This series reflects the latest advances and applications in machine learning and pattern recognition through the publication of a broad range of reference works, textbooks, and handbooks. The inclusion of concrete examples, applications, and methods is highly encouraged. The scope of the series includes, but is not limited to, titles in the areas of machine learning, pattern recognition, computational intelligence, robotics, computational/statistical learning theory, natural language processing, computer vision, game AI, game theory, neural networks, computational neuroscience, and other relevant topics, such as machine learning applied to bioinformatics or cognitive science, which might be proposed by potential contributors.
Model-Based Reinforcement Learning.
In the above model-free approaches, policies are learned without explicitly modeling the unknown environment (i.e., the transition probability of the agent in the environment, р(s'|s,а)). On the other hand, the model-based approach explicitly learns the environment in advance and uses the learned environment model for policy learning.
No additional sampling cost is necessary to generate artificial samples from the learned environment model. Thus, the model-based approach is particularly useful when data collection is expensive (e.g., robot control). However, accurately estimating the transition model from a limited amount of trajectory data in multi-dimensional continuous state and action spaces is highly challenging. Part IV of this book focuses on model-based reinforcement learning. In Chapter 10, a non-parametric transition model estimator that possesses the optimal convergence rate with high computational efficiency is introduced. However, even with the optimal convergence rate, estimating the transition model in high-dimensional state and action spaces is still challenging. In Chapter 11, a dimensionality reduction method that can be efficiently embedded into the transition model estimation procedure is introduced and its usefulness is demonstrated through experiments.
Contents.
Foreword.
Preface.
Author.
I Introduction.
1 Introduction to Reinforcement Learning.
1.1 Reinforcement Learning.
1.2 Mathematical Formulation.
1.3 Structure of the Book.
1.3.1 Model-Free Policy Iteration.
1.3.2 Model-Free Policy Search.
1.3.3 Model-Based Reinforcement Learning.
II Model-Free Policy Iteration.
2 Policy Iteration with Value Function Approximation.
2.1 Value Functions.
2.1.1 State Value Functions.
2.1.2 State-Action Value Functions.
2.2 Least-Squares Policy Iteration.
2.2.1 Immediate-Reward Regression.
2.2.2 Algorithm.
2.2.3 Regularization.
2.2.4 Model Selection.
2.3 Remarks.
3 Basis Design for Value Function Approximation.
3.1 Gaussian Kernels on Graphs.
3.1.1 MDP-Induced Graph.
3.1.2 Ordinary Gaussian Kernels.
3.1.3 Geodesic Gaussian Kernels.
3.1.4 Extension to Continuous State Spaces.
3.2 Illustration.
3.2.1 Setup.
3.2.2 Geodesic Gaussian Kernels.
3.2.3 Ordinary Gaussian Kernels.
3.2.4 Graph-Laplacian Eigenbases.
3.2.5 Diffusion Wavelets.
3.3 Numerical Examples.
3.3.1 Robot-Arm Control.
3.3.2 Robot-Agent Navigation.
3.4 Remarks.
4 Sample Reuse in Policy Iteration.
4.1 Formulation.
4.2 Off-Policy Value Function Approximation.
4.2.1 Episodic Importance Weighting.
4.2.2 Per-Decision Importance Weighting.
4.2.3 Adaptive Per-Decision Importance Weighting.
4.2.4 Illustration.
4.3 Automatic Selection of Flattening Parameter.
4.3.1 Importance-Weighted Cross-Validation.
4.3.2 Illustration.
4.4 Sample-Reuse Policy Iteration.
4.4.1 Algorithm.
4.4.2 Illustration.
4.5 Numerical Examples.
4.5.1 Inverted Pendulum.
4.5.2 Mountain Car.
4.6 Remarks.
5 Active Learning in Policy Iteration.
5.1 Efficient Exploration with Active Learning.
5.1.1 Problem Setup.
5.1.2 Decomposition of Generalization Error.
5.1.3 Estimation of Generalization Error.
5.1.4 Designing Sampling Policies.
5.1.5 Illustration.
5.2 Active Policy Iteration.
5.2.1 Sample-Reuse Policy Iteration with Active Learning.
5.2.2 Illustration.
5.3 Numerical Examples.
5.4 Remarks.
6 Robust Policy Iteration.
6.1 Robustness and Reliability in Policy Iteration.
6.1.1 Robustness.
6.1.2 Reliability.
6.2 Least Absolute Policy Iteration.
6.2.1 Algorithm.
6.2.2 Illustration.
6.2.3 Properties.
6.3 Numerical Examples.
6.4 Possible Extensions.
6.4.1 Huber Loss.
6.4.2 Pinball Loss.
6.4.3 Deadzone-Linear Loss.
6.4.4 Chebyshev Approximation.
6.4.5 Conditional Value-At-Risk.
6.5 Remarks.
III Model-Free Policy Search.
7 Direct Policy Search by Gradient Ascent.
7.1 Formulation.
7.2 Gradient Approach.
7.2.1 Gradient Ascent.
7.2.2 Baseline Subtraction for Variance Reduction.
7.2.3 Variance Analysis of Gradient Estimators.
7.3 Natural Gradient Approach.
7.3.1 Natural Gradient Ascent.
7.3.2 Illustration.
7.4 Application in Computer Graphics: Artist Agent.
7.4.1 Sumie Painting.
7.4.2 Design of States, Actions, and Immediate Rewards.
7.4.3 Experimental Results.
7.5 Remarks.
8 Direct Policy Search by Expectation-Maximization.
8.1 Expectation-Maximization Approach.
8.2 Sample Reuse.
8.2.1 Episodic Importance Weighting.
8.2.2 Per-Decision Importance Weight.
8.2.3 Adaptive Per-Decision Importance Weighting.
8.2.4 Automatic Selection of Flattening Parameter.
8.2.5 Reward-Weighted Regression with Sample Reuse.
8.3 Numerical Examples.
8.4 Remarks.
9 Policy-Prior Search.
9.1 Formulation.
9.2 Policy Gradients with Parameter-Based Exploration.
9.2.1 Policy-Prior Gradient Ascent.
9.2.2 Baseline Subtraction for Variance Reduction.
9.2.3 Variance Analysis of Gradient Estimators.
9.2.4 Numerical Examples.
9.3 Sample Reuse in Policy-Prior Search.
9.3.1 Importance Weighting.
9.3.2 Variance Reduction by Baseline Subtraction.
9.3.3 Numerical Examples.
9.4 Remarks.
IV Model-Based Reinforcement Learning.
10 Transition Model Estimation.
10.1 Conditional Density Estimation.
10.1.1 Regression-Based Approach.
10.1.2 ǫ-Neighbor Kernel Density Estimation.
10.1.3 Least-Squares Conditional Density Estimation.
10.2 Model-Based Reinforcement Learning.
10.3 Numerical Examples.
10.3.1 Continuous Chain Walk.
10.3.2 Humanoid Robot Control.
10.4 Remarks.
11 Dimensionality Reduction for Transition Model Estimation.
11.1 Sufficient Dimensionality Reduction.
11.2 Squared-Loss Conditional Entropy.
11.2.1 Conditional Independence.
11.2.2 Dimensionality Reduction with SCE.
11.2.3 Relation to Squared-Loss Mutual Information.
11.3 Numerical Examples.
11.3.1 Artificial and Benchmark Datasets.
11.3.2 Humanoid Robot.
11.4 Remarks.
References.
Index.
Бесплатно скачать электронную книгу в удобном формате, смотреть и читать:
Скачать книгу Statistical Reinforcement Learning, Modern Machine, Learning Approaches, Sugiyama M., 2015 - fileskachat.com, быстрое и бесплатное скачивание.
Скачать pdf
Ниже можно купить эту книгу по лучшей цене со скидкой с доставкой по всей России.Купить эту книгу
Скачать - pdf - Яндекс.Диск.
Дата публикации:
Теги: учебник по английскому языку :: английский язык :: Sugiyama
Смотрите также учебники, книги и учебные материалы:
Следующие учебники и книги:
- Classic Motorcars, Coloring Book, 1986 — Since 1885, when Gottlieb Daimler and Karl Benz developed their first self-propelled vehicles, motorcar manufacturers have turned out thousands of … Книги по английскому языку
- Как вести беседу по телефону, Практическое пособие по разговорному английскому языку, учебное пособие, Шелкова Т.Г., Мелех И.Я., 1989 — Ведение разговора по телефону на иностранном языке требует от обучающегося определенных навыков понимания, восприятия и удержания в памяти услышанного, а … Книги по английскому языку
- Двадцать шесть времен - за двадцать шесть минут, Захаров А.А., 2003 — Захаров А. А. окончил философский ф-т РГУ (1980), аспирантуру Института философии АН СССР (1987), докторантуру философского ф-та МГУ (1999). Работал … Книги по английскому языку
- The Big Yellow Book of German Verbs, 555 Fully Conjugated Verbs, Listen P., Di Donato R., Franklin D., 2005 — A verb is the part of speech that expresses an action, mode of being, or occurrence, for example, to run, … Книги по английскому языку
Предыдущие статьи:
- Smart Shopping Math, 21st Century Lifeskills Mathematic, 2005 — Welcome to Smart Shopping! This is Book 5 of the 21st Century Lifeskills Mathematics series. The goal of this book … Книги по английскому языку
- How to succeed with women, Copeland D., Louis R., 1998 — We were no different than you might be. Our relationships with women were dependent upon two things: luck and the … Книги по английскому языку
- The Well Spoken Thesaurus, Heehler T., 2011 — Фрагмент из книги. Words are like little gods. The pronoun “him” instead of “her,” if used often enough, can dissuade … Книги по английскому языку
- Употребление модальных глаголов в современном английском языке, Селиванова Е.Е., Ванько Т.Р., 2007 — Данное пособие представляет собой сборник упражнений по формированию устойчивых навыков употребления модальных глаголов в английском языке. Для реализации этой цели … Книги по английскому языку