幫助中心 | 我的帳號 | 關於我們

阿爾法零對最優模型預測自適應控制的啟示(國際知名大學原版教材)(英文版)/信息技術學科與電氣工程學科系列

  • 作者:(美)德梅萃·P.博塞克斯|責編:古雪
  • 出版社:清華大學
  • ISBN:9787302684718
  • 出版日期:2025/04/01
  • 裝幀:平裝
  • 頁數:227
人民幣:RMB 79 元      售價:
放入購物車
加入收藏夾

內容大鋼
    本書構建了近似動態規劃和強化學習的新的理論框架,簡潔但雄心勃勃。這一框架以離線訓練和在線學習這兩個演算法為中心,彼此獨立又通過牛頓法有機融合。當今新一代人工智慧技術發展絢麗多彩。在看似紛繁複雜的數據與演算法表象之下,其實蘊藏著簡潔而美妙的規律。通過本書的學習,讀者將能體會經典優化控制理論在分析理解當代強化學習演算法性能中的強大威力,更能領悟到以阿爾法零為代表的新一代演算法浪潮對經典理論提供的新的發展機遇。本書適合作為普通高等學校信息科學技術領域研究生、本科高年級教材,也可用於本領域科研人員自學使用。

作者介紹
(美)德梅萃·P.博塞克斯|責編:古雪

目錄
1.AlphaZero, Off-Line Training, and On-Line Play
  1.1.Off-Line Training and Policy Iteration
  1.2.On-Line Play and Approximation in Value Space-Truncated Rollout
  1.3.The Lessons of AlphaZero
  1.4.A New Conceptual Framework for Reinforcement Learning
  1.5.Notes and Sources
2.Deterministic and Stochastic Dynamic Programming
  2.1.Optimal Control Over an Infinite Horizon
  2.2.Approximation in Value Space
  2.3.Notes and Sources
3.An Abstract View of Reinforcement Learning
  3.1.Bellman Operators
  3.2.Approximation in Value Space and Newton's Method
  3.3.Region of Stability
  3.4.Policy Iteration, Rollout, and Newton's Method
  3.5.How Sensitive is On-Line Play to the Off-Line Training Process?
  3.6.Why Not Just Train a Policy Network and Use it Without On-Line Play?
  3.7.Multiagent Problems and Multiagent Rollout
  3.8.On-Line Simplified Policy Iteration
  3.9.Exceptional Cases
  3.10.Notes and Sources
4.The Linear Quadratic Case - Illustrations
  4.1.Optimal Solution
  4.2.Cost Functions of Stable Linear Policies
  4.3.Value Iteration
  4.4.One-Step and Multistep Lookahead - Newton Step Interpretations
  4.5.Sensitivity Issues
  4.6.Rollout and Policy Iteration
  4.7.Truncated Rollout - Length of Lookahead Issues
  4.8.Exceptional Behavior in Linear Quadratic Problems
  4.9.Notes and Sources
5.Adaptive and Model Predictive Control
  5.1.Systems with Unknown Parameters - Robust and PID Control
  5.2.Approximation in Value Space, Rollout, and Adaptive Control
  5.3.Approximation in Value Space, Rollout, and Model Predictive Control
  5.4.Terminal Cost Approximation - Stability Issues
  5.5.Notes and Sources
6.Finite Horizon Deterministic Problems - Discrete Optimization
  6.1.Deterministic Discrete Spaces Finite Horizon Problems.    
  6.2.General Discrete Optimization Problems
  6.3.Approximation in Value Space
  6.4.Rollout Algorithms for Discrete Optimization .. .    
  6.5.Rollout and Approximation in Value Space with Multistep Lookahead
    6.5.1.Simplified Multistep Rollout - Double Rollout..p.
    6.5.2.Incremental Rollout for Multistep Approximation in Value Space
  6.6.Constrained Forms of Rollout Algorithms
  6.7.Adaptive Control by Rollout with a POMDP Formulation
  6.8.Rollout for Minimax Control
  6.9.Small Stage Costs and Long Horizon - Continuous-Time Rollout
  6.10.Epilogue

Appendix A: Newton's Method and Error Bounds
  A.1.Newton's Method for Differentiable Fixed Point Problems
  A.2.Newton's Method Without Differentiability of the Hellman Operator
  A.3.Local and Global Error Bounds for Approximation in Value Space
  A.4.Local and Global Error Bounds for Approximate Policy Iteration
References

  • 商品搜索:
  • | 高級搜索
首頁新手上路客服中心關於我們聯絡我們Top↑
Copyrightc 1999~2008 美商天龍國際圖書股份有限公司 臺灣分公司. All rights reserved.
營業地址:臺北市中正區重慶南路一段103號1F 105號1F-2F
讀者服務部電話:02-2381-2033 02-2381-1863 時間:週一-週五 10:00-17:00
 服務信箱:bookuu@69book.com 客戶、意見信箱:cs@69book.com
ICP證:浙B2-20060032