/
Bayesian Brain: Probabilistic Approaches to Neural Coding Bayesian Brain: Probabilistic Approaches to Neural Coding

Bayesian Brain: Probabilistic Approaches to Neural Coding - PowerPoint Presentation

telempsyc
telempsyc . @telempsyc
Follow
342 views
Uploaded On 2020-08-04

Bayesian Brain: Probabilistic Approaches to Neural Coding - PPT Presentation

Chapter 12 Optimal Control Theory Kenju Doya Shin Ishii Alexandre Pouget and Rajesh PNRao Summarized by SeungJoon Yi Chapter overview Discrete Control Dynamic programming ID: 797304

optimal snu biointelligence 2008 snu optimal 2008 biointelligence lab http control state maximum discrete time cost kalman iteration action

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "Bayesian Brain: Probabilistic Approaches..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Bayesian Brain:

Probabilistic Approaches to Neural Coding

Chapter 12: Optimal Control Theory

Kenju

Doya

, Shin Ishii,

Alexandre

Pouget

, and Rajesh

P.N.Rao

Summarized by

Seung-Joon

Yi

Slide2

Chapter

overview

Discrete ControlDynamic programmingValue iteration / Policy iterationMarkov decision processContinuous ControlThe Hamilton-Jacobi-Bellman equationDeterministic ControlPontryagin’s Maximum PrincipleLinear-Quadratic-Gaussian ControlRiccati equations

© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/

2

Slide3

Discrete control setting

State:

Action: Future state: Cost:

Problem: find an action sequence and corresponding state sequence minimizing the total cost

© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/

3

Slide4

Dynamic Programming

Bellman optimality

principleIf the given state-action sequence is optimal, its subsequence generated by removing its first state and action is also optimal.The optimal value functionThe Bellman equations for the optimal policy © 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/

4

Slide5

Value iteration

and Policy iteration

Relaxation scheme for graphs with loopsValue iteration updatePolicy iteration updateBoth algorithms are proved to converge in finite steps© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/

5

Slide6

Markov

Decision Process

Stochastic transition caseTransition functionValue functionMarkov decision processAn optimal control problem with discrete state and stochastic state transitions© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/

6

Slide7

Continuous state

control

Real-valued state:Real-valued control:Controlled Ito diffusion processTotal cost function© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/7

Slide8

The Hamilton-Jacobi-Bellman equation

Apply DP approach to the time-

discretized stochastic problemThe resulting HJB equation© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/8

Slide9

Solving the HJB equation

A nonlinear, second-order PDE

w.r.t. the unknown function vDo not always have classic solutionsMany weak solutions can existThe idea of viscosity solutions provides a reassuring answerParametric method for approximate solution© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/

9

Slide10

Infinite-horizon

case

Discounted cost formulationAverage-cost-per-stage formulation© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/10

Slide11

Pontrygin’s

Maximum principle

Two fundamental ideas of the optimal control theoryBellman’s DP and optimality principlePontryagin’s maximum principleThe Maximum principleApplies only to deterministic problemsYields the same solutions as DPHowever, the MP avoids the curse of dimensionality!© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/

11

Slide12

Continuous-time maximum principle

HJB equation for deterministic dynamics

: the costate vectorThe maximum principle© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/

12

Slide13

Discrete-time maximum principle

Discrete-time optimal control problem

The maximum principleCan be solved using gradient descent© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/13

Slide14

Linear-Quadratic-Gaussian control

LQG case

Linear dynamicsQuadratic costsAdditive Gaussian noiseRare closed-form optimal control lawQuadratic optimal value functionAllows minimization of the Hamiltonian in closed form© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/14

Slide15

Continuous case

LQG condition

Guess of the optimal VF in parametric formOptimal control lawContinuous time Riccati equation© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/

15

Slide16

Discrete case

LQR condition (deterministic)

Guess for the optimal VF:Optimal control law:Discrete-time Riccati equation:© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/16

Slide17

Optimal

estimation and

Kalman filterThe dual to the optimal control problemKalman filterThe most widely used estimatorObjective: compute the posterior given observationsKalman filter result

© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/

17

Slide18

Beyond the

Kalman filter

Nonlinear dynamics, non-Gaussian noise, etc.Extended Kalman filterUses local linearization centered at the current state estimateUnscented filterUses deterministic samplingParticle filteringPropagates a cloud of points sampled from the posterior© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/

18

Slide19

Duality of optimal control and

optimal

estimationLQR controller and Kalman filterTwo riccati equations Optimal Control and MAP smoothingLQG Control and Kalman smoothing

© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/

19

Slide20

Optimal control as

a theory of biological movement

Brain generates the best behavior it can, subject to the constraints imposed by the body and environment.We can assume that, at least in natural and well-practived tasks, the observed behavior will be close to optimal.Minimum-energy, minimum-jerk, minimum-torque-change models etc.Research DirectionsMotor learning and adaptationNeural implementation of the optimal control lawsDistributed and hierarchical controlInverse optimal control

© 2008, SNU Biointelligence Lab, http://bi.snu.ac.kr/

20