EE Winter  Lecture  Linear Quadratic Stochastic Control with Partial State Observation partially observed linearquadratic stochastic control proble estimationcontrol separation principle solution via
201K - views

EE Winter Lecture Linear Quadratic Stochastic Control with Partial State Observation partially observed linearquadratic stochastic control proble estimationcontrol separation principle solution via

N with state input and process noise linear noise corrupted observations Cx t 0 N is output is measurement noise 8764N 0 X 8764N 0 W 8764N 0 V all independent Linear Quadratic Stochastic Control with Partial State Obser vation 102 br

Download Pdf

EE Winter Lecture Linear Quadratic Stochastic Control with Partial State Observation partially observed linearquadratic stochastic control proble estimationcontrol separation principle solution via




Download Pdf - The PPT/PDF document "EE Winter Lecture Linear Quadratic Sto..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.



Presentation on theme: "EE Winter Lecture Linear Quadratic Stochastic Control with Partial State Observation partially observed linearquadratic stochastic control proble estimationcontrol separation principle solution via"β€” Presentation transcript:


Page 1
EE363 Winter 2008-09 Lecture 10 Linear Quadratic Stochastic Control with Partial State Observation partially observed linear-quadratic stochastic control proble estimation-control separation principle solution via dynamic programming 10–1
Page 2
Linear stochastic system linear dynamical system, over finite time horizon: +1 Ax Bu , t = 0 , . . . , N with state , input , and process noise linear noise corrupted observations: Cx , t = 0 , . . . , N is output, is measurement noise ∼N (0 , X ∼N (0 , W ∼N (0 , V , all independent Linear Quadratic

Stochastic Control with Partial State Obser vation 10–2
Page 3
Causal output feedback control policies causal feedback policies: input must be function of past and present outputs roughly speaking: current state is not known = 0 , . . . , N = ( , . . . , y is output history at time +1) called the control policy at time closed-loop system is +1 Ax B ) + , y Cx , . . . , x , . . . , y , . . . , u are all random Linear Quadratic Stochastic Control with Partial State Obser vation 10–3
Page 4
Stochastic control with partial observations objective: =0 Qx Ru Qx with R > partially

observed linear quadratic stochastic control problem (a.k.a. LQG problem): choose output feedback policies , . . . , to minimize Linear Quadratic Stochastic Control with Partial State Obser vation 10–4
Page 5
Solution optimal policies are ) = is optimal feedback gain matrix for associated LQR problem is the MMSE estimate of given measurements (can be computed using Kalman filter) called separation principle : optimal policy consists of estimating state via MMSE (ignoring the control problem) using estimated state as if it were the actual state, for purposes of control Linear

Quadratic Stochastic Control with Partial State Obser vation 10–5
Page 6
LQR control gain computation define , and for N, . . . , set +1 +1 A, t = 0 , . . . , N does not depend on data Linear Quadratic Stochastic Control with Partial State Obser vation 10–6
Page 7
Kalman filter current state estimate define (current state estimate) )( (current state estimate covariance) +1 (next state estimate covariance) start with | ; for = 0 , . . . , N = +1 define = = 0 , . . . , N Linear Quadratic Stochastic Control with Partial State Obser vation 10–7
Page

8
set ; for = 0 , . . . , N +1 Bu +1 +1 , e +1 +1 Bu +1 is next output prediction error +1 ∼N (0 , C +1 , independent of Kalman filter gains do not depend on data Linear Quadratic Stochastic Control with Partial State Obser vation 10–8
Page 9
Solution via dynamic programming let be optimal value of LQG problem, from on, conditioned on the output history ) = min ,..., Qx Ru ) + Qx we’ll show that is a quadratic function plus a constant, in fact, ) = , t = 0 , . . . , N, where is the LQR cost-to-go matrix ( is a linear function of Linear Quadratic Stochastic Control

with Partial State Obser vation 10–9
Page 10
we have ) = Qx ) = Tr (using ∼N ( ) so Tr dynamic programming (DP) equation is ) = min Qx Ru +1 +1 (and argmin, which is a function of , is optimal input) with +1 +1 ) = +1 +1 +1 +1 , DP equation becomes ) = min Qx Ru + +1 +1 +1 +1 Qx ) + +1 + min Ru ( +1 +1 +1 Linear Quadratic Stochastic Control with Partial State Obser vation 10–10
Page 11
using ∼N ( , the first term is Qx ) = Tr using +1 Bu +1 +1 with +1 ∼N (0 , C +1 , independent of , we get ( +1 +1 +1 ) = +1 +1 Bu + 2 +1 Bu Tr +1 +1 +1 )( +1 using +1

= +1 +1 , last term becomes Tr +1 +1 +1 +1 ) = Tr +1 ( +1 +1 Linear Quadratic Stochastic Control with Partial State Obser vation 10–11
Page 12
combining all terms we get ) = +1 ) +1 Tr Tr +1 ( +1 +1 + min +1 + 2 +1 Bu minimization same as in deterministic LQR problem thus optimal policy is ) = , with +1 +1 plugging in optimal we get ) = , where +1 +1 +1 +1 +1 Tr ) + Tr +1 ( +1 +1 recursion for is exactly the same as for deterministic LQR Linear Quadratic Stochastic Control with Partial State Obser vation 10–12
Page 13
Optimal objective optimal LQG cost is ) = Tr using

∼N (0 , X using Tr and +1 Tr ) + Tr +1 ( +1 +1 we get =0 Tr ) + =0 Tr ( using | Linear Quadratic Stochastic Control with Partial State Obser vation 10–13
Page 14
we can write this as =0 Tr )+ =1 Tr )+ Tr )) which simplifies to lqr est where lqr Tr ) + =1 Tr est Tr (( ) ) + =1 Tr (( ) ) + Tr lqr is the stochastic LQR cost, i.e. , the optimal objective if you knew the state est is the cost of not knowing ( i.e. , estimating) the state Linear Quadratic Stochastic Control with Partial State Obser vation 10–14
Page 15
when state measurements are exact ( = 0 ), we

have = 0 so we get lqr Tr ) + =1 Tr Linear Quadratic Stochastic Control with Partial State Obser vation 10–15
Page 16
Infinite horizon LQG choose policies to minimize infinite horizon average stage cost = lim =0 Qx Ru optimal average stage cost is Tr Σ) + Tr Σ)) where and are PSD solutions of AREs PA PB PB PA, Σ = and Σ = Linear Quadratic Stochastic Control with Partial State Obser vation 10–16
Page 17
optimal average stage cost doesn’t depend on (an) optimal policy is +1 Bu +1 Bu )) where PB PA, L is steady-state LQR feedback gain is

steady-state Kalman filter gain Linear Quadratic Stochastic Control with Partial State Obser vation 10–17
Page 18
Example system with = 5 states, = 2 inputs, = 3 outputs; infinite horizon chosen randomly; scaled so max = 1 = 0 = 0 we compare LQG with the case where state is known (stochastic LQR) Linear Quadratic Stochastic Control with Partial State Obser vation 10–18
Page 19
Sample trajectories sample trace of and in steady state 10 20 30 40 50 −2 10 20 30 40 50 −1 blue: LQG, red: stochastic LQR Linear Quadratic Stochastic Control with Partial

State Obser vation 10–19
Page 20
Cost histogram histogram of stage costs for 5000 steps in steady state 10 15 20 25 30 100 200 300 400 500 10 15 20 25 30 100 200 300 400 500 lqr Linear Quadratic Stochastic Control with Partial State Obser vation 10–20