OPTIMAL SEQUENTIAL VECTOR QUANTIZA TION OF MARK SOUR CES VIVEK S
117K - views

OPTIMAL SEQUENTIAL VECTOR QUANTIZA TION OF MARK SOUR CES VIVEK S

BORKAR SANJO K MITTER AND SEKHAR TIK OND SIAM J ONTR OL PTIM 2001 So ciet for Industrial and Applied Mathematics ol 40 No 1 pp 135148 Abstract The problem of sequen tial ector quan tization of stationary Mark source is cast as an equiv alen sto hast

Download Pdf

OPTIMAL SEQUENTIAL VECTOR QUANTIZA TION OF MARK SOUR CES VIVEK S




Download Pdf - The PPT/PDF document "OPTIMAL SEQUENTIAL VECTOR QUANTIZA TION ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.



Presentation on theme: "OPTIMAL SEQUENTIAL VECTOR QUANTIZA TION OF MARK SOUR CES VIVEK S"— Presentation transcript:


Page 1
OPTIMAL SEQUENTIAL VECTOR QUANTIZA TION OF MARK SOUR CES VIVEK S. BORKAR SANJO K. MITTER AND SEKHAR TIK OND SIAM J. ONTR OL PTIM 2001 So ciet for Industrial and Applied Mathematics ol. 40, No. 1, pp. 135–148 Abstract. The problem of sequen tial ector quan tization of stationary Mark source is cast as an equiv alen sto hastic con trol problem with partial observ ations. This problem is analyzed using the tec hniques of dynamic programming, leading to haracterization of optimal enco ding sc hemes. Key ords. optimal ector quan tization, sequen tial source co ding, Mark sources,

con trol under partial observ ations, dynamic programming AMS sub ject classifications. 94A29, 90E20, 90C39 PI I. S0363012999365261 1. In tro duction. In this pap er, consider the problem of optimal sequen tial ector quan tization of stationary Mark sources. In the traditional rate distortion framew ork, the ell-kno wn result of Shannon sho ws that one can ac hiev en trop rates arbitrarily close to the rate distortion function for suitably long lossy blo co des ]. Unfortunately long blo co des imply long dela ys in comm unication systems. In par- ticular, con trol applications require

causal co ding and deco ding sc hemes. These concerns are not new, and there is sizable dy of literature addressing these issues. shall briefly men tion few ey con tributions. Witsenhausen 24 lo ok ed at the optimal finite horizon sequen tial quan tization problem for finite state enco ders and deco ders. His enco der had fixed um er of lev els. He sho ed that the optimal enco der for th order Mark source dep ends on at most the last sym ols and the presen state of the deco der’s memory alrand and araiy 23 lo ok ed at the infinite horizon sequen tial quan tization

problem for sources with finite alphab ets. Using Mark decision theory they ere able to sho that the optimal enco der for Mark source dep ends only on the curren input and the curren state of the deco der. Gaarder and Slepian 12 lo ok at sequen tial quan tization er classes of finite state enco ders and deco ders. Though they la do wn sev eral useful definitions, their results, their wn admission, are incomplete. Other related orks include neural net ork based sc heme 17 and study of optimalit prop erties of co des in sp ecific cases ], 10 ]. Some abstract theoretical

results are giv en in 19 ]. Receiv ed the editors Decem er 20, 1999; accepted for publication (in revised form) Decem er 6, 2000; published electronically Ma 31, 2001. preliminary ersion of this pap er app eared in the Pr dings of the 1998 IEEE International Symp osium on Information The ory IEEE Information Theory So ciet Piscata NJ, 1998, p. 71. ttp://www.siam.org/journals/sicon/40-1/36526.h tml Sc ho ol of ec hnology and Computer Science, ata Institute of undamen tal Researc h, Homi Bhabha Road, Mum bai 400005, India (b ork ar@tifr.res.in). This ork as done while this author as visiting the

Lab oratory for Information and Decision Systems, Massac usetts Institute of ec h- nology The researc of this author as supp orted Homi Bhabha fello wship, NSF KDI: Learning, Adaptation, and La ered In telligen Systems gran 6756500, and gran 5(12)/96-ET of Depart- men of Science and ec hnology Go ernmen of India. Lab oratory for Information and Decision Systems, Massac usetts Institute of ec hnology Ro om 35-403, Cam bridge, MA 02139 (mitter@lids.mit.edu). The researc of this author as supp orted NSF KDI: Learning, Adaptation, and La ered In telligen Systems gran ECS-9873451. Univ ersit of

California at Berk eley So da Hall, Ro om 485, Berk eley CA 94720 (tatik ond@eecs.b erk eley .edu). The researc of this author as supp orted U.S. Arm gran AAL03-92-G-0115. 135
Page 2
136 V. S. BORKAR, S. K. MITTER, AND S. TIK OND form ulation similar in spirit to ours (insofar as it aims to minimize “La- grangian distortion measure describ ed elo w) is studied in ], ]. They sho em- pirically that one can mak gains in erformance en trop co ding the co dew ords. In the en trop constrained ector quan tization problem for blo is form ulated and Max–Llo yd-t yp algorithm is in tro duced.

In they in tro duce the conditional en trop constrained ector quan tization problem and sho that one should use con- ditional en trop co ders when the co dew ords are not indep enden from blo to blo k. In these pap ers there is more emphasis on syn thesizing algorithms and less emphasis on pro ving rigorously the optimalit of the sc hemes prop osed. Along with this ork there is large literature on differen tial predictiv co ding, where one enco des the inno ation. Other than the Gauss–Mark case, though, it is not apparen ho one ma pro the optimalit of suc inno ation co ding sc hemes.

Herein emphasize, through the dynamic programming form ulation, the optimalit prop erties of the se- quen tial quan tization sc heme. This leads the for the application of man erful appro ximate dynamic programming to ols. In this pap er do not imp ose fixed um er of lev els on the quan tizer. The aim is to someho join tly optimize the en trop rate of the quan tized pro cess (in order to obtain etter compression rate) as ell as suitable distortion measure. The traditional rate distortion framew ork calls for the minimization of the former with hard constrain on the latter. shall, ho ev

er, consider the analytically more tractable Lagrangian distortion measure of ], ], whic is eigh ted com bination of the o. approac the problem from sto hastic con trol viewp oin t, treating the hoice of the sequen tial quan tizer as con trol hoice. The correct “state space then turns out to the space of conditional la ws of the underlying pro cess giv en the quan tizer outputs, these conditional la ws serving as the “state or “sufficien statistics. The “state dynamics is then giv en the appropriate nonlinear filter. While this is ery reminiscen of the finite state quan tizers

studied, e.g., in 16 ], the state space here is not finite, and the state pro cess has the familiar sto hastic con trol in terpretation as the output of nonlinear filter. then consider the “separated or “certain equiv alen t con trol problem of con trolling this nonlinear filter so as to minimize an appropriately transformed Lagrangian distortion measure. This problem can analyzed in the traditional dynamic programming framew ork. This in turn can made basis for computational sc hemes for near-optimal co de design. summarize, the main con tributions of this pap er are as

follo ws. (i) form ulate sto hastic con trol problem equiv alen to the optimal ector quan tization problem. In the pro cess, mak precise the passage from the source output to its enco ded ersion in manner that ensures the ell- osedness of the con trol problem. (ii) underscore the crucial role of the pro cess of conditional la ws of the source giv en the quan tized pro cess as the correct “sufficien statistics for the prob- lem. (iii) analyze the equiv alen con trol problem using the metho dology of Mark decision theory This op ens up the ossibilit of using the com- putational mac hinery

of Mark decision theory for co de design. Sp ecifically consider pair of “state pro cess and an asso ciated “ob- serv ation pro cess giv en the dynamics +1 +1 where are indep enden tly and iden tically distributed (i.i.d.) driving noise pro cesses. quan tize +1 in to its quan tized ersion +1 that has finite range and
Page 3
SEQUENTIAL VECTOR QUANTIZA TION OF MARK SOUR CES 137 is selected based on the “history ]. The aim then is to minimize the long run erage of the Lagrangian distortion measure +1 /q || || ], where is prescrib ed constan t, is the conditional en trop and

is the est estimate of giv en Let the regular conditional la of giv en for 0. rom one can easily deriv the regular conditional la of +1 giv en Using Ba es’s rule, can ev aluated recursiv ely nonlinear filter. urthermore, one can express as the exp ected alue of function of and “con trol pro cess alone. is, in fact, the finite set depicting the range of the ector quan tization of +1 prior to its enco ding in to fixed finite alphab et.) This allo ws us to consider the equiv alen problem of con trolling with the aim of minimizing the long run erage of the recast as ab e.

This then fits the framew ork of traditional Mark decision theory and can approac hed dynamic programming. As usual, one has to deriv the dynamic programming equations for the erage cost con trol problem “v anishing discoun t argumen applied to the asso ciated infinite horizon discoun ted con trol problem for whic the dynamic programming equation is easier to justify The structure of the pap er is as follo ws. In section 2, describ the sequen tial quan tization problem and in tro duce the formalism. Section deriv es the equiv alen con trol problem. This is analyzed in section using

the formalism of Mark decision theory 2. Sequen tial quan tization. This section form ulates the sequen tial ector quan- tization problem. In particular, it describ es the passage from the observ ation pro cess to its quan tized ersion, whic in turn gets mapp ed in to its enco ding with resp ect to fixed alphab et. also la do wn our ey assumptions whic h, apart from making the co ding sc heme robust, also mak its subsequen con trol form ulation ell-p osed. The section concludes with precise statemen of this “long run erage cost con trol problem with partial observ ations that is equiv

alen to our original ector quan tization problem. Throughout, for olish (i.e., complete separable metric) space will denote the olish space of probabilit measures on with Prohoro top ology Chapter 2]. or random pro cess set its past up to time Finally will denote finite ositiv constan t, dep ending on the con text. Let an ergo dic Mark pro cess taking alues in with an asso ciated “observ ation pro cess taking alues in th us is the actual pro cess eing observ ed.) Their join ev olution is go erned transition ernel x, dz dy as describ ed elo w. assume this map to con tin uous and further,

that x, dz dy dz dy for densit ·|· that is con tin uous and strictly ositiv e, and furthermore, |· is Lipsc hitz uniformly in The ev olution la is as follo ws. or Borel, +1 A, +1 /X dx, dy dy dz ollo wing 13 ], call the pair Mark source, though the terminology “hidden Mark mo del is more common no ada ys. imp ose on the condition of “asymptotic flatness describ ed next. assume that these pro cesses
Page 4
138 V. S. BORKAR, S. K. MITTER, AND S. TIK OND are giv en recursiv ely the dynamics +1 (2.1) +1 (2.2) where are i.i.d. -v alued (sa y) random ariables indep enden of eac other

and of and are prescrib ed measurable maps satisfying || x, || || x, || (1 || || Equations (2 1) and (2 2) and the la ws of completely sp ecify x, dz dy ), and therefore the conditions imp ose on the latter will implicitly restrict the hoice of the former. Let denote the solutions to (2 1), (2 2) for x, resp ectiv ely with the same driving noises The assumption of asymptotic flatness then is that there exist 1, suc that || || || || simple example ould the case when x, u, x, for all x, u, where is con traction with resp ect to some equiv alen norm on This co ers, e.g., the usual linear

quadratic Gaussian LQG case when the state pro cess is stable. Another example ould discretization of con tin uous time asymptotically flat pro cesses considered in ], where Ly apuno v-t yp sufficien condition for asymptotic flatness is giv en. This assumption, one ust add, is not required for our form ulation of the optimization problem er se but will pla ey role in our deriv ation of the dynamic programming equations in section 4. Let an ordered set that will serv as the alphab et for our ector quan tizer. Let denote the -v alued pro cess that stands for the “v ector quan

tized ersion of The passage from to is describ ed elo w. Let denote the set of finite nonempt subsets of with cardinalit at most 1, satisfying the follo wing. There exist (“large”) and (“small”) suc that (i) implies || || (ii) for x, implies for all i. endo with the Hausdor metric whic renders it compact olish space. or let denote the map that maps to the elemen of nearest to it with reference to the Euclidean norm || || an tie eing resolv ed according to some fixed priorit rule. Let denote the map that first orders the elemen ts of lexicographically and then maps them to

preserving the order. Let (i.e., one-sided coun tably infinite pro duct. Analogous notation will used elsewhere.) eac time measurable map +1 is hosen. With one sets +1 +1 )) This defines recursiv ely as the quan tized pro cess that is to enco ded and transmitted across comm unication hannel.
Page 5
SEQUENTIAL VECTOR QUANTIZA TION OF MARK SOUR CES 139 The explanation of this sc heme is as follo ws. In case of fixed quan tizer, the finite subset of to whic the signal gets mapp ed can itself iden tified with the alphab et In our case, ho ev er, this set

will ary from one instan to another and therefore ust mapp ed to fixed alphab et in uniquely in ertible manner. This is ac hiev ed through the map Assuming that the receiv er kno ws ahead of time the deterministic maps (later on argue that single fixed will suffice), she can reconstruct as on ha ving receiv ed time In turn, she can reconstruct +1 +1 as the ector quan tized ersion of +1 The main con tribution of the condition is to render the map con tin uous. Not only do es this mak sense from the oin of view of robust deco ding, but it also mak es the con trol problem form

ulate later ell-p osed. As men tioned in the in tro duction, our aim will to join tly optimize er the hoice of the erage en trop rate of the erage co de length if the enco ding is done optimally) and the erage distortion. The con en tional rate distortion theoretic form ulation ould to minimize the erage en trop rate lim sup =0 +1 /q )] eing the (conditional) Shannon en trop sub ject to hard constrain on the distortion lim sup =0 || || where shall, ho ev er, consider the simpler problem of minimizing the Lagrangian distortion measure lim sup =0 +1 /q || || (2.3) where is prescrib ed constan t.

One ma think of as Lagrange ultiplier, though, strictly sp eaking, suc an in terpretation is lac king giv en our arbitrary hoice thereof. 3. Reduction to the con trol problem. This section deriv es the “completely observ ed optimal sto hastic con trol problem equiv alen to the optimal ector quan- tization problem describ ed ab e. In this, follo the usual “separation idea of sto hastic con trol iden tifying the regular conditional la of state giv en past ob- serv ations (in our case, past enco dings of the actual observ ations) as the new state pro cess for the completely observ ed con trol

problem. The original cost function is rewritten in an equiv alen form that displa ys it as function of the new state and con trol pro cesses alone. Under the assumptions of the previous section on the er- missible ector quan tization sc hemes (as reflected in our definition of ), the ab con trolled Mark pro cess is sho wn to ha transition ernel con tin uous in the ini- tial state and con trol. Finally relaxation of this con trol problem is outlined, whic allo ws for larger class of con trols. This is purely tec hnical con enience required for the pro ofs of the next section and do

es not affect our con trol problem in an essen tial manner.
Page 6
140 V. S. BORKAR, S. K. MITTER, AND S. TIK OND Let dx denote the conditional la of giv en standard application of the Ba es rule sho ws that is giv en recursiv ely the nonlinear filter +1 dx )) +1 dy dx dx )) +1 dy dz dx (3.1) By ), )) con tains an op en subset of for an a, A. Giv en this fact and the condition that ·|· it follo ws that the denominator ab is strictly ositiv e, and hence the ratio is ell defined. The initial condition for the recursion (3 1) is the conditional la of giv en assume to

the trivial quan tizer, i.e., 0, sa so that the la of Th us defined, can view ed as )-v alued con trolled Mark pro cess with -v alued “con trol pro cess complete the description of the con trol problem, need to define our cost (2 3) in terms of or this purp ose, let dz for all x, Note that for +1 a/q +1 /q /q dy +1 /q dx )) dy where is defined dx x, with x, )) dy Also define x, || || dy log dx x, where the logarithm is to the base 2. assume to Lipsc hitz uniformly in a, This ould implied in particular the condition that Lipsc hitz uniformly in No (2 3) can rewritten as

lim sup =0 λr )] (3.2) Strictly sp eaking, should consider the problem of con trolling giv en (3 1) so as to minimize the cost (3 2). shall, ho ev er, in tro duce some further
Page 7
SEQUENTIAL VECTOR QUANTIZA TION OF MARK SOUR CES 141 simplifications, thereb replacing (3 2) an appro ximation of the same. Let small ositiv constan t. or let denote the simplex of probabilit ectors in whic ha eac comp onen ounded from elo That is, 1] i, Similarly let [0 1] i, denote the en tire simplex of probabilit ectors in Let denote the pro jection map. Let )] for and )) )] Note that log

log a, A. (3.3) Finally let log The con trol problem consider is that of con trolling so as to minimize the cost lim sup =0 λr )] (3.4) Replacing is purely tec hnical con enience to suit the needs of the dev elopmen ts to come in section 4. eliev that it should ossible to obtain the same results directly for (3 2), though ossibly at the exp ense of considerable additional tec hnical erhead. shall analyze this problem using tec hniques of Mark decision pro cesses. With this in mind, call stationary con trol olicy if for all for measurable The map itself ma referred to as the stationary con

trol olicy standard abuse of notation. Let A, d )) denote the transition ernel of the con trolled Mark pro cess Lemma 3.1. The map d is ontinuous. Pr of It suffices to hec that for )) the map dy is con tin uous. Let in Then are tigh t, and therefore, for an 0, can find compact suc that for Fix and By the Stone–W eierstrass theorem, an )) can appro ximated uniformly on )) of the form dµ, dµ
Page 8
142 V. S. BORKAR, S. K. MITTER, AND S. TIK OND for some and ). Then dy dy K sup dy dy (3.5) Let ai )) dy dx for Direct erification leads to A, dy al (3.6) Note that for all

)) )) almost ev erywhere (a.e.) ecause this con ergence fails only on the oundaries of the regions whic ha zero Leb esgue measure. (These are the so called or onoi regions in ector quan tization literature, viz., sets in the partition generated the quan tizer ).) Therefore, for all a, )) )) If in for all Then Sc heffe’s theorem p. 26], dy dy in total ariation. Hence for an a, )) dy )) dy That is, the map x, )) dy is con tin uous. It is clearly ounded. The con tin uit of ia follo ws. That of follo ws similarly The con tin uit of the sum in (3 6) then follo ws one more application of Sc

heffe’s theorem. Th us the last term on the righ t-hand side (RHS) of (3 5) tends to zero as Since as arbitrary and the second term on the RHS of (3 5) can made arbitrarily small suitable hoice of the claim follo ws. conclude this section with description of certain relaxation of this con trol problem wherein ermit larger class of con trol olicies, the so-called wide sense admissible con trols used in 11 ]. Let ( denote the underlying probabilit space, where, without loss of generalit ma supp ose that for ), 0. Define new probabilit measure on ( as
Page 9
SEQUENTIAL

VECTOR QUANTIZA TION OF MARK SOUR CES 143 follo ws. Let +1 denote the regular conditional la of +1 giv en +1 for 0. (Th us are no allo wing for randomized hoice of i.e., is not necessarily deterministic function of +1 Let an fixed probabilit measure with full supp ort. If, for denote the restrictions of to ( ), resp ectiv ely then << with dP dP =0 +1 )( +1 Γ( +1 Then, under are indep enden of and are i.i.d. with la Γ. sa that is wide sense admissible con trol if under +1 +2 is indep enden of for 0. Note that this includes of the yp for suitable maps It should ept in mind that

this allo ws explicit randomization in the hoice of whence the en trop rate expression in (3 2) or (3 4) is no longer alid. Nev ertheless, con tin ue with wide sense admissible con trols in the con text of (3 1) (3 4) ecause, for us, this is strictly temp orary tec hnical device to facilitate pro ofs. The dynamic programming form ulation that shall finally arriv at in section will ermit us to return without an loss of generalit to the apparen tly more restrictiv class of started out with. 4. The anishing discoun limit. This section deriv es the dynamic program- ming equations for the

equiv alen “separated con trol problem extending the tra- ditional “v anishing discoun t argumen to the presen setup. Deriving the dynamic programming equations for the long run erage cost con trol of the separated con trol problem has een an outstanding op en problem in the general case. solv it here using in crucial manner the asymptotic flatness assumption in tro duced earlier. It should noted that this assumption as not required at all in the dev elopmen th us far and is included purely for facilitating the anishing discoun limit argumen that follo ws. In particular, it could disp

ensed with altogether ere to consider the finite horizon or infinite horizon discoun ted cost. or an alternativ set of conditions (also strong) under whic the dynamic programming equations for the erage cost con trol under partial observ ations ha een deriv ed, see 21 ]. Our first step will to mo dify the construction at the end of section so as to construct on common probabilit space con trolled nonlinear filters with common con trol pro cess but differing in their initial condition. This allo ws us to compare discoun ted cost alue functions for differen

initial la ws. In turn, this allo ws us to sho that their difference, with one of the initial la ws fixed arbitrarily remains ounded and equicon tin uous with resp ect to certain complete metric on the space of probabilit measures, as the discoun factor approac hes unit (This is where one uses the condition of asymptotic flatness.) The rest of the deriv ation mimics the classical argumen ts in this field. or (0 1) consider the discoun ted con trol problem of minimizing =0 λr )) (4.1) er the set of all wide sense admissible con trols, with the prescrib ed

Define the asso ciated alue function inf
Page 10
144 V. S. BORKAR, S. K. MITTER, AND S. TIK OND Standard dynamic programming argumen ts sho that satisfies min λr A, d (4.2) for shall arriv at the dynamic programming equation for our original problem taking “v anishing discoun t limit of arian of (4 2). or this purp ose, need to compare for distinct alues of its argumen t. In order to do so, first set up framew ork for comparing (4 1) for hoices of but with “common wide sense admissible con trol This will done mo difying the construction at the end of the

preceding section. Let ( probabilit space on whic ha (i) -v alued, ossibly dep enden random ariables with la ws resp ectiv ely; (ii) -v alued i.i.d. random pro cesses indep enden of eac other and of with la ws as in (2 1), (2 2); and (iii) -v alued i.i.d. random sequences with la Γ. Also defined on ( is -v alued pro cess indep enden of ([ and satisfying the follo wing. or +1 +2 is indep enden of Let solutions to (2 1), (2 2) with as ab e. Without loss of generalit ma supp ose that with Define new probabilit measure on ( as follo ws. If denote the restrictions of resp ectiv ely

to ( then << with dP dP =0 +1 )( +1 +1 )( +1 Γ( +1 )Γ( +1 where the (resp ectiv ely are the regular conditional la ws of +1 giv en +1 (resp ectiv ely of +1 giv en +1 )) for 0. What this construction ac hiev es is the iden tification of eac wide sense admissible con trol for initial la with one wide sense admissible con trol for (This iden tification can man y-one.) By symmetric argumen that in terc hanges the roles of and can iden tify eac wide sense admissible con trol for with one for No supp ose that Then for wide sense admissible con trol that is optimal for

(existence of this follo ws standard dynamic programming argumen ts), ha sup where use the ab iden tification. If symmetric argumen applies. Th us ha pro ed the follo wing lemma. Lemma 4.1. sup Next, let || || dx ∞} top ologized the (com- plete) asserstein metric 20 inf || ||
Page 11
SEQUENTIAL VECTOR QUANTIZA TION OF MARK SOUR CES 145 where the infim um is er all join la ws of suc that the la of (resp ectiv ely is (resp ectiv ely ). shall assume from no on that ). Giv en the linear gro wth condition on of (2 1), (2 2), uniformly in it is then easily deduced that ||

|| for all and therefore almost surely (a.s.) for all Th us ma and do view as )-v alued pro cess. then ha the follo wing lemma. Lemma 4.2. or and Pr of Let solutions to (3 1) with initial conditions re- sp ectiv ely and “common wide sense admissible con trol Φ. Then for as ab (with denoting generic ositiv constan that ma hange from step to step) )] )] )] || || (b the Lipsc hitz condition on || || (b asymptotic flatness). No consider )] Supp ose that )] )] Then )] )] )] )] log log log log log )) log )) (b Jensen’s inequalit y) )) log )) || || || || where use (3 3) to arriv at the

second to last inequalit symmetric argumen orks if )] )], leading to the same conclusion. Com bining
Page 12
146 V. S. BORKAR, S. K. MITTER, AND S. TIK OND ev erything, ha λr )] λr )] || || Therefore, Lemma 4.1, || || || || or an can render || || suitably ho osing the join la of Since is arbitrary the claim follo ws. Fix and define for (0 1). By the ab lemma, is ounded equicon tin uous. Letting use the Arzela Ascoli theorem to conclude that con erges in )) to some along subsequence 1. By dropping to further subsequence if necessary ma also supp ose that (1 )) whic is

clearly ounded, con erges to some as These will turn out to e, resp ectiv ely the alue function and optimal cost for our original con trol problem. Our main result is the follo wing theorem. Theorem 4.3. (i) solve the dynamic pr gr amming quation min λr u, d (4.3) (ii) is the optimal ost, indep endent of the initial ondition. urthermor e, stationary olicy is optimal for any initial ondition if Argmin λr d In articular, an optimal stationary olicy exists. (iii) If is an optimal stationary olicy and is orr esp onding er go dic pr ob- ability me asur for then )) λr )) d -a.s. Pr of

or (i) rewrite (4 2) as min λr u, d (1 Let along to obtain (4 3). or (ii) note that the first statemen ts follo standard argumen whic ma found, e.g., in 15 Theorem 5.2.4, pp. 80–81]. The last claim follo ws from standard measurable selection theorem—see, e.g., 22 ].
Page 13
SEQUENTIAL VECTOR QUANTIZA TION OF MARK SOUR CES 147 or (iii) note that the claim holds if “= is replaced ”. If the claim is false, can in tegrate oth sides with resp ect to to obtain )) λr ))) d The RHS is the cost under ), whereb this inequalit con tradicts the optimalit of ). The claim follo ws.

This result op ens up the ossibilit of exploiting the computational mac hinery of Mark decision theory (see, e.g., ], 18 ], 21 ]) for co de design. Finally briefly consider the deco der’s problem. If transmission is error free, the deco der can construct recursiv ely giv en and the stationary olicy ). Then ma estimated the maxim um osteriori (MAP) estimates: argmax argmax )) +1 dz dx Supp ose the deco der receiv es through noisy but memoryless hannel with input alphab et and output alphab et another finite set with transition probabilities i, Th us i, i, for all i, Let the hannel

output at time The deco der can estimate giv en but this is no longer easy ecause cannot reconstruct exactly in absence of his kno wledge of Th us he should estimate sa (e.g., maxim um lik eliho d), giv en and use these estimates in place of in the nonlinear filter for giving an appro ximation to The guess for then is 0. 5. Conclusions and extensions. In this pap er ha considered the prob- lem of optimal sequen tial ector quan tization of stationary Mark source. ha form ulated the problem as sto hastic con trol problem. ha used the metho d- ology of Mark decision theory urther, ha sho wn

that the conditional la of the source giv en the quan tized past is sufficien statistic for the problem. Th us the optimal enco ding sc heme has separated structure. The conditional la ws are giv en recursiv ely the nonlinear filter describ ed in (3 1). The optimal olicy is haracterized Theorem 4.3. The next step is to apply traditional Mark decision problem appro ximation tec hniques to compute appro ximate sc hemes. If ha access to training data, then can use the to ols of reinforcemen learning. Here the idea is to parametrize the alue function space or the con trol la itself and

apply sto hastic appro ximation tec hniques to optimize those parameters. In general, the nonlinear filter recursion is ery complicated. In the literature eople ha appro ximated this linear prediction of the mean. These linear predictiv metho ds can considered an appro ximation to the general nonlinear filter. REFERENCES [1] G. Basak and R. N. Bha tt char Stability in distribution for class of singular diffusions Ann. Probab., 20 (1992), pp. 312–321. [2] D. Ber tsekas and J. Tsitsiklis Neur dynamic Pr gr amming thena Scien tific, Belmon t, MA, 1996.
Page 14

148 V. S. BORKAR, S. K. MITTER, AND S. TIK OND [3] A. Bist Differ ential state quantization of high or der Gauss-Markov pr esses in Pro ceedings of the IEEE Data Compression Conference, Sno wbird, UT, 1994, pp. 62–71. [4] V. S. Borkar Optimal Contr ol of Diffusion Pr esses Pitman Lecture Notes in Math. 203, Longman Scien tific and ec hnical, Harlo w, UK, 1989. [5] V. S. Borkar opics in Contr ol le Markov Chains Pitman Lecture Notes in Math. 240, Longman Scien tific and ec hnical, Harlo w, UK, 1991. [6] V. S. Borkar Pr ob ability The ory: dvanc Course Springer-V erlag,

New ork, 1995. [7] P. Chou, T. Lookaba ugh, and R. Gra Entr opy-c onstr aine ve ctor quantization IEEE rans. Acoust. Sp eec Signal Pro cess., 37 (1989), pp. 31–42. [8] P. Chou and T. Lookaba ugh Conditional entr opy-c onstr aine ve ctor quantization of line ar pr dictive efficients in Pro ceedings of the IEEE In ternational Conference on Acoustics, Sp eec and Signal Pro cessing, ol. 1, Albuquerque, NM, 1990, pp. 197–200. [9] T. Co ver and J. Thomas Elements of Information The ory John Wiley New ork, 1991. [10] J. G. Dunham iter ative the ory for de design in Pro ceedings of the IEEE In

ternational Symp osium on Information Theory St. Jo vite, QC, Canada, 1983, pp. 88–90. [11] W. Fleming and E. ardoux Optimal ontr ol for artial ly observe diffusions SIAM J. Con trol Optim., 20 (1982), pp. 261–285. [12] N. T. Gaarder and D. Slepian On optimal finite-state digital tr ansmission systems IEEE rans. Inform. Theory 28 (1982), pp. 167–186. [13] R. E. Galla ger Information The ory and eliable Communic ation John Wiley New ork, 1968. [14] P. Hall and C. C. Heyde Martingale Limit The ory and Its Applic ations Academic Press, New ork, London, 1980. [15] O. Hernandez-Lerma

and J. B. Lasserre Discr ete-Time Markov Contr ol Pr esses Springer-V erlag, New ork, 1996. [16] J. C. Kieffer Sto chastic stability of fe db ack quantization schemes IEEE rans. Inform. Theory 28 (1982), pp. 248–254. [17] E. Levine Sto chastic ve ctor quantization and sto chastic with state fe db ack using neur al networks in Pro ceedings of the IEEE Data Compression Conference, Sno wbird, UT, 1996, pp. 330–339. [18] S. P. Meyn lgorithms for optimization and stabilization of ontr ol le Markov chains in SADHANA: Indian Academ of Sciences Pro ceedings in Engineering Sciences 24, Banga- lore,

1999, pp. 339–368. [19] D. Neuhoff and R. K. Gilber Causal sour des IEEE rans. Inform. Theory 28 (1982), pp. 701–713. [20] S. T. Ra chev Pr ob ability Metrics and the Stability of Sto chastic Mo dels John Wiley Chic h- ester, UK, 1991. [21] W. J. unggaldier and L. Stettner Appr oximations of Discr ete Time Partial ly Observe Contr ol Pr oblems Applied Maths. Monographs 6, Giardini Editori Stampatori, Pisa, Italy 1994. [22] D. H. gner Survey of me asur able sele ction the or ems SIAM J. Con trol Optim., 15 (1977), pp. 859–903. [23] J. alrand and P. P. araiy Optimal ausal ding-de ding pr oblems

IEEE rans. In- form. Theory 29 (1983), pp. 814–820. [24] H. Witsenha usen On the structur of al-time sour ders The Bell System ec hnical Journal, 58 (1979), pp. 1437–1451.