They may be distributed outside this class only with the permission of the Instructor 81 Codes Codes are functions that convert strings over some alphabet into typically shorter strings over another alphabet We recall di64256erent types of codes and ID: 24792
Download Pdf The PPT/PDF document "Information Processing and Learning Spri..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
8-2Lecture8:SourceCodingTheorem,HumancodingConversely,forallsetsfl(x)gx2Xofnumberssatisfying(8.1),thereexistsaprexcodeC:X!f1;2;:::;Dgsuchthatl(x)isthelengthofC(x)foreachx.Theideabehindtheproofistonotethateachuniquelydecodablecode(takingDpossiblevalues)correspondstoaniteD-arytreehavingthecodewordsassomeofitsleaves.Thusthesecondsentenceofthetheoremreadilyfollows.Therstsentenceofthetheoremfollowsfromnotingthatwedrewthetreesothatitsbranchesformed=Dradiananglesandwescaledthetreesothattheleaveshoveredovertheunitinterval[0;1],theneachleafLhoversoverthesumofthereciprocallengthsofpathstoleavesleftofL,i.e.eachleafmapstoadisjointsubintervalof[0;1].Proposition8.2Theidealcodelengthsforaprexcodewithsmallestexpectedcodelengtharel(x)=logD1 p(x)(Shannoninformationcontent)Proof:Inlastclass,weshowedthatforalllengthfunctionslofprexcodes,E[l(x)]=Hp(X)E[l(x)]. WhileShannonentropiesarenotinteger-valuedandhencecannotbethelengthsofcodewords,theintegersfdlogD1 p(x)egx2XsatisfytheKraft-McMillanInequalityandhencethereexistssomeuniquelydecodablecodeCforwhichHp(x)E[l(x)]Hp(x)+1;x2X(8.2)byTheorem8.1.SuchacodeiscalledShannoncode.Moreover,thelengthsofcodewordsforsuchacodeCachievetheentropyforXasymptotically,i.e.ifShannoncodesareconstructedforstringsofsymbolsxnwheren!1,insteadofindividualsymbols.AssumingX1;X2;:::formaniidprocess,foralln=0;1;:::H(X)=H(X1;X2;:::;Xn) nE[l(x1;:::;xn)] nH(X1;X2;:::;Xn) n+1 n=H(X)+1 nby(8.2),andhenceE[l(x1;:::;xn) n]!n!1H(X).IfX1;X2;:::formastartionaryprocess,thenasimilararugmentshowsthatE[l(x1;:::;xn) n]!n!1H(X),whereH(X)istheentropyrateoftheprocess.Theorem8.3(ShannonSourceCodingTheorem)Acollectionofniidranodmvariables,eachwithentropyH(X),canbecompressedintonH(X)bitsonaveragewithnegligiblelossasn!1.Conversely,nouniquelydecodablecodecancompressthemtolessthannH(X)bitswithoutlossofinformation.8.1.2Non-singularvs.UniquelydecodablecodesCanwegainanythingbygivingupuniquedecodabilityandonlyrequiringthecodetobenon-singular?First,thequestionisnotreallyfairbecausewecannotdecodesequenceofsymbolseachencodedwithanon-singularcodeeasily.Second,(aswearguebelow)non-singularcodesonlyprovideasmallimprovementinexpectedcodelengthoverentropy. 8-4Lecture8:SourceCodingTheorem,Humancoding8.1.3CostofusingwrongdistributionWecanuserelativeentropytoquantifythedeviationfromoptimalitythatcomesfromusingthewrongprobabilitydistributionq6=ponthesourcesymbols.Supposel(x)=dlogD1 q(x)e,istheShannoncodeassignmentforawrongdistributionq6=p.ThenH(p)+D(pkq)Ep[l(X)]H(p)+D(pkq)+1:ThusD(pkq)measuresdeviationfromoptimalityincodelengths.Proof:First,theupperbound:Ep[l(X)]=Xxp(x)l(x)=Xxp(x)dlogD1 q(x)eXxp(x)log1 q(x)+1=Xxp(x)logp(x) q(x)1 p(x)+1=D(pjjq)+H(p)+1Thelowerboundfollowssimilarly:Ep[l(X)]Xxp(x)log1 q(x)=D(pjjq)+H(p) 8.1.4HumanCodingIsthereaprexcodewithexpectedlengthshorterthanShannoncode?Theanswerisyes.Theoptimal(shortestexpectedlength)prexcodeforagivendistributioncanbeconstructedbyasimplealgorithmduetoHuman.Weintroduceanoptimalsymbolcode,calledaHumancode,thatadmitsasimplealgorithmforitsim-plementation.WexY=f0;1gandhenceconsiderbinarycodes,althoughtheproceduredescribedherereadilyadaptsformoregeneralY.Simply,wedenetheHumancodeC:X!f0;1gasthecodingschemethatbuildsabinarytreefromleavesup-takesthetwosymbolshavingtheleastprobabilities,assignsthemequallengths,mergesthem,andthenreiteratestheentireprocess.Formally,wedescribethecodeasfollows.LetX=fx1;:::;xNg;p1=p(x1);p2=p2(x2);:::pN=p(xN):TheprocedureHuisdenedasfollows:Hu(p1;:::;pN):ifN2thenC(1) 0,C(2) 1elsesortp1p2:::pNC0 Hu(p1;p2;:::;pN2;pN1+pN)foreachiifiN2thenC(i) C0(i)elseifi=N1thenC(i) C0(N1)0elseC(i) C0(N1)1returnC 8-6Lecture8:SourceCodingTheorem,HumancodingProof:Thecollectionofprexcodesiswell-orderedunderexpectedlengthsofcodewords.Hencethereexistsa(notnecessarilyunique)optimalprexcode.Tosee(1),supposeCisanoptimalprexcode.LetC0bethecodeinterchangingC(xj)andC(xk)forsomejk(sothatpjpk).Then0L(C0)L(C)=Xipil0iXipili=pjlk+pkljpjljpklk=(pjpk)(lklj)andhencelklj0,orequivalently,ljlk.Tosee(2),notethatifthetwolongestcodewordshaddieringlengths,abitcanberemovedfromtheendofthelongestcodewordwhileremainingaprexcodeandhencehavestrictlylowerexpectedlength.Anapplicationof(1)yields(2)sinceittellsusthatthelongestcodewordscorrespondtotheleastlikelysymbols. WeclaimthatHumancodesareoptimal,atleastamongallprexcodes.Becauseourproofinvolvesmultiplecodes,weavoidambiguitybywritingL(C)fortheexpectedlengthofacodewordcodedbyC,foreachC.Proposition8.7Humancodesareoptimalprexcodes.Proof:DeneasequencefANgN=2;:::;jXjofsetsofsourcesymbols,andassociatedprobabilitiesPN=fp1;p2;:::;pN1;pN+pN+1++pjXjg.LetCNdenoteahumanencodingonthesetofsourcesymbolsANwithprobabilitiesPN.WeinductonthesizeofthealphabetsN.1.ForthebasecaseN=2,theHumancodemapsx1andx2toonebiteachandishenceoptimal.2.InductivelyassumethattheHumancodeCN1isanoptimalprexcode.3.WewillshowthattheHumancodeCNisalsoanoptimalprexcode.NoticethatthecodeCN1isformedbytakingthecommonprexofthetwolongestcodewords(least-likelysymbols)infx1;:::;xNgandallottingittoasymbolwithexpectedlengthpN1+pN.Inotherwords,theHumantreeforthemergedalphabetisthemergeoftheHumantreefortheoriginalalphabet.ThisistruesimplybythedenitionoftheHumanprocedure.LetlidenotethelengthofthecodewordforsymboliinCNandletl0idenotethelengthofsymboliinCN1.ThenL(CN)=N2Xi=1pili+pN1lN1+pNlN=N2Xi=1pil0i+(pN1+pN)l0N1| {z }L(CN1)+(pN1+pN)thelastlinefollowingfromtheHumanconstruction.Suppose,tothecontrary,thatCNwerenotoptimal.LetCNbeoptimal(existenceisguaranteedbypreviousLemma).WecantakeCN1tobeobtainedbymergingthetwoleastlikelysymbolswhichhavesamelengthbyLemma8.6.ButthenL(CN)=L(CN1)+(pN1+pN)L(CN1)+(pN1+pN)=L(CN)wheretheinequalityholdssinceCN1isoptimal.Hence,CNhadtobeoptimal.