/
Dec 19, 2019 IITB: Compression Dec 19, 2019 IITB: Compression

Dec 19, 2019 IITB: Compression - PowerPoint Presentation

walsh
walsh . @walsh
Follow
28 views
Uploaded On 2024-02-03

Dec 19, 2019 IITB: Compression - PPT Presentation

1 Data Compression A survey Madhu Sudan Harvard Basic problem Given Want and st Basic correctness Prefixfree correctness not a prefix of Ensures can compress arbitrarily long sequences of ID: 1044523

dec 2019iitb compression length 2019iitb dec length compression markovian finite source encoding compress single sources prefix state analysis amortized

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Dec 19, 2019 IITB: Compression" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Dec 19, 2019IITB: Compression1Data Compression:A surveyMadhu SudanHarvard

2. Basic problemGiven: Want and s.t.Basic correctness: Prefix-free correctness: not a prefix of . (Ensures can compress arbitrarily long sequences of el’ts of )Prefix-free correctness Basic correctnessPerformance measure: ?Assumption: : Some source described finitely (independent of )Measure:  Dec 19, 2019IITB: Compression2

3. Many types of problems:Single-shot or amortized? : Single-shot : Measure:(Amortized)“finite” length analysis: Smallest for which get -close to the limit.Source:Independent or Markovian or ?Known or Unknown (aka “Universal Coding”)Algorithmic! Dec 19, 2019IITB: Compression3

4. Overview of talkSingle-shotKraft, Shannon, HuffmanAmortized, iidShannon: Random coding + converseAmortized, MarkovianRate = ?Gap to capacity: Polar codingUnknown sources: Lempel-ZivUnknown sources + Gap to capacity: LPNDec 19, 2019IITB: Compression4

5. I. Single-shotDec 19, 2019IITB: Compression5

6. Single-shot given by :Kraft: Given a prefix-free encoding with exists iff Proof of existence: sort according to ; Greedily assign strings. Proof of tightness: String of length prefix of fraction of infinite length binary strings.  Dec 19, 2019IITB: Compression6

7. Single-shot (contd.)Shannon Kraft:Set Compression length Huffman:“Merge” two smallest probability symbols ; Compress smaller symbol set; Extend encoding of merged symbol by (or ) to get encoding of (or Optimal encoding! compression Lower bounds? From amortized setting! Dec 19, 2019IITB: Compression7

8. II. Amortized, iidDec 19, 2019IITB: Compression8

9. The original Shannon Problem:Encoding: Random functionDecoding: Maximum likelihoodAmortized Compression length Lower bound: AEP + “Pigeonhole Principle”AEP: -fold iid samples have mass on points with near-uniform distribution. Asymptotic compression length  Dec 19, 2019IITB: Compression9

10. Speed of convergence to capacityTo get -close to capacityWhat is fastest run time? with non-linear encoder: Simple. with linear (fixed-length) encoder ( error): Polar coding/decoding!Should all the above be ? ? Dec 19, 2019IITB: Compression10

11. Unknown source/Universal Coding?Simple: Just sample enough to learn distribution to within TVD. Then compress as if known.A slightly more “natural” algorithm:Given , let Encode: Send encodings of (Works well if alphabet large enough; use if not big enough.) Dec 19, 2019IITB: Compression11

12. III. Amortized, MarkovDec 19, 2019IITB: Compression12

13. Markovian SourcesSource : : -state Markov Chain : Distribution over produces sequence , : drawn independently for each  Dec 19, 2019IITB: Compression13      

14. Entropy of Markovian SourceChallenge: Closed form expression? Polytime computable? Computable?Computably approximable? (“Single-letter characterization?”)Entropy =  Dec 19, 2019IITB: Compression14      

15. Compression of Markovian Sources“Universal” Compression (Unknown source):Lempel-Ziv algorithm:Partition into s.t., with Output encodings of Example: for Encoding = Thm: Markovian source , , s.t. Dec 19, 2019IITB: Compression15

16. ProofStep 1: finite state compressor that compresses close to capacity.Idea: Pick large Compute Let = Huffman code of according to Partition , ;Output #States ; ; Concentration (Odd ’s and even ’s separately.) Dec 19, 2019IITB: Compression16

17. ProofStep 1: finite state compressor that compresses close to capacity.Step 2: LZ not much worse than any finite state compressor.Key ingredient: Complexity of string distinct s.t. Clearly Lemma: Compression length of -state compressor  Dec 19, 2019IITB: Compression17

18. Finite length analysis?What is smallest as function of for which universal compression can be achieved?Analysis thus far: Has many “non-explicit” elements; best guess = Necessary?Thm[GNS]: Learn Parity with Noise.  Dec 19, 2019IITB: Compression18

19. Compressing known Markovian Source?(Obviously easier … but can ask for strong finite length analysis.)Till recently … unexplored.[GNS ‘17]: Polar codes compress Markovian sources!At lengths [ length at which chain converges to entropy rate.]Surprise factor: Polar codes good for compressing iid sources … but Markovian not iid! Dec 19, 2019IITB: Compression19

20. Reduction to ind. case Dec 19, 2019IITB: Compression20                Compress for  Elements within column ind. conditioned on previous columns!

21. ConclusionsCan compress many stochastic sourcesMarkovian sources – many unknowns.Entropy rate?Finite length analysis of universal compression.Did not discuss …Interactive compressionUncertain compressionDec 19, 2019IITB: Compression21

22. Thank You!Dec 19, 2019IITB: Compression22