/
Cross-scale Predictive Dictionaries for Image and Video Res Cross-scale Predictive Dictionaries for Image and Video Res

Cross-scale Predictive Dictionaries for Image and Video Res - PowerPoint Presentation

yoshiko-marsland
yoshiko-marsland . @yoshiko-marsland
Follow
377 views
Uploaded On 2018-01-04

Cross-scale Predictive Dictionaries for Image and Video Res - PPT Presentation

Paper ID 2314 Vishwanath Saragadam Aswin Sankaranarayanan Xin Li 1 Compressive sensing Solving underdetermined linear system of equations Relies on sparsity of signal Orthogonal Matching Pursuit ID: 619549

tree video sparse omp video tree omp sparse dictionary high atoms signal resolution 512 dictionaries atom data image 8192

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Cross-scale Predictive Dictionaries for ..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Cross-scale Predictive Dictionaries for Image and Video Restoration(Paper ID: 2314)

Vishwanath Saragadam, Aswin Sankaranarayanan, Xin Li

1Slide2

Compressive sensing

Solving underdetermined linear system of equationsRelies on sparsity of signalOrthogonal Matching Pursuit: Efficient recovery method

2

 

 

 

 

Signal of interest

Measurement matrix

 

 

 Slide3

Orthogonal matching pursuit

Greedy algorithm to solve for sparse representation,

1-sparse representations

Find most correlated atom

K-sparse:

Iteratively find 1-sparse solutions, K times.

Number of correlations increase linearly with

 

3Slide4

Accuracy increases with dictionary size

4

Y.

Hitomi

, J.

Gu

, M. Gupta, T.

Mitsunaga

, and S. K.

Nayar. Video from a single coded exposure photograph using a learned over-complete dictionary. In IEEE Intl. Conf. Computer Vision, 2011Slide5

A real example: High speed video

5

Frames 1 - 36

Coded image

Recovered video

video patches, 100,000 atoms

 

hour to recover 36 frames

 

Y.

Hitomi

, J.

Gu

, M. Gupta, T.

Mitsunaga

, and S. K.

Nayar

. Video from a single coded exposure photograph using a learned over-complete dictionary. In IEEE Intl. Conf. Computer Vision, 2011Slide6

Need structured dictionaries

Need very large dictionaries for high accuracyComputational complexity increases with larger number of dictionary elementsEndow structure in dictionary to reduce search time

6Slide7

Structure across scales for visual signals

7

Image

Wavelet transform

 

 

 

Sparse

MultiscaleSlide8

Wavelet zero tree

8Wavelet transform

Sparse

Multiscale

Predictive

Well known only for images

8

Parent coefficient zero

child coefficients most likely zero

Extend wavelet zero tree to dictionariesSlide9

Cross-scale predictive dictionaries

9Slide10

Proposed signal model

 

 

 

Signal

 

 

 

Dictionaries

 

 

 

Sparse representation

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

Zero tree structure of sparse coefficients

10

Downsample

 

 

 Slide11

Signal model

 

 

Downsample

 

Restrict support

 

 

 

 

11

 

 

Parent coefficient

Child coefficients

Parent atom

Child atoms

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Use low resolution approximation to restrict high resolution approximationSlide12

Speedup

Speed up,

For the toy problem, assume

,

 

12Slide13

Proposed signal model

 

 

 

Signal

 

 

 

Dictionaries

 

 

 

Sparse representation

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

Zero tree structure of sparse coefficients

13

Use OMP at each scale

Zero tree of coefficients

Zero tree OMP

Downsample

How do we learn these dictionaries?Slide14

Dictionary learning

Given:

Step 1

: Learn

Data:

K-SVD

By product:

Step 2

: Learn

Data:

Modify K-SVD sparse approximation step

Use zero tree OMP instead of OMP

 

14

Recollect:

KSVD:

-- Dictionary update: Rank 1 SVD

-- Sparse approximation: OMPSlide15

Dictionary learning—results

15

 

 

Data: 24x24 RGB patches

: 12x12, 512 atoms

 

: 24x24, 8192 atoms

 

(Parent atom)

(Child atoms)Slide16

Dictionary learning—results

16

 

 

Data: 24x24 RGB patches

: 12x12, 512 atoms

 

: 24x24, 8192 atoms

 Slide17

Dictionary learning – results

17

 

 

Data: 8x8x32 video patches

: 4x4x16, 512 atoms

 

: 8x8x32, 8192 atoms

 Slide18

Dictionary learning – results

18

 

 

Data: 8x8x32 video patches

: 4x4x16, 512 atoms

 

: 8x8x32, 8192 atoms

 Slide19

Signal recovery

Given:

,

Step 1

: Low resolution recovery

Step 2

: High resolution recovery

 

19

 

Upsampler

 

 

 

 

 

 

Need

in the first step

 Slide20

Application: Video compressive sensing

20

8 high speed video frames

Coded image

http://high_speed_video.colostate.edu/Slide21

Application: Video compressive sensing

21

video patches

 

atom dictionary, OMP

 

atom high resolution,

atom low resolution, Zero tree OMP

 Slide22

Application: Video compressive sensing

22

Original video

Recovery using OMP

Time: 3.71 min

SNR: 15.79 dB

Recovery using zero tree OMP

Time: 16.5sSNR: 17.81 dB

speedup. Increase in accuracy

 

video patches

 

atom dictionary, OMP

 

atom high resolution,

atom low resolution, Zero tree OMP

 

SNR

 

http://high_speed_video.colostate.edu/Slide23

Video compressive sensing: Real data

23

Coded image

Recovered video using 10,000 dictionary atoms

Time = 3 minutes

Recovered video using zero tree OMP

Time = 45 s

Thanks to

Dengyu

Liu and

Yasunobu

Hitomi

for sharing the data they collected with us.Slide24

Accuracy plots for images

Image

denoising

metrics

Image

inpainting

with randomly deleted pixels. N/M represents the number of pixels recovered for each known pixel value.

24Slide25

Accuracy plots for videos

Video

denoising

metrics

Video compressive sensing. N/M represents the number of frames recovered from each coded image.

25Slide26

Summary

26

 

 

 

Signal

 

 

 

Dictionaries

 

 

 

Sparse representation

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

Zero tree structure of sparse coefficients

Downsample

Novel signal model inspired by wavelets zero tree

speedup

 

Appealing for high dimensional signals like videos

Questions?Slide27

Thank you

27Slide28

Extra slides

28Slide29

Model comparison table

29

Signal Class

N

N

low

T

low

T

high

K

lowK

highSpeedup

Model Accuracy (dB)

K-SVD Accuracy (dB)

Images

8x8

4x464

1024

88

4.10

20.6721.98

24x24x3

12x12x3

512

8192

8

8

22.6

19.64

20.57

Videos

8x8x16

4x4x8

512

8192

16

16

15.87

22.62

24.09

8x8x16

4x4x8

512

8192

14

16

15.80

22.75

24.09

8x8x32

4x4x16512

819216

16

23.8120.72

21.36

8x8x16

4x4x8

512

16384

16

16

16.89

21.84

23.27Slide30

OMP – Computational Requirements

Per-iteration costsForming proxy O

(MN)

Finding closest atom

O

(N)Least squares (MxK

system) O (MK

2)Dominant O

(MN + MK2)

Total costs (K-iterations) O (MNK + MK3

)

30Slide31

Zero tree OMP – algorithm

31Slide32

Zero tree OMP speed up computation

Computation time for low resolution

Computation time for high resolution

 

32Slide33

Zero tree OMP speed up computation

Computation time for OMP

Speedup

Simple case,

 

33