/
Motion Estimation Motion Estimation

Motion Estimation - PowerPoint Presentation

ellena-manuel
ellena-manuel . @ellena-manuel
Follow
457 views
Uploaded On 2017-05-02

Motion Estimation - PPT Presentation

ECE 569 Spring 2010 Toan Nguyen Shikhar Upadhaya Outline What is new with motion estimation Four Step Search and Hexagon Search Algorithms Parallelization strategies Results and discussions ID: 543862

step search image hexagon search step hexagon image point motion points performance window algorithms minimum estimation algorithm results center

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Motion Estimation" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Motion Estimation

ECE 569 – Spring 2010

Toan

Nguyen

Shikhar

UpadhayaSlide2

Outline

What is new with motion estimation

Four Step Search and Hexagon Search Algorithms

Parallelization strategies

Results and discussionsSlide3

What is new with motion estimation?

The familiar way – Full search

Full search is not so efficient

Some of the

most popular

fast

search

algorithms:

Diamond search

Hexagon search

Three-step search

Four-step search

Orthogonal search

And many moreSlide4

So what is the best?

There is a trade-off between the run time and the accuracy.

Full search will be most accurate because of exhaustive search, but will require more time

Fast search is faster but the accuracy will be reduced because of estimation algorithms.

We implemented two of the most popular fast search algorithms for comparison:

Four Step Search

Hexagon SearchSlide5

Four Step Search Algorithm

Step 1: A minimum BDM point is found from a nine-checking points pattern on a

5 x 5 window located at the center of

the 15 x 15 searching area. If the minimum BDM point is found at the center of the search window, go to Step 4; otherwise go to Step 2.

Step 2: The search window size is maintained in 5

x 5. However,

the search pattern will depend on the position of the previous minimum BDM point.

If the previous minimum BDM point is located at the corner of the previous search window, five additional checking points as shown in Fig. 2(b) are used.

If the previous minimum BDM point is located at the middle of horizontal or vertical axis of the previous search window, three additional checking points as shown in Fig. 2(c) are used.

If the minimum BDM point is found at the center of the search window, go to Step 4; otherwise go to Step 3.

Step 3: The searching pattern strategy is the same as Step 2, but finally it will go to Step 4.

Step 4: The search window is reduced to 3

x 3 as shown in

Fig. 2(d) and the direction of the overall motion vector is considered as the minimum BDM point among these nine searching points.Slide6

Four Step Search ExampleSlide7

Hexagon Search Algorithm

Step 1: The large hexagon with seven checking points is centered at, the center of a predefined search window in the motion field. If the MBD point is found to be at the center of the hexagon, proceed to Step 3; otherwise, proceed to Step 2.

Step 2: With the MBD point in the previous search step as the center, a new large hexagon is formed. Three new candidate points are checked, and the MBD point is again identified. If the MBD point is still the center point of the newly formed hexagon, then go to Step 3; otherwise, repeat this step continuously.

Step 3: Switch the search pattern from the large to the small size of the hexagon. The four points covered by the small hexagon are evaluated to compare with the current MBD point. The new MBD point is the final solution of the motion vector.Slide8

Hexagon Search ExampleSlide9

Design Implementation

Parallelization is possible by dividing the image into small sub-image partitions.

Each thread will work on a sub-image independently using a designed algorithm (

i.e

Four step search or Hexagon Search).

At the end, the minimum SAD of each sub-image is compared to get the final minimum

SAD and avoid local minimum.Slide10

Implementation Notes

Since the number of threads we use is multiple of 2’s, if the number of sub-image is not multiple of 2’s, we need to pad the image with additional rows and columns and we ignore the results from those extra

sub-images.

We excluded the time it takes to read a text file and store data into the window and image arrays when we compare the runtime for performance analysis.Slide11

Simulation Results

First we varied the number of threads per block to find the maximal configuration that gives the best run time.

256 threads/block give the best performance.Slide12

Simulation Results (cont.)

The runtime of the serial versions and the parallel versions of different algorithms are collected and compare to see what kind of performance improvement we achieved.

We

only see the performance improvement when the image size is 256x256 or bigger. Any image of size smaller than this will actually decrease the performance.Slide13

Simulation Results (cont.)

So how much speed up do we get and which algorithm is better, Full Search, Four Step Search, or Hexagon Search?Slide14

Simulation Results (cont.)

Overall performance

Full_Serial

Full_Parallel

4SS_Serial

4SS_Parallel

Hexagon_Serial

Hexagon_parallel

16X16

0

0

0

0.016

0

0.078

32X32

0

0.016

0

0.015

0

0.047

64X64

0.01

0.016

0.01

0.015

0.01

0.062

128X128

0.02

0.016

0.01

0.015

0.01

0.062

256X256

0.09

0.031

0.02

0.016

0.02

0.047

512X512

0.41

0.078

0.06

0.016

0.06

0.063

1024X1024

1.64

0.265

0.236

0.032

0.22

0.062

2048X2048

6.56

0.922

0.87

0.047

0.85

0.078

4096X4096

26.29

3.719

3.38

0.11

3.3

0.157Slide15

Simulation Results (cont.)

Performance comparison between NVIDIA 8400 GS and 9800 GT GPUs.Slide16

Simulation Results (cont.)

Distortion measurement (motion estimation quality).Slide17

Result Analysis Summary

1. Motion estimation parallel versions performance only improve when image is large (256x256). Smaller image will reduce performance.

Larger image ~ greater speedup

2. Fast search algorithms outperform full search algorithm, hence “fast”.

3. Parallelization on Four Step Search gives a slightly edge improvement over Hexagon Search.

4. The distortion we see on the two fast search algorithms are similar. Slide18

Result Conclusions

Based on the data collected from different algorithms, Four Step Search gives a slightly better performance than Hexagon Search, while the distortion is very similar.

Hence, Four Step Search is a better fast search algorithm than Hexagon Search.

Only perform motion estimation algorithms on GPU if image size is larger than 256x256. Smaller image size should be ran serially on CPU.Slide19

Limitations

Image and window files are random.

Not make use of shared memorySlide20

Other parallelization strategy

After each step, the SAD of the new checking points will be computed. We can parallelize by having threads to compute SAD’s of all the points in the sub-image.

Then after each step complete and the SAD for the new checking points needed, we already have them computed by the threads in previous step.

Drawback of this strategy:

Not getting a considerable amount of speedup

Lots of data transfer between host and device

More complicated implementationSlide21

References

Deepak

Turaga

, Mohamed

Alkanhal

. "Search Algorithms for Block-Matching in Motion Estimation". ECE - CMU. March 06, 2010 <http://www.ece.cmu.edu/~ee899/project/deepak_mid.htm>.

Lai-Man Po, Wing-Chung Ma. A Novel Four-Step Search Algorithm for Fast Block Motion Estimation. JUNE 1996

Xuan

Jing, Lap-

Pui

Chau

. "An Efficient Three-Step Search Algorithm for Block Motion Estimation". IEEE TRANSACTIONS ON MULTIMEDIA JUNE 2004: 435-437.

Chen Lu, Wang. "Diamond Search Algorithm". ECE, U of Texas. March 06, 2010 <http://users.ece.utexas.edu/~bevans/courses/ee381k/projects/fall98/chen-lu-wang/presentation/sld012.htm>.Slide22

Questions?