Megan Fuller and Ezzeldin Hamed 1 Transforms of Images Original Image Image Reconstructed from 25 of DFT coefficients Magnitude of DFT of Image128 otherwise DC component 8e6 2 The 2D Discrete Fourier Transform ID: 428143
Download Presentation The PPT/PDF document "Image Compression System" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Image Compression System
Megan Fuller and Ezzeldin Hamed
1Slide2
Transforms of Images
Original Image
Image Reconstructed from 25% of DFT coefficients
Magnitude of DFT of Image-128 (otherwise DC component = ~8e6)
2Slide3
The 2D Discrete Fourier Transform
Where
This can be computed
separably
by rearranging:
3Slide4
The 2D Discrete Cosine Transform
Computed
separably
Computed as a DFT + 1 multiply
Generally gives better energy compaction than DFT
4Slide5
High Level Architecture
Separable, in-place 2D DFT/DCT
Input Memory
Coefficient > Threshold?
Output Module (sending data to PC)
5
The choice between DFT and DCT is provided at compile time
Threshold is provided by the user at run timeSlide6
What’s Interesting?
Reducing the computation requiredSharing resources in the DCT case
Some memory organization tricks
Reducing bit width
6Slide7
Number of FFTs
Using FFT to calculate the 1D-DFT
We
need
FFTs to calculate the 2D-DFT
Can we reduce the number of FFTs?
7Slide8
Reduction for the DFT case
Using the DFT properties
Input is
real
Output is
symmetric
Combining rows
Even/Odd decomposition
8
S
00
S01S02S03S10S11S12S13S20S21S22S23S30S31S32S33
N/2 FFTs of the rows, followed by Even/Odd decomposition
Output is symmetric (discard half the columns)
N/2 FFTs of the columns
Total of N FFT computations
S
31
S
11
Real
ImagSlide9
Reduction in the DCT case
Again combining the rows in the same way as in DFT (N/2 FFTs)Even/Odd decomposition then extra multiplication to calculate the DCT
9
S
1
0
S
1
1
S12
S
1
3S00S01S02S03S30S31S32S33S20S21S22S23Results are not symmetricBut the DCT is realWe can combine the columns the same way we combined the rows (N/2 FFT)The same multiplier inside the FFT is usedAnother Even/Odd decomposition is required here with an extra complex multiplierTotal of N FFT computations + few extra multiplicationsRealImagSlide10
In-Place Radix-4 FFT
10
Critical path
Fixed
point arithmetic
Bit
Width?
Quantization
noise
Rounding instead of
Truncation
Avoid any overflow
additionsNeeds extra bitsCan we do better? Slide11
Static Scaling Vs. Dynamic Scaling
Shift when you expect an overflow
Shift after each addition
The location of the fraction point is fixed at each computation step
Almost no overhead compared to fixed point
Higher effective bit width only in the first computation
stepsNo effect on the critical path
11
Shift only when overflow occurs
Track overflows and account for them
The
location of the fraction point is
the same for each 1D-FFT frameNeeds simple circuitry to track the overflow and shift when requiredEffective bit width depend on the data.No effect on the critical pathSlide12
Design Space Explored
Dynamic Scaling
Yes
No
DFT
DCT
DFT
8 12 16
DCT
8 12 16
8 12 16
8 12 16
12
8 bits with dynamic scaling considered later
8 bits without dynamic scaling (and 12 for DCT) perform too poorly to be considered
12 does as good as 16 bits with dynamic scaling in the DFTSlide13
Dynamic Scaling of DFT
13
50% of coefficients is sufficient for perfect reconstruction because of the symmetry of the
DFT
16 bits without dynamic scaling does as well as floating point
12 bits with dynamic scaling also does nearly as well as floating pointSlide14
Dynamic Scaling of DFT(continued)
14
Improvement in performance when dynamic scaling is used more than makes up for reduced compression because the scaling bits have to be saved
12 bits with dynamic scaling does nearly as well as 16 bitsSlide15
DCT Vs. DFT
15
All cases are using dynamic scaling
DCT provides better energy compaction
For DCT, 12 bits gives a lower MSE for a given compression ratio (this was not the case for the DFT).Slide16
8 Bits
Image reconstructed from 50% of the DFT coefficients, computed with 8 bits, using dynamic scaling. MSE = 452.
Image reconstructed from 6% of the DFT coefficients, computed with 16 bits, MSE = 129.
16Slide17
Physical Considerations
Transform
# of Bits
Dynamic Scaling?
Critical
Path
Slice Registers
Slice LUTs
BRAMDSP48EsDFT
16No11.458ns
16%
23%
29%7DFT16Yes11.763ns17%24%29%7DFT12No11.273ns15%22%24%7DFT12Yes11.464ns16%23%24%7DFT8Yes11.287ns15%22%18%6DCT16Yes11.458ns19%26%29%10DCT12Yes11.273ns18%25%24%10DCT8Yes11.066ns17%23%18%817Critical path about the same for all designs, could probably be improved with tighter synthesis constraintsResource usage increases with bitwidth, addition of dynamic scaling, and DCT, but overall doesn’t change muchDCT uses extra DSP blocks because of the extra multiplicationSlide18
Latency
Component
Latency (clock cycles)
Potential Frame Rate
with 50MHz Clock
Initialization
870,000
-
DCT263,900189 images/second
DFT262,200191 images/second
18
Slide19
Future Work
Use of DRAM to allow compression of larger imagesSupport for color imagesSupport for rectangular images of arbitrary edge length
Combining the DCT and DFT into a single core that could compute either transform, as selected by the user at runtime
19Slide20
Relationship Between the DFT and the DCT
The N-point DFT of a sequence is the Fourier Series coefficients for that sequence made periodic with period N.
20Slide21
Relationship Between the DFT and the DCT (continued)
The N-point DCT of a sequence is a twiddle factor multiplied by the first N Fourier Series coefficients of the 2N point sequence y(n) made periodic with period 2N.
y(n) = x(x) + x(2N-1-n)
x
(n)
21Slide22
Relationship Between the DFT and the DCT (continued)
The DCT can be computed from the DFT as follows:
Define the sequences
y(n) = x(n) + x(2N-1-n)
v(n) = y(2n)
Compute the N-point DFT of v(n), V(k)
22Slide23
Rounding
Design
MSE Decrease
with Rounding
12 bits, no dynamic
scaling, DFT
20
16 bits, no dynamic scaling, DFT0
12 bits, dynamic scaling, DFT2
16 bits, no dynamic scaling, DCT0
12 bits, dynamic scaling, DCT
2
16 bits, dynamic scaling, DCT0Conclusion: Never hurt, often helped. Free in hardware (just a register initialization), so always use it. All subsequent results will be using rounding.23Slide24
Dynamic Scaling of DCT
24Slide25
Dynamic Scaling of DCT (continued)
25Slide26
Limitations of MSE
Image reconstructed from 5.7% of the DCT coefficients, computed with dynamic scaling. MSE = 193
Image reconstructed from 6.1% of the DCT coefficients, computed without dynamic scaling. MSE = 338
26Slide27
Performance of 8 Bit Systems
27Slide28
More Limitations of MSE
(Left) 8 bit DFT coefficients, computed with rounding. Compression ratio = 2.3, MSE = 869.
(Right) 8 bit DFT coefficients, computed without rounding. Compression ratio = 2.1, MSE = 664
(Left) 8 bit DCT coefficients, computed with rounding. Compression ratio = 2.2, MSE = 517.
(Right) 8 bit DCT coefficients, computed without rounding. Compression ratio = 2.4, MSE = 563
28