By Messaoud Mohamed Anis Outline FFT divide and conquer algorithm ADSP21469 FFT coprocessor Background telemetry channels Hum detector application 2 33 Discrete Fourier Transform DFT ID: 428141
Download Presentation The PPT/PDF document "ENEL653 Project" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
ENEL653 Project
By
Messaoud
Mohamed
AnisSlide2
Outline
FFT divide and conquer algorithm
ADSP21469
FFT
coprocessorBackground telemetry channelsHum detector application
2
/33Slide3
Discrete Fourier Transform (DFT)
;
Complexity
:
N² complex multiplication
N(N-1) complex addition
3
/33Slide4
FFT
Family
of algorithms
based
on the properties of symmetry and periodicity of the twiddle factors
Symmetry
:
Periodicity: 4/33Slide5
Divide And Conquer
decompose N into N=LM
Remap
the 1D array into 2D one
Two possible configurations:
Column wise
n=
l+mLrow wisen=Ml+mx(n) :5/33Slide6
Divide and conquer simplification
k =
Mp
+ q
n =
mL + l
Remap
the initial
array
Rewrite the DFT
equation
6
/33Slide7
Divide and conquer result interpretation
M point DFT
Multiplication with special twiddles
L
point DFT
Step1 : L FFT with M points LM² multiplication and LM(M-1) addition
Step2 : N multiplicationStep3 : M FFT with L points ML² multiplication and ML(L-1) additionComplexity:Direct evaluationDivide and conquerMultiplicationN²N(M+L+1)AdditionN(N-1)N(M+L-2)The algorithm can be reiterated on M and L to break the calculation to even smaller DFTs 7/33Slide8
Divide and conquer Algorithm
formulation[1]
Remap the input into a 2D array
Calculate the DFT along the first dimension
Multiply the obtained data by the special twiddle factors
Calculate the DFT along the second dimension
Rearrange the output data
123458/33Picture taken from[1]Slide9
ADSP21469 FFT accelerator
Features:
Supports
FFT sizes from 16 –
points.
Computes
a radix 2 decimation in time algorithm with automated bit reversal.Contains a 1024 32-bit word data memory unit.Contains a 512 32-bit word twiddle coefficients memory unit.Contains a compute block unit with four floating-point multipliers and six floating-point adders. 9/33Slide10
FFT limitation
The size of the local memory is limited so only FFTs of less or equal to 256 points can be calculated internally.
For FFTs longer than 256 the calculation is broken into smaller ones using the divide and conquer approach which decrease the performance
of the accelerator
.Only the internal memory can be used to load and store any data related to the FFT accelerator.
The FFT use the PCLK signal as clock source which runs at half the core clock frequency.
10
/33Slide11
FFT accelerator state machine
Idle State:
used
to program the
accelerator’s
control
registers.Read State: The coefficients and input data are loaded to the local memory.Processing State: the FFT is computed.Write state: In this mode all the computed data is written out to internal memory.11/33Slide12
FFT accelerator control registers
FFTCTL1:
FFTCTL2:
FFTDMASTAT:
12
/33Slide13
FFT local data organisation
Unpacked(only if
)
packed
data format
Coefficient format
13/33Slide14
FFT Transfer control block configuration
The local memory is only accessible through DMA
Two DMA channels are reserved for the FFT core
The DMA channels are configured using transfer control blocks (TCB)
FFT transfer control block
This register
acts as a pointer to the next read or write location
provide the signed increment by which the DMA controller post-modifies the corresponding memory index registerindicate the number of words remaining to be transferred to or from memoryConfigure a DMA circular bufferhold the starting address of the TCB for the next DMA operation on the corresponding channel14/33Slide15
TCB configuration for N less than 256
Bits
Name
Description
18-0
IIx
address
address of next chain pointer. This address must be offset by 0x8000019PCIProgram controlled interrupt(PCI)0 = no interrupt after current TCB1 = interrupt after current TCB20 COEFFSELCoefficient select for next TCB0 = next TCB is data TCB1 = next TCB is coefficient TCBTCB Index register15/33Slide16
FFT C-API header file
16
/33Slide17
The
ConfHardFFT
function
Setup FFTCTL2 register
17
/33Slide18
The
RunFFT function
18
/33Slide19
Twiddle generation MATLAB function
19
/33Slide20
Timing Analysis
Cycle count
Total
FFT type
Data read
Coefficient read
Data writeCompute FFT accelerator N<=2562Nx22Nx22N10N+NFFT accelerator N=VxH>256Vertical FFT2Nx22Vx22N28N+Special prod2Nx24Nx2
2N
2 x 4N/4
Horizontal FFT
2Nx2
2Hx2
2N
Cycle count
Total
FFT type
Data read
Coefficient read
Data write
Compute
FFT accelerator N<=256
2Nx2
2Nx2
2N
FFT accelerator N=
VxH
>256
Vertical FFT
2Nx2
2Vx2
2N
Special prod
2Nx2
4Nx2
2N
2 x 4N/4
Horizontal FFT
2Nx2
2Hx2
2N
Reads from internal memory take 2 cycles/word.
Writes to internal memory take 1 cycle/word.
It takes 2 cycles to compute a single complex butterfly by the FFT core.
20
/33Slide21
Timing Analysis: measured cycles
FFT type
Speed ratio
Hardware
Software
256 points
9319 cycle
2623 cycle3.55512 points38919 cycle5303 cycle7.331024points-11151 cycle-21/33Slide22
Background telemetry channels
It is
a debug feature enabling real time data exchange with the processor via the JTAG interface without interrupting processor
execution.
BTC uses an intermediary I/O buffer in which the application has to periodically put the data.
22
/33Slide23
How to use BTC step 1: adding necessary files
Add the BTC library file
Include BTC headers and
signal.h
23
/33Slide24
How to use BTC step
2: BTC initialization
The BTC initialization is done by calling the function
btc_init
().For the ADSP21469, the BTC commands from the host are processed via the low-priority emulator interrupt (EMULI). So the corresponding interrupt vector must be installed.
24
/33Slide25
How to use BTC step
3: Channel
definition
Buffer used by the application
Intermediary buffer used by BTC
Channel name
(less than 32char)
Channel starting addressChannel size in 32-bit words25/33Slide26
BTC data must be updated periodically by the application.
Update rate shouldn’t be too high (a maximum of ~1.5 to 2
MBytes
of data per
second are supported)How to use BTC step 3: Data update
Channel number
buffer starting address
Buffer size in 32-bit words26/33Slide27
How to use BTC step
3: Display Data(BTC memory)
27
/33Slide28
How to use BTC step
3: plot Data
Use the
intermediary
buffers
28
/33Slide29
Hum
detector: Initialization and update tasks
Initialization task
Periodic Task
29
/33Slide30
Hum detector :
preemptive Task
30
/33Slide31
Hum detector : Screen Capture
31
/33Slide32
Hum detector : resolution
49+51Hz
49+50Hz
49+52Hz
49+62Hz
32
/33Slide33
References
[1]
Proakis
, Digital Communications, 5th edition, ISBN: 0-07-295716-6, 2008[2] ADSP-214xx SHARC® Processor Hardware Reference, Analog Devices
[3] SHARC®
Processor Programming Reference, Analog Devices
33/33