/
Folding and Pipelining complex combinational circuits Folding and Pipelining complex combinational circuits

Folding and Pipelining complex combinational circuits - PowerPoint Presentation

calandra-battersby
calandra-battersby . @calandra-battersby
Follow
402 views
Uploaded On 2016-06-15

Folding and Pipelining complex combinational circuits - PPT Presentation

Arvind Computer Science amp Artificial Intelligence Lab Massachusetts Institute of Technology February 22 2012 L05 1 httpcsgcsailmitedu6S078 Combinational IFFT Suppose we want to reduce the area of the circuit ID: 363890

inq stage mit bfly4 stage inq bfly4 mit http 2012 february s078 outq csg csail pipeline l05 sreg2 sreg1

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Folding and Pipelining complex combinati..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

Folding and Pipelining complex combinational circuitsArvind Computer Science & Artificial Intelligence LabMassachusetts Institute of Technology

February 22, 2012

L05-1

http://csg.csail.mit.edu/6.S078Slide2

Combinational IFFT:Suppose we want to reduce the area of the circuitin0

in1

in2

in63

in3

in4

Bfly4

Bfly4

Bfly4

x16

Bfly4

Bfly4

Bfly4

Bfly4

Bfly4

Bfly4

out0

out1

out2

out63

out3

out4

Permute

Permute

Permute

Reuse the same circuit three times

to reduce area

Folding

February 22, 2012

L05-

2

http://csg.csail.mit.edu/6.S078Slide3

f

g

Reusing a combinational block

we expect:

Throughput to

Area to

f

f

g

decrease – less parallelism

The clock needs to run faster for the same throughput

hyper-linear increase in energy

decrease – reusing a block

February 21, 2012

L04-

3

http://csg.csail.mit.edu/6.S078

Introduce state elements to hold intermediate valuesSlide4

Circular or folded pipeline: Reusing the Pipeline Stagein0…

in1

in2

in63

in3

in4

out0

out1

out2

out63

out3

out4

Bfly4

Bfly4

Permute

Stage Counter

February 22, 2012

L05-

4

http://csg.csail.mit.edu/6.S078Slide5

Folded pipelinerule folded-pipeline (True); if

(stage==0)

begin

sxIn= inQ.first(); inQ.deq();

end

else

sxIn= sReg;

sxOut = f(stage,sxIn);

if

(stage==n-1) outQ.enq(sxOut);

else

sReg <= sxOut;

stage <= (stage==n-1)? 0 : stage+1;

endrule

x

sReg

inQ

f

outQ

stage

no for-loop

Need type declarations for

sxIn

and

sxOut

February 22, 2012

L05-

5

http://csg.csail.mit.edu/6.S078Slide6

BSV Code for stage_ffunction Vector#(64, Complex) stage_f

(Bit#(2) stage, Vector#(64, Complex) stage_in);

begin

for

(Integer i = 0; i < 16; i = i + 1)

begin

Integer idx = i * 4;

let

twid = getTwiddle(stage, fromInteger(i));

let

y =

bfly4(twid, stage_in[idx:idx+3]);

stage_temp[idx] = y[0]; stage_temp[idx+1] = y[1]; stage_temp[idx+2] = y[2]; stage_temp[idx+3] = y[3];

end

//Permutation for

(Integer i = 0; i < 64; i = i + 1)

stage_out[i] = stage_temp[permute[i]]; end

return(stage_out);

February 22, 2012

L05-

6

http://csg.csail.mit.edu/6.S078Slide7

Folded pipeline-multiple rulesanother way of expressing the same computationrule

foldedEntry

if

(stage==0

);

sReg

<= f(stage,

inQ.first

()); stage <= stage+1;

inQ.deq

();

endrule

rule

foldedCirculate

if

(stage!=

0)&(stage<(n-1));

sReg <= f(stage,

sReg); stage <= stage+1;

endrulerule

foldedExit

if (stage==n-1);

outQ.enq(f(stage, sReg)); stage <= 0;

endrule

x

sReg

inQ

f

outQ

stage

Disjoint firing conditions

February 22, 2012

L05-

7

http://csg.csail.mit.edu/6.S078Slide8

Superfolded circular pipeline: use just one Bfly-4 node!in0

in1

in2

in63

in3

in4

out0

out1

out2

out63

out3

out4

Bfly4

Permute

Index == 15?

Index:

0 to 15

64, 2-way Muxes

4, 16-way Muxes

4, 16-way DeMuxes

Stage

0 to 2

February 21, 2012

L04-

8

http://csg.csail.mit.edu/6.S078

Lab 3Slide9

Combinational IFFTLot of area and long combinational delayFolded or multi-cycle version can save area and reduce the combinational delay but throughput per clock cycle gets worsePipelining: a method to increase the circuit throughput to evaluate multiple IFFTs

in0

in1

in2

in63

in3

in4

Bfly4

Bfly4

Bfly4

x16

Bfly4

Bfly4

Bfly4

Bfly4

Bfly4

Bfly4

out0

out1

out2

out63

out3

out4

Permute

Permute

Permute

February 22, 2012

L05-

9

http://csg.csail.mit.edu/6.S078Slide10

Pipelining a blockinQ

outQ

f2

f1

f3

Combinational

C

inQ

outQ

f2

f1

f3

Pipeline

P

inQ

outQ

f

Folded

Pipeline

FP

Clock?

Area?

Throughput?

February 22, 2012

L05-

10

http://csg.csail.mit.edu/6.S078Slide11

Inelastic pipeline

x

sReg1

inQ

f0

f1

f2

sReg2

outQ

rule

sync-pipeline (True);

inQ.deq();

sReg1

<=

f0(

inQ.first

());

sReg2

<=

f1(sReg1);

outQ.enq(f2(sReg2));

endrule

This rule can fire only if

February 22, 2012

L05-

11

http://csg.csail.mit.edu/6.S078Slide12

Stage functions f1, f2 and f3function f0(x); return

(

stage_f(0,x

));

endfunction

function

f1(x

);

return

(

stage_f

(1,x

));

endfunction

function

f2(x

);

return

(

stage_f

(2,x

)); endfunction

The stage_f function was given earlier

February 22, 2012

L05-

12

http://csg.csail.mit.edu/6.S078Slide13

Problem: What about pipeline bubbles?

x

sReg1

inQ

f0

f1

f2

sReg2

outQ

rule

sync-pipeline (True);

inQ.deq();

sReg1 <=

f0(

inQ.first

());

sReg2 <=

f1(sReg1);

outQ.enq(f2(sReg2));

endrule

February 22, 2012

L05-

13

http://csg.csail.mit.edu/6.S078Slide14

Explicit encoding of Valid/Invalid datarule sync-pipeline (True); if

(

inQ.notEmpty())

begin

sReg1

<=

f0(

inQ.first

()); inQ.deq

();

sReg1f <= Valid

end

else

sReg1f <=

Invalid;

sReg2

<= f1(sReg1); sReg2f <= sReg1f;

if (

sReg2f == Valid) outQ.enq(f2(sReg2));endrule

sReg1

sReg2

x

inQ

f0

f1

f2

outQ

Valid/Invalid

February 22, 2012

L05-

14

http://csg.csail.mit.edu/6.S078Slide15

When is this rule enabled?inQ sReg1f sReg2f

outQ

NE V

V

NF

NE V

V

F

NE V I NF

NE V I F

NE I V NF

NE I V F

NE I I NF

NE I I F

E V

V NF

E V V F

E V I NFE V I F

E I V NFE I V FE I I NF

E I I F

Yes

Yes

Yes1 = yes but no change

inQ

sReg1f

sReg2f

outQ

February 22, 2012

L05-

15

http://csg.csail.mit.edu/6.S078

rule

sync-pipeline (True);

if

(

inQ.notEmpty

())

begin sReg1

<= f0(

inQ.first()); inQ.deq();

sReg1f <= Valid end

else

sReg1f

<= Invalid

;

sReg2 <=

f1(sReg1); sReg2f <= sReg1f;

if (

sReg2f == Valid) outQ.enq(f2(sReg2));

endruleSlide16

Area estimates Tool: Synopsys Design CompilerComb. FFTCombinational area: 16536Noncombinational area:  9279Linear FFT

Combinational area:       20610

Noncombinational area:     18558Circular FFT

Combinational area:      

29330

Noncombinational

area:    

11603

February 22, 2012

L05-

16

http://csg.csail.mit.edu/6.S078

Surprising?

Explanation?Slide17

The Maybe type data in the pipelinetypedef union tagged {

void Invalid;

data_T

Valid;

} Maybe#(

type

data_T

);

data

valid/invalid

Registers contain Maybe type values

rule

sync-pipeline (True);

if

(

inQ.notEmpty

())

begin

sReg1

<=

tagged

Valid f0(

inQ.first

()); inQ.deq

();

end

else

sReg1

<= tagged Invalid

;

case (sReg1)

matches

tagged Valid .sx1:

sReg2

<= tagged Valid f1(sx1

); tagged

Invalid: sReg2

<= tagged Invalid

; endcase

case

(sReg2) matches

tagged

Valid .sx2: outQ.enq(f2(sx2));

endcase

endrule

February 22, 2012

L05-

17http://csg.csail.mit.edu/6.S078