/
How does CLA (carry look-ahead adder) work? How does CLA (carry look-ahead adder) work?

How does CLA (carry look-ahead adder) work? - PowerPoint Presentation

evans
evans . @evans
Follow
345 views
Uploaded On 2022-05-14

How does CLA (carry look-ahead adder) work? - PPT Presentation

Wei jen Hsu TA for EE457 at USC Fall 2004 Modified in Fall 2005 A simple onebit full adder A B Cin S Cout It takes A B and Cin as input and generates S and Cout in 2 gate delays SOP ID: 910958

gate bit generate carry bit gate carry generate cin adder delay delays cla cll cout tier generated propagate parallel

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "How does CLA (carry look-ahead adder) wo..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

How does CLA (carry look-ahead adder) work?

Wei-

jen

Hsu

TA for EE457 at USC, Fall 2004

Modified in Fall 2005

Slide2

A simple one-bit full adder

(+)

A

B

Cin

S

Cout

It takes A, B, and Cin as input and generates S and Cout in 2 gate delays (SOP)

Slide3

(+)

4-bit RCA

(+)

A3

B3

S3

C4

(+)

A1

B1

S1

A2

B2

S2

(+)

A0

B0

C0

S0

C1

C2

C3

Work from lowest bit to highest bit sequentially.

With A0, B0, and C0, the lowest bit adder generates S0 and C1 in 2 gate delay.

With A1, B1, and C1 ready, the second bit adder generates S1 and C2 in 2 gate

delay.

Each bit adder has to wait for the lower bit adder to propagate the carry.

Carry propagation

forms a long sequential

wait chain, hence RCA

Is slow!!

Slide4

Observations

The critical component each bit adder waits for is the carry input.

Instead of generating and propagating carry bit-by-bit, can we generate all of them in parallel and break the sequential chain?

This is exactly the idea of CLA (carry look-ahead adder).

Slide5

Carry Look Ahead Logic

Now even before the carry in (

Cin

) is available, based on the inputs (A,B) only, can we say anything about the carry out?

Under what condition will the bit-cell propagate an outgoing carry (Cout), if there is an incoming carry (Cin)?Under what condition will the bit-cell generate an outgoing carry (Cout

), regardless of whether there is an incoming carry (Cin)?

Slide6

1-bit CLA adder (a primary cell)

(+)

A

B

Cin

S

p

g

Instead of

Cout

, an 1-bit CLA adder block takes A, B inputs and generates

p,g

p=propagate => I will propagate the

Cin

to the next bit. p = A+B

(If either A or B is 1, then

Cin

=1 causes

Cout

=1)

g=generate => I will generate a

Cout

independent of what

Cin

is. g = A.B

(If both A and B are 1,

Cout

=1 for sure)

p,g

are generated in 1 gate delay after we have A,B. Note that

Cin

is not needed

to generate p,g.

S is generated in 2 gate delays after we get Cin later (using SOP or XOR).

Slide7

4-bit CLA

(+)

A

B

C0

p

g

(+)

A

B

p

g

(+)

A

B

p

g

(+)

A

B

p

g

CLL (carry look-ahead logic)

The CLL takes

p,g

from all 4 bit-cells and C0 as inputs and generates all Cs in 2 gate delays.

C1=g0+p0C0,

C2=g1+p1g0+p1p0C0,

C3=g2+p2g1+p2p1g0+p2p1p0c0,

C4=g3+p3g2+p3p2g1+p3p2p1g0+p3p2p1p0c0 (Note: this C4 is too complicated

to generate in 2-level SOP representation as it exceeds Fan-in of 4 for AND

and

OR)

C1

C2

C3

C4

Slide8

4-bit CLA

(+)

A0

B0

C0

(+)

A1

B1

(+)

A2

B2

(+)

A3

B3

CLL (carry look-ahead logic)

p0

g0

p1

g1

p2

g2

p3

g3

Given A,B’s, all p,g’s are generated in 1 gate delay in parallel.

C1

C2

C3

Given all p,g’s, all C’s are generated in 2 gate delay in parallel.

S3

S2

S1

S0

Given all C’s, all S’s are generated in 2 gate delay in parallel.

Key virtue of CLA

: sequential operation in RCA is broken into parallel operation!!

Slide9

Observation

The CLL block cannot be made arbitrarily big (at most 4 bits) because if the equations for C’s are too long it cannot be evaluated in 2 gate delay (

Fan-in limitation on gates is 4 as required by VLSI technology

).

So how about long operands, say 16 bits?We add another layer of CLL and make a multi-level CLA.

Slide10

16-bit CLA

Same as before, p,g’s are generated in parallel in 1 gate delay

The

second-tier CLL

takes

the P,G’s from first-tier CLLs and the C0

to generate the “seed C’s”

for first-tier CLLs in 2 gate delay. (note that the logic for generating “seed C’s” from

P,G’s is exactly the same to generating C’s from

p,g’s

!)

The seed C’s form input

Cin

to the first-tier CLLs. They use this

Cin

and

p,g’s

to generate C’s in 2 gate delays.

With all C’s in place, S’s are calculated in 2 gate delays

Therefore, totally

1+2+2+2+2=9 gate delay

to finish the whole thing!!

Now, without input carry, the first-tier CLL cannot generate C’s……

Instead they generate P,G’s (group propagate and group generate) in 2 gate delays

P => This group will propagate the input carry as output carry P=p0p1p2p3

G => This group will generate an output carry G=g3+p3g2+p3p2g1+p3p2p1g0

Slide11

Now, how about 64-bit CLA?

You can visualize that in mind by yourself now, I guess.

Slide12

A bit more details

(+)

A0

B0

C0

(+)

A1

B1

(+)

A2

B2

(+)

A3

B3

CLL (carry look-ahead logic)

p0

g0

p1

g1

p2

g2

p3

g3

C1

C2

C3

C4

S3

S2

S1

S0

Do all these 4 S’s (S3, S2, S1, S0) come together?

Actually no! Since C0 is available from the beginning, S0 can be calculated in

2 gate delays (using original SOP expression for S bit in a single bit adder)

(before S3,S2,S1)

Slide13

A bit more details (Cont’d)

Again, actually not all the S’s come together!

C0 is readily available, so S0 can be calculated in 2 gate delays.

Since C0 is readily available, the lowest first-tier CLL can generate C1, C2, C3

independent of the second-tier CLL. Since C1, C2, C3 are done earlier, so is S1,

S2, S3. (in 5 gate delays. 1 for (

p,g

), 2 for (C1,C2,C3), 2 for S)

When we get C4, C8, C12, we can start to calculate S4, S8, S12 and get them

in 2 more gate delays. That is the same time when we get the other C’s (purple

guys) at 7 gate delays.

Timing for items generated (in terms of gate delay):

black=already available (incorrect black color for S0 because S0 comes out at 2)

orange=1

,

green=3,

blue=5,

purple=7,

brown=9

C0

Slide14

Thanks!!

(You can distribute these slides as one whole file to anywhere you feel it may be useful.)