/
CS 314  Computer Organization CS 314  Computer Organization

CS 314 Computer Organization - PowerPoint Presentation

hysicser
hysicser . @hysicser
Follow
346 views
Uploaded On 2020-08-28

CS 314 Computer Organization - PPT Presentation

Fall 2017 Chapter 3 Arithmetic for Computers Haojin Zhu httptdtsjtueducnhjzhu Adapted from Computer Organization and Design 4 th Edition Patterson amp Hennessy 2012 MK ID: 808716

0000 bit 1111 remainder bit 0000 remainder 1111 0111 register divisor carry shift add 0001 instruction rem step quotient

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "CS 314 Computer Organization" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

CS 314 Computer Organization Fall 2017Chapter 3: Arithmetic forComputers

Haojin

Zhu (

http://tdt.sjtu.edu.cn/~hjzhu/

)

[Adapted from

Computer Organization and Design, 4

th

Edition

,

Patterson & Hennessy, © 2012, MK]

Slide2

Review: MIPS (RISC) Design PrinciplesSimplicity favors regularityfixed size instructionssmall number of instruction formatsopcode always the first 6 bits

Smaller is faster

limited instruction set

limited number of registers in register file

limited number of addressing modes

Make

the common case fast

arithmetic operands from the

register file

(load-store machine)

allow instructions to contain immediate

operands

Good design demands good compromises

three instruction formats

Slide3

Specifying Branch DestinationsUse a register (like in lw and sw) added to the 16-bit offsetwhich register? Instruction Address Register (the PC)its use is automatically

implied

by instruction

PC gets updated (PC+4) during the

fetch

cycle so that it holds the address of the next instructionlimits the branch distance to -215 to +215-1 (word) instructions from the (instruction after the) branch instruction, but most branches are local anyway

PC

Add

32

32

32

32

32

offset

16

32

00

sign-extend

from the low order 16 bits of the branch instruction

branch dst

address

?

Add

4

32

Slide4

MIPS also has an unconditional branch instruction or jump instruction:

j label #go to label

Other Control Flow Instructions

Instruction Format (

J

Format):

0x02

26-bit address

PC

4

32

26

32

00

from the low order 26 bits of the jump instruction

Why shift left by two bits?

Slide5

Review: MIPS Addressing Modes Illustrated

1. Register addressing

op rs rt rd funct

Register

word

operand

op rs rt offset

2. Base

(displacement) addressing

base register

Memory

word or byte

operand

3. Immediate addressing

op rs rt

operand

4. PC-relative addressing

op rs rt offset

Program Counter (PC)

Memory

branch destination

instruction

5. Pseudo-direct addressing

op jump address

Program Counter (PC)

Memory

jump destination

instruction

||

Slide6

32-bit signed numbers (2’s complement):0000 0000 0000 0000 0000 0000 0000 0000

two

= 0

ten

0000 0000 0000 0000 0000 0000 0000 0001

two = + 1ten...

0111 1111 1111 1111 1111 1111 1111 1110

two = + 2,147,483,646ten

0111 1111 1111 1111 1111 1111 1111 1111two = + 2,147,483,647

ten

1000 0000 0000 0000 0000 0000 0000 0000two

= – 2,147,483,648ten

1000 0000 0000 0000 0000 0000 0000 0001

two = – 2,147,483,647ten

...

1111 1111 1111 1111 1111 1111 1111 1110two = – 2

ten1111 1111 1111 1111 1111 1111 1111 1111

two = – 1ten

Number Representations

maxint

minint

Converting <32-bit values into 32-bit values

copy the most significant bit (the sign bit) into the “empty” bits

0010 -> 0000 0010

1010 -> 1111 1010

sign extend

versus zero extend (

lb

vs.

lbu

)

MSB

LSB

Slide7

MIPS Arithmetic Logic Unit (ALU)Must support the Arithmetic/Logic operations of the ISAadd, addi

,

addiu

,

addu

sub, subumult,

multu, div,

divusqrt

and, andi, nor, or,

ori, xor

, xori

beq, bne

, slt, slti

, sltiu, sltu

32

32

32

m (operation)

result

A

B

ALU

4

zero

ovf

1

1

With special handling for

sign extend –

addi

,

addiu

,

slti

,

sltiu

zero extend –

andi

,

ori

,

xori

overflow detection

add,

addi

, sub

Slide8

Dealing with Overflow

Operation

Operand

A

Operand

BResult indicating overflow

A + B≥

0

≥ 0< 0

A + B

< 0

< 0

≥ 0

A - B

≥ 0

< 0

< 0

A - B

< 0

≥ 0

≥ 0

Overflow occurs when

the result of an operation cannot be represented in 32-bits, i.e., when the sign bit contains a

value

bit of the result and not the proper

sign bit

When adding

operands with different signs or when subtracting operands with the same sign, overflow can

never occur

MIPS signals overflow with an

exception

(

aka interrupt) – an unscheduled procedure call where the EPC contains the address of the instruction that caused the exception

Slide9

Just like in grade school (carry/borrow 1s)

0111 0111 0110

+ 0110

- 0110

- 0101

Two's complement operations are easydo subtraction by negating and then adding

0111

 0111 - 0110

+ 1010

Overflow (result too large for

finite computer word)

e.g., adding two n-bit numbers does not yield an n-bit number

0111

+ 0001

Addition & Subtraction

Slide10

Just like in grade school (carry/borrow 1s)

0111 0111 0110

+ 0110

- 0110

- 0101

Two's complement operations are easy

do subtraction by negating and then adding

0111

 0111 - 0110

+ 1010

Overflow (result too large for

finite

computer word)e.g., adding two n-bit numbers does not yield an n-bit number

0111

+ 0001

Addition & Subtraction

1

101

0001 0001

0001

1 0001

1

000

Slide11

Building a 1-bit Binary Adder

1 bit Full Adder

A

B

S

carry_in

carry_out

S = A xor B xor carry_in

carry_out = A

&

B | A

&

carry_in | B

&

carry_in

(

majority function

)

How can we use it to build a 32-bit adder?

How can we modify it easily to build an adder/subtractor?

A

B

carry_in

carry_out

S

0

0

0

0

0

0

0

1

0

1

0

1

0

0

1

0

1

1

1

0

1

0

0

0

1

1

0

1

1

0

1

1

0

1

0

1

1

1

1

1

Slide12

Building 32-bit Adder

1-bit FA

A

0

B

0

S

0

c

0

=carry_in

c

1

1-bit FA

A

1

B

1

S

1

c

2

1-bit FA

A

2

B

2

S

2

c

3

c

32

=carry_out

1-bit FA

A

31

B

31

S

31

c

31

. . .

Just connect the carry-out of the least significant bit FA to the carry-in of the next least significant bit and connect . . .

Ripple Carry Adder (RCA)

advantage: simple logic, so small (low cost)

disadvantage: slow and lots of glitching (so lots of energy consumption)

Slide13

A 32-bit Ripple Carry Adder/Subtractor

Remember 2’s complement is just

complement all the bits

add a 1 in the least significant bit

A 0111

0111 B

- 0110

+

1-bit FA

S

0

c

0

=carry_in

c

1

1-bit FA

S

1

c

2

1-bit FA

S

2

c

3

c

32

=carry_out

1-bit FA

S

31

c

31

. . .

A

0

A

1

A

2

A

31

B

0

B

1

B

2

B

31

add/sub

B

0

control

(0=add,1=sub)

B

0

if control = 0, !B

0

if control = 1

Slide14

A 32-bit Ripple Carry Adder/Subtractor

Remember 2’s complement is just

complement all the bits

add a 1 in the least significant bit

A 0111

0111 B

- 0110

+

1-bit FA

S

0

c

0

=carry_in

c

1

1-bit FA

S

1

c

2

1-bit FA

S

2

c

3

c

32

=carry_out

1-bit FA

S

31

c

31

. . .

A

0

A

1

A

2

A

31

B

0

B

1

B

2

B

31

add/sub

B

0

control

(0=add,1=sub)

B

0

if control = 0

!B

0

if control = 1

0001

1001

1

1 0001

Slide15

Overflow Detection LogicCarry into MSB ! = Carry out of MSBFor a N-bit ALU: Overflow = CarryIn [N-1] XOR

CarryOut

[N-1]

Overflow

X

Y

X XOR Y

0

0

0

1

1

1

0

1

1

1

0

A0

B0

1-

bit

ALU

Result0

CarryIn0

CarryOut0

A1

B1

1-

bit

ALU

Result1

CarryIn1

CarryOut1

A2

B2

1-

bit

ALU

Result2

CarryIn2

CarryOut2

A3

B3

1-

bit

ALU

Result3

CarryIn3

CarryOut3

0

why?

Slide16

MultiplyBinary multiplication is just a bunch of right shifts and adds

multiplicand

multiplier

partial

product

array

double precision

product

n

2n

n

can be formed in parallel and added in parallel for faster multiplication

Slide17

More complicated than addition

Can be accomplished via shifting and adding

0010

(multiplicand)

x_

1011

(multiplier)

0010 0010

(partial product 0000

array) 0010 00010110

(product)

In every step multiplicand is shifted

next bit of multiplier is examined (also a shifting step) if this bit is 1, shifted multiplicand is added to the product

Multiplication

Slide18

In every step

multiplicand is shifted

next bit of multiplier is examined (also a shifting step)

if this bit is 1, shifted multiplicand is added to the product

Multiplication Algorithm 1

Slide19

Slide20

Comments on Multiplicand Algorithm 1

Performance

Three basic steps for each bit

It requires 100 clock cycles to multiply two 32-bit numbers If each step took a clock cycle,

How to improve it?

Motivation (

Performing the operations in parallel

):

Putting multiplier and the product together

Shift them together

Slide21

Refined Multiplicand Algorithm 2

multiplicand

32-bit ALU

multiplier

Control

add

shift

right

product

32-bit ALU and multiplicand is untouched

the sum keeps shifting right

at every step, number of bits in product + multiplier = 64,

hence, they share a single 64-bit register

Slide22

Add and Right Shift Multiplier Hardware

multiplicand

32-bit ALU

multiplier

Control

add

shift

right

product

0 1 1 0 = 6

0 0 0 0 0 1 0 1 = 5

add 0 1 1 0 0 1 0 1

0 0 1 1 0 0 1 0

add 0 0 1 1 0 0 1 0

0 0 0 1 1 0 0 1

add 0 1 1 1 1 0 0 1

0 0 0 1 1 1 1 0

add 0 0 1 1 1 1 0 0

0 0 1 1 1 1 0 0

= 30

Slide23

ExerciseUsing 4-bit numbers to save space, multiply 2ten*3ten, or 0010two * 0011two

Slide24

DivisionDivision is just a bunch of quotient digit guesses and left shifts and subtractsdividend = quotient x divisor + remainder

dividend

divisor

partial

remainder

array

quotient

n

n

remainder

n

0

0

0

0

0

0

Slide25

Division

1001

ten

Quotient

Divisor 1000

ten

| 1001010

ten

Dividend

-1000

10

101

1010

-1000

10

ten

Remainder

At every step,

shift divisor right and compare it with current dividend

if divisor is larger, shift 0 as the next bit of the quotient

if divisor is smaller, subtract to get new dividend and shift 1

as the next bit of the quotient

Slide26

26

First Version of Hardware

for Division

A comparison requires a subtract; the sign of the result is

examined; if the result is negative, the divisor must be added back

Slide27

1. Subtract the Divisor register from the

Remainder register, and place the result in the

Remainder register

.

Test Remainder

Remainder < 0

Remainder >=0

2

a. Shift the Quotient register to the left

setting the new rightmost bit to 1.

2

b. Restore the original value by adding the Divisor

reg

to the Remainder

reg

and place the sum in

the Remainder

reg. Also shift the Quotient register

to the

left, setting the new LSB to 0

3.

Shift the Divisor register right1 bit.

3

3rd

repetition?

No: < 33

repetitions

Done

Yes: 33

repetitions

Start

Divide Algorithm

Slide28

28

Divide Example

Divide 7

ten

(0000 0111

two

) by 2

ten

(0010

two

)

Iter

Step

Quot

Divisor

Remainder

0

Initial values

1

2

3

4

5

Slide29

Divide Example

Divide 7

ten

(0000 0111

two

) by 2

ten

(0010

two

)

Iter

Step

Quot

Divisor

Remainder

0

Initial values

0000

0010 0000

0000 0111

1

Rem = Rem – Div

Rem < 0

 +Div, shift 0 into Q

Shift Div right

0000

0000

0000

0010 0000

0010 0000

0001 0000

1110 0111

0000 0111

0000 0111

2

Same steps as 1

0000

0000

0000

0001 0000

0001 0000

0000 1000

1111 0111

0000 0111

0000 0111

3

Same steps as 1

0000

0000 0100

0000 0111

4

Rem = Rem – Div

Rem >= 0

 shift 1 into Q

Shift Div right

0000

0001

0001

0000 0100

0000 0100

0000 0010

0000 0011

0000 0011

0000 0011

5

Same steps as 4

0011

0000 0001

0000 0001

Slide30

30

Efficient Division

Remainder

Quotient

Divisor

64-

bit ALU

Shift Right

Shift Left

Write

Control

32

bits

64

bits

64

bits

divisor

32-bit ALU

quotient

Control

subtract

shift

left

dividend

remainder

Slide31

Left Shift and Subtract Division Hardware

divisor

32-bit ALU

quotient

Control

subtract

shift

left

dividend

remainder

0 0 1 0 = 2

0 0 0 0 0 1 1 0 = 6

0 0 0 0 1 1 0 0

sub 1 1 1 0 1 1 0 0

rem

neg

, so ‘

ient

bit = 0

0 0 0 0 1 1 0 0

restore remainder

0 0 0 1 1 0 0 0

sub 1 1 1 1 1 1 0 0

rem

neg

, so ‘

ient

bit = 0

0 0 0 1 1 0 0 0

restore remainder

0 0 1 1 0 0 0 0

sub 0 0 0 1 0 0 0 1

rem

pos, so ‘

ient

bit = 1

0 0 1 0 0 0 1 0

sub 0 0 0 0 0 0 1 1

rem

pos, so ‘

ient

bit = 1

= 3 with 0 remainder

Slide32

s

(0)

= z

for

j = 1 to k

if

2 s

(j-1)

- 2

k

d > 0 qk-j = 1

s(j) = 2 s(j-1) - 2k

d else qk

-j = 0 s(j) = 2 s(j-1)

32

Restoring Unsigned Integer Division

No need to restore

the remainder

in

the case of

R-D>0

,

Restore the remainder

In the case of

R-D<0,

the remainder

s

hift left by 1 bit

K=32, put divisor in the left 32 bit register

Slide33

Non-Restoring Unsigned Integer Division

s

(1)

= 2 z

-

2

k

d

for

j = 2 to k

if

s

(j

-1

)

0 qk-(j-1) = 1

s(j) = 2 s(j-1)

- 2k d else

qk-(j-1) = 0

s(j) = 2 s(j-1) + 2

k dend forif s

(k)  0

q0 = 1else q

0 = 0 Correction step

If in the last step, remainder –divisor >0,

Perform subtraction

If in the last step, remainder –divisor

<0

,

Perform

addition

why?

Slide34

s

(0)

= z

for

j = 1 to k

if

2 s

(j-1)

- 2k d > 0

q

k-j = 1 s(j) = 2 s

(j-1) - 2k d else

qk-j

= 0 s(j) = 2 s(j-1)

s

(1)

= 2 z

-

2

k

d

for j = 2 to k

if s(j-1

)  0

qk-(j-1) = 1

s(j) = 2 s(j-1)

- 2k d else

qk-(j-1)

= 0 s(j) = 2 s

(j-1) + 2k dend forif

s(k) 

0 q0 = 1else q0

= 0 Correction step

Restoring Unsigned Integer Division

equal

Why?

Non-Restoring

Unsigned Integer Division

considering two

consequent

steps

j-1 and

j, in particular

2s

(j

-

2)

- 2

k

d

<

0

In the j-1 step, Restoring

Algorithm computes

q

k

-j

= 0

s

(j-1)

= 2

s

(j-2)

Non-Restoring Algorithm

s

(j-1)

= 2

s

(j-2)

-

2

k

d

In the subsequent j step, Restoring Algorithm computes

2

s

(j-1)

- 2

k

d

=

= 2*2 s

(j-2)

- 2

k

d

In the subsequent j step, non-Restoring Algorithm computes

2

s

(j-1)

+

2

k

d

=

2*2

s

(j-2)

-

2*2

k

d

+2

k

d

=

2*2 s

(j-2)

-

2

k

d

2x-y= 2(x-y)+y

Slide35

Non-restoring algorithm

set

subtract_bit

true

1: If subtract bit true: Subtract the Divisor register from the Remainder and place the result in the remainder register else Add the Divisor register to the Remainder and place the result in the remainder register

2:If Remainder >= 0

Shift the Quotient register to the left, setting rightmost bit to 1 else Set subtract bit to

false3: Shift the Divisor register right 1 bit if < 33rd rep

goto 1

else Add Divisor register to remainder and place in Remainder register exit

Slide36

Example:

Perform n + 1 iterations for n bits

Remainder 0000 1011

Divisor 00110000

-----------------------------------

Iteration 1:(subtract)Rem 1101 1011Quotient 0Divisor 0001 1000-----------------------------------Iteration 2:

(add)Rem 11110011

Q00Divisor 0000 1100-----------------------------------Iteration 3:

(add)Rem 11111111Q000Divisor 0000 0110

-----------------------------------

Iteration 4:

(add)

Rem 0000 0101Q0001Divisor 0000 0011-----------------------------------

Iteration 5:(subtract)Rem 0000 0010

Q 00011Divisor 0000 0001Since reminder is positive, done.Q = 0011 and Rem = 0010

Slide37

ExerciseCalculate A divided by B

using restoring and

non-restoring

division. A=26, B=5

Slide38

Divide (div and

divu

) generates

the reminder in

hi

and the quotient in lo div $s0, $s1 # lo = $s0 / $s1 # hi = $s0 mod $s1

Instructions mfhi

rd and mflo

rd are provided to move the quotient and reminder to (user accessible) registers in the register fileMIPS Divide Instruction

As with multiply, divide ignores overflow so software must determine if the quotient is too large. Software must also check the divisor to avoid division by 0.

0 16 17 0 0 0x1A

Slide39

Lecture 1