/
Designing Designing

Designing - PDF document

yvonne
yvonne . @yvonne
Follow
342 views
Uploaded On 2021-08-24

Designing - PPT Presentation

1MIPS ProcessorSingleCyclePresentation GCSE 67502 Introduction to Computer ArchitectureSlides by GojkoBabiReading Assignment5154g babicPresentation G2Were now ready to look at an implementation of t ID: 870866

control clock presentation babic clock control babic presentation cycle alu instructions instruction unit write datapath add design nsec type

Share:

Link:

Embed:

Download Presentation from below link

Download Pdf The PPT/PDF document "Designing" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1 1 Designing MIPS Processor (Single - Cy
1 Designing MIPS Processor (Single - Cycle) Presentation G CSE 675.02: Introduction to Computer Architecture Slides by Gojko Babi ć Reading Assignment : 5.1 - 5.4 g. babic Presentation G 2 • We're now ready to look at an implementation of the system that includes MIPS processor and memory. • The design will include support for execution of only: – memory - reference instructions: lw & sw , – arithmetic - logical instructions: add , sub , and , or , slt & nor , – control flow instructions: beq & j , – exception handling: illegal instruction & overflow . • But that design will provide us with principles, so many more instructions could be easily added such as: addu, lb, lbu, lui, addi, adiu, sltu, slti, andi, ori, xor, xori, jal, jr, jalr, bne, beqz, bgtz, bltz, nop, mfhi, mflo, mfepc, mfco, lwc1, swc1, etc. Introduction 2 g. babic Presentation G 3 • We will first design a simpler processor that executes each instruction in only one clock cycle time . • This is not efficient from performance point of view , since: – a clock cycle time (i.e. clock rate) must be chosen such that the longest instruction can be executed in one clock cycle and – that makes shorter instructions execute in one unnecessarily long cycle. • Additionally,

2 no resource in the design may be used mo
no resource in the design may be used more than once per instruction, thus some resources will be duplicated . • The singe cycle design will require: – two memories (instruction and data), – two additional adders. Single Cycle Design g. babic Presentation G 4 Elements for Datapath Design 1 6 3 2 S i g n e x t e n d g . S i g n - e x t e n s i o n u n i t 32 3 2 h. Shift left 2 Shift Left 2 P C a . P r o g r a m c o u n t e r 32 32 R e g W r i t e R e g i s t e r s W r i t e r e g i s t e r R e a d d a t a 1 R e a d d a t a 2 R e a d r e g i s t e r 1 R e a d r e g i s t e r 2 W r i t e d a t a D a t a D a t a R e g i s t e r n u m b e r s b . R e g i s t e r File 5 5 5 32 32 32 A d d S u m d . A d d e r 32 32 32 M e m R e a d M e m W r i t e D a t a m e m o r y W r i t e d a t a R e a d d a t a e . D a t a m e m o r y u n i t A d d r e s s 32 32 32 I n s t r u c t i o n m e m o r y I n s t r u c t i o n a d d r e s s I n s t r u c t i o n f . I n s t r u c t i o n m e m o r y 32 32 MemRead=1 MemWrite =0 c . A L U A L U c o n t r o l A L U r e s u l t A L U Z e r o 4 32 32 32 overflow 3 g. babic Presentation G 5 • This generic implementation: – uses the program counter (PC) to supply instruction address, – gets the instruction from memory, – reads registers

3 , – uses the instruction opcode to d
, – uses the instruction opcode to decide exactly what to do. R e g i s t e r s R e g i s t e r # D a t a R e g i s t e r # D a t a m e m o r y A d d r e s s D a t a R e g i s t e r # P C I n s t r u c t i o n A L U I n s t r u c t i o n m e m o r y A d d r e s s Abstract /Simplified View (1 st look) g. babic Presentation G 6 Abstract /Simplified View (2 nd look) Figure 5.1 • PC is incremented by 4 by most instructions, and 4 + 4 × offset by branch instructions. • Jump instructions change PC differently (not shown). 4 g. babic Presentation G 7 • An edge triggered methodology • Typical execution: – read contents of some state elements at the beginning of the clock cycle, – send values through some combinational logic, – write results to one or more state elements at the end of the clock cycle. Our Implementation C l o c k c y c l e S t a t e e l e m e n t 1 C o m b i n a t i o n a l l o g i c S t a t e e l e m e n t 2 • An edge triggered methodology allows a state element to be read and written in the same clock cycle. Figure 5.5 g. babic Presentation G 8 P C I n s t r u c t i o n m e m o r y R e a d a d d r e s s I n s t r u c t i o n 4 A d d Incrementing PC & Fetching Instruction Clock Figure 5.6 with addition in red 5 g. babic Presentation G 9

4 Datapath for R - type Instructions R -
Datapath for R - type Instructions R - type 000000 rs rt rd 00000 funct 31 26 25 21 20 16 15 11 10 6 5 0 add = 32 sub = 34 slt = 42 and = 36 or = 37 nor = 39 I n s t r u c t i o n R e g i s t e r s W r i t e r e g i s t e r R e a d d a t a 1 R e a d d a t a 2 R e a d r e g i s t e r 1 R e a d r e g i s t e r 2 W r i t e d a t a A L U r e s u l t A L U Z e r o R e g W r i t e 4 I 25 - 21 I 20 - 16 I 15 - 11 Clock ALU control g. babic Presentation G 10 Complete Datapath for R - type Instructions P C I n s t r u c t i o n m e m o r y R e a d a d d r e s s I n s t r u c t i o n 4 A d d clock Based on contents of op - code and funct fields, Control Unit sets ALU control appropriately and asserts RegWrite, i.e. RegWrite = 1. R e g i s t e r s W r i t e r e g i s t e r R e a d d a t a 1 R e a d d a t a 2 R e a d r e g i s t e r 1 R e a d r e g i s t e r 2 W r i t e d a t a A L U r e s u l t A L U Z e r o R e g W r i t e 4 I 25 - 21 I 20 - 16 I 15 - 11 Clock ALU control 6 g. babic Presentation G 11 Datapath for LW and SW Instructions Control Unit sets : • ALU control = 0010 (add) for address calculation for both lw and sw • MemRead=0, MemWrite=1

5 and RegWrite=0 for sw • MemRead=1, M
and RegWrite=0 for sw • MemRead=1, MemWrite=0 and RegWrite=1 for lw 31 26 25 21 20 16 15 0 sw or lw opcode rs rt offset I n s t r u c t i o n 1 6 3 2 R e g i s t e r s W r i t e r e g i s t e r R e a d d a t a 1 R e a d d a t a 2 R e a d r e g i s t e r 1 R e a d r e g i s t e r 2 D a t a m e m o r y W r i t e d a t a R e a d d a t a W r i t e d a t a S i g n e x t e n d A L U r e s u l t Z e r o A L U A d d r e s s M e m R e a d M e m W r i t e R e g W r i t e A L U 4 I 25 - 21 I 20 - 16 I 20 - 16 I 15 - 0 control M e m W r i t e Clock g. babic Presentation G 12 Datapath for R - type, LW & SW Instructions Let us determine setting of control lines for R - type, lw & sw instructions. P C I n s t r u c t i o n m e m o r y R e a d a d d r e s s I n s t r u c t i o n 1 6 3 2 R e g i s t e r s W r i t e r e g i s t e r W r i t e d a t a R e a d d a t a 1 R e a d d a t a 2 R e a d r e g i s t e r 1 R e a d r e g i s t e r 2 S i g n e x t e n d A L U r e s u l t Z e r o D a t a m e m o r y A d d r e s s W r i t e d a t a R e a d d a t a 4 A d d A L U A L U control 4 M e m R e a d M e m W r i t e A L U S r c M e m t o R e g 0 1 0 1 1 0 RegDst Clock

6 rs rt rd MemRead =1 MemWrite =0 Clock o
rs rt rd MemRead =1 MemWrite =0 Clock offset W r i t e Clock R e g 7 g. babic Presentation G 13 31 26 25 21 20 16 15 0 beq rs rt offset Datapath for BEQ Instruction Branch target = [PC] + 4 + 4 × offset 1 6 3 2 S i g n e x t e n d Z e r o A L U S u m S h i f t l e f t 2 T o b r a n c h c o n t r o l l o g i c B r a n c h t a r g e t P C + 4 f r o m i n s t r u c t i o n d a t a p a t h I n s t r u c t i o n A d d R e g i s t e r s W r i t e r e g i s t e r R e a d d a t a 1 R e a d d a t a 2 R e a d r e g i s t e r 1 R e a d r e g i s t e r 2 W r i t e d a t a R e g W r i t e A L U control 4 rs rt Figure 5.9 with additions in red offset g. babic Presentation G 14 Datapath for R - type, LW, SW & BEQ Figure 5.15 with additions in red M e m t o R e g M e m R e a d A L U S r c R e g D s t P C I n s t r u c t i o n m e m o r y R e a d a d d r e s s I n s t r u c t i o n [ 3 1 – 0 ] I n s t r u c t i o n [ 2 0 – 1 6 ] I n s t r u c t i o n [ 2 5 – 2 1 ] A d d 4 1 6 3 2 I n s t r u c t i o n [ 1 5 – 0 ] 0 R e g i s t e r s W r i t e r e g i s t e r W r i t e d a t a W r i t e d a t a R e a d d a t a 1 R e a d d a t a 2 R e a d r e g i s t e r 1 R e a d r e g i s t e r 2 S i

7 g n e x t e n d A L U r e s u l t Z e r
g n e x t e n d A L U r e s u l t Z e r o D a t a m e m o r y A d d r e s s R e a d d a t a M u x 1 1 M u x 0 1 M u x 0 1 M u x 0 I n s t r u c t i o n [ 1 5 – 1 1 ] S h i f t l e f t 2 P C S r c A L U A d d A L U r e s u l t ALU control clock 4 MemRead=1 MemWrite=0 rs rt rd offset M e m W r i t e Clock W r i t e Clock R e g 8 g. babic Presentation G 15 P C I n s t r u c t i o n m e m o r y R e a d a d d r e s s I n s t r u c t i o n [ 3 1 – 0 ] I n s t r u c t i o n [ 2 0 1 6 ] I n s t r u c t i o n [ 2 5 2 1 ] A d d I n s t r u c t i o n [ 5 0 ] M e m t o R e g A L U O p M e m W r i t e R e g W r i t e M e m R e a d B r a n c h R e g D s t A L U S r c I n s t r u c t i o n [ 3 1 2 6 ] 4 1 6 3 2 I n s t r u c t i o n [ 1 5 0 ] 0 0 M u x 0 1 C o n t r o l A d d A L U r e s u l t M u x 0 1 R e g i s t e r s W r i t e r e g i s t e r W r i t e d a t a R e a d d a t a 1 R e a d d a t a 2 R e a d r e g i s t e r 1 R e a d r e g i s t e r 2 S i g n e x t e n d M u x 1 A L U r e s u l t Z e r o P C S r c D a t a m e m o r y W r i t e d a t a R e a d d a t a M u x 1 I n s t r u c t i o n [ 1 5 1 1 ] A L U c o n t r o l S h i f t l e f t 2 A L U A d d r e s s Control Unit and Datapath Clock MemRead=1 MemWrite=0 Clock anded Clock anded Figure 5.17 with additions in red rs rt rd

8 funct offset opcode g. babic Presentatio
funct offset opcode g. babic Presentation G 16 Op - code RegDst ALUSrc Memto - Reg Reg Write Mem Read Mem Write Branch ALUOp1 ALUp0 000000 1 0 0 1 d 0 0 1 0 100011 0 1 1 1 1 0 0 0 0 101011 d 1 d 0 0 1 0 0 0 000100 d 0 d 0 d 0 1 0 1 Truth Table for (Main) Control Unit Input Output R - type lw sw beq • ALUOp[1 - 0] = 00  signal to ALU Control unit for ALU to perform add function, i.e. set Ainvert = 0, Binvert= 0 and Operation= 10 • ALUOp[1 - 0] = 01  signal to ALU Control unit for ALU to perform subtract function, i.e. set Ainvert = 0, Binvert= 1 and Operation= 10 • ALUOp[1 - 0] = 10  signal to ALU Control unit to look at bits I [5 - 0] and based on its pattern to set Ainvert, Binvert and Operation so that ALU performs appropriate function, i.e. add, sub, slt, and, or & nor 9 g. babic 17 Truth Table of ALU Control Unit ALUOp Funct field ALU Control ALUOp1 ALUOp0 F5 F4 F3 F2 F1 F0 0 0 d d d d d d 0 0 10 0 1 d d d d d d 0 1 10 1 0 1 0 0 0 0 0 0 0 10 1 0 1 0 0 0 1 0 0 1 10 1 0 1 0 0 1 0 0 0 0 00 1 0 1 0 0 1 0 1 0 0 01 1 0 1 0 1 0 1 0 0 1 11 add sub add sub and or slt nor Input Output 1 0 1 0 0 1 1 1 1 1 00 Ainvert Bivert Operation g. babic 18 R - f o r m a t I w s w b e q O p 0 O p 1 O p

9 2 O p 3 O p 4 O p 5 I n p u t s O u t p
2 O p 3 O p 4 O p 5 I n p u t s O u t p u t s R e g D s t A L U S r c M e m t o R e g R e g W r i t e M e m R e a d M e m W r i t e B r a n c h A L U O p 1 A L U O p O Op - code bits 5 4 3 2 1 0 RegDst ALUSrc Memto - Reg Reg Write Mem Read Mem Write Branch ALUOp1 ALUp0 0 0 0 0 0 0 1 0 0 1 d 0 0 1 0 1 0 0 0 1 1 0 1 1 1 1 0 0 0 0 1 0 1 0 1 1 d 1 d 0 0 1 0 0 0 0 0 0 1 0 0 d 0 d 0 d 0 1 0 1 Design of (Main) Control Unit RegDst =Op 5 Op 4 Op 3 Op 2 Op 1 Op 0 ALUSrc= Op 5 Op 4 Op 3 Op 2 Op 1 Op 0 +Op 5 Op 4 Op 3 Op 2 Op 1 Op 0 0 0 … 0 … 0 Figure C.2.5 10 g. babic 19 P C + 4 [ 3 1 – 2 8 ] Datapath for R - type, LW, SW, BEQ & J PC  PC 31 - 28 || jump_target || 00 31 26 25 0 j jump_target PC[31 - 28] Add 2 zeros Figure 5.24 with correction in red P C I n s t r u c t i o n m e m o r y R e a d a d d r e s s I n s t r u c t i o n [ 3 1 – 0 ] D a t a m e m o r y R e a d d a t a W r i t e d a t a R e g i s t e r s W r i t e r e g i s t e r W r i t e d a t a R e a d d a t a 1 R e a d d a t a 2 R e a d r e g i s t e r 1 R e a d r e g i s t e r 2 I n s t r u c t i o n [ 1 5 – 1 1 ] I n s t r u c t i o n [ 2 0 – 1 6 ] I n s t r u c t i o n [ 2 5 – 2 1 ] A d d A

10 L U r e s u l t Z e r o I n s t r u c t
L U r e s u l t Z e r o I n s t r u c t i o n [ 5 – 0 ] M e m t o R e g A L U O p M e m W r i t e R e g W r i t e M e m R e a d B r a n c h J u m p R e g D s t A L U S r c I n s t r u c t i o n [ 3 1 – 2 6 ] 4 M u x I n s t r u c t i o n [ 2 5 – 0 ] J u m p a d d r e s s [ 3 1 – 0 ] S i g n e x t e n d 1 6 3 2 I n s t r u c t i o n [ 1 5 – 0 ] 1 M u x 1 0 M u x 0 1 M u x 0 1 A L U c o n t r o l C o n t r o l A d d A L U r e s u l t M u x 0 1 0 A L U S h i f t l e f t 2 2 6 2 8 A d d r e s s shift left 2 g. babic 20 Design of Control Unit (J included) … 0 J 0 0 0 0 1 0 d d d 0 d 0 d d d Op - code bits 5 4 3 2 1 0 RegDst ALUSrc Memto - Reg Reg Write Mem Read Mem Write Branch ALUOp1 ALUp0 0 0 0 0 0 0 1 0 0 1 d 0 0 1 0 1 0 0 0 1 1 0 1 1 1 1 0 0 0 0 1 0 1 0 1 1 d 1 d 0 0 1 0 0 0 0 0 0 1 0 0 d 0 d 0 d 0 1 0 1 Jump 0 0 0 0 1 R - f o r m a t I w s w b e q O p 0 O p 1 O p 2 O p 3 O p 4 O p 5 I n p u t s R e g D s t A L U S r c M e m t o R e g R e g W r i t e M e m R e a d M e m W r i t e B r a n c h A L U O p 1 A L U O p O Jump Jump =Op 5 Op 4 Op 3 Op 2 Op 1 Op 0 No changes in ALU Control unit 11 g. babic Presentation G 21 • Let us assume that the only delays introduced ar

11 e by the following tasks: – Memory ac
e by the following tasks: – Memory access (read and write time = 3 nsec) – Register file access (read and write time = 1 nsec) – ALU to perform function (= 2 nsec) • Under those assumption here are instruction execution times: Instr Reg ALU Data Reg fetch read oper memory write Total R - type 3 + 1 + 2 + 1 = 7 nsec lw 3 + 1 + 2 + 3 + 1 = 10 nsec sw 3 + 1 + 2 + 3 = 9 nsec branch 3 + 1 + 2 = 6 nsec jump 3 = 3 nsec • Thus a clock cycle time has to be 10nsec, and clock rate = 1/10 nsec = 100MHz Cycle Time Calculation g. babic Presentation G 22 • Single Cycle Problems: – what if we had a more complicated instruction like floating point? – a clock cycle would be much longer, – thus for shorter and more often used instructions, such as add & lw, wasteful of time. • One Solution: – use a “smaller” cycle time, and – have different instructions take different numbers of cycles. • And that is a “multi - cycle” processor. Single Cycle Processor: Conclus