/
Branch Prediction Define branch prediction. Branch Prediction Define branch prediction.

Branch Prediction Define branch prediction. - PowerPoint Presentation

luna
luna . @luna
Follow
75 views
Uploaded On 2023-11-09

Branch Prediction Define branch prediction. - PPT Presentation

Draw a state machine for a 2 bit branch prediction scheme Explain the impact on the compiler of branch delay Chapter 4 The Processor 2 Control Hazards Consider add t1 zero zero ID: 1030892

instruction branch computer ifequal branch instruction ifequal computer prediction instructions beq addi add loop compiler processor bit previous determined

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Branch Prediction Define branch predicti..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

1. Branch PredictionDefine branch prediction. Draw a state machine for a 2 bit branch prediction scheme Explain the impact on the compiler of branch delay.

2. Chapter 4 — The Processor — 2Control HazardsConsider: add $t1, $zero, $zero # t1=0beq $t1, $zero, IfequalNotequal:addi $v0, $zero, 4Ifequal: addi $v0, $zero, 17 Branch determines flow of controlFetching next instruction depends on branch outcome

3. Chapter 4 — The Processor — 3Stall on BranchWait until branch outcome determined before fetching next instructionPipeline can’t determine next instruction until MEM stage of beqStill working on ID stage of beq when IF should begin!add $t1, $zero, $zerobeq $t1, $zero, Ifequal addi $v0, $zero, 4 #Notequal addi $v0, $zero, 17 #IfequalNext instr determined here

4. Chapter 4 — The Processor — 4Deciding earlier helps a little…Extra hardware can be designed to test registers and update the PC in the ID stageThen IF of next inst can be done one step earlierStill have a 1-cycle stall, howeveradd $t1, $zero, $zerobeq $t1, $zero, Ifequal addi $v0, $zero, 4 #Notequal addi $v0, $zero, 17 #IfequalNext instr determined herewith extra hardware

5. Performance penalty of stalling on branch: 17% of instructions executed in the SPECint2006 benchmark are branch instructionsIf we always stalled for 1 clock cycle on a branch, what performance penalty would we have?CS2710 Computer Organization5Other instructions: CPI of 1Branches would take 2.83 * 1 + 2*.17 = 1.17 CPI17% slowdown

6. Branch PredictionA method of resolving branch hazards that assumes a given outcome for the branch and proceeds from that assumption rather than waiting to ascertain the actual outcomeCS2710 Computer Organization6

7. 1-bit Dynamic Branch Prediction One possibility is to have each branch instruction reserve a bit that retains the “history” of the last decision0: branch not taken1: branch takenTo execute a branchCheck history bit, expect the same outcomeStart fetching from fall-through (next instruction) or branch targetIf wrong, flush pipeline and flip prediction bitCS2710 Computer Organization7add $t1, $zero, $zerobeq $t1, $zero, Ifequaladdi $v0, $zero, 4 #NotequalNext actualinstr determined here

8. Problems with 1-bit Dynamic Branch Prediction Consider a loop that branches 9 times in a row, then is not taken once (end of loop condition is met)Branch taken 9 times, not taken 1 timeAt steady stateThe first branch decision will be incorrect (from previous execution)The final branch decision will be incorrectThus, the prediction accuracy would only be 80%CS2710 Computer Organization8

9. Chapter 4 — The Processor — 92-Bit PredictorOnly change prediction on two successive mispredictions

10. Loops and Static Branch PredictionConsider the following loop of codeWhich branch might we reliably predict?.textmain: li $t0, 100loop: addi $t0, $t0, -1 add $t0, $t0, $zero bnez $t0, loop #other instructions follow here… CS2710 Computer Organization10

11. Example 2: Assembly while-loop.textmain: li $t0, 10 loop: beqz $t0, exitLoop addi $t0, $t0, -1 add $t0, $t0, $zero j loopexitLoop: # Goto main j mainCS2710 Computer Organization11Which branch is more probable?

12. Static prediction based on code analysis (done by compiler)Assume all branches to a previous address are always takenAssume all branches to a subsequent address are not takenCS2710 Computer Organization12

13. Dynamic Versus static branch predictionStatic branch predictionBased on typical branch behaviorExample: loop and if-statement branchesPredict backward branches takenPredict forward branches not takenDynamic branch predictionHardware measures actual branch behaviore.g., record recent history of each branchAssume future behavior will continue the trendWhen wrong, stall while re-fetching, and update historyCS2710 Computer Organization13

14. SurveyThe branch prediction methods we just discussed were examples ofStatic Branch PredictionDynamic Branch PredictionI haven’t a clueCS2710 Computer Organization14

15. Chapter 4 — The Processor — 15MIPS approach: delayed branchingAlways assume “branch not taken”.This means the instruction immediately following the branch instruction will always begin to executeThe actual decision to branch will/will not be taken until after that instruction begins to execute!Leaves it to the compiler to insert a “useful” instruction right after the branch that would have needed to execute whether or not the branch was takenadd $t1, $zero, $zerobeq $t1, $zero, Ifequal#next inst after beq!!Next actual instr determined herewith extra hardware

16. Delayed branching examplebefore after CS2710 Computer Organization16#previous instructionsadd $s1, $s2, $s3 beq $s2, $zero, Ifequal # no-branch instructions Ifequal: # branch instructions#previous instructionsbeq $s2, $zero, Ifequal add $s1, $s2, $s3 # no-branch instructions +Ifequal: # branch instructionsMIPS always assumes “branch not taken”, so the pipeline will automaticallybegin executing the next instruction following the beq.The actual branch will be delayed until AFTER the next instruction executesThe compiler must help out by inserting a “useful” instruction after the beq toexecute while the branch decision is being made by the processor

17. Delayed branching pitfallbefore not possibleCS2710 Computer Organization17#previous instructionsadd $s2, $s1, $s3 beq $s2, $zero, Ifequal # no-branch instructions Ifequal: # branch instructions#previous instructionsbeq $s2, $zero, Ifequal add $s2, $s1, $s3 # no-branch instructions +Ifequal: # branch instructionsIn this case, the beq instruction depends on $s2 being up-to-date before the branching decision is madeIf the compiler moves the add instruction until after beq, then $s2 will be updated too late – beq would use a “stale” value of $s2!!The compiler in this case would have to search for a different instruction that it could insert after the beqIf no such instruction can be found (which is rare), the pipelinewill stall