TOPIC: Pipeline Forwarding SUBJECT: Computer Architecture PROBLEM: Answer the be
ID: 3592062 • Letter: T
Question
TOPIC: Pipeline Forwarding
SUBJECT: Computer Architecture
PROBLEM: Answer the below questions for two MIPS assembly language instructions: add $t2, $t3, $t4and sub $s2, $s3, $s4 that will require data forwarding to avoid a pipeline stall.
1) For each value that needs to be forwarded, explain:
a) Which register value(s) will need to be forwarded
b) Which instruction generated the value, and pipeline register the value will be forwarded from
c) Which instruction and pipeline stage within the instruction that the value will be forwarded to
2) List the exact hazard detection rule will catch this condition and then explain the rule in English.
Explanation / Answer
• Hazards cause imperfect pipelining • They prevent us from achieving CPI = 1 • They are generally causes by “counter flow” data dependences in the pipeline • Three kinds • Structural -- contention for hardware resources • Data -- a data value is not available when/where it is needed. • Control -- the next instruction to execute is not known. • Two ways to deal with hazards • Removal -- add hardware and/or complexity to work around the hazard so it does not exist • Bypassing/forwarding • Speculation • Stall -- Sacrifice performance to prevent the hazard from occurring • Stalling causes “bubbles” 2 Data Dependences • A data dependence occurs whenever one instruction needs a value produced by another. • Register values (for now) • Also memory accesses (more on this later) 3 add $s0, $t0, $t1 sub $t2, $s0, $t3 add $t3, $s0, $t4 and $t3, $t2, $t4 sw $t1, 0($t2) ld $t3, 0($t2) ld $t4, 16($s4) • In our simple pipeline, these instructions cause a hazard • Dependences in the pipeline 4 Deco EX de Fetch Mem Write add $s0, $t0, $t1 back Deco EX de Fetch Mem Write sub $t2, $s0, $t3 back Cycles How can we fix it? • Ideas? 5 Solution 1: Make the compiler deal with it. • Expose hazards to the big A architecture • A result is available N instructions after the instruction that generates it. • In the meantime, the register file has the old value. • “delay slots” • What is N? • Can it change? • What can the compiler do? 6 Deco EX de Fetch Mem Write back Compiling for delay slots 7 add $s0, $t0, $t1 sub $t2, $s0, $t3 add $t3, $s0, $t4 and $t7, $t5, $t4 add $s0, $t0, $t1 and $t7, $t5, $t4 sub $t2, $s0, $t3 add $t3, $s0, $t4 Rearrange instructions • The compiler must fill the delay slots with other instructions • What if it can’t? • No-ops Solution 2: Stall • When you need a value that is not ready, “stall” • Suspend the execution of the executing instruction • and those that follow. • This introduces a pipeline “bubble.” A bubble is a lack of work to do. It moves through the pipeline like an instruction. 8 Deco EX de Fetch Mem Write add $s0, $t0, $t1 back Fetch sub $t2, $s0, $t3 Cycles Deco EX de Mem Write Stall back Stalling the pipeline • Freeze all pipeline stages before the stage where the hazard occurred. • Disable the PC update • Disable the pipeline registers • This essentially equivalent to always inserting a nop when a hazard exists • Insert nop control bits at stalled stage (decode in our example) • How is this solution still potentially “better” than relying on the compiler? 9 The compiler can still act like there are delay slots to avoid stalls. Implementation details are not exposed in the ISA The Impact of Stalling On Performance • ET = I * CPI * CT • I and CT are constant • What is the impact of stalling on CPI? • What do we need to know to figure it out? 10 The Impact of Stalling On Performance • ET = I * CPI * CT • I and CT are constant • What is the impact of stalling on CPI? • Fraction of instructions that stall: 30% • Baseline CPI = 1 • Stall CPI = 1 + 2 = 3 • New CPI = 11 0.3*3 + 0.7*1 = 1.6 Solution 3: Bypassing/Forwarding • Data values are computed in Ex and Mem but “publicized in write back” • The data exists! We should use it. 12 Deco EX de Fetch Mem Write back results known Results "published" inputs are needed to registers • Take the values, where ever they are • Bypassing or Forwarding 13 Deco EX de Fetch Mem Write add $s0, $t0, $t1 back Deco EX de Fetch Mem Write sub $t2, $s0, $t3 back Cycles Forwarding Paths 14 Deco EX de Fetch Mem Write add $s0, $t0, $t1 back Deco EX de Fetch Mem Write sub $t2, $s0, $t3 back Cycles Deco EX de Fetch Mem Write back Deco EX de Fetch Mem Write back sub $t2, $s0, $t3 sub $t2, $s0, $t3 Forwarding in Hardware Read Address Instruc(on Memory Add PC 4 Write Data Read Addr 1 Read Addr 2 Write Addr Register File Read Data 1 Read Data 2 16 32 ALU Shi< le< 2 Add Data Memory Address Write Data Read Data IFetch/Dec Dec/Exec Exec/Mem Mem/WB Sign Extend Add Forwarding for Loads • Load values come from the Mem stage 16 Deco EX de Fetch Mem Write ld $s0, (0)$t0 back Deco EX de Fetch Mem sub $t2, $s0, $t3 Cycles Time travel presents significant implementation challenges
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.