Consider a program that executes a large number of instructions. Assume that the
ID: 3740622 • Letter: C
Question
Consider a program that executes a large number of instructions. Assume that the program does not suffer from stalls from data hazards or structural hazards. Assume that 20% of all instructions are branch instructions, and 75% of these branch instructions are Taken. What is the average CPI for this program when it executes on each of the processors listed below? All of these processors implement an 10-stage pipeline and resolve a branch outcome at the end of the 4th stage. The 1st stage fetches an instruction, the 2nd stage does decode, the 3rd stage does register read, and the 4th stage does the computations for the branch. (30 points)
1 The processor pauses instruction fetch as soon as it fetches a branch. Instruction fetch is resumed after the branch outcome has been resolved.
2 The processor always fetches instructions sequentially. If a branch is resolved as Taken, the incorrectly fetched instructions after the branch are squashed.
3 The processor implements three branch delay slots. The compiler fills the branch delay slots with three instructions that come before the branch in the original code (option A in the videos).
4 The processor does not implement branch delay slots. Instead, it implements a hardware branch predictor that makes correct predictions for 90% of all branches. When an incorrect prediction is discovered, the incorrectly fetched instructions after the branch are squashed.
Explanation / Answer
(1)
The processor gaps instruction fetch as soon as it fetches a branch. Instruction make is resumed after the branch outcome has been determined. On an average, in a pipelined system each instruction have a CPI of one clock cycle. In this case, CPI for branch instructions is different as the processor gaps the instruction fetch as rapidly as it brands the branch instruction till branch outcome is determined. As branch consequence is determined in 3rd stage, so processor starts attractive new instructions 3 clock cycles after the branch instruction is fetched so branch instruction takes 3 clock cycles to execute.
So, average CPI of program is = 0.2 * 3 + 0.8 * 1
= 1.4
(2)
If a branch is taken, the instructions in the phases previous the 3rd stage are squashed. So branch instruction when the division is taken takes 3 clock cycles however all others take 1 clock cycle.
So, average CPI of program is = 0.20 * 0.75 * 3 + 0.8 * 1 + 0.20 * 0.25 * 1
= 1.3
(3)
As the branch delay slot is filled with an instruction before the branch instruction and processor stops fetching instructions as soon as division delay slot ins. Is fetched, so 2 clock cycles are compulsory for branches.
So, average CPI of program is = 0.20 * 2 + 0.8 * 1
= 1.20
(4)
As hardware analyst makes a precise prediction for 90% of divisions,
1 clock cycle is compulsory when the correct calculation is made and 3 clock cycles are essential for incorrect calculation.
So, usual CPI of program is = 0.20 * 0.90 * 1 + 0.20 * 0.05 * 3 + 0.8 * 1
= 1.01
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.