Consider a program that executes a large number of instructions. Assume that the
ID: 3815260 • Letter: C
Question
Consider a program that executes a large number of instructions. Assume that the program does not suffer from stalls from data hazards Assume that 20% of all instructions are branch instructions, and 65% of these branch instructions are Taken What is the average CPI for this program when it executes on each of the processors listed below? All of these processors implement an 8-stage pipeline and resolve a branch at the end of the 3rd stage. The 1st stage fetches an instruction, the 2nd stage does decode, and the 3rd stage does register read and branch resolution. The processor pauses instruction fetch as soon as it fetches a branch. Instruction fetch is resumed after the branch outcome has been resolved. The processor always fetches instructions sequentially. If a branch is resolved as Taken, the incorrectly fetched instructions after the branch are squashed. The processor implements one branch delay slot. The compiler fills the branch delay slot with an instruction that comes before the branch in the original code (option A in the videos. After fetching the branch delay slot instruction, the processor pauses instruction fetch until the branch outcome has been resolved. The processor implements a hardware branch predictor that makes correct predictions for 95% of all branches. When an incorrect prediction is discovered, the incorrectly fetched instructions after the branch are squashed.Explanation / Answer
Answer:
1) The processor gaps instruction fetch as soon as it fetches a branch. Instruction make is resumed after the branch outcome has been determined. On an average, in a pipelined system each instruction have a CPI of one clock cycle. In this case, CPI for branch instructions is different as the processor gaps the instruction fetch as rapidly as it brands the branch instruction till branch outcome is determined. As branch consequence is determined in 3rd stage, so processor starts attractive new instructions 3 clock cycles after branch instruction is fetched so branch instruction takes 3 clock cycles to execute.
So, average CPI of program is = 0.2 * 3 + 0.8 * 1 = 1.4
2)
If a branch is taken, then instructions in the phases previous the 3rd stage are squashed. So branch instruction when division is taken takes 3 clock cycles however all others take 1 clock cycle.
So, average CPI of program is = 0.20 * 0.65 * 3 + 0.8 * 1 + 0.20 * 0.35 * 1 = 1.26
3)
As the branch delay slot is filled with an instruction before the branch instruction and processor stops fetching instructions as soon as division delay slot ins. Is fetched, so 2 clock cycles are compulsory for branches.
So, average CPI of program is = 0.20 * 2 + 0.8 * 1 = 1.20
4)
As hardware analyst makes precise prediction for 95% of divisions,
1 clock cycle is compulsory when correct calculation is made and 3 clock cycles are essential for incorrect calculation.
So, usual CPI of program is = 0.20 * 0.95 * 1 + 0.20 * 0.05 * 3 + 0.8 * 1 = 1.02
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.