A RISC processor is driven by a 2-GHz clock. Instructions are executed in five-s
ID: 3572931 • Letter: A
Question
A RISC processor is driven by a 2-GHz clock. Instructions are executed in five-stage pipeline. Instruction statistics in a large program are as follows: Please answer the following questions. (a) Assume all memory access operations are cache hit. What is the ideal instruction throughput? (b) Assume there are an instruction cache and a data cache. For both caches, it takes 4 cycles to complete the memory access if it is a cache miss. 80% of instruction fetch are cache hit. 20% are cache miss. For all Load instructions, the data to be accessed are in the data cache, but 30% of the Load instructions are followed by a dependent instruction, which will stall the pipeline for one cycle. 40% of the Store instructions store data into the data cache, while 60% of them store data into the main memory. What is the instruction throughput? (c) Assume all memory accesses are cache hit. 40% of the branch instructions are unconditional, while 60% are conditional. 80% of the conditional branches are taken, 20% are not taken. The penalty for taking the branch is one cycle. What is the instruction throughput?Explanation / Answer
a) Since all memory access operations are cache hit, all stages need one cycle to complete, 5 stage instructions would take 5 cycles to complete each instruction. Therefore CPI (Clock per instruction) = 5
Execution time = (Number of Instructions * CPI) / Clock Rate
Hence:
Instruction Throughput = Number of Instructions / Execution Time = Clock Rate / CPI
= 2 GHz / 5 = 2 * 10^9 / 5 = 400 million instructions per second
b) Let's Calculate CPI for each and every stage separately.
Stage 1: 80% of instruction fetches take one cycle and 20% of instruction fetches take 4 cycles
CPI for stage 1 = (0.8 * 1) + (0.2 * 4) = 1.6
Stage 2: Instruction decode stage takes one cycle per instruction. Therefore CPI for stage 2 = 1
Stage 3: Instruction execution stage takes one cycle per instruction. Therefore CPI for stage 3 = 1
Stage 4: Load or Store instruction takes 4 cycles for memory access. All other instructions require 1 cycle.
Moreover it's given 30% of load instructions stall pipeline for 1 cycle. 40% of store instructions are to data cache. Hence
(0.4 * 0.3 * 5) (Load instruction stalling the pipeline) + (0.4 * 0.7 * 4) (Load instruction without stalling) + (0.2 * 0.4 * 1) (Store instruction into data cache) + (0.2 * 0.6 * 5) (Store instructions into memory) + (0.1 + 0.3) * 1 (Time for other instructions)
= 0.6 + 1.12 + 0.08 + 0.6 + 0.4
CPI for Stage 4 = 2.8
Stage 5: Write back stage takes one cycle per instruction. Therefore CPI for stage 5 = 1
Total CPI = 1.6 + 1 + 1 + 2.8 + 1 = 7.4
Instruction throughput = 2 Ghz / 7.4 = 270.27 million instructions (approximately)
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.