Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

A RISC processor is driven by a 2-GHz clock. Instructions are executed in five-s

ID: 3572931 • Letter: A

Question

A RISC processor is driven by a 2-GHz clock. Instructions are executed in five-stage pipeline. Instruction statistics in a large program are as follows: Please answer the following questions. (a) Assume all memory access operations are cache hit. What is the ideal instruction throughput? (b) Assume there are an instruction cache and a data cache. For both caches, it takes 4 cycles to complete the memory access if it is a cache miss. 80% of instruction fetch are cache hit. 20% are cache miss. For all Load instructions, the data to be accessed are in the data cache, but 30% of the Load instructions are followed by a dependent instruction, which will stall the pipeline for one cycle. 40% of the Store instructions store data into the data cache, while 60% of them store data into the main memory. What is the instruction throughput? (c) Assume all memory accesses are cache hit. 40% of the branch instructions are unconditional, while 60% are conditional. 80% of the conditional branches are taken, 20% are not taken. The penalty for taking the branch is one cycle. What is the instruction throughput?

Explanation / Answer

a) Since all memory access operations are cache hit, all stages need one cycle to complete, 5 stage instructions would take 5 cycles to complete each instruction. Therefore CPI (Clock per instruction) = 5

Execution time = (Number of Instructions * CPI) / Clock Rate
Hence:
Instruction Throughput = Number of Instructions / Execution Time = Clock Rate / CPI
= 2 GHz / 5 = 2 * 10^9  / 5 = 400 million instructions per second

b) Let's Calculate CPI for each and every stage separately.

Stage 1: 80% of instruction fetches take one cycle and 20% of instruction fetches take 4 cycles

CPI for stage 1 = (0.8 * 1) + (0.2 * 4) = 1.6

Stage 2: Instruction decode stage takes one cycle per instruction. Therefore CPI for stage 2 = 1

Stage 3: Instruction execution stage takes one cycle per instruction. Therefore CPI for stage 3 = 1

Stage 4: Load or Store instruction takes 4 cycles for memory access. All other instructions require 1 cycle.

Moreover it's given 30% of load instructions stall pipeline for 1 cycle. 40% of store instructions are to data cache. Hence

(0.4 * 0.3 * 5) (Load instruction stalling the pipeline) + (0.4 * 0.7 * 4) (Load instruction without stalling) + (0.2 * 0.4 * 1) (Store instruction into data cache) + (0.2 * 0.6 * 5) (Store instructions into memory) + (0.1 + 0.3) * 1 (Time for other instructions)

= 0.6 + 1.12 + 0.08 + 0.6 + 0.4

CPI for Stage 4 = 2.8

Stage 5: Write back stage takes one cycle per instruction. Therefore CPI for stage 5 = 1

Total CPI = 1.6 + 1 + 1 + 2.8 + 1 = 7.4

Instruction throughput = 2 Ghz / 7.4 = 270.27 million instructions (approximately)

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote