1. There are 4 caches with the organization and block size as indicated below. A
ID: 3778651 • Letter: 1
Question
1. There are 4 caches with the organization and block size as indicated below. All caches hold 128 words where each word is 4 bytes. Assuming a 32-bit address.
a. A direct-mapped cache with block size of 16 words
b. 2-way set-associative cache with block size of 8 words
c. 4-way set-associative cache with block size of 4 words
d. A fully associative cache with block size of 32 words.
Complete the table for each cache.
Cache a
Cache b
Cache c
Cache d
total # bits need for word + byte
displacement
32
32
32
32
# bits needed for index
4
2
2
7
# bits needed for tag
22
25
26
18
# bits (total) per set (including valid
and dirty bits)
56
59
70
52
2. Here is a series of addresses in hexadecimal:
20(w), 3C(r), 10(r), 16(w), 20(r), 04(w), 28(r), 6(r), 10(w), 17(w)
Assume a LRU replacement algorithm. For the four caches in problem 1, draw each cache as it would appear at the end of the series of references. Include the valid bit, dirty bit, and tag. Show the contents of the memory block using the byte address range such as M[20-23] for the word with address 22.
I already completed question 1. Just need help with question 2.
Cache a
Cache b
Cache c
Cache d
total # bits need for word + byte
displacement
32
32
32
32
# bits needed for index
4
2
2
7
# bits needed for tag
22
25
26
18
# bits (total) per set (including valid
and dirty bits)
56
59
70
52
Explanation / Answer
2.
The key to the problem is to note that two addresses that map to the same set in a set-associative cache may map to a different block in a direct-mapped cache. A simple example is with a four-word cache and 1 word blocks. The example sequence is addresses 0, 2, 4, 0, 2. In the direct-mapped cache, 0 and 4 map to block 0, while 2 maps to block 2. Accesses to 0 and 4 miss because they conflict in block 0, but the second access to 2 hits. The hit rate is 1/5
With a 2-way set-associative cache, all three address map to the first set. Thus after the first two misses, 4 kicks out 0, 0 kicks out 2, and 2 kicks out 4. The hit rate is 0/5.
Memory-stall clock cycles = Instructions/ Program * Misses/Instruction * Miss penalty
Misses/Instruction = Instruction miss rate + (Data miss rate* Data references/Instruction)
Miss penalty = 6 + Block size in words
Data references/Instruction = 50%
Cache 1: Block size = 1 word
Miss penalty = 6+1 =7 cycles
Instruction miss rate = 4%
Data miss rate = 8%
CPIstall
= (Instruction miss rate + (Data miss rate* Data references/Instruction) * Miss penalty
= (4% + (8%*50%)) * 7 = 0.56
CPI = CPIstall + CPIperfect = 2.0
CPIperfect = 2.0 - 0.56 = 1.44
Cache 2:
Block size = 4 words
Miss penalty = 6+4 = 10 cycles
Instruction miss rate = 2%
Data miss rate = 5%
CPIstall
= (Instruction miss rate + (Data miss rate* Data references/Instruction) * Miss penalty
= (2% + (5%*50%)) * 10 = 0.45
Cache 3:
Block size = 4 words
Miss penalty = 6+4 = 10 cycles
Instruction miss rate = 2%
Data miss rate = 4%
CPIstall
= (Instruction miss rate + (Data miss rate* Data references/Instruction) * Miss penalty
= (2% + (4%*50%)) * 10 = 0.4
So cache 1 spends the most cycles on cache misses.
NOTE: A direct- mapped cache
64 Kbytes = 16 K words = 2^14 words = 2^14 blocks
block size = 4 bytes => offset size = 2 bits,
#sets = #blocks = 2^14 => index size = 14 bits
tag size = address size - index size - offset size = 32 - 14 - 2 = 16 bits
bits/block = data bits + tag bits + valid bit = 32 + 16 + 1 = 49
bits in cache = #blocks x bits/block = 2^14 x 49 = 98 Kbytes
for a direct- mapped cache with 64 KBytes of data and 8 word blocks, assuming a 32-bit address
64 Kbytes = 2^14 words = (2^14)/8 = 2^11 blocks
block size = 32 bytes => offset size = 5 bits,
#sets = #blocks = 2^11 => index size = 11 bits
tag size = address size - index size - offset size = 32 - 11 - 5 = 16 bits
bits/block = data bits + tag bits + valid bit = 8x32 + 16 + 1 = 273 bits
bits in cache = #blocks x bits/block = 2^11 x 273 = 68.25 Kbytes
4-way set associative cache
block size and #blocks does not change
#sets = #blocks/4 = (2^14)/4 = 2^12 => index size = 12 bits
tag size = address size - index size - offset = 32 - 12 - 2 = 18 bits
bits/block = data bits + tag bits + valid bit = 32 + 18 + 1 = 51
bits in cache = #blocks x bits/block = 2^14 x 51 = 102 Kbytes
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.