Suppose we have an application running on a 32- processor multiprocessor, which
ID: 3765819 • Letter: S
Question
Suppose we have an application running on a 32- processor multiprocessor, which has a 200ns time to handle reference to a remote memory. For this application, assume that all the references except those involving communication hit in the local memory hierarchy, which is slightly optimistic. Processors are stalled on a remote request, and the processor clock rate is 3.3 GHz. If the base CPI (assuming that all references hit in the cache) is 0.5, how much faster is the multiprocessor if there is no communication versus if 0.2% of the instructions involve a remote communication reference?
Explanation / Answer
It is simpler to first calculate the clock cycles per instruction. The effective CPI
for the multiprocessor with 0.2% remote references is
CPI=BASE CPI +REMOTE REQUEST RATE*REMOTE REQUEST COST
=0.5+0.2%*REMOTE REQUEST COST
this remote request cost is
Remote access cost/ cycle time= 200nos/0.3nos=666 cycles
therefore CPI= 0.5+1.2=1.7
The multiprocessor with all local references is 1.7/0.5 = 3.4 times faster. In prac-
tice, the performance analysis is much more complex, since some fraction of the
noncommunication references will miss in the local hierarchy and the remote
access time does not have a single constant value. For example, the cost of a re-
mote reference could be quite a bit worse, since contention caused by many ref-
erences trying to use the global interconnect can lead to increased delays.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.