Consider a GPU with the following characteristics: Clock rate: 1.6 GHz Number of
ID: 3686968 • Letter: C
Question
Consider a GPU with the following characteristics:
Clock rate: 1.6 GHz
Number of SIMD processors (i.e. Streaming Multiprocessors (SM) in NVIDIA CUDA
terminology): 16
Number of Floating Point Units per SIMD processor: 16
GPU off-chip memory bandwidth: 100 GB/s
Compute the throughput in FLoating-point Operations Per Second (FLOPS) without considering
the memory bandwidth and assuming all memory latencies can be hidden. Assuming that each
FP operation requires two operands of 4 Bytes each and outputs one 4 Byte result, is this
throughput sustainable with current memory bandwidth?
Explanation / Answer
Given clock rate = 1.6 GHz
SIMD core processors = 16
Floating point units are 16
Memory bandwidth 100 GB/s
So throughput in FLOPS ==> clock rate * number of core processors
1.6 Ghz x 16 ===> 25.6 GHz
each operation takes 4 bytes ... then total memory required is 25.6 x 4 = 102.4 GB/s
Avaiable memory bandwidth is 100 GB/s
So 102.4 > 100 =====> not sufficient
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.