The floating-point format to be used in this problem is an 8-bit IEEE 754 normal
ID: 3566374 • Letter: T
Question
The floating-point format to be used in this problem is an 8-bit IEEE 754 normalized format with 1 sign bit, 4 exponent bits, and 3 mantissa bits. It is identical to the 32-bit and 64-bit formats in terms of the meaning of fields and special encodings. The exponent field employs an excess- 7coding. The bit fields in a number are (sign, exponent, mantissa). Assume that we use unbiased rounding to the nearest even specified in the IEEE floating point standard.
(a) Encode the following numbers the 8-bit IEEE format: (1) 0.0011011binary (2) 16.0decimal
(b) Perform the computation 1.011binary + 0.0011011binary showing the correct state of the guard, round and sticky bits. There are three mantissa bits.
(c) Decode the following 8-bit IEEE number into their decimal value: 1 1010 101
(d) Decide which number in the following pairs are greater in value (the numbers are in 8-bit IEEE 754 format): (1) 0 0100 100 and 0 0100 111 (2) 0 1100 100 and 1 1100 101
(e) In the 32-bit IEEE format, what is the encoding for negative zero? (f) In the 32-bit IEEE format, what is the encoding for positive infinity?
Explanation / Answer
As you demanded an 8 bit Floating point number (1 bit --> Sign , 4 bits--> biased exponent and 3 bits --> mantissa
a) Encode the following numbers the 8-bit IEEE format: (1) 0.0011011binary (2) 16.0decimal
Ans a) (1) expressing 0.0011011 in binary in normalized scientific notation --> 1.1011x2^(-3) , (exponent and base written in decimal)
So, the encoded number is 0 0100 101
(2) 16 in decimal is 1000.0 in binary, representing the same as 1.000x2^(3)
So, the encoded number is 0 1010 000
(b) Perform the computation 1.011binary + 0.0011011binary showing the correct state of the guard, round and sticky bits. There are three mantissa bits.
Ans b) normalizing second number as 1.1011x2^(-3) but we have three mantissa bits, so leaving behind 1
let second number=n2= 1.101x2^(-3)
(c) Decode the following 8-bit IEEE number into their decimal value: 1 1010 101
Ans c) Sign=1 , biased exponent = (1010)binary=10(decimal) , mantissa= 101
(d) Decide which number in the following pairs are greater in value (the numbers are in 8-bit IEEE 754 format): (1) 0 0100 100 and 0 0100 111 (2) 0 1100 100 and 1 1100 101
Ans d) (1) decoding the numbers
(2) decoding the numbers 0 1100 100 and 1 1100 101
(e) In the 32-bit IEEE format, what is the encoding for negative zero?
With sign bit as 1 , all other field zero
1 00000000 000000...23 times....000000
(f) In the 32-bit IEEE format, what is the encoding for positive infinity?
Sign bit=0
all other field max i.e. all set as 1
0 11111111 1111111...23times...11111111111111
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.