Floating Point Numbers - Yr 2 Only

From TRCCompSci - AQA Computer Science
Revision as of 12:51, 16 June 2017 by Admin (talk | contribs) (Examples)
Jump to: navigation, search

Floating point numbers are a method of dynamic binary numerical representation, allowing for a customizable range and accuracy using the same number of digits. Floating point consists of 2 parts, a mantissa which contains the binary value of the represented number, and the exponent which shifts the decimal point according to the size of the number. For a floating point number to be normalized and make the best use of available memory, it must begin with "0.1" for a positive number and "1.0" for a negative number. Any deviation with this could be a waste of bits, as the same number could be represented with a smaller mantissa.

For example, the number 32 could be represented by a floating point number with an 8 bit mantissa and a 5 bit exponent.

The mantissa would be as follows: 0.1000000

The exponent must shift the decimal point to shift 1 into the value of 32, it must therefore have a value of 6: 00110

Converting from Binary to Denary

  1. Write down the mantissa, with the point inserted after the sign bit. (Miss off trailing 0’s)
  2. If the mantissa is negative (sign bit = 1) then
    1. find the twos complement of the mantissa
  3. If the exponent is negative (sign bit = 1) then
    1. find the twos complement of the exponent
  4. Calculate the value of the exponent in denary
  5. If the exponent is positive then
    1. move the point in the mantissa to the right the number of places given by the exponent
  6. else {if the exponent is negative}
    1. move the point in the mantissa to the left the number of places given by the exponent
  7. Convert the mantissa to denary to obtain the answer

Converting from Denary to Binary

  1. Convert the denary number to an unsigned binary number (the mantissa)
  2. Normalise this (move the point to in front of the leading 1)
  3. If the number is negative then
    1. represent it as its twos complement equivalent
  4. Count the number of places the point has been moved to give exponent
  5. If point moved left then
    1. exponent is positive
  6. else {if point moved right}
    1. exponent is negative
  7. Convert exponent to twos complement binary (6-bits in this case)
  8. Add 0’s to the mantissa if necessary (to give 10 bits in this case)

Example 1

Convert 123.5 to floating point form

  1. Convert number (123.5) to pure binary = 1111011.1
  2. Normalise mantissa = 0.11110111
  3. (Number not negative)
  4. The point has moved 7 places left, so exponent = 7
  5. Convert exponent to twos complement binary = 000111
  6. Add 0’s to the mantissa = 0.111101110

Answer = 0111101110 000111

Example 2

Convert 0.1875 to floating point form

  1. Convert number (0.1875) to pure binary = 0.0011
  2. Normalise mantissa = 0.11
  3. (Number not negative)
  4. The point has moved 2 places right, so exponent = - 2
  5. Convert exponent to twos complement binary = 111110
  6. Add 0’s to the mantissa = 0.110000000

Answer = 0110000000 111110

Example 3

Convert -0.375 to floating point form

  1. Convert number (-0.375) to pure binary = - 0.011
  2. Normalise mantissa = - 0.11
  3. Number negative so find twos complement = 1.01
  4. The point has moved 1 place right, so exponent = - 1
  5. Convert exponent to twos complement binary = 111111
  6. Add 0’s to the mantissa = 1.010000000

Answer = 1010000000 111111