[go: up one dir, main page]

0% found this document useful (0 votes)
7 views7 pages

2.4 Floating Point Representation

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 7

Floating-Point Representation

The term floating-point is derived from the fact that there is no fixed number of digits
before and after the decimal point. There are also representations in which the number
of digits before and after the decimal point is set which is called fixed point
representations.
In general, floating point representations are slower and less accurate than fixed point
representations, but they can handle a large range of numbers. Some examples of real
numbers:
examples of reals:
● 3.14159265… ten (pi)
● 2.71828… ten (e)
● 0.000000001ten or 1.0ten × 10−9 (seconds in a nanosecond)

Scientific notation

A notation that renders numbers with a single digit to the left of the decimal point.
Scientific notation is the way that scientists easily handle very large numbers or very
small numbers. For example, instead of writing 0.0000000056, write 5.6 x 10-9.

Normalised notation

A number in scientifi c notation that has no leading 0s is called a normalized number,


which is the usual way to write it. For example, 1.010 x10-9is in normalized scientific
notation

Any given real number can be written in the form m×10nin many ways: for example, 350
can be written as 3.5×102 or 35×101 or 350×100.

Similar to decimal numbers in scientific notation, binary numbers can also be


represented in scientific notation as below:
1. xxxxxxxxx2 x 2yyyy

For example, 1.1012 x 2-1is a binary number represented in scientific notation. In


general, the syntax of the floating point number representation is as below:
Significand X base exponent
A significand contains the number’s digits. Negative significands represent negative
numbers.
An exponent says where the decimal(or binary) point place relative to the beginning of
the significand. Negative exponents represent numbers that are very small.

In computers, floating-point numbers are represented in scientific notation of fraction (


F ) and exponent ( E ) with a radix of 2, in the form of F×2^E . Both E and F can be
positive as well as negative.

FRACTION
The value, generally between 0 and 1, placed in the fraction field. The fraction is also
called the mantissa.

EXPONENT
In the numerical representation system of floating-point arithmetic, the value that is
placed in the exponent field.

Example :
1.2345 = 12354 x 10 -4
[ 12354 - Significand, 10 - Base , -4 - Exponent ]
Single Precision
Floating-point numbers are usually a multiple of the size of a word. A floating-point
value represented in a single 32- bit word is called Single precision.

To represent in single precision format, at first the number must be converted to


normalised form. 1.1012 x 2-1is a normalised number.
The representation of IEEE 754 single precision floating point number is shown below,
where s is the sign of the floating point number (0-positive, 1- negative), exponent is the
value of the 8 bit exponent field(including the sign of the exponent), and fraction is the
23 bit number.

F involves the value in the fraction field and E involves the value in the exponent Field.

-38
In single precision format, numbers as small as 2.010 x 10 and numbers as large as
2.010 x 1038 can be represented. A situation in which a positive exponent biomes too
large to fit in the exponent field is overflow in floating point. A situation in which a
negative exponent becomes too large to fit in the exponent field is called underflow in
floating point.

Double Precision
The overflow and underflow problems of single precision format can be rectified in
another format which has a larger exponent. A floating point value represented in two 32
bit words is called Double Precision. The representation of double precision floating

point numbers takes two MIPS words as shown below, where s is the sign of the floating
point number (0-positive, 1-negative), exponent is the value of the 11 bit exponent field,
and fraction is the 52 bit number in the fraction field.
In double precision format, numbers as small as 2.010 x 10 -308 and numbers as large as

2.010 x 10308 can be represented.

Biasing
IEEE 754 uses a bias of 127 for single precision, so an exponent of - 1 is represented
by the bit pattern of the value -1 +127ten, or 126ten = 0111 1110two, and +1 is represented
by 1 + 127, or 128ten =1000 0000two.

The exponent bias for double precision is 1023. Biased exponent means that the value
represented by a floating-point number is really

(-1)S x (1+ Fraction) x 2(Exponent- Bias).

The range of single precision numbers is then from as small as

±1.00000000000000000000000two x 2-126

to as large as

±1.11111111111111111111111two 2+127.

Floating-Point Representation
Example : 2
Converting Binary to Decimal Floating Point
What decimal number is represented by this single precision float?

You might also like