Software Layers 2 Introduction to unix 2


Real Number Representation



Download 0.58 Mb.
Page13/26
Date28.01.2017
Size0.58 Mb.
#10070
1   ...   9   10   11   12   13   14   15   16   ...   26

Real Number Representation


  • Fixed point notation

    • Top bit specifies the sign (as in signed magnitude): 0 = positive, 1 = negative.

    • Some bits = integer part (normal format), some bits = fractional part

      • The fractional columns are 2-1, 2-2, ...

      • To convert the fractional part:

Repeat

Multiply the number by 2

Write down & discard the integer part

Until the number reaches 0

    • eg: -37.90625

      • Sign bit is: 1

      • Integer part:

Base

Num

Rem

2

37







18

1




9

0




4

1




2

0




1

0




0

1

 the integer part is: 100101

      • Fractional part :

Base

Num

Int

2

0.90625







1.8125

1




1.625

1




1.25

1




0.5

0




1.0

1




0




 the fractional part is: 11101

      •  -37.9062510 = 1100101.111012

    • Problem is how many bits to assign to integer & fractional parts

      • More bits in integer part allows larger magnitude numbers

      • More bits in fractional part is more accurate

      • note: Some decimal fractions have infinite binary representations

        • eg: 0.4

Base

Num

Int

2

0.4







0.8

0




1.6

1




1.2

1




0.4

0




0.8

0

etc………

 

  • Scientific Notation

    • eg: -37.9062510 = -0.3790625E+210

    • The format is Sign Significand E Exponent

    • The signed significand is multiplied by BaseExponent

    • Any number can be normalized so it starts 0.

    • This can be done base 2

    • eg: 37.9062510 = 100101.111012 = 0.10010111101 * 26

 

  • Floating Point Representation

    • The top bit specifies the sign: 0 = positive, 1 = negative.

    • Some bits = exponent (in biased notation), some bits = significand

      • note: Every normalized significand starts 0.1

      • The 0.1 is not stored, ie. one free bit

    • eg: -37.90625, 7 bit exponent, 8 bit significand

      • Sign bit is 1

      • Exponent is 1000110

      • Significand is 00101111

    • More bits in the exponent part allows larger and smaller magnitude numbers

      • Very large numbers cannot be represented  overflow

      • Numbers close to 0 cannot be represented  underflow

      • 0 cannot be stored, due to the implicit 0.1. How is this handled?

    • More bits in the significand is more accurate

    • Typically: 8 bits exponent, 23 bits significand (plus 1 bit for the sign)  gives you a 32 bit floating point number.

  • An Example:

One way is an 8 bit floating point representation:

    • 1 bit sign, 3 bits exponent, 4 bits significand

(note: the exponent is in biased-4 representation)
eg: the number 3.5 can be stored:

+3.5 in binary is +11.1

+11.1 is normalized to become  0.111*210
(normalized numbers always start with 0.1, so the first bit can be assumed)

The sign is +, represented by 0

The exponent is 2 (10), and in biased-4 that is 6 (110)

The significand bits are 111 from 0.111


(which can be stored as 1100 since the first bit is assumed)




Download 0.58 Mb.

Share with your friends:
1   ...   9   10   11   12   13   14   15   16   ...   26




The database is protected by copyright ©ininet.org 2024
send message

    Main page