Data Representation

CS 272
Sam Houston State Univ.
Dr. Tim McGuire

Positional Number Systems

• Decimal (base 10) is an example
• e.g.,
• 435 means
• 400 + 30 + 5
• 4 x 102 + 3 x 101 + 5 x 100
• Example of a non-positional system: Roman numerials
• inconvenient for humans
• unusable for computers
• Base 2 -- natural for computers
• 0 represents OFF, 1 represents ON
• Base 10 -- natural for humans
• decimal system uses 10 symbols, 0 - 9
• binary system uses 2 symbols, 0 & 1
• 0, 1, 10, 11, 100, 101, 110, 111, 1000, 1001, etc.
• 11012 may be written as
• 1x23 + 1x22 + 0x21 + 1x20 = 8 + 4 + 0 + 1 = 1310
Problems with Binary
• Binary numbers are long and cumbersome for people:
• 100010110001111102 = 71,23010
• Easy to drop a digit by hand
• Therefore, we often use a binary-compatible base (decimal is not binary-compatible)
• Two binary compatible bases are (a) octal -- base 8, and (b) hexadecimal -- base 16
• For base 8, we need eight symbols, 0 - 7
`decimal 0  1   2   3   4   5   6   7    8    9   10`
`binary  0  1  10  11 100 101 110 111 1000 1001 1010`
`octal   0  1   2   3   4   5   6   7   10   11   12`
• When binary changes from 3 symbols to 4, octal changes from 1 symbol to 2
• Reason: 3 binary digits form one octal digit
• Octal was used on several machines, most notably the PDP-11
• 318 = 3x81 + 1x80 = 24 + 1 = 2510
• Hexadecimal is performed similarly, only 16 symbols are needed (0 - 9 and A - F)
• Four bits for one hexadecimal digit
`decimal 0  1   2   3   4   5   6   7    8    9   10   11   12`
`binary  0  1  10  11 100 101 110 111 1000 1001 1010 1011 1100`
`hex     0  1   2   3   4   5   6   7    8    9    A    B    C`
• 0001 0110 0011 1110 = 163E16
Conversions
2BD416 = 2x163 + Bx162 + Dx161 + 4x160
= 2x4096 + 11x256 + 13x16 + 4x1
= 8192 + 2816 + 208 + 4 = 11,22010
95 ¸ 16 = 5 r 15 --> 9510 = 5F16
• Hexadecimal to octal (go through binary)
3F7416 = 0011 1111 0111 0100
= 0 011 111 101 110 100 = 0375648
Exercise:
• Convert the following octal numbers to hexadecimal:
• 12, 5655, 2550276, 76545336, 3726755
• base 10
2546
+1872
4418
• base 16
5B39
+7AF4
D62D
• base 2
• easier than hex because the addition table is so small
100101111
+000110110
101100101
A Note on Notation
• We will represent base 2 numbers with a b suffix, e.g., 0100111100001010b
• Base 16 numbers are represented with an h suffix, e.g., 4F0Ah
• if the hex number starts with A-F, prefix a 0 at the beginning so it doesn't look like a variable name, e.g., 0BACh , not BACH
• Base 10 can be suffixed with a d, but no suffix indicates base 10 by default
• We don't worry about octal on Intel processors
Two's Complement Arithmetic
• So far, the numbers we have discussed have had no size restriction on them
• However, since computer arithmetic is performed in registers, this will restrict the size of the numbers
• On some machines, this is 8 bits, others 16 bits, and others it is 32 or even 64 bits
• To keep it simple, we'll use 4 bits at first
The Fabulous Four-bit Machine (FFM)
• The possible numbers it can hold are:
0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111,
1000, 1001, 1010, 1011, 1100, 1101, 1110, 1111
• In order to identify particular bits in a byte or word, we use the following terms:
• lsb -- least significant bit -- rightmost bit position
• always numbered as bit 0
• msb -- most significant bit -- leftmost bit postion
• on FFM it is bit 3
Unsigned Integers
• An unsigned integer is one which is never negative (addresses of memory locations, ASCII codes, counters, etc.)
• On the FFM, then, we can represent numbers from 0 to 15 (0000b to 1111b)
• On the Intel 8086 the range of integers is 0 to 216-1 (0 to 65535)
• If the lsb is 0, the number is even; if it is 1, the number is odd
Signed Integers
• How do we represent negative numbers?
• We have three possible methods:
• Sign and magnitude
• One's complement
• Two's complement
Sign and Magnitude
• The msb of the number represents the sign of the number
• 0 means positive, 1 means negative
• On FFM
• 0000 to 0111 represent 0 - 7
• 1001 to 1111 represent -1 to -7
• 1000 is not used ( negative 0)
• Easy to understand, but doesn't work well in computer circuits (too many special cases)
One's Complement
• The one's complement of an integer is obtained by complementing each bit
• The one's complement of 5 (0101b) is 1010b = -5
• The one's complement of 0 (0000b) is 1111b = -0 (here we go again)
• - 5 + 5 = 0101 + 1010 = 1111 = -0
• The negative 0 problem can be solved by using two's complement
Two's Complement
• To get the two's complement of an integer, just add 1 to its one's complement
• The two's complement of 5 (0101) is 1010+1 = 1011
• When we add 5 and -5 we get
0100
+ 1011
10000
• Because the FFM can only hold 4 bits, the 1 carried out from the msb is lost and the 4-bit result is 0
Complementing the Complements
• It should be obvious that taking the 1's complement of a number twice will give the original number ( -(-5) = 5)
• If the 2's complement is to be useful, it must have the same property
• 5 = 0101, -5 = 1011, -(-5) = 0100+1 = 0101
Moving past 4 bits
• Everything true of the FFM is still true for 8, 16, 32, or even 64 bit machines
• Example: Show how the base-10 integer -97 would be represented in (a) 8 bits and (b) in 16 bits, expressing the answer in hex
• (a)
•  97 = 6x16+1 = 61h  = 0110 0001b
• -97 = 1001 1110 + 1 = 1001 1111b = 9Fh
• (b)
•  97 = 0000 0000 0110 0001b
• -97 = 1111 1111 1001 1111b = FF9Fh