Chapter 2: Representing and Manipulating Information

Modern computers store and process information represented as two-valued signals.

2.1 Information Storage

Rather than accessing individual bits in memory, most computers use blocks of 8 bits, or bytes, as the smallest addressable unit of memory.

2.1.1 Hexadecimal Notation

One simple trick for doing the conversion in your hand is to memorize the decimal equivalents of hex digits A, C, and F.

2.1.2 Data Sizes

Since a virtual address is encoded by such a word, the most important system parameter determined by the word size is the maximum size of the virtual address space.

[signed] charunsigned char11
shortunsigned short22
longunsigned long48
char *48

Typical sizes (in bytes) of basic C data types.
The number of bytes allocated varies with how the program is compiled.
This chart shows the values typical of 32-bit and 64-bit programs.

To avoid the vagaries of relying on “typical” sizes and difference compiler settings, ISO C99 introduced int32_t and int64_t which having exactly 4 and 8 bytes, respectively.
Although most compilers and machines treat char as signed data, the C standard does not guarantee this. The programmer should use the declaration signed char to guarantee a 1-byte signed value.

2.1.3 Addressing and Byte Ordering

In virtually all machines, a multi-byte object is stored as a contiguous sequence of bytes, with the address of the object given by the smallest address of the bytes used.
For ordering the bytes representing an object, some machines choose to store the object in memory ordered from least significant byte to most (little endian), while other machines store them from most to least (big endian).
For most application programmers, the byte ordering used by their machines are totally invisible.
At times, however, byte ordering becomes an issue:

  • when binary data are communicated over a network between different machines.
  • when looking at the byte sequences representing integer data.
  • when programs are written that circumvent the normal type system (like cast and union in C).

2.1.4 Representing Strings

Observe that the ASCII code for decimal digit x happens to be 0x3x, and this same result would be obtained on any system using ASCII as its character code, independent of the byte ordering and word size conventions.
As a consequence, text data are more platform independent than binary data.

2.1.5 Representing Code

A fundamental concept of computer systems is that a program, from the perspective of the machine, is simply a sequence of bytes.

2.1.6 Introduction to Boolean Algebra

George Boole observed that by encoding logic values TRUE and FALSE as binary values 1 and 0, he could formulate an algebra that captures the basic principles of logical reasoning.

2.1.7 Bit-Level Operation in C

One useful feature of C is that it supports bitwise Boolean operations.

2.1.8 Logical Operations in C

2.1.9 Shift Operations in C

The C standards do not precisely define which type of right shift should be used with signed numbers. In practice, however, almost all compiler/machine combinations use arithmetic right shifts for signed data.
For unsigned data, on the other hand, right shifts must be logical.

2.2 Integer Representations

In this section, we describe two different ways bits can be encoded integer——signed and unsigned, and we will see that they are strongly related both in their mathematical properties and their machine-level implementations.

2.2.1 Integral Data Types

C data typeMinimumMaximum
[signed] char-128127
unsigned char0255
unsigned short065,535
unsigned long018,446,744,073,709,551,615

Typical ranges for C integral data types for 64-bit programs.

C data typeMinimumMaximum
[signed] char-127127
unsigned char0255
unsigned short065,535
unsigned long04,294,967,295

Guaranteed ranges for C integral data types.
The C standards require that the data type have at least these ranges of values.

The C standards define minimum ranges of values that each data type must be able to represent, which are the same or smaller than the typical implementations.
In particular with the exception of the fixed-size data types, we see that they require only a symmetric range of positive and negative numbers. We also see that data type int could be implemented with 2-byte numbers, although this is mostly a throwback to the days of 16-bit machines.

2.2.2 Unsigned Encodings

PRINCIPLE: Definition of unsigned encoding
For vector \(\vec{x} = [x_{w-1}, x_{w-2}, \dots , x_0]: B2U_w(\vec{x}) = \sum\limits ^{w-1}_{i=0}x_i2^i\)
PRINCIPLE: Uniqueness of unsigned encoding
Function \(B2U_w\) is a bijection.

2.2.3 Two’s-Complement Encodings

PRINCIPLE: Definition of two’s-complement encoding
For vector \(\vec{x} = [x_{w-1}, x_{w-2}, \dots , x_0]: B2T_w(\vec{x}) = -x_{w-1}2^{w-1} \sum\limits ^{w-2}_{i=0}x_i2^i\)
PRINCIPLE: Uniqueness of two’s-complement encoding
Function \(B2T_w\) is a bijection.

2.2.4 Conversions between Signed and Unsigned

Although the C standard does not specify precisely, there is a general rule for how most C implementations handle conversions between signed and unsigned numbers with the same word size——the numeric values might change, but the bit patterns do not.
由 \(B2U_w\) 与 \(B2T_w\) 的定义及双射易得 \(T2U_w(x)\) 与 \(U2T_w(x)\) 在数值上就是加/减个 \(x_{w-1}2^w\) (先转换成w位比特串)

Conversion from two’s complement to unsigned. (bijection)

2.2.5 Signed versus Unsigned in C

In C, when either operand of a comparison is unsigned, the other operand is implicitly cast to unsigned, which causes some nonintuitive cases:

  • -1 > 0U
  • 2147483647U < -2147483647-1
  • 2147483647 > (int)2147483648U

2.2.6 Expanding the Bit Representation of a Number

PRINCIPLE: Expansion of unsigned number by zero extension
PRINCIPLE: Expansion of two’s-complement number by sign extension
The C standard specifies that when converting from short to unsigned, the program first changes the size and then the type. (E.g. -12345->4294954951)

2.2.7 Truncating Numbers

不管是 unsigned 还是 two’s-complement,比特串而言就是截断,数值而言就是取模

2.2.8 Advice on Signed versus Unsigned

As we have seen, the implicit casting of signed to unsigned leads to some nonintuitive behavior.

2.3 Integer Arithmetic

2.3.1 Unsigned Addition


2.3.2 Two’s-Complement Addition

在机器里,Two’s-Complement 的加法跟 Unsigned 的加法用的电路是一样的

2.3.3 Two’s-Complement Negation

(在 Bit-Level 上,有两种等价的实现:取反加一; low-bit不变,更高位取反)

2.3.4 Unsigned Multiplication

2.3.5 Two’s-Complement Multiplication

不过在汇编里,在被乘数都是 64 位前提下,如果结果也是 64 位的,那么就不区分有/无符号数两种乘法指令;如果结果是 128 位的,就要区分,因为其实就变成了模 128 位的乘法,被乘数会先扩展到 128 位,而有/无符号数扩展的结果是不同的,所以要区分开来



2.3.6 Multiplying by Constants

不管有/无符号,模乘\(2^k\)就是左移 k 位再截断
比如:x * 14 = x * [1110] = (x<<3) + (x<<2) + (x<<1) = (x<<4) – (x<<1)

2.3.7 Dividing by Powers of 2

PRINCIPLE: Unsigned division by a power of 2
For C variables x and k with unsigned values x and k, such that \(0 \le k < w\), the C expression x >> k yields the value \(\lfloor x/2^k \rfloor\)
PRINCIPLE: Two’s-complement division by a power of 2, rounding down
Let C variables x and k have two’s-complement value x and unsigned value k, respectively, such that \(0 \le k < w\). The C expression x >> k, when the shift is performed arithmetically, yields the value \(\lfloor x/2^k \rfloor\).
PRINCIPLE: Two’s-complement division by a power of 2, rounding up
Let C variables x and k have two’s-complement value x and unsigned value k, respectively, such that \(0 \le k < w\). The C expression (x + (1<<k)-1) >> k, when the shift is performed arithmetically, yields the value \(\lceil x/2^k \rceil\).
The biasing technique exploits the property that \(\lceil x/y \rceil = \lfloor (x + y-1) / y \rfloor\)
Integer division always rounds toward zero.

2.3.8 Finaly Thoughts on Integer Arithmetic

As we have seen, the “integer” arithmetic performed by computers is really a form of modular arithmetic. The finite word size used to represent numbers limits the range of possible values, and the resulting operations can overflow.
Two’s-complement representation provides a clever way to represent both negative and positive values, while using the same bit-level implementations as are used to performed unsigned arithmetic——operations such as addition, subtraction, multiplication, and even division have either identical or vary similar bit-level behaviors, whether the operands are in unsigned or two’s-complement form.

2.4 Floating Point

We will see that since the IEEE format is based on a small and consistent set of principles, it is really quite elegant and understandable.

2.4.1 Fractional Binary Numbers

A first step understanding floating-point number is to consider binary numbers having fractional values.
\(b = \sum\limits_{i=-n}^m2^i \times b_i\)

2.4.2 IEEE Floating-Point Representation

The IEEE floating-point standard represents a number in a form \(V = (-1)^s \times M \times 2^E\)

Standard floating-point formats. Floating-point numbers are represented by their fields. For the two most common formats, there are packed in 32-bit (single precision) or 64-bit (double-precision) words.

Categories of single-precision floating-point values.

Case 1: Normalized Values
It occurs when the bit pattern of exp is neither all zeros nor all ones.
The exponent field exp is interpreted as representing a signed integer in biased form. That is, the exponent value is E = e – Bias, where e is the unsigned number having bit representation \(e_{k-1} \cdots e_1e_0\) and Bias is a bias value equal to \(2^{k-1}-1\). This yields exponent ranges from -126 to +127 for single precision and -1024 to +1023 for double precision. (differ from two’s-complement)
The fraction field frac is interpreted as representing the fractional value f, where \(0 \le f < 1\), having binary representation \(0.f_{n-1} \cdots f_1f_0\). The significant is defined to be M = 1 + f.

Case 2: Denormalized Values
It occurs when the exponent field is all zeros.
The exponent value is E = 1 – Bias (is equal to the smallest exponent value of the normalized case).
The significant value is M = f, that is, the value of the fraction field without am implied leading 1.
Denormalized numbers serve two purposes:

  • provide a way to represent numeric value 0
    The values -0.0 and +0.0 are considered different in some ways and the same in others
  • represent numbers that are very close to 0.0
    This provide a property known as gradual underflow in which possible numeric values are spaced evenly near 0.0

Case 3: Special Values
It occurs when the exponent field is all ones.
When the fraction field is all zeros, the resulting values represent infinity, either \(+\infty\) when s = 0 or \(-\infty\) when s = 1. Infinity can represent result that overflow, as when we multiply two very large numbers, or when we divide by zero.
When the fraction field is nonzero, the resulting value is called NaN (not a number). Where the result cannot be given as a real number or as infinity, as when computing \(\sqrt{-1}\) or \(\infty – \infty\) (useful in some applications for representing uninitialized data).

2.4.3 Example Numbers

The set of values that can be represented in a hypothetical 6-bit format having k=3 exponent bits and n=2 fraction bits:
A very important example:

Example nonnegative vales for 8-bit floating-point format. There are k=4 exponent bits and n=3 fraction bits. The bias is 7.

Observe the smooth transition between the largest denormalized number 7/512 and the smallest normalized number 8/512.
This smoothness is due to our definition of E for denormalized values. By making it 1 – Bias rather than -Bias, we compensate for the fact that the significant of a denormalized number does not have an implied leading 1.

Also from the example, we can observe that the IEEE format was designed so that floating-point numbers could be sorted using an integer sorting routine. We can perform comparisons without requiring floating-point operations.

The smallest positive integer that cannot be represented exactly: \(2^{24}+1\) for float and \(2^{53}+1\) for double.

Besides overflow, floating-value can also underflow, when they are so close to 0.0 that they are changed to zero.

2.4.4 Rounding

The IEEE floating-point format defines four different rounding modes. The default method finds a closest match, while the other three can be used for computing upper and lower bounds.

  • Round-to-even (the default method, also called round-to-nearest):
    It attempts to find a closest match.
    When in halfway, it rounds the number either upward or downward such that the least significant digit of the result is even.
    Rounding toward even numbers avoids the statistical bias in most real-life situations (like the average of a set of data which need to be rounded).
  • Round-toward-zero
  • Round-up
  • Round-down

2.4.5 Floating-Point Operations

先把浮点数翻译回实数,然后按实数规则进行计算,最后对结果进行 rounding

When one of the arguments is a special value, such as -0, \(\infty\), or Nan, the standard specifies conventions that attempt to be reasonable. For example, 1/-0 is defined to yield \(-\infty\), while 1/+0 is defined to yield \(+\infty\).

One strength of the IEEE standard’s method of specifying the behavior of floating-point operations is that it is independent of any particular hardware or software realization. Thus we can examine its abstract mathematical properties without considering how it is actually implemented.

We saw earlier that integer addition, both unsigned and two’s-complement, forms an abelian group.
The floating-point addition are not associative (e.g. (3.14+1e10)-1e10 = 0.0).
Infinities and NaN have no addition inverses (since \(+\infty – \infty = NaN\) and NaN + x = NaN).
The floating-point multiplication are not associative (e.g. (1e201e20)1e-20 = \(+\infty\))
In addition, floating-point multiplication does not distribute over addition (e.g. 1e20(1e20-1e20) = 0 while 1e201e20 – 1e20*1e20 = NaN).
This lack of associativity and distributivity is of serious concern to scientific programmers and compiler writers.
On the other hand, floating-point addition and multiplication satisfies some monotonicity property.

2.4.6 Floating Point in C

The C standards do not require the machine to use IEEE floating point, there are no standard methods to change to rounding mode or to get special values such as -0, \(+\infty\), \(-\infty\), or NaN. Most systems provide a combination of include (.h) files and procedure libraries to provide access to these features.

When casting values between int, float, and double formats:

  • From int to float: the number cannot overflow, but it may be rounded
  • From int or float to double, the exact numeric value can be preserved because double has both greater range (i.e., the range or representable values), as well as greater precision (i.e. the number of significant bits)
  • From float or double to int, the value will be rounded toward zero.
    Furthermore, the value may overflow, the C standard do not specify a fixed result for this case.
    Intel-compatible microprocessors designate the bit pattern [10…00] as an integer indefinite value.
    Any conversion from floating point to integer that cannot assign a reasonable integer approximation yields this value. (e.g. (int)+1e10 = -2147483648)

Assume variables x, f, and d are of type int, float, double, respectively.
x == (int)(double)x
x != (int)(float)x
d != (double)(float)d
f == (float)(double)d
f == -(-f)
(a floating-point number is negated by simply inverting its sign bit)

2.5 Summary

Both unsigned and two’s-complement arithmetic satisfy many properties of integer arithmetic, including associativity, commutativity, and distributivity. This allows compilers to do many optimizations (not likes floating-point operations).