Next: Special Arithmetic Operations Up: Computer Representation of Numbers Previous: IEEE Arithmetic Contents

The Guard Digit

Is useful when subtracting almost equal numbers. Suppose and , with 23 1's after the binary point. Both and are single precision floating point numbers. The mathematical result is . It is a floating point number also, hence the numerical result should be identical to the mathematical result, .

When we subtract the numbers, we align them by shifting one position to the right. If computer registers are 24-bit long, then we may have one of the following situations.

1. Shift and ``chop'' it to single precision format (to fit the register), then subtract.

The result is , twice the mathematical value.

2. Shift and ``round'' it to single precision format (to fit the register), then subtract.

The result is , and all the meaningful information is lost.

3. Append the registers with an extra guard bit. When we shift , the guard bit will hold the 1. The subtraction is then performed in 25 bits.

The result is normalized, and is rounded back to 24 bits. This result is , precisely the mathematical value. Funny fact: Cray supercomputers lack the guard bit. In practice, many processors do subtractions and additions in extended precision, even if the operands are single or double precision. This provides effectively 16 guard bits for these operations. This does not come for free: additional hardware makes the processor more expensive; besides, the longer the word the slower the arithmetic operation is.

The following theorem (see David Goldberg, p. 160) shows the importance of the additional guard digit. Let x, y be FP numbers in a FP system with ;

if we compute using digits, then the relative rounding error in the result can be as large as (i.e. all the digits are corrupted!).
if we compute using digits, then the relative rounding error in the result is less than .

Note that, although using an additional guard digit greatly improves accuracy, it does not guarantee that the result will be exactly rounded (i.e. will obey the IEEE requirement). As an example consider , in our toy FP system. In exact arithmetic, , which rounds to . With the guard bit arithmetic, we first shift and chop it to 4 digits, . Now (calculation done with 4 mantissa digits). When we round this number to the nearest (even) we obtain , a value different from the exactly rounded result.

However, by introducing a second guard digit and a third, ``sticky'' bit, the result is the same as if the difference was computed exactly and then rounded (D.Goldberg, p. 177).

Next: Special Arithmetic Operations Up: Computer Representation of Numbers Previous: IEEE Arithmetic Contents

Adrian Sandu 2001-08-26