Next: Double-Valued Vectors ( vector Up: Number Types and Linear Previous: The smod kernel of   Contents   Index

# A Floating Point Filter ( floatf )

Definition

The type floatf provides a clean and efficient way to approximately compute with large integers. Consider an expression E with integer operands and operators + , -, and *, and suppose that we want to determine the sign of E. In general, the integer arithmetic provided by our machines does not suffice to evaluate E since intermediate results might overflow. Resorting to arbitrary precision integer arithmetic is a costly process. An alternative is to evaluate the expression using floating point arithmetic, i.e., to convert the operands to doubles and to use floating-point addition, subtraction, and multiplication.

Of course, only an approximation E' of the true value E is computed. However, E' might still be able to tell us something about the sign of E. If E' is far away from zero (the forward error analysis carried out in the next section gives a precise meaning to "far away") then the signs of E' and E agree and if E' is zero then we may be able to conclude under certain circumstances that E is zero. Again, forward error analysis can be used to say what `certain circumstances' are.

The type floatf encapsulates this kind of approximate integer arithmetic. Any integer (= object of type integer) can be converted to a floatf; floatfs can be added, subtracted, multiplied, and their sign can be computed: for any floatf x the function Sign(x) returns either the sign of x (-1 if x < 0, 0 if x = 0, and +1 if x > 0) or the special value NOIDEA. If x approximates X, i.e., X is the integer value obtained by an exact computation, then Sign(x)! = NOIDEA implies that Sign(x) is actually the sign of X if Sign(x) = NOIDEA then no claim is made about the sign of X.

#include < LEDA/numbers/floatf.h >

Creation

 floatf x introduces a variable x of type floatf and initializes it with zero. floatf x(integer i) introduces a variable x of type floatf and initializes it with integer i.

Operations

 floatf const floatf& a + const floatf& b Addition. floatf const floatf& a - const floatf& b Subtraction. floatf const floatf& a * const floatf& b Multiplication. int Sign(const floatf& f) as described above.

Implementation

A floatf is represented by a double (its value) and an error bound. An operation on floatfs performs the corresponding operation on the values and also computes the error bound for the result. For this reason the cost of a floatf operation is about four times the cost of the corresponding operation on doubles. The rules used to compute the error bounds are described in ([65]).

Example

see [65] for an application in a sweep line algorithm.

Next: Double-Valued Vectors ( vector Up: Number Types and Linear Previous: The smod kernel of   Contents   Index