Talkback: Discuss this article with peers
Abstract
In this article I propose a practical exploration of how Linux behaves when performing single or double-precision calculations. I use a chaotic function to show how the calculation results of a same program can vary quite a lot under Linux or a Microsoft operating system.
It is intended for math and physics students and teachers, though the equations involved are quite accessible to just about everybody.
I use Pascal, C and Java as they are the main programming languages in use today.
This discussion focusses on the Intel architecture. Basic concepts are the same for other types of processor, though the details can vary somewhat.
May functions
These functions build up a series of terms with the form:
x0 is given in [0;1]
xk+1 = mu.xk.(1 - xk) where mu is a parameter
They were introduced by Robert May in 1976, to study the evolution of a closed insect population. It can be shown that:
Simplifying things somewhat, the difference between a chaotic and a deterministic system is their sensibility to initial conditions. A chaotic system is very sensible: a small variation of the initial value of x0 will lead to increasing differences in subsequent terms. Thus any error that creeps into the calculations -- such as lack of precision -- will eventually give very different final results.
Other examples of chaotic systems are satellite orbitals and weather prediction.
On the other hand, a deterministic system is not so sensible. A small error in x0 will make us calculate terms that, while differing from their exact value, will be "close enough" aproximations (whatever that means).
An example of a deterministic system is the trajectory of a ping-pong ball.
So chaotic functions are useful to test the precision of calculations on different systems and with various compilers.
Our example
In this example, I propose to use the following values:
mu = 3.8
x0 = 0.5
A precise calculation with a special 1000-digit precision packet gives the following results:
k x(k) ----- --------- 10 0.18509 20 0.23963 30 0.90200 40 0.82492 50 0.53713 60 0.66878 70 0.53202 80 0.93275 90 0.79885 100 0.23161
As you see, the series fluctuates merrily up and down the scale between 0 and 1.
Programming in Turbo-Pascal
A program to calculate this function is easily written in Turbo Pascal for MS-DOS: (text version)
program caos; {$n+} { you need to activate hardware floating-point calculation in order to use the extended type } uses crt; var s : single; { 32-bit real } r : real; { 48-bit real } d : double; { 64-bit real } e : extended; { 80-bit real } i : integer; begin clrscr; s := 0.5; r := 0.5; d := 0.5; e := 0.5; for i := 1 to 100 do begin s := 3.8 * s * (1 - s); r := 3.8 * r * (1 - r); d := 3.8 * d * (1 - d); e := 3.8 * e * (1 - e); if (i/10 = int(i/10)) then begin writeln (i:10, s:16:5, r:16:5, d:16:5, e:16:5); end; end; readln; end.
As you can see, Turbo Pascal has quite a number of floating-point types, each on a different number of bits. In each case, specific bits are set aside for:
For example, on a 386, an 80-bit floating-point is coded as:
Naturally, hardware FP coding is determined by the processor manufacturer. However, the compiler designer can specify different codings for internal calculations. If FP-math emulation is not used, the compiler must then provide means to translate compiler codings to hardware. This is the case for Turbo Pascal.
The results of the above program are:
k single real double extended ---- --------- --------- --------- ---------- 10 0.18510 0.18510 0.18510 0.18510 20 0.23951 0.23963 0.23963 0.23963 30 0.88423 0.90200 0.90200 0.90200 40 0.23013 0.82492 0.82493 0.82493 50 0.76654 0.53751 0.53714 0.53714 60 0.42039 0.64771 0.66878 0.66879 70 0.93075 0.57290 0.53190 0.53203 80 0.28754 0.72695 0.93557 0.93275 90 0.82584 0.39954 0.69203 0.79884 100 0.38775 0.48231 0.41983 0.23138
The first terms are rather close in all cases, as heavy calculation precision losses (from truncation) have not yet occurred. Then the least precise (single) format already loses touch with reality around x30, while the real format goes out around x60 and the double around x90. These are all compiler FP codings.
The extended format -- which is the native hardware FP coding -- retains sufficient precision right up to x100. As an educated guess, it would probably go out around x110.
p2c under Linux
The above program can be compiled with almost no changes with the p2c translating program under Linux:
p2c caos.pas | translate caos.pas to caos.c |
cc caos.c -lp2c -o caos | compile caos.c + p2c library using gcc |
Results are then:
k single real double extended ---- --------- --------- --------- ---------- 10 0.18510 0.18510 0.18510 0.18510 20 0.23951 0.23963 0.23963 0.23963 30 0.88423 0.90200 0.90200 0.90200 40 0.23013 0.82493 0.82493 0.82493 50 0.76654 0.53714 0.53714 0.53714 60 0.42039 0.66878 0.66878 0.66878 70 0.93075 0.53190 0.53190 0.53190 80 0.28754 0.93558 0.93558 0.93558 90 0.82584 0.69174 0.69174 0.69174 100 0.38775 0.49565 0.49565 0.49565
It is interesting to note that the p2c translator converts Pascal single precision FP to C single, while the real, double and extended types all convert to C double. This is a format that keeps precision up to around x80.
I have no data to substantiate the following, but my impression is that C double FP coding is also on 64 bits, but with a different magnitude vs. exponent distribution than for Turbo Pascal.
gcc under Linux
The above program, rewritten in C and compiled with gcc, naturally gives the very same results as with p2c: (text version)
#include <stdio.h> int main() { float f; double d; int i; f = 0.5; d = 0.5; for (i = 1; i <= 100; i++) { f = 3.8 * f * (1 - f); d = 3.8 * d * (1 - d); if (i % 10 == 0) printf ("%10d %20.5f %20.5f\n", i, f, d); } }
Java
The Java programming language is another case altogether, as from the start it was designed to work on many different platforms.
A Java .class file contains the source program compiled in a Virtual Machine Language format. This "executable" file is then interpreted on a client box by whatever java interpreter is available.
However, the Java specification took FP precision very much into account. Any java interpreter should perform single and double precision FP calculations with precisely the same results.
This means that one same program will:
The reader can easily experiment these facts. The following applet calculates the May series we have been talking about. Compare its results on your own setup, viewed with Netscape, HotJava, appletviewer, etc. You could also compare with the same browsers, or others, under Windoze. Just open this page with each browser:
I have, so far, only found one single exception to this rule. Guess who? Microsoft Explorer 3.0!
Finally, the java source file was: (text version)
import java.applet.Applet; import java.lang.String; import java.awt.*; public class caos extends Applet { public void paint (Graphics g) { float f; double d; String s; int i, y; f = (float)0.5; d = 0.5; g.setColor (Color.black); g.drawString ("k", 10, 10); g.drawString ("float", 50, 10); g.drawString ("double", 150, 10); g.setColor (Color.red); y = 20; for (i = 1; i <= 100; i++) { f = (float)3.8* f * ((float)1.0 - f); d = 3.8 * d * (1.0 - d); if (i % 10 == 0) { y += 12; g.drawString (java.lang.String.valueOf(i), 10, y); g.drawString (java.lang.String.valueOf(f), 50, y); g.drawString (java.lang.String.valueOf(d), 150, y); } } } }
Further reading
An introduction to Chaotic Dynamical Systems, R.L. Devaney
Jurassic Park I and II, Michael Crichton (the books, not the films!)
The Intel 386-SX microprocessor data sheet, Intel Corp. (available at http://developer.intel.com)