forked from douglascrockford/DEC64
-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathdec64.html
133 lines (122 loc) · 11.4 KB
/
dec64.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>DEC64: Decimal Floating Point</title>
<style>
@import url(http://fonts.googleapis.com/css?family=Cousine:700);
@import url(http://fonts.googleapis.com/css?family=Inika:400,700);
body {
background-color: cornsilk;
border-left: 1em solid indianred;
font-family: 'Inika', serif;
margin-left: 1em;
max-width: 50em;
padding: 10%;
}
code, pre {
font-weight: bold;
}
pre {
border-left: 1em solid indianred;
padding-left: 1em;
padding-bottom: 0.25em;
padding-top: 0.25em;
}
table {
margin-top: -1em;
}
td {
background-color: navajowhite;
border: 2pt solid black;
text-align: center;
}
th {
background-color: inherit;
border: 0;
font-weight: lighter;
padding-left: 3pt;
padding-right: 3pt;
position: relative;
text-align: justify;
top: 1em;
}
/* This terrible hack instead of text_align: force */
th:after {
content: "";
display: inline-block;
height: 0;
width: 100%;
}
.dec64, h1, h2 {
font-family: 'Cousine', sans-serif;
};
</style>
</head>
<body>
<img src="dec64.png" width="398" height="103" alt="DEC64">
<h1>Overview</h1>
<p><span class=dec64>DEC64</span> is a number type. It can precisely represent
decimal fractions with 16 decimal places, which makes it very well suited
to all applications that are concerned with money. It can represent
values as gargantuan as <code>3.6028797018963967E+143</code> or as measly
as <code>1.0E-127</code>, which makes it well suited to most scientific
applications. It can provide very fast performance on integer values,
eliminating the need for a separate <code>int</code> type and avoiding the
terrible errors than can result from <code>int</code> truncation.</p>
<p><span class=dec64>DEC64</span> is intended to be the only number type in the
next generation of application programming languages.</p>
<h1 id="representation">Representation</h1>
<p><span class=dec64>DEC64</span> represents numbers as 64 bit values composed of 2 two’s complement components: a 56 bit <var>coefficient</var> and an 8 bit <var>exponent</var>. The <var>coefficient</var> is in the high order end, and the <var>exponent</var> is in the low order end. The <var>coefficient</var>’s decimal point is between bits 8 and 7.</p>
<table cellspacing=1 cellpadding="0" width=100%><tbody>
<tr><th>63 8</th><th>7 0</th></tr>
<tr>
<td width=70%><em>coefficient</em></td>
<td width=10%><em>exponent</em></td>
</tr></tbody>
</table>
<p>The <var>coefficient</var> is in the range -<code>36028797018963968</code> .. <code>36028797018963967. </code>The <var>exponent</var> is in the range <code>-127</code> .. <code>127</code>. Numbers may not use an <var>exponent</var> containing <code>-128</code><code></code>. The <var>value</var> of a number is obtained from this formula:</p>
<blockquote><var>value</var><code> = </code><var>coefficient</var><code> * 10</code><sup><var>exponent</var></sup></blockquote>
<p>Normalization is not required, and is usually not desired. Integers can have an <var>exponent</var> of <code>0</code> as long as the <var>coefficient</var> is less than 36 quadrillion. Addition of numbers with equal <var>exponents</var> could be performed in a single machine cycle.</p>
<p>There are 255 possible representations of zero. They are all considered to be equal.</p>
<p>There is a special value called <code>nan</code> that has a <var>coefficient</var> of <code>0</code> and an <var>exponent</var> of <code>-128</code>. The result of division by zero is <code>nan</code>. <code>nan</code> is also the result of operations that produce results that are too large to be represented. <code>nan</code> is equal to itself.</p>
<p>When an arithmetic operation has an input with an <var>exponent</var> of <code>-128</code>, the result will be <code>nan</code>. Applications are free to use the <var>coefficient</var> as they wish when the <var>exponent</var> is <code>-128</code>, since in that case the <var>coefficient</var> has no arithmetic significance. One possible use is to store object pointers in the <var>coefficient</var>. </p>
<h1 id="implementation">Implementation</h1>
<p><span class=dec64>DEC64</span> can be implemented efficiently in hardware or software. </p>
<p>Conversion to and from textual representations is simple and straightforward and free of the complexities that binary floating formats must wrestle with to minimize the inevitable errors caused by the fundamental incompatibility of the binary and decimal systems. <span class=dec64>DEC64</span> instead uses an internal representation that is very compatible with the <code>E</code> notation.</p>
<p>To convert an <code>int</code> to <span class=dec64>DEC64,</span> shift it left 8 bits. To unpack a <var>coefficient</var>, shift it right 8 bits with sign extension. The <var>exponent</var> can be unpacked at no cost on x64 architecture because the least significant byte can be accessed directly.</p>
<p>There is a fast path for addition of integers in a software implementation that takes only 5 instructions while also providing for <i>not-a-number</i> and overflow protection.</p>
<p>x64:</p>
<pre>; Add rdx to rax.
mov cl,al ; load the exponent of rax into cl
or cl,dl ; 'or' the two exponents together
jnz slow_path ; if both exponents are zero, take the fast path
add rax,rdx ; add the coefficients together
jo overflow ; if there was no overflow, we are done</pre>
<p>ARM64:</p>
<pre>; Add x1 to x0
orr x2, x0, x1 ; 'or' the two numbers together
ands xzr, x2, #255 ; examine the exponent part
b.ne slow_path ; if both exponents are zero, take the fast path
adds x0, x0, x1 ; add the coefficients together
b.vs overflow ; if there was no overflow, we are done</pre>
<p>The fast path for addition in a hardware implementation should take only 1 cycle when the two exponents are equal to each other and there is no overflow. The fast path for multiplication in hardware takes the time it takes to do a 56*56 signed multiply when there is no overflow. </p>
<p>A reference implementation is available Intel/AMD x64 and ARM64 at <a href="https://github.com/douglascrockford/DEC64">https://github.com/douglascrockford/DEC64</a>. <a href="http://dec64.com/dec64.obj.html">It provides the DEC64 elementary operations.</a></p>
<p>Conversion between <span class=dec64>DEC64</span> and strings is trivially easy. This is demonstrated by <a href="http://www.dec64.com/dec64_string.html">dec64_string</a>.</p>
<p>The elementary functions (sine, log, sqrt, etc) are demonstrated by <a href="http://dec64.com/dec64_math.html">dec64_math</a>.</p>
<h1>Motivation</h1>
<p>The idea of using powers of <em>ten</em> instead of powers of <em>two</em> is not new. For example, </p>
<blockquote>
<p>Floating point subroutines and interpretive systems for early machines were coded by D. J. Wheeler and others, and the first publication of such routines was in <i>The Preparation of Programs for an Electronic Digital Computer</i> by Wilkes, Wheeler, and Gill (Reading, Mass.: Addison-Wesley, 1951), subroutines A1-A11, pages 35-37 and 105-117. It is interesting to note that floating <i>decimal</i> subroutines are described here, although a binary computer was being used; in other words, the numbers were represented as 10<i><sup>e</sup>f</i>, not 2<i><sup>e</sup>f</i>, and therefore the scaling operations required multiplication or division by 10.</p>
<p><i>The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Third Edition</i> by Donald Knuth (Addison-Wesley, 1998), page 226.</p>
</blockquote>
<p>The book Knuth cited may have been the first software book. It described some of the libraries and conventions of Maurice Wilkes’s EDSAC, one of the first generation of Von Neumann machines. Some of its subroutines used a numeric format that was very similar to <span class=dec64>DEC64</span>.</p>
<p>Floating point was so important that support for it was moved into hardware for better performance. This led to the development of binary floating point because a shift could be implemented much more easily than a divide by 10. It was discovered that by biasing the exponent and moving it to the position just after the sign bit that floating point numbers could be compared with integer opcodes, a nifty optimization. It was also discovered that because normalization always left a 1 bit in the most significant position of the significand, that that bit could be omitted, providing an additional bit of significance.</p>
<p>The Burroughs 5000 series had a floating point format in which an exponent of zero allowed the mantissa to be treated as an ordinary integer. <span class=dec64>DEC64</span> incorporates that brilliant idea.</p>
<p>Languages for scientific computing like FORTRAN provided multiple floating point types such as <code>REAL</code> and <code>DOUBLE</code> <code>PRECISION</code> as well as <code>INTEGER</code>, often also in multiple sizes. This was to allow programmers to reduce program size and running time. This convention was adopted by later languages like C and Java. In modern systems, this sort of memory saving is pointless. By giving programmers a choice of number types, programmers are required to waste their time making choices that don’t matter. Even worse, making a bad choice can lead to a loss of accuracy or destructive bugs. This is a bad practice that is very deeply ingrained.</p>
<p>Binary floating point trades away familiarity and decimal compatibility for performance. This made it unsuitable for business languages like COBOL. Decimal fractions cannot be represented accurately in binary floating point, which is a problem for programs that interact with humans, and is dangerous in programs that manipulate money. Exactness is required, so most business processing used BCD (Binary Coded Decimal) in which each digit is encoded in 4 bits. That created some inefficiency, but benefited from allowing a shift by 4 bits in place of the more complex divide by 10. For a time, mainframes could be ordered with optional floating point units for scientific computing, and optional BCD units for business computing.</p>
<p>The BASIC language eliminated much of the complexity of FORTRAN by having a single number type. This simplified the programming model and avoided a class of errors caused by selection of the wrong type. The efficiencies that could have gained from having numerous number types proved to be insignificant. </p>
<p>Business Basic was a dialect of BASIC that was developed by Basic/Four Corporation for its small business minicomputers. It used decimal floating point, much like the EDSAC, so the language could be used for both scientific and business applications. Business Basic could do everything that BASIC could do, and it could also handle money.</p>
<p>Intel undertook an ambitious architecture that was marketed as iAPX432. A decimal number type was considered for the 432 but rejected in favor of conventional binary floating point. The 432 failed, but its floating point unit was salvaged and repackaged as the 8087, a numeric coprocessor for the 8086. The 8087’s number types became the basis of the IEEE 754 floating point standard. The standard was so successful that it destroyed the market for decimal computing. A later revision of IEEE 754 attempted to remedy this, but the formats it recommended were so inefficient that it has not found much acceptance. <span class=dec64>DEC64</span> is a better alternative.</p>
</body>
</html>