Vector vol. 23 Nos. 1&2 Contents Editorial Stephen Taylor 2 Sustaining Members News

Download 0.53 Mb.

Page	17/31
Date	28.01.2017
Size	0.53 Mb.
	#9422

1 ... 13 14 15 16 17 18 19 20 ... 31

References

[1] Ajay Askoolum, System Building with APL+Win, Wiley, 2006, ISBN 0470030208 http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0470030208.html

[2] Gerald M. Weinberg, The Psychology of Computer Programming, Coriolis, 1988, ISBN 0442207646

Design Decisions in APLX64

by Richard Nabavi, MicroAPL Ltd (richardnabavi@microapl.co.uk )

When MicroAPL launched its first APL microcomputer in 1980, the CPU was a Zilog Z80, which could address a maximum of 64 Kb. By squeezing the system and APL interpreter software down to an absolute minimum, we were able to offer a workspace size of around 28Kb. Remarkably, users were able to write some quite sophisticated APL systems with this tiny resource, but it was tight!

The big breakthrough came in 1983, when the MicroAPL Spectrum was launched, with a 68000 processor and a maximum of 16MB of RAM (24-bit addressing). For the first time, microcomputer users could enjoy a workspace size as big as, or much bigger than, what a mainframe could offer. Within a couple of years, MicroAPL was selling 68020-based systems with full 32-bit addressing.

At this time, IBM PC users were still largely limited to a total of 640Kb, but this was segmented into 64K chunks which ‍limited the maximum size of arrays. Eventually the PC world caught up, and unsegmented 32-bit systems, addressing up to a theoretical 4Gb of memory (of which 2Gb is the most that Windows applications can address), have been the norm for the last decade or more. Because they use (signed) 32 bit integers internally, most APL interpreters available today have a maximum workspace size of 2Gb, and the maximum number of elements in an array is at most 2,147,483,647 (¯1+2*31).

But now a new standard is emerging. First AMD, and subsequently Intel, have extended the original x86 architecture to a full 64 bits. Desktop computers now often contain a 64-bit processor (such as the Intel Core Duo or AMD Athlon), and Intel have now standardised on 64-bits for their entire range of desktop and server processors. Currently, nearly all of these are used in 32-bit mode, but the fact remains: low-cost 64-bit systems are here. And APL is one of the few software products which can make good use of the new power and enhanced memory addressing which is now available.

APLX64 is a fully 64-bit version of APLX, which is designed to take advantage of this new memory addressing capability. It is currently available for Linux and Windows.

Representation and Conversion of Numbers

Integer size

In designing APLX64, one of the first design decisions was: how big should integers be? Our starting point was that we did not want to impose any artificial restrictions on workspace size or array dimensions, so in APLX64 all array dimensions and all internal pointers are 64-bit. This means that, in theory, the maximum workspace size is 8,589,934,592 Gb, and the maximum size of an array is 9,223,372,036,854,775,807 elements (¯1+2*63). That should be enough for another decade or two, and takes us well beyond the current generation of 64-bit operating systems, which are typically limited to 128Gb of physical RAM.

In order to index these potentially massive arrays, we decided to implement full 64-bit integers. This was partly to avoid having to use floating-point numbers to index arrays, but also because 64-bit integers are needed in other contexts in 64-bit systems. These other uses include positions in native files, record numbers and IDs in SQL databases, and handles, pointers and other 64-bit values returned from external calls (⎕NA). In addition, we wanted the APL user to be able to do full-precision 64-bit integer arithmetic.

Booleans remain as one bit per element, making it possible to handle huge Boolean arrays without excessive memory requirements.

Representation of floating-point numbers

In most 32-bit APLs, including APLX, floating-point numbers are represented in 64-bit IEEE format. This representation has 53 bits of precision, and a range of ¯1.797693135E308 to +1.797693135E308.

In APLX64, we decided to keep this same 64-bit format for floating-point numbers. The principal motivation for this decision was that current processors and compilers support 64-bit floats directly, whereas higher-precision representations (such as 80-bit or 128-bit) are not available on many platforms. It does not look as though this will change in the near future. A secondary motivation was to save space on large floating-point arrays.

Conversion between integers and floats

The choice of 64-bit float types presents a potential problem. Up to 2*53, integers can be represented exactly as 64-bit floats. Above 2*53, the floats start to lose so much precision that a given float bit-pattern covers a range which includes more than one integer (maybe many thousands of integers). So what happens to the rules for converting integers to floats and vice versa in APL?

In APLX64 the Floor ⌊ and Ceiling ⌈ primitives have been modified so that, given a float number greater than or equal to 2*53, the number is considered to have overflowed precision, and hence the primitives return the float value unchanged (as a 64-bit float). This is effectively the same behaviour as already happens in 32-bit APLs at 2*31. The reasoning here is that it is wrong to appear to create a spurious precision by choosing one particular 64-bit integer to represent the floor or ceiling, when the interpreter could equally validly choose many other integers.

For the same reason, any float greater than 2*53 cannot be used in expressions which require an exact integer (for example, to index an array, or as a file pointer). A DOMAIN ERROR will be reported.

Integer tolerance

According to the APL2 Programming Language Reference:

A number R is treated as integer if the difference between R and some integer is less than approximately 1E¯13×1⌈|R .

This definition would have strange consequences for large numbers. It would mean that ALL floating-point numbers greater than 1E13 (approx 2*43), would be regarded as integers.

To avoid this problem, APLX64 applies the following rules:

• If the resulting integer would fit in a 32 bit integer, we adopt the existing APL2 rule.

• For larger integers, we use the a fixed distance, the same as that which we would use for 2*32, i.e.

1E¯13 × biggest 32-bit integer => 0.0000488

This has the desirable consequence that 10,000,000,000,000.5 is not regarded as an integer.

Comparison tolerance

In APLX64 the default value of ⎕CT has been reduced from 1E¯13 to 3E¯15. This is a compromise between a value which is small enough to distinguish X from X+1 at high values of X, and not giving false negatives for true float comparisons because of calculation and representational inaccuracies. The new default value gives means that, for X up to 2*48, the expression X=X+1 always returns 0, irrespective of the internal representation of X.

Default display of numbers

In APLX64 the rules for the default display of numbers has been changed. Numbers represented internally as integers are displayed in full precision irrespective of ⎕PP (this is also true in most 32-bit APLs, although it may not be obvious because of the limited allowed range of ⎕PP). In addition, numbers internally represented as floats which are less than 2*53, and which are ‘exact’ integers, are also displayed in full precision irrespective of ⎕PP. The practical effect of this is that, at the point where the floats lose precision and cannot be converted back to integers, the default display switches into E format. Below that, true 64-bit integers, and floats which are close to or exactly integers, both display in the same way (full precision).

Example

The following sequence illustrates how this all works:

BIGINT←2*48

BIGINT ⍝ 64-bit integer
281474976710656

⎕DR BIGINT

2 ⍝ Data representation 2 means Integer

BIGFLOAT←1.0×BIGINT ⍝ Multiply by float

⍝ forces result to float

BIGFLOAT
281474976710656 ⍝ Looks the same as BIGINT, though.

⍝ It could be used as a file position,
⍝ array index, etc

⎕DR BIGFLOAT

3 ⍝ .. but Data Representation 3
⍝ i.e. float

⌊BIGFLOAT

281474976710656 ⍝ Floor produces same whole number.
⍝ Good!

⎕DR ⌊ BIGFLOAT

2 ⍝ Internally converted to integer

BIGINT = BIGINT+1

BIGFLOAT = BIGFLOAT+1

0 ⍝ Distinct numbers at default ⎕CT

VERYBIGINT←2*62 ⍝ A rather bigger 64-bit integer

VERYBIGINT
4611686018427387904

⎕DR VERYBIGINT

VERYBIGFLOAT←1.0×VERYBIGINT ⍝ Force it to 64-bit float form

⎕DR VERYBIGFLOAT

3 ⍝ Data Representation 3, i.e. float

VERYBIGFLOAT
4.611686018E18 ⍝ Lost precision:
⍝ displays in E format.
⍝ It can NOT be used as a file
⍝ position, array index, etc

⌊VERYBIGFLOAT

4.611686018E18 ⍝ Floor cannot restore the lost
precision

⎕DR ⌊VERYBIGFLOAT

3 ⍝ so it returns the same float
number

VERYBIGINT+1

4611686018427387905 ⍝ Great! We can add 1 to a
64-bit integer!

VERYBIGINT = VERYBIGINT+1

0 ⍝ Integer comparison:
⍝ They are distinct

VERYBIGFLOAT=VERYBIGFLOAT+1

1 ⍝ Float comparison:
⍝ Same (within ⎕CT)

VERYBIGFLOAT+1

4.611686018E18 ⍝ Actually, the addition does
nothing.

⍝ We have only 53 bits of precision,

⍝ so the extra 1 is lost off the end
⍝ for a number of magnitude 2*62

Summary of integer-float conversion issues

The practical effect of these design choices is that, for whole numbers below 2*48, the APL programmer does not need to know or care whether the number is internally represented as a float or as a 64-bit integer; it will behave and display in the same way, and comparisons will always give the expected result. Any conversion between the two internal forms loses no precision, and hence is reversible (e.g. using Floor or Ceiling). Either representation can be used to index an array, or represent a position in a huge native file.

For numbers between 2*48 and 2*52, the same is true, except that the APL programmer might need to reduce ⎕CT to avoid comparison problems, or alternatively use Floor or Ceiling to force the numbers to integer before doing a compare.

Above 2*52, if the APL programmer needs exact integers (for example, for doing high-precision arithmetic, or if the integers are 64-bit database record numbers), APLX64 can correctly handle this requirement. However, in this case the APL programmer needs to be careful to ensure that the integers do not accidentally get converted to float (for example, by mixing record numbers and float values in a single N×2 matrix, or by doing arithmetic operations which are intrinsically non-integer, such as divide). Fortunately, if this does happen, it should be obvious, because the display will flip into E format at the point where precision has been lost, and operations which require an integer will give DOMAIN ERROR rather than giving the wrong answer.

Directory: issues
issues -> Protecting the rights of the child in the context of migration
issues -> Submission for the Office of the High Commissioner for Human Rights (ohchr) report to the General Assembly on the protection of migrants (res 68/179) June 2014
issues -> Human rights and access to water
issues -> October/November 2015 Teacher's Guide Table of Contents
issues -> Suhakam’s input for the office of the high commissioner for human rights (ohchr)’s study on children’s right to health – human rights council resolution 19/37
issues -> Office of the United Nations High Commissioner
issues -> The right of persons with disabilities to social protection
issues -> Human rights of persons with disabilities
issues -> Study related to discrimination against women in law and in practice in political and public life, including during times of political transitions
issues -> Super bowl boosts tv set sales millennials most likely to buy

Download 0.53 Mb.

Share with your friends:

1 ... 13 14 15 16 17 18 19 20 ... 31