Arrays in Forth


A natural question that beginners often ask is: Why doesn't Forth have features that are standard in other languages, for example, arrays? The answer is that Forth is so facile at creating new data types that it is usually easier to invent something that exactly suits your needs than it is to force your program to conform to an arbitrary standard.


Indexed and unindexed arrays

The term array in Forth has come to signify two different kinds of structures, which I distinguish by calling them unindexed and indexed.

An unindexed array allots a specified number of bytes at compile-time and returns the address of its origin at run-time. It is not really an array, in my opinion, but a work area or buffer. The defining word ARRAY in F-PC is of this type. Here it is, suitably renamed, with an example:

: unindexed-array ( n -- ) ( -- a) 
     create allot ;
80 unindexed-array u-foo   \ Make an 80-byte unindexed array
u-foo                      \ Return the origin addr of u-foo

An indexed array regards the area as divided into elements of equal length. It replaces an integer n at the top of the stack by the address of the nth element.

"Variable-like" and other kinds of indexed arrays

I call an indexed array variable-like if it returns an address. One can also have value-like arrays that return their contents, execution arrays ("vectors") that perform the contained action, and mixed arrays composed of several of these types.

Many Forth implementations provide a variable-like indexed array called simply array. It creates a 1-dimension variable-like array with elements one cell in length. Most programmers think of this when "array" is mentioned, and may even assume that it is standard. Its popularity is warranted, as it is quick and simple. Note that using 3 as the index returns the fourth element, because numbering conventionally starts with zero in Forth.

If you wish, you can change the definition so that 1 becomes the first element, but it's probably better if, for example, you want an array of size 100, to define it with 101 elements and ignore the first one.

: array ( n -- ) ( i -- addr)
     create cells allot
     does> cells + ;
100 array foo              \ Make an array with 100 cells
3 foo                      \ Return address of fourth element

Remember, zero is the first element.

Comment by Mike Condron - 08/07/2019:

The example for : array I think needs a "SWAP" as the first word after DOES>

: array ( n -- ) ( i -- addr)
     create cells allot
     does> swap cells + ;
100 array foo              \ Make an array with 100 cells
3 foo                      \ Return address of fourth element

Otherwise, as written without the swap, cells is getting the address of the data field of the array object, not the index.


A flexible array

The elements of array are limited to a single cell in length. This is not adequate even for our first adventure game, Game 0, which will need five cells to store data for each room: one for the room descriptor, and four for the destinations.

The defining word long-element-array will create variable-like arrays whose elements are any desired length. At compile-time, the number of elements and their length in cells are on the stack, and at run-time, an index returns the address of the corresponding element. In the example, the array th-room provides for 10 rooms of 5 cells each.

: long-element-array ( n len -- ) ( i -- addr)
     create  dup ,  * cells allot
     does>   dup @ swap * cells + ;
10 5 long-element-array th-room  \ Create array for 10 rooms
4 th-room                        \ Find address of room 4


Subfields

Dividing array elements into fields

Since the records of an array may may contain multiple data items, we need a way to access their components. The principle is simple: add an offset. To make it easy, we will construct a defining word to create the offsets.

Before defining a group of offsets, reset the variable current-offset to 0. Then define each offset according to how many cells it will contain.

variable current-offset
: offset ( n -- ) ( addr -- addr')
     create current-offset @ ,
     does> @ cells + ;

In Game 0, each room needs 5 cells: 1 to contain the room-descriptor (actually a pointer to a Forth word that will print the string) and 4 for directions, north, east, south, and west.

I name the offsets with a right curly-bracket, which is pronounced "to." Remember that an offset finds an address. You still have to fetch or otherwise manipulate the data at that address.

current-offset off           \ Set variable to 0
     1 offset }descriptor
     1 offset }north
     1 offset }east
     1 offset }south
     1 offset }west

\ Examples:
     3 th-room }north @     \ Rm#   The room north of room 3
     4 th-room }descriptor @ execute
                            \       Print the description of room 4

"Faking" an indexed array

Much of the above-described apparatus is unnecessary if your program has only one or two arrays.

  • Step 1. Create an unindexed array or work area. You may use unindexed-array described above, or simply allot the requiired number of bytes. It is convenient to enclose the name of this array in parentheses.
    create (rooms) 200 allot
  • Step 2. Create a word that will calculate the address, given the index. The word room assumes that there are 20 bytes assigned to each element of the array.
    : room ( i -- a )
         20 * (rooms) + ;

An array of bits

It is possible to make an area of RAM into an array of bits, each of which can be accessed by number, as described in Learning Forth Bit by Bit.


The following code meets requrements for an ANS Forth standard program.

Home

Updated: 7/25/96

Source code

: unindexed-array ( n -- ) ( -- a) 
     create allot ;
80 unindexed-array u-foo   \ Make an 80-byte unindexed array
u-foo                      \ Return the origin addr of u-foo

: array ( n -- ) ( i -- addr)
     create cells allot
     does> cells + ;
100 array foo              \ Make an array with 100 cells
3 foo                      \ Return address of fourth element

: long-element-array ( n len -- ) ( i -- addr)
     create  dup ,  * cells allot
     does>   dup @ rot * cells + ;
10 5 long-element-array th-room  \ Create array for 10 rooms
4 th-room                        \ Find address of room 4

variable current-offset
: offset ( n -- ) ( addr -- addr')
     create current-offset @ ,
     does> @ cells + ;

current-offset off           \ Set variable to 0
     1 offset }descriptor
     1 offset }north
     1 offset }east
     1 offset }south
     1 offset }west

\ Examples:
     3 th-room }north @     \ Rm#   The room north of room 3
     4 th-room }descriptor @ execute
                            \       Print the description of room 4