Welcome to RetroForth RetroForth is an implementation of the Forth programming language. It provides a flexible set of words, block editor, and interactive environment in one small package. The implementation is built over the Rx Core; a very small library that provides the core functionality needed by a Forth system. The development of Rx Core and RetroForth are related. This handbook is based on the Rx Handbook, and some areas will reflect that. RetroForth does not follow any of the various standards for Forth that have been created over the years. Rather, the words and their behaviour has evolved over a period of eight years to service those using it. The code is continually being refined; new things added, and obsolete things phased out. It is an active codebase. What are word classes? A concept called \i word \i classes is at the heart of RetroForth. Word classes are a way to group related words, based on their compilation and execution behaviors. A special word, called a \i class \i handler, is defined to handle an execution token passed to it on the stack. The compiler uses a variable named class to set the default class when compiling a word. We'll take a closer look at class later. Rx provides several classes with differing behaviors: ------------------------------------------------ .class ( xt -- ) The class handler for the provided classes. During compilation, a call to the xt is compiled. During interpretation, the xt is executed. .data ( xt -- ) .data is used for data structures and numbers. It will leave the xt on the stack during interpretation and compile the xt as a literal during compilation. .forth ( xt -- ) The class handler for forth words. Semantics are the same as with .class .self ( xt -- ) Words using this class are always called, unless used in a definition with a prefixing word immediately before them. Generally words using this class will need to be state-smart. .macro ( xt -- ) Used for compiler macros, words with a class of .macro are always called during compilation (unless preceded by a prefixing word). They are ignored during interpretation. .inline ( xt -- ) This is a special class used for certain primitives. It will copy the machine code (up to a hex value of $C3, the ret instruction) into the current definition during compilation. During interpretation the word is called normally. Only words that do not use $c3 and have no calls to other words can use the .inline class. ------------------------------------------------ The class mechanism is not limited to these classes. You can write custom classes at any time. On entry the custom handler should take the XT passed on the stack and do something with it. Generally the handler should also check the compiler state to determine what to do in either interpretation or compilation. As a simple example, let's take a look at the \b .data hander, written in Forth: ------------------------------------------------ ( handler for data types ) ( This compiles the XT as a literal in a definition. If interpreting, just leave ) ( the XT on the data stack. ) ( For constants, we make use of a clever trick. We set the XT to the value of the ) ( constant and give it a class of .data allowing it to work without requiring any ) ( special code or create/does> constructs ) : .data ( xt -- xt | ) state @ if literal, then ; ------------------------------------------------ When writing programs in Rx, you will need to set the class of each word you write. There are numerous ways to do this. A look at the example below will show how each method appears in source. We'll use the \b .forth class in the examples, but the same approach applies to all of the classes. ------------------------------------------------ ' .forth class ! ( set the default class to .forth ) ' .forth reclass ( set the class of the last defined word to .forth ) ' .forth reclass: wordname ( set wordname to the .forth class ) ------------------------------------------------ Because changing the default class will be done regularly, the main classes also have wrapper functions provided. These wrappers are given a class of \b .forth and are listed below. In general, it's recommended to use these wrappers to set the default class for readability reasons. ------------------------------------------------ forth ( -- ) Set .forth as the active class macro ( -- ) Set .macro as the active class self ( -- ) Set .self as the active class inline ( -- ) Set .inline as the active class ------------------------------------------------ Dictionary Headers The Rx Core has a single dictionary consisting of a linked list of headers. The current form of a header is shown in the chart below. Pay special attention to the accessors. Each of these words corresponds to a field in the dictionary header. When dealing with dictionary headers, it is recommended that you use the accessors to access the fields since it is expected that the exact structure of the header will change over time. ------------------------------------------------ accessor: :link offset: 0 Use: Link to the previous header. If set to 0, no prior entries are visible. accessor: :xt offset: 4 Use: Link to the compiled code definition. For constants, the value of the constant is stored in this field. accessor: :class offset: 8 Use: Link to the class handler. accessor: :doc offset: 12 Use: Store a pointer to a documentation string, source, or other documentation source. accessor: :name offset: 16 Use: The name of the word, stored as a packed string. ------------------------------------------------ Word names are stored in the dictionary as packed strings. Use \b unpack to obtain a counted string from it. In memory, a packed string looks like: ------------------------------------------------ count (one byte) actual string (a sequence of bytes) ------------------------------------------------ Because of the use of a single byte for the count, word names are limited to 255 characters. The most recently created header can be accessed by using a variable named \b last. To access a specific header, you can use \b >entry to get the header that points to a specified XT. A special variable named \b which is also updated by find each time a word is looked up in the dictionary. This variable always points to the entry of the most recently found word. Given only a single dictionary, it might seem limiting, however both vocabularies and lexical scope have been implemented on top of it. We will look at each of these now. Lexical Scope There will be often be times you would like to "hide" some words that you only use once or twice to build other words. In Rx, the ability to do this is provided by the lexical scoping words \b loc: and \b ;loc A typical use looks like: ------------------------------------------------ loc: ( start a lexically scoped sequence ) loc: ( start a lexically scoped sequence ) 10 constant ten 20 constant twenty 10 20 * constant two-hundred here ] two-hundred ; ( anonymous definition ) words ( debugging; show words in dictionary ) ;loc is value ( close the lexically scoped sequence and reveal ) ( the anonymous definition as "value" ) words ( debugging; show words in dictionary ) : a value 10 / ; : b value 20 / ; here ] a b * . cr ; ( anonymous definition ) ;loc is foo ( close the lexically scoped sequence and reveal ) ( the anonymous definition as "foo" ) words ( debugging; show words in dictionary ) ------------------------------------------------ As you can see in the example above, when a lexically scoped definition is closed, the headers for all words inside the scoped region are removed. Only words explicitly revealed are kept in in the global dictionary. Also, as is shown in the example, you can nest lexically scoped sequences. Nesting is possible up to four levels deep. Lexical scopes are used frequently by the developers of Rx and allow a good level of factoring without polluting the actual dictionary with words that are not needed outside a select few definitions. Handling Syntax Errors By default, the Rx Core ignores syntax errors. This behavior can be altered freely by those using Rx by replacing the \b word? hook with a new definition. The \b word? hook is the "last chance" error handler. When the interpreter obtains a token it tries to process it in a defined order: ------------------------------------------------ 1. Look up the token in the dictionary. If found, execute its class handler 2. Attempt to convert the token to a number. If valid, execute the .data class handler to determine the course of action to take 3. Call word? to hopefully resolve the problem. ------------------------------------------------ You can extend this or simply report errors by changing \b word? to point to a new definition. As a vectored word, this is easily done. A simple example is shown below. ------------------------------------------------ ( Example of a custom error handler. ) here is word? ] ( ac-f ) ." could not identify: " type cr false ; ------------------------------------------------ This replaces the default definition of \b word? with a new handler that displays a nice error message. Of course it is possible to write significantly more flexible error handlers. For instance, you could write a handler for entering floating point numbers, say \b >float, and hook that into \b word? to allow the user to enter floating point numbers directly at the interpreter. Vectored Words Many words in Rx are defined as vectored words. These are similar to, but more flexible than, deferred words in ANS Forth. A vectored word will have vector as the first word in its definition. This is the only step necessary to make a word vectorable. Four words are provided for creating and working with vectors. ------------------------------------------------ vector ( - ) Put this as the first word in a definition to make it into a vectorable word. is ( a"- ) Set the following wordname to point to the address on the stack. If the word does not exist, create it as a vector defaulting to the passed address. devector ( "- ) Remove the vector set by is. default: ( a"- ) Used in a special form, e.g: : foo vector vector ; This will set the *second* vector to the passed address. ------------------------------------------------ As an example: ------------------------------------------------ ( simple example of vectors ) : a vector 12 ; : b vector 23 ; : c vector a b * ; : d a . ." times " b . ." equals " c . cr ; d here is a ] 23 ; d here is b ] 13 ; d here is c ] a b + ; d devector a d devector b d devector c d ------------------------------------------------ While a bit simplistic, this does show how to use vectors. Vectors can also be used with lexically scoped sequences to make revealing multiple words cleaner: ------------------------------------------------ ( another example ) : a vector ; : b vector ; loc: 10 constant ten 20 constant twenty : * later prior * ; : + later prior + ; 10 + 20 constant thirty 20 * 20 constant four-hundred here is a ] ." 10+20=" thirty . cr ; here is b ] ." 20*20=" four-hundred . cr ; ;loc ------------------------------------------------ Learning to use vectored words will allow you to explore new options in your programming. Word Prefixes Release 9.2 of RetroForth introduced \i word \i prefixes, which are single character prefixes that can enhance readability and add new features. \b More \b to \b come. ------------------------------------------------ ~name -- return the xt of name `name -- compile a call to name !name @name -- store and fetch from a variable called name +name -name -- add to and subtract from the variable name's current contents ------------------------------------------------ Quotes \b coming \b soon Can be nested up to four levels deep. Useful inside words or at the console. ------------------------------------------------ [[ ]] ------------------------------------------------ Appendix 1: Glossary of Words ------------------------------------------------ .class ( xt -- ) .class Class handler for the provided classes .forth ( xt -- ) .class Class handler for Forth words .macro ( xt -- ) .class Class handler for compiler macros .inline ( xt -- ) .class Class handler for Forth words that can be inlined .data ( xt -- ) .class Class handler for basic data structures forth ( -- ) .forth Set the .forth class as the active class macro ( -- ) .forth Set the .macro class as the active class self ( -- ) .forth Set the .self class as the active class inline ( -- ) .forth Set the .inline class as the active class entry ( addr count -- ) .forth Create a dictionary entry pointing to here. A string containing the name of the word to be created is passed on the stack. The word is created with the current class. lookup ( addr count --- xt flag | addr count flag ) .forth Attempt to find a word in the dictionary. If successful, it returns the address of the word and the true value. If unsuccessful, it leaves the string containing the word to search for on the stack and returns a value of false. >number ( addr count -- n flag | addr count flag ) .forth Attempt to convert a string to a number in the current base. If successful, leave the number and a value of true on the stack. If unsuccessful, leaves the string containing the number on the stack and a return value of false. ] ( -- ) .forth Turn the compiler mode on. This sets state to -1. compile ( xt -- ) .forth Compile a call to the specified xt. Technical note: The call instruction is relative. To obtain the address of a word from a compiled definition, try something like: : foo words ; ' foo 1+ @ here + 5 + This will adjust the first call (1+ skips over the call opcode) to point to the valid address for the function used. The "5 +" is the adjustment, based on the size of a call. Call opcodes on x86 are 5 bytes long. , ( n -- ) .forth Place a cell (4 byte) value into the heap 1, ( n -- ) .forth Place a single byte into the heap 2, ( n -- ) .forth Place a word (2 bytes) into the heap 3, ( n -- ) .forth Place three bytes into the heap eval ( addr count -- ) .forth Evaluate (using the current compiler state) the string. The results of the evaluation are left on the stack. parse ( delimiter -- addr count ) .forth Search ahead in the tib for the delimiter. This skips leading whitespace and, if not found, will return a string containing the rest of the input line. reset ( ... -- ) .forth Remove all values from the stack last ( -- addr ) .forth Get the address of a pointer to the most recent dictionary entry tib ( -- addr ) .forth Get the address of the text input buffer word? ( addr count -- flag ) .forth Last-chance error handler for syntax errors. Receives a string pointing to a token that was not located in the dictionary or converted to a number. The returned flag should tell whether it was handled successfully or not. not ( n -- n ) .forth Perform a bitwise NOT operation on the TOS @ ( addr -- n ) .forth Fetch a cell-sized value from the address specified ! ( n addr -- ) .forth Store a cell-sized value into the address specified w@ ( addr -- n ) .forth Fetch a word-sized value from the address specified w! ( n addr -- ) .forth Store a word-sized value into the address specified c@ ( addr -- n ) .forth Fetch a single byte from the address specified c! ( n addr -- ) .forth Store a single byte into the address specified + ( x y -- z ) .forth - ( x y -- z ) .forth * ( x y -- z ) .forth / ( x y -- z ) .forth mod ( x y -- z ) .forth These are the basic math words. Insert the operation between the "x" and "y" in the stack comment to convert to infix notation. /mod ( x y -- z n ) .forth Divide, returning the result and the remainder (x/y) wsparse ( "word" -- addr count ) .forth Parse ahead in the TIB until either a space or end of line is reached. lnparse ( "line" -- addr count ) .forth Parse ahead in the tib until the end of the line is reached ' ( "name" -- xt | addr count ) .forth Return the address of a function in the dictionary or the name of the word if it doesn't exist. Technical note: This word will set which to the dictionary entry of the word if found. x' ( "name" -- xt flag | addr count flag ) .forth Return the address of a function in the dictionary or the name of the word if it doesn't exist. Also returns a flag that can be used with the conditional words. >> ( x y -- z ) .forth Perform a bitwise right shift << ( x y -- z ) .forth Perform a bitwise left shift here ( -- addr ) .forth Return the value on the top of the stack :link ( dt -- addr ) .forth Return the address of the link field in a dictionary entry :xt ( dt -- addr ) .forth Return the address of the xt field in a dictionary entry :class ( dt -- addr ) .forth Return the address of the class field in a dictionary entry :doc ( dt -- addr ) .forth Return the address of the docstring :name ( dt -- addr ) .forth Return the address of the name field in a dictionary entry. Use count to convert this to a string. reclass ( class -- ) .forth Change the class of the most recently defined word to the specified class. reclass: ( xt "name" -- ) .forth Change the class of the word specified by name to the specified class cells ( x -- y ) .forth Return x multiplied by the size of a cell (4 bytes) cell+ ( x -- y ) .forth Add the size of a cell (4 bytes) to x cell- ( x -- y ) .forth Subtract the size of a cell (4 bytes) from x word+ ( x -- y ) .forth Add the size of a word (2 bytes) to x word- ( x -- y ) .forth Subtract the size of a word (2 bytes) from x rot ( x y z -- y z x ) .forth Rotate the stack so that the third element is on top. -rot ( x y z -- z x y ) .forth Rotate the stack twice. Basically the same as doing rot rot. over ( x y -- x y x ) .forth Place a copy of NOS above TOS tuck ( x y -- y x y ) .forth Place a copy of TOS under NOS 2dup ( x y -- x y x y ) .forth Duplicate the top two items on the stack later ( rs:x rs:y -- rs:y rs:x ) .forth Defer execution of the rest of the definition until the caller finishes executing. 0; ( n -- n | ) .forth Exit a word if the TOS is equal to 0. Drops TOS before exiting. Leave it alone if non-zero. execute ( xt -- ) .forth Call the code that xt points to. This word ignores the class! Technical note: This treats the passed xt as a .self word. General note: You can use get-class to obtain the class handler for the specified xt. create: ( "name" -- ) .forth Create a new word with a name of name and the current class literal, ( n -- ) .forth Compile a literal into a definition. This is different from the macro word literal which is called at compile time. This word is called at runtime. literal? ( n -- n | ) .forth If compiling, compile 'n' as a literal. If interpreting, leave it on the stack. create ( "name" -- ) .forth Create a new word named name using the .data class. variable ( "name" -- ) .forth Create a variable named name with an initial value of 0 variable: ( n "name" -- ) .forth Create a variable named name with an initial value of n. constant ( n "name" -- ) .forth Create a constant named name with a value of n. +! ( n addr -- ) .forth Add n to the contents of the memory location specified by addr. -! ( n addr -- ) .forth Subtract n from the contents of the memory location specified by addr. allot ( n -- ) .forth Allocate n bytes in the heap alias ( xt "name" -- ) .forth Create an alternate dictionary entry for xt with a name of name loc: ( -- ) .forth Start a lexically scoped section of code. These can be nested up to four deep. ;loc ( -- ) .forth End the current lexically scoped section of code. fill ( addr count value -- ) .forth Fill the memory location specified by addr with count bytes of the specified value. move ( source dest count -- ) .forth Move count bytes from the memory address specified by source to dest. copy ( source dest count -- dest count ) .forth Copy "count" bytes from source to dest, leaving the destination and count on the stack. pad ( -- addr ) .forth Return the address of the pad, a floating buffer above here which is used for temporary strings and other structures. " ( "string" -- addr count ) .forth Parse ahead until a " is encountered. Place this string in the PAD and return a pointer to it. z" ( "string" -- addr ) .forth Like ", but returns a zero-terminated string $, ( addr count -- ) .forth Compile a string into the current definition. hex ( -- ) .forth Switch to base 16 decimal ( -- ) .forth Switch to base 10 binary ( -- ) .forth Switch to base 2 octal ( -- ) .forth Switch to base 8 >entry ( xt -- dt ) .forth Given an xt, obtain the corresponding dictionary entry. unpack ( addr -- addr count ) .forth Convert a packed string to a counted string (useful with dictionary entries) is ( xt "name" -- ) .forth Change the vectored word name to point to xt. If name does not exist, create a new vectored word with a default definition of xt. default: zt "name" -- ) .forth Change the second vector in a double-vectored word to point to xt. devector ( "name" -- ) .forth Remove the vector of name, restoring the original definition. zt ( addr count -- addr ) .forth Convert a counted string to a zero-terminated string. Cleanup of allocated space is done automatically. get-class ( xt -- xt class ) .forth Obtain the class for a specified xt; if the xt is not found in the dictionary, assumes a default class of .forth = ( x y -- flag ) .forth <> ( x y -- flag ) .forth < ( x y -- flag ) .forth > ( x y -- flag ) .forth Compare x any y, return a true/false flag that can be used with the new conditionals or 'if'. t ( flag xt -- ) .forth Execute xt if the flag is true f ( flag xt -- ) .forth Execute the xt if the flag is false t/f ( flag xt1 xt2 -- ) .forth Execute xt1 if the flag is true, execute xt2 if the flag is false on ( addr -- ) .forth off ( addr -- ) .forth 'on' sets a variable to 'true', 'off' sets it to false toggle ( addr -- ) .forth Toggle a variable between true and false ( ( "comment" -- ) .self Parse ahead until the ) character or end of line is found. Ignore everything that was parsed. : ( "name" -- ) .self Create a new word with a name of "name" and start the compiler. The word is created in the current class. [[ ( -- ) .self ]] ( -- xt ) .self Start and end an anonymous definition. This can be used inside other definitions, and nested up to four levels deep. ; ( -- ) .macro End the current definition See also: ;; [ ( -- ) .macro Switch to interpretation mode. Sets state to 0. ;; ( -- ) .macro Compile an exit to the current word without ending the definition Technical note: This word will compile either a "ret" ($c3 opcode) or change the last compiled call opcode to a jump opcode. This provides inherent tail call elimination and allows for safe recursion. As ; uses this, the same applies to it. literal ( n -- ) .macro Compile a literal from the stack into the current definition x: ( "name" -- ) .macro Compile a call to name ignoring the action defined by the word's class. This treats the "name" as a .self word. ['] ( "name" -- ) .macro Compile the xt of name into the current word c: ( "name" -- ) .macro Compile the code needed to compile a call to name into the current definition. Does the same thing as: ['] name compile as ( "class" -- ) .macro Change the class of the most recently created word to the specified class >r ( n -- rs:n ) .macro Move TOS onto the return stack. r> ( rs:n -- n ) .macro Move TORS to the data stack r ( -- n ) .macro Get a copy of TORS on the data stack. Historical Note: In most Forths this word is called r@. In RetroForth, it has traditionally been called r. We chose to continue with the RetroForth naming in this case. Technical Note: This has the same effect as: r> dup >r rdrop ( rs:n -- ) .macro Drop the TORS. Technical Note: Has the same effect as: r> drop repeat ( -- ) .macro Start an unconditional loop again ( -- ) .macro Close an unconditional loop, branching back to the most recent repeat Technical Note: Loops constructed using repeat and again are properly tail recursive. for ( n -- rs:counter ) .macro Start a counted loop. The counter is put on the return stack next ( rs:counter -- rs:counter | ) .macro Close a loop that starts with for. Decrements the counter. If 0, exit, otherwise branch back to for. (if) ( n -- ) .macro Factor of most of the forms of if General Note: There is seldom any reason to call this word in normal code. Its sole purpose is to factor out a chunk of code shared between the other conditionals. <>if ( x y -- ) .macro =if ( x y -- ) .macro if ( x y -- ) .macro We describe these together since they are very similar in functionality. To visualize the action, insert the symbol between "x" and "y" in the stack diagram. <> denotes inequality. These words start a conditional, ending with then. if ( flag -- ) .macro If the flag is true, execute the code between if and then. If not true, skips ahead to then. then ( -- ) .macro Terminate any of the if constructs. This patches the conditional jump to point to the proper offset in the compiled definition. ;then ( -- ) .macro Compile an exit to the word and then terminate any if construct. Technical Note: The same as: ;; then if; ( flag -- ) .macro If true, then exit the word. Otherwise continue execution of the word. Technical Note: The same as: if ;; then prior ( "word" -- ) .macro Compile a call to an earlier definition of word. vector ( -- ) .macro Make the word a vectored word. This must be the first word in the definition if you use it. s" ( "string" -- addr count ) .macro Compile a string into the definition { ( "}" -- ) .macro Compile a sequence of code to be evaluated at runtime. The compiler state is not altered by this. dup ( n -- n n ) .inline Duplicate the top item on the stack 1+ ( x -- y ) .inline Increment the TOS 1- ( x -- y ) .inline Decrement the TOS swap ( x y -- y x ) .inline Exchange the location of the top two items on the stack drop ( x -- ) .inline Discard the top element of the stack nip ( x y -- y ) .inline Discard the second element on the stack 2drop ( x y -- ) .inline Discard the top two elements on the stack and ( x y -- z ) .inline Perform a bitwise AND operation or ( x y -- z ) .inline Perform a bitwise OR operation xor ( x y -- z ) .inline Perform a bitwise XOR operation h0 ( -- addr ) .data Variable pointing to the top of the heap base ( -- addr ) .data Variable holding the current base >in ( -- addr ) .data Variable that holds the current address in the sequence being evaluated. class ( -- addr ) .data Variable that stores the address of the current class handler for creating new words. state ( -- addr ) .data Variable which contains the state of the compiler. true is compiling, false is interpreting. which ( -- addr ) .data Variable pointing to the dictionary entry of the most recently found word. whitespace ( -- addr ) .data Variable which contains a character used as the whitespace to search for. list ( -- addr ) .data \ nVariable pointing to a data structure used to implement lexical scoping. true ( -- addr ) .data Constant returning the value -1 false ( -- addr ) .data Constant returning the value 0 ------------------------------------------------ Appendix 2: Assembly Listings Many words in the bootstrap code for Rx are written using raw opcodes for various instructions. This has been a cause of much confusion, so we'd like to take the time now to document what each word actually compiles to. Due to the little-endian nature of the x86, the opcodes in the source are in reverse order. These listings have the normal order. ------------------------------------------------ dup 8946FC mov [esi-4], eax 8D76FC lea esi, [esi-4] drop AD lodsd swap 8706 xchg eax, [esi] nip 83C604 add esi, 4 1+ 40 inc eax 1- 48 dec eax 2drop AD lodsd AD lodsd and 2306 and eax, [esi] 83C604 add esi, 4 or 0B06 or eax, [esi] 83C604 add esi, 4 xor 3306 xor eax, [esi] 83C604 add esi, 4 r> 58 pop eax >r 50 push eax AD lodsd r 58 pop eax 50 push eax rdrop 5B pop ebx =if 0F85 jnz if 0F8D jnl <>if 0F84 jz (if) 3B06 cmp eax, [esi] AD lodsd AD lodsd ???? ?????? then $90 nop vector $E9 jmp ?????? + 0603 add eax, [esi] 83C604 add esi, 4 - 2906 sub [esi], eax AD lodsd * F726 mul dword [esi] 83C604 add esi, 4 /mod 89C3 mov ebx, eax AD lodsd 99 cdq F7FB idiv ebx 8946FC mov [esi-4], eax 8D76FC lea esi, [esi-4] 89D0 mov eax, edx 8706 xchg eax, [esi] @ 8B00 mov eax, [eax] ! 89C2 mov edx, eax AD lodsd 8902 mov [edx], eax AD drop c! 89C2 mov edx, eax AD lodsd 8802 mov [edx], al AD drop w! 89C2 mov edx, eax AD lodsd 668902 mov [edx], ax AD drop next 58 pop eax 48 dec eax 0F8F jg near ???? AD lodsd >> 89C1 mov ecx, eax AD lodsd D3E8 shr eax, cl << 89C1 mov ecx, eax AD lodsd D3E0 shl eax, cl ------------------------------------------------