+(
+ CASE ----------------------------------------------------------------------
+
+ CASE...ENDCASE is how we do switch statements in FORTH. There is no generally
+ agreed syntax for this, so I've gone for the syntax mandated by the ISO standard
+ FORTH (ANS-FORTH).
+
+ ( some value on the stack )
+ CASE
+ test1 OF ... ENDOF
+ test2 OF ... ENDOF
+ testn OF ... ENDOF
+ ... ( default case )
+ ENDCASE
+
+ The CASE statement tests the value on the stack by comparing it for equality with
+ test1, test2, ..., testn and executes the matching piece of code within OF ... ENDOF.
+ If none of the test values match then the default case is executed. Inside the ... of
+ the default case, the value is still at the top of stack (it is implicitly DROP-ed
+ by ENDCASE). When ENDOF is executed it jumps after ENDCASE (ie. there is no "fall-through"
+ and no need for a break statement like in C).
+
+ The default case may be omitted. In fact the tests may also be omitted so that you
+ just have a default case, although this is probably not very useful.
+
+ An example (assuming that 'q', etc. are words which push the ASCII value of the letter
+ on the stack):
+
+ 0 VALUE QUIT
+ 0 VALUE SLEEP
+ KEY CASE
+ 'q' OF 1 TO QUIT ENDOF
+ 's' OF 1 TO SLEEP ENDOF
+ ( default case: )
+ ." Sorry, I didn't understand key <" DUP EMIT ." >, try again." CR
+ ENDCASE
+
+ (In some versions of FORTH, more advanced tests are supported, such as ranges, etc.
+ Other versions of FORTH need you to write OTHERWISE to indicate the default case.
+ As I said above, this FORTH tries to follow the ANS FORTH standard).
+
+ The implementation of CASE...ENDCASE is somewhat non-trivial. I'm following the
+ implementations from here:
+ http://www.uni-giessen.de/faq/archiv/forthfaq.case_endcase/msg00000.html
+
+ The general plan is to compile the code as a series of IF statements:
+
+ CASE (push 0 on the immediate-mode parameter stack)
+ test1 OF ... ENDOF test1 OVER = IF DROP ... ELSE
+ test2 OF ... ENDOF test2 OVER = IF DROP ... ELSE
+ testn OF ... ENDOF testn OVER = IF DROP ... ELSE
+ ... ( default case ) ...
+ ENDCASE DROP THEN [THEN [THEN ...]]
+
+ The CASE statement pushes 0 on the immediate-mode parameter stack, and that number
+ is used to count how many THEN statements we need when we get to ENDCASE so that each
+ IF has a matching THEN. The counting is done implicitly. If you recall from the
+ implementation above of IF, each IF pushes a code address on the immediate-mode stack,
+ and these addresses are non-zero, so by the time we get to ENDCASE the stack contains
+ some number of non-zeroes, followed by a zero. The number of non-zeroes is how many
+ times IF has been called, so how many times we need to match it with THEN.
+
+ This code uses [COMPILE] so that we compile calls to IF, ELSE, THEN instead of
+ actually calling them while we're compiling the words below.
+
+ As is the case with all of our control structures, they only work within word
+ definitions, not in immediate mode.
+)
+: CASE IMMEDIATE
+ 0 ( push 0 to mark the bottom of the stack )
+;
+
+: OF IMMEDIATE
+ ' OVER , ( compile OVER )
+ ' = , ( compile = )
+ [COMPILE] IF ( compile IF )
+ ' DROP , ( compile DROP )
+;
+
+: ENDOF IMMEDIATE
+ [COMPILE] ELSE ( ENDOF is the same as ELSE )
+;
+
+: ENDCASE IMMEDIATE
+ ' DROP , ( compile DROP )
+
+ ( keep compiling THEN until we get to our zero marker )
+ BEGIN
+ ?DUP
+ WHILE
+ [COMPILE] THEN
+ REPEAT
+;
+
+(
+ DECOMPILER ----------------------------------------------------------------------
+
+ CFA> is the opposite of >CFA. It takes a codeword and tries to find the matching
+ dictionary definition.
+
+ In this FORTH this is not so easy. In fact we have to search through the dictionary
+ because we don't have a convenient back-pointer (as is often the case in other versions
+ of FORTH).
+
+ This word returns 0 if it doesn't find a match.
+)
+: CFA>
+ LATEST @ ( start at LATEST dictionary entry )
+ BEGIN
+ DUP 0<> ( while link pointer is not null )
+ WHILE
+ DUP >CFA ( cfa curr curr-cfa )
+ 2 PICK ( cfa curr curr-cfa cfa )
+ = IF ( found a match? )
+ NIP ( leave curr dictionary entry on the stack )
+ EXIT ( and return from the function )
+ THEN
+ @ ( follow link pointer back )
+ REPEAT
+ 2DROP ( restore stack )
+ 0 ( sorry, nothing found )
+;
+
+(
+ SEE decompiles a FORTH word.
+
+ We search for the dictionary entry of the word, then search again for the next
+ word (effectively, the end of the compiled word). This results in two pointers:
+
+ +---------+---+---+---+---+------------+------------+------------+------------+
+ | LINK | 3 | T | E | N | DOCOL | LIT | 10 | EXIT |
+ +---------+---+---+---+---+------------+------------+------------+------------+
+ ^ ^
+ | |
+ Start of word End of word
+
+ With this information we can have a go at decompiling the word. We need to
+ recognise "meta-words" like LIT, LITSTRING, BRANCH, etc. and treat those separately.
+)
+: SEE
+ WORD FIND ( find the dictionary entry to decompile )
+
+ ( Now we search again, looking for the next word in the dictionary. This gives us
+ the length of the word that we will be decompiling. (Well, mostly it does). )
+ HERE @ ( address of the end of the last compiled word )
+ LATEST @ ( word last curr )
+ BEGIN
+ 2 PICK ( word last curr word )
+ OVER ( word last curr word curr )
+ <> ( word last curr word<>curr? )
+ WHILE ( word last curr )
+ NIP ( word curr )
+ DUP @ ( word curr prev (which becomes: word last curr) )
+ REPEAT
+
+ DROP ( at this point, the stack is: start-of-word end-of-word )
+ SWAP ( end-of-word start-of-word )
+
+ ( begin the definition with : NAME [IMMEDIATE] )
+ ':' EMIT SPACE DUP ID. SPACE
+ DUP ?IMMEDIATE IF ." IMMEDIATE " THEN
+
+ >DFA ( get the data address, ie. points after DOCOL | end-of-word start-of-data )
+
+ ( now we start decompiling until we hit the end of the word )
+ BEGIN ( end start )
+ 2DUP >
+ WHILE
+ DUP @ ( end start codeword )
+
+ CASE
+ ' LIT OF ( is it LIT ? )
+ 4 + DUP @ ( get next word which is the integer constant )
+ . ( and print it )
+ ENDOF
+ ' LITSTRING OF ( is it LITSTRING ? )
+ [ CHAR S ] LITERAL EMIT '"' EMIT SPACE ( print S"<space> )
+ 4 + DUP @ ( get the length word )
+ SWAP 4 + SWAP ( end start+4 length )
+ 2DUP TELL ( print the string )
+ '"' EMIT SPACE ( finish the string with a final quote )
+ + ALIGNED ( end start+4+len, aligned )
+ 4 - ( because we're about to add 4 below )
+ ENDOF
+ ' 0BRANCH OF ( is it 0BRANCH ? )
+ ." 0BRANCH ( "
+ 4 + DUP @ ( print the offset )
+ .
+ ')' EMIT SPACE
+ ENDOF
+ ' BRANCH OF ( is it BRANCH ? )
+ ." BRANCH ( "
+ 4 + DUP @ ( print the offset )
+ .
+ ')' EMIT SPACE
+ ENDOF
+ ' ' OF ( is it ' (TICK) ? )
+ [ CHAR ' ] LITERAL EMIT SPACE
+ 4 + DUP @ ( get the next codeword )
+ CFA> ( and force it to be printed as a dictionary entry )
+ ID. SPACE
+ ENDOF
+ ' EXIT OF ( is it EXIT? )
+ ( We expect the last word to be EXIT, and if it is then we don't print it
+ because EXIT is normally implied by ;. EXIT can also appear in the middle
+ of words, and then it needs to be printed. )
+ 2DUP ( end start end start )
+ 4 + ( end start end start+4 )
+ <> IF ( end start | we're not at the end )
+ ." EXIT "
+ THEN
+ ENDOF
+ ( default case: )
+ DUP ( in the default case we always need to DUP before using )
+ CFA> ( look up the codeword to get the dictionary entry )
+ ID. SPACE ( and print it )
+ ENDCASE
+
+ 4 + ( end start+4 )
+ REPEAT
+
+ ';' EMIT CR
+
+ 2DROP ( restore stack )
+;
+
+(
+ DOES> ----------------------------------------------------------------------
+
+ CREATE ... DOES> is a tricky construct allowing you to create words which create other words.
+ For example CONSTANT (defined above) is a word which creates words, and it could have been
+ written as follows:
+
+ : CONSTANT CREATE DOCOL , , DOES> @ ;
+
+ Even explaining what DOES> is supposed to do is tricky. It's possible that the implementation
+ is easier to understand than the explanation.
+
+ If we look at the definition of CONSTANT here, and remember that when it is called the value
+ of the constant is on the stack and the name follows. So first CREATE makes the header of a
+ new word with the name. Secondly the codeword is set to DOCOL. Thirdly , (COMMA) takes the
+ value off the stack and adds it to the definition. At this point (just before executing DOES>)
+ the word looks like this:
+
+ ________ CREATE _______ _ DOCOL ,_ ____ , ___
+ / \ / \ / \
+ +---------+---+---+---+---+------------+------------+
+ | LINK | 3 | T | E | N | DOCOL | 10 |
+ +---------+---+---+---+---+------------+------------+
+ ^ len codeword
+ |
+ LATEST
+
+
+)
+
+
+
+
+: DOES>
+ R> LATEST @ >DFA !
+;
+
+(
+ C STRINGS ----------------------------------------------------------------------
+
+ FORTH strings are represented by a start address and length kept on the stack or in memory.
+
+ Most FORTHs don't handle C strings, but we need them in order to access the process arguments
+ and environment left on the stack by the Linux kernel.
+
+ The main function we need is STRLEN which works out the length of a C string. DUP STRLEN is
+ a common idiom which 'converts' a C string into a FORTH string. (For example, DUP STRLEN TELL
+ prints a C string).
+)
+
+( STRLEN returns the length of a C string )
+: STRLEN ( str -- len )
+ DUP ( save start address )
+ BEGIN
+ DUP C@ 0<> ( zero byte found? )
+ WHILE
+ 1+
+ REPEAT
+
+ SWAP - ( calculate the length )
+;
+
+(
+ STRNCMP compares two strings up to a length. As with C's strncmp it returns 0 if they
+ are equal, or a number > 0 or < 0 indicating their order.
+)
+: STRNCMP ( str1 str2 len -- eq? )
+ BEGIN
+ ?DUP
+ WHILE
+ ROT ( len str1 str2 )
+ DUP C@ ( len str1 str2 char2 )
+ 2 PICK C@ ( len str1 str2 char2 char1 )
+ OVER ( len str1 str2 char2 char1 char2 )
+ - ( len str1 str2 char2 char1-char2 )
+
+ ?DUP IF ( strings not the same at this position? )
+ NIP ( len str1 str2 diff )
+ ROT ( len diff str1 str2 )
+ DROP DROP ( len diff )
+ NIP ( diff )
+ EXIT
+ THEN
+
+ 0= IF ( characters are equal, but is this the end of the C string? )
+ DROP DROP DROP
+ 0
+ EXIT
+ THEN
+
+ 1+ ( len str1 str2+1 )
+ ROT ( str2+1 len str1 )
+ 1+ ROT ( str1+1 str2+1 len )
+ 1- ( str1+1 str2+1 len-1 )
+ REPEAT
+
+ 2DROP ( restore stack )
+ 0 ( equal )
+;
+
+(
+ THE ENVIRONMENT ----------------------------------------------------------------------
+
+ Linux makes the process arguments and environment available to us on the stack.
+
+ The top of stack pointer is saved by the early assembler code when we start up in the FORTH
+ variable S0, and starting at this pointer we can read out the command line arguments and the
+ environment.
+
+ Starting at S0, S0 itself points to argc (the number of command line arguments).
+
+ S0+4 points to argv[0], S0+8 points to argv[1] etc up to argv[argc-1].
+
+ argv[argc] is a NULL pointer.
+
+ After that the stack contains environment variables, a set of pointers to strings of the
+ form NAME=VALUE and on until we get to another NULL pointer.
+
+ The first word that we define, ARGC, pushes the number of command line arguments (note that
+ as with C argc, this includes the name of the command).
+)
+: ARGC
+ S0 @ @
+;
+
+(
+ n ARGV gets the nth command line argument.
+
+ For example to print the command name you would do:
+ 0 ARGV TELL CR
+)
+: ARGV ( n -- str u )
+ 1+ CELLS S0 @ + ( get the address of argv[n] entry )
+ @ ( get the address of the string )
+ DUP STRLEN ( and get its length / turn it into a FORTH string )
+;
+
+(
+ ENVIRON returns the address of the first environment string. The list of strings ends
+ with a NULL pointer.
+
+ For example to print the first string in the environment you could do:
+ ENVIRON @ DUP STRLEN TELL
+)
+: ENVIRON ( -- addr )
+ ARGC ( number of command line parameters on the stack to skip )
+ 2 + ( skip command line count and NULL pointer after the command line args )
+ CELLS ( convert to an offset )
+ S0 @ + ( add to base stack address )
+;
+
+(
+ ANS FORTH ----------------------------------------------------------------------
+
+ From this point we're trying to fill in the missing parts of the ISO standard, commonly
+ referred to as ANS FORTH.
+
+ http://www.taygeta.com/forth/dpans.html
+ http://www.taygeta.com/forth/dpansf.htm (list of words)
+)
+( BL pushes the ASCII character code of space on the stack. )
+: BL 32 ;
+
+( C, writes a byte at the HERE pointer. )
+: C, HERE @ C! 1 HERE +! ;
+
+
+