1 /* A sometimes minimal FORTH compiler and tutorial for Linux / i386 systems. -*- asm -*-
2 By Richard W.M. Jones <rich@annexia.org> http://annexia.org/forth
4 gcc -m32 -nostdlib -static -Wl,-Ttext,0 -o jonesforth jonesforth.S
6 INTRODUCTION ----------------------------------------------------------------------
8 FORTH is one of those alien languages which most working programmers regard in the same
9 way as Haskell, LISP, and so on. Something so strange that they'd rather any thoughts
10 of it just go away so they can get on with writing this paying code. But that's wrong
11 and if you care at all about programming then you should at least understand all these
12 languages, even if you will never use them.
14 LISP is the ultimate high-level language, and features from LISP are being added every
15 decade to the more common languages. But FORTH is in some ways the ultimate in low level
16 programming. Out of the box it lacks features like dynamic memory management and even
17 strings. In fact, at its primitive level it lacks even basic concepts like IF-statements
20 Why then would you want to learn FORTH? There are several very good reasons. First
21 and foremost, FORTH is minimal. You really can write a complete FORTH in, say, 2000
22 lines of code. I don't just mean a FORTH program, I mean a complete FORTH operating
23 system, environment and language. You could boot such a FORTH on a bare PC and it would
24 come up with a prompt where you could start doing useful work. The FORTH you have here
25 isn't minimal and uses a Linux process as its 'base PC' (both for the purposes of making
26 it a good tutorial). It's possible to completely understand the system. Who can say they
27 completely understand how Linux works, or gcc?
29 Secondly FORTH has a peculiar bootstrapping property. By that I mean that after writing
30 a little bit of assembly to talk to the hardware and implement a few primitives, all the
31 rest of the language and compiler is written in FORTH itself. Remember I said before
32 that FORTH lacked IF-statements and loops? Well of course it doesn't really because
33 such a lanuage would be useless, but my point was rather that IF-statements and loops are
34 written in FORTH itself.
36 Now of course this is common in other languages as well, and in those languages we call
37 them 'libraries'. For example in C, 'printf' is a library function written in C. But
38 in FORTH this goes way beyond mere libraries. Can you imagine writing C's 'if' in C?
39 And that brings me to my third reason: If you can write 'if' in FORTH, then why restrict
40 yourself to the usual if/while/for/switch constructs? You want a construct that iterates
41 over every other element in a list of numbers? You can add it to the language. What
42 about an operator which pulls in variables directly from a configuration file and makes
43 them available as FORTH variables? Or how about adding Makefile-like dependencies to
44 the language? No problem in FORTH. This concept isn't common in programming languages,
45 but it has a name (in fact two names): "macros" (by which I mean LISP-style macros, not
46 the lame C preprocessor) and "domain specific languages" (DSLs).
48 This tutorial isn't about learning FORTH as the language. I'll point you to some references
49 you should read if you're not familiar with using FORTH. This tutorial is about how to
50 write FORTH. In fact, until you understand how FORTH is written, you'll have only a very
51 superficial understanding of how to use it.
53 So if you're not familiar with FORTH or want to refresh your memory here are some online
56 http://en.wikipedia.org/wiki/Forth_%28programming_language%29
58 http://galileo.phys.virginia.edu/classes/551.jvn.fall01/primer.htm
60 http://wiki.laptop.org/go/Forth_Lessons
62 Here is another "Why FORTH?" essay: http://www.jwdt.com/~paysan/why-forth.html
64 SETTING UP ----------------------------------------------------------------------
66 Let's get a few housekeeping things out of the way. Firstly because I need to draw lots of
67 ASCII-art diagrams to explain concepts, the best way to look at this is using a window which
68 uses a fixed width font and is at least this wide:
70 <------------------------------------------------------------------------------------------------------------------------>
72 ASSEMBLING ----------------------------------------------------------------------
74 If you want to actually run this FORTH, rather than just read it, you will need Linux on an
75 i386. Linux because instead of programming directly to the hardware on a bare PC which I
76 could have done, I went for a simpler tutorial by assuming that the 'hardware' is a Linux
77 process with a few basic system calls (read, write and exit and that's about all). i386
78 is needed because I had to write the assembly for a processor, and i386 is by far the most
79 common. (Of course when I say 'i386', any 32- or 64-bit x86 processor will do. I'm compiling
80 this on a 64 bit AMD Opteron).
82 Again, to assemble this you will need gcc and gas (the GNU assembler). The commands to
83 assemble and run the code (save this file as 'jonesforth.S') are:
85 gcc -m32 -nostdlib -static -Wl,-Ttext,0 -o jonesforth jonesforth.S
88 You will see lots of 'Warning: unterminated string; newline inserted' messages from the
89 assembler. That's just because the GNU assembler doesn't have a good syntax for multi-line
90 strings (or rather it used to, but the developers removed it!) so I've abused the syntax
91 slightly to make things readable. Ignore these warnings.
93 ASSEMBLER ----------------------------------------------------------------------
95 (You can just skip to the next section -- you don't need to be able to read assembler to
96 follow this tutorial).
98 However if you do want to read the assembly code here are a few notes about gas (the GNU assembler):
100 (1) Register names are prefixed with '%', so %eax is the 32 bit i386 accumulator. The registers
101 available on i386 are: %eax, %ebx, %ecx, %edx, %esi, %edi, %ebp and %esp, and most of them
102 have special purposes.
104 (2) Add, mov, etc. take arguments in the form SRC,DEST. So mov %eax,%ecx moves %eax -> %ecx
106 (3) Constants are prefixed with '$', and you mustn't forget it! If you forget it then it
107 causes a read from memory instead, so:
108 mov $2,%eax moves number 2 into %eax
109 mov 2,%eax reads the 32 bit word from address 2 into %eax (ie. most likely a mistake)
111 (4) gas has a funky syntax for local labels, where '1f' (etc.) means label '1:' "forwards"
112 and '1b' (etc.) means label '1:' "backwards".
114 (5) 'ja' is "jump if above", 'jb' for "jump if below", 'je' "jump if equal" etc.
116 (6) gas has a reasonably nice .macro syntax, and I use them a lot to make the code shorter and
119 For more help reading the assembler, do "info gas" at the Linux prompt.
121 Now the tutorial starts in earnest.
123 INDIRECT THREADED CODE ----------------------------------------------------------------------
142 /* Macros to deal with the return stack. */
144 lea -4(%ebp),%ebp // push reg on to return stack
149 mov (%ebp),\reg // pop top of return stack to reg
153 /* ELF entry point. */
158 mov %esp,var_S0 // Store the initial data stack pointer.
159 mov $return_stack,%ebp // Initialise the return stack.
161 mov $cold_start,%esi // Initialise interpreter.
162 NEXT // Run interpreter!
165 cold_start: // High-level code without a codeword.
168 /* DOCOL - the interpreter! */
172 PUSHRSP %esi // push %esi on to the return stack
173 addl $4,%eax // %eax points to codeword, so make
174 movl %eax,%esi // %esi point to first data word
177 /*----------------------------------------------------------------------
178 * Fixed sized buffers for everything.
182 /* FORTH return stack. */
183 #define RETURN_STACK_SIZE 8192
185 .space RETURN_STACK_SIZE
188 /* Space for user-defined words. */
189 #define USER_DEFS_SIZE 16384
192 .space USER_DEFS_SIZE
199 /*----------------------------------------------------------------------
200 * Built-in words defined the long way.
203 #define F_HIDDEN 0x20
205 // Store the chain of links.
208 .macro defcode name, namelen, flags=0, label
214 .set link,name_\label
215 .byte \flags+\namelen // flags + length byte
216 .ascii "\name" // the name
220 .int code_\label // codeword
224 code_\label : // assembler code follows
227 .macro defword name, namelen, flags=0, label
233 .set link,name_\label
234 .byte \flags+\namelen // flags + length byte
235 .ascii "\name" // the name
239 .int DOCOL // codeword - the interpreter
240 // list of word pointers follow
243 .macro defvar name, namelen, flags=0, label, initial=0
244 defcode \name,\namelen,\flags,\label
253 // Some easy ones, written in assembly for speed
254 defcode "DROP",4,,DROP
255 pop %eax // drop top of stack
259 pop %eax // duplicate top of stack
264 defcode "SWAP",4,,SWAP
265 pop %eax // swap top of stack
271 defcode "OVER",4,,OVER
272 mov 4(%esp),%eax // get the second element of stack
273 push %eax // and push it on top
285 defcode "-ROT",4,,NROT
295 incl (%esp) // increment top of stack
299 decl (%esp) // decrement top of stack
302 defcode "4+",2,,INCR4
303 addl $4,(%esp) // increment top of stack
306 defcode "4-",2,,DECR4
307 subl $4,(%esp) // decrement top of stack
324 push %eax // ignore overflow
332 push %eax // push quotient
340 push %edx // push remainder
343 defcode "=",1,,EQU // top two words are equal?
353 defcode "<>",2,,NEQU // top two words are not equal?
363 defcode "0=",2,,ZEQU // top of stack equals 0?
382 defcode "INVERT",6,,INVERT
386 // COLD must not return (ie. must not call EXIT).
387 defword "COLD",4,,COLD
388 // XXX reinitialisation of the interpreter
389 .int INTERPRETER // call the interpreter loop (never returns)
390 .int LIT,1,SYSEXIT // hmmm, but in case it does, exit(1).
392 defcode "EXIT",4,,EXIT
393 POPRSP %esi // pop return stack into %esi
397 // %esi points to the next command, but in this case it points to the next
398 // literal 32 bit integer. Get that literal into %eax and increment %esi.
399 // On x86, it's a convenient single byte instruction! (cf. NEXT macro)
401 push %eax // push the literal number on to stack
404 defcode "LITSTRING",9,,LITSTRING
405 lodsl // get the length of the string
406 push %eax // push it on the stack
407 push %esi // push the address of the start of the string
408 addl %eax,%esi // skip past the string
409 addl $3,%esi // but round up to next 4 byte boundary
413 defcode "BRANCH",6,,BRANCH
414 add (%esi),%esi // add the offset to the instruction pointer
417 defcode "0BRANCH",7,,ZBRANCH
419 test %eax,%eax // top of stack is zero?
420 jz code_BRANCH // if so, jump back to the branch function above
421 lodsl // otherwise we need to skip the offset
425 pop %ebx // address to store at
426 pop %eax // data to store there
427 mov %eax,(%ebx) // store it
431 pop %ebx // address to fetch
432 mov (%ebx),%eax // fetch it
433 push %eax // push value onto stack
436 defcode "+!",2,,ADDSTORE
438 pop %eax // the amount to add
439 addl %eax,(%ebx) // add it
442 defcode "-!",2,,SUBSTORE
444 pop %eax // the amount to subtract
445 subl %eax,(%ebx) // add it
448 /* ! and @ (STORE and FETCH) store 32-bit words. It's also useful to be able to read and write bytes.
449 * I don't know whether FORTH has these words, so I invented my own, called !b and @b.
450 * Byte-oriented operations only work on architectures which permit them (i386 is one of those).
451 * UPDATE: writing a byte to the dictionary pointer is called C, in FORTH.
453 defcode "!b",2,,STOREBYTE
454 pop %ebx // address to store at
455 pop %eax // data to store there
456 movb %al,(%ebx) // store it
459 defcode "@b",2,,FETCHBYTE
460 pop %ebx // address to fetch
462 movb (%ebx),%al // fetch it
463 push %eax // push value onto stack
466 // The STATE variable is 0 for execute mode, != 0 for compile mode
467 defvar "STATE",5,,STATE
469 // This points to where compiled words go.
470 defvar "HERE",4,,HERE,user_defs_start
472 // This is the last definition in the dictionary.
473 defvar "LATEST",6,,LATEST,name_SYSEXIT // SYSEXIT must be last in built-in dictionary
475 // _X, _Y and _Z are scratch variables used by standard words.
480 // This stores the top of the data stack.
483 // This stores the top of the return stack.
484 defvar "R0",2,,RZ,return_stack
486 defcode "DSP@",4,,DSPFETCH
491 defcode "DSP!",4,,DSPSTORE
496 pop %eax // pop parameter stack into %eax
497 PUSHRSP %eax // push it on to the return stack
500 defcode "R>",2,,FROMR
501 POPRSP %eax // pop return stack on to %eax
502 push %eax // and push on to parameter stack
505 defcode "RSP@",4,,RSPFETCH
509 defcode "RSP!",4,,RSPSTORE
513 defcode "RDROP",5,,RDROP
514 lea 4(%ebp),%ebp // pop return stack and throw away
517 #include <asm-i386/unistd.h>
521 push %eax // push return value on stack
533 1: // out of input; use read(2) to fetch more input from stdin
534 xor %ebx,%ebx // 1st param: stdin
535 mov $buffer,%ecx // 2nd param: buffer
537 mov $buffend-buffer,%edx // 3rd param: max length
538 mov $__NR_read,%eax // syscall: read
540 test %eax,%eax // If %eax <= 0, then exit.
542 addl %eax,%ecx // buffer+%eax = bufftop
546 2: // error or out of input: exit
548 mov $__NR_exit,%eax // syscall: exit
551 defcode "EMIT",4,,EMIT
556 mov $1,%ebx // 1st param: stdout
558 // write needs the address of the byte to write
560 mov $2f,%ecx // 2nd param: address
562 mov $1,%edx // 3rd param: nbytes = 1
564 mov $__NR_write,%eax // write syscall
569 2: .space 1 // scratch used by EMIT
571 defcode "WORD",4,,WORD
573 push %ecx // push length
574 push %edi // push base address
578 /* Search for first non-blank character. Also skip \ comments. */
580 call _KEY // get next key, returned in %eax
581 cmpb $'\\',%al // start of a comment?
582 je 3f // if so, skip the comment
584 jbe 1b // if so, keep looking
586 /* Search for the end of the word, storing chars as we go. */
587 mov $5f,%edi // pointer to return buffer
589 stosb // add character to return buffer
590 call _KEY // get next key, returned in %al
591 cmpb $' ',%al // is blank?
592 ja 2b // if not, keep looping
594 /* Return the word (well, the static buffer) and length. */
596 mov %edi,%ecx // return length of the word
597 mov $5f,%edi // return address of the word
600 /* Code to skip \ comments to end of the current line. */
603 cmpb $'\n',%al // end of line yet?
608 // A static buffer where WORD returns. Subsequent calls
609 // overwrite this buffer. Maximum word length is 32 chars.
612 defcode "EMITSTRING",10,,EMITSTRING
613 mov $1,%ebx // 1st param: stdout
614 pop %ecx // 2nd param: address of string
615 pop %edx // 3rd param: length of string
617 mov $__NR_write,%eax // write syscall
623 pop %eax // Get the number to print into %eax
624 call _DOT // Easier to do this recursively ...
627 mov $10,%ecx // Base 10
631 xor %edx,%edx // %edx:%eax / %ecx -> quotient %eax, remainder %edx
645 // Parse a number from a string on the stack -- almost the opposite of . (DOT)
646 // Note that there is absolutely no error checking. In particular the length of the
647 // string must be >= 1 bytes.
648 defcode "SNUMBER",7,,SNUMBER
658 imull $10,%eax // %eax *= 10
661 subb $'0',%bl // ASCII -> digit
667 defcode "FIND",4,,FIND
668 pop %edi // %edi = address
669 pop %ecx // %ecx = length
675 push %esi // Save %esi so we can use it in string comparison.
677 // Now we start searching backwards through the dictionary for this word.
678 mov var_LATEST,%edx // LATEST points to name header of the latest word in the dictionary
680 test %edx,%edx // NULL pointer? (end of the linked list)
683 // Compare the length expected and the length of the word.
684 // Note that if the F_HIDDEN flag is set on the word, then by a bit of trickery
685 // this won't pick the word (the length will appear to be wrong).
687 movb 4(%edx),%al // %al = flags+length field
688 andb $(F_HIDDEN|0x1f),%al // %al = name length
689 cmpb %cl,%al // Length is the same?
692 // Compare the strings in detail.
693 push %ecx // Save the length
694 push %edi // Save the address (repe cmpsb will move this pointer)
695 lea 5(%edx),%esi // Dictionary string we are checking against.
696 repe cmpsb // Compare the strings.
699 jne 2f // Not the same.
701 // The strings are the same - return the header pointer in %eax
707 mov (%edx),%edx // Move back through the link field to the previous word
708 jmp 1b // .. and loop.
712 xor %eax,%eax // Return zero to indicate not found.
715 defcode ">CFA",4,,TCFA // DEA -> Codeword address
722 add $4,%edi // Skip link pointer.
723 movb (%edi),%al // Load flags+len into %al.
724 inc %edi // Skip flags+len byte.
725 andb $0x1f,%al // Just the length, not the flags.
726 add %eax,%edi // Skip the name.
727 addl $3,%edi // The codeword is 4-byte aligned.
731 defcode "CHAR",4,,CHAR
732 call _WORD // Returns %ecx = length, %edi = pointer to word.
734 movb (%edi),%al // Get the first character of the word.
735 push %eax // Push it onto the stack.
740 // Get the word and create a dictionary entry header for it.
741 call _WORD // Returns %ecx = length, %edi = pointer to word.
742 mov %edi,%ebx // %ebx = address of the word
744 movl var_HERE,%edi // %edi is the address of the header
745 movl var_LATEST,%eax // Get link pointer
746 stosl // and store it in the header.
748 mov %cl,%al // Get the length.
749 orb $F_HIDDEN,%al // Set the HIDDEN flag on this entry.
750 stosb // Store the length/flags byte.
752 mov %ebx,%esi // %esi = word
753 rep movsb // Copy the word
755 addl $3,%edi // Align to next 4 byte boundary.
758 movl $DOCOL,%eax // The codeword for user-created words is always DOCOL (the interpreter)
761 // Header built, so now update LATEST and HERE.
762 // We'll be compiling words and putting them HERE.
767 // And go into compile mode by setting STATE to 1.
772 pop %eax // Code pointer to store.
776 movl var_HERE,%edi // HERE
778 movl %edi,var_HERE // Update HERE (incremented)
781 defcode "HIDDEN",6,,HIDDEN
785 movl var_LATEST,%edi // LATEST word.
786 addl $4,%edi // Point to name/flags byte.
787 xorb $F_HIDDEN,(%edi) // Toggle the HIDDEN bit.
790 defcode "IMMEDIATE",9,F_IMMED,IMMEDIATE
794 movl var_LATEST,%edi // LATEST word.
795 addl $4,%edi // Point to name/flags byte.
796 xorb $F_IMMED,(%edi) // Toggle the IMMED bit.
799 defcode ";",1,F_IMMED,SEMICOLON
800 movl $EXIT,%eax // EXIT is the final codeword in compiled words.
801 call _COMMA // Store it.
802 call _HIDDEN // Toggle the HIDDEN flag (unhides the new word).
803 xor %eax,%eax // Set STATE to 0 (back to execute mode).
807 /* This definiton of ' (TICK) is strictly cheating - it also only works in compiled code. */
809 lodsl // Get the address of the next word and skip it.
810 pushl %eax // Push it on the stack.
813 /* This interpreter is pretty simple, but remember that in FORTH you can always override
814 * it later with a more powerful one!
816 defword "INTERPRETER",11,,INTERPRETER
817 .int INTERPRET,RDROP,INTERPRETER
819 defcode "INTERPRET",9,,INTERPRET
820 call _WORD // Returns %ecx = length, %edi = pointer to word.
822 // Is it in the dictionary?
824 movl %eax,interpret_is_lit // Not a literal number (not yet anyway ...)
825 call _FIND // Returns %eax = pointer to header or 0 if not found.
826 test %eax,%eax // Found?
829 // In the dictionary. Is it an IMMEDIATE codeword?
830 mov %eax,%edi // %edi = dictionary entry
831 movb 4(%edi),%al // Get name+flags.
832 push %ax // Just save it for now.
833 call _TCFA // Convert dictionary entry (in %edi) to codeword pointer.
835 andb $F_IMMED,%al // Is IMMED flag set?
837 jnz 4f // If IMMED, jump straight to executing.
841 1: // Not in the dictionary (not a word) so assume it's a literal number.
842 incl interpret_is_lit
843 call _SNUMBER // Returns the parsed number in %eax
845 mov $LIT,%eax // The word is LIT
847 2: // Are we compiling or executing?
850 jz 4f // Jump if executing.
852 // Compiling - just append the word to the current dictionary definition.
854 mov interpret_is_lit,%ecx // Was it a literal?
857 mov %ebx,%eax // Yes, so LIT is followed by a number.
861 4: // Executing - run it!
862 mov interpret_is_lit,%ecx // Literal?
863 test %ecx,%ecx // Literal?
866 // Not a literal, execute it now. This never returns, but the codeword will
867 // eventually call NEXT which will reenter the loop in INTERPRETER.
870 5: // Executing a literal, which means push it on the stack.
877 .int 0 // Flag used to record if reading a literal
879 // NB: SYSEXIT must be the last entry in the built-in dictionary.
880 defcode SYSEXIT,7,,SYSEXIT
885 /*----------------------------------------------------------------------
886 * Input buffer & initial input.
891 // XXX gives 'Warning: unterminated string; newline inserted' messages which you can ignore
893 \\ Define some character constants
899 \\ CR prints a carriage return
902 \\ SPACE prints a space
903 : SPACE 'SPACE' EMIT ;
905 \\ Primitive . (DOT) function doesn't follow with a blank, so redefine it to behave like FORTH.
906 \\ Notice how we can trivially redefine existing functions.
909 \\ DUP, DROP are defined in assembly for speed, but this is how you might define them
910 \\ in FORTH. Notice use of the scratch variables _X and _Y.
911 \\ : DUP _X ! _X @ _X @ ;
914 \\ The 2... versions of the standard operators work on pairs of stack entries. They're not used
915 \\ very commonly so not really worth writing in assembler. Here is how they are defined in FORTH.
919 \\ More standard FORTH words.
923 \\ [ and ] allow you to break into immediate mode while compiling a word.
924 : [ IMMEDIATE \\ define [ as an immediate word
925 0 STATE ! \\ go into immediate mode
929 1 STATE ! \\ go back to compile mode
932 \\ LITERAL takes whatever is on the stack and compiles LIT <foo>
934 ' LIT , \\ compile LIT
935 , \\ compile the literal itself (from the stack)
938 \\ condition IF true-part THEN rest
940 \\ condition 0BRANCH OFFSET true-part rest
941 \\ where OFFSET is the offset of 'rest'
942 \\ condition IF true-part ELSE false-part THEN
944 \\ condition 0BRANCH OFFSET true-part BRANCH OFFSET2 false-part rest
945 \\ where OFFSET if the offset of false-part and OFFSET2 is the offset of rest
947 \\ IF is an IMMEDIATE word which compiles 0BRANCH followed by a dummy offset, and places
948 \\ the address of the 0BRANCH on the stack. Later when we see THEN, we pop that address
949 \\ off the stack, calculate the offset, and back-fill the offset.
951 ' 0BRANCH , \\ compile 0BRANCH
952 HERE @ \\ save location of the offset on the stack
953 0 , \\ compile a dummy offset
958 HERE @ SWAP - \\ calculate the offset from the address saved on the stack
959 SWAP ! \\ store the offset in the back-filled location
963 ' BRANCH , \\ definite branch to just over the false-part
964 HERE @ \\ save location of the offset on the stack
965 0 , \\ compile a dummy offset
966 SWAP \\ now back-fill the original (IF) offset
967 DUP \\ same as for THEN word above
972 \\ BEGIN loop-part condition UNTIL
974 \\ loop-part condition 0BRANCH OFFSET
975 \\ where OFFSET points back to the loop-part
976 \\ This is like do { loop-part } while (condition) in the C language
978 HERE @ \\ save location on the stack
982 ' 0BRANCH , \\ compile 0BRANCH
983 HERE @ - \\ calculate the offset from the address saved on the stack
984 , \\ compile the offset here
987 \\ BEGIN loop-part AGAIN
989 \\ loop-part BRANCH OFFSET
990 \\ where OFFSET points back to the loop-part
991 \\ In other words, an infinite loop which can only be returned from with EXIT
993 ' BRANCH , \\ compile BRANCH
994 HERE @ - \\ calculate the offset back
995 , \\ compile the offset here
998 \\ BEGIN condition WHILE loop-part REPEAT
1000 \\ condition 0BRANCH OFFSET2 loop-part BRANCH OFFSET
1001 \\ where OFFSET points back to condition (the beginning) and OFFSET2 points to after the whole piece of code
1002 \\ So this is like a while (condition) { loop-part } loop in the C language
1004 ' 0BRANCH , \\ compile 0BRANCH
1005 HERE @ \\ save location of the offset2 on the stack
1006 0 , \\ compile a dummy offset2
1010 ' BRANCH , \\ compile BRANCH
1011 SWAP \\ get the original offset (from BEGIN)
1012 HERE @ - , \\ and compile it after BRANCH
1014 HERE @ SWAP - \\ calculate the offset2
1015 SWAP ! \\ and back-fill it in the original location
1018 \\ With the looping constructs, we can now write SPACES, which writes n spaces to stdout.
1021 SPACE \\ print a space
1022 1- \\ until we count down to 0
1027 \\ .S prints the contents of the stack. Very useful for debugging.
1029 DSP@ \\ get current stack pointer
1031 DUP @ . \\ print the stack element
1033 DUP S0 @ 4- = \\ stop when we get to the top
1038 \\ DEPTH returns the depth of the stack.
1039 : DEPTH S0 @ DSP@ - ;
1041 \\ .\" is the print string operator in FORTH. Example: .\" Something to print\"
1042 \\ The space after the operator is the ordinary space required between words.
1043 \\ This is tricky to define because it has to do different things depending on whether
1044 \\ we are compiling or in immediate mode. (Thus the word is marked IMMEDIATE so it can
1045 \\ detect this and do different things).
1046 \\ In immediate mode we just keep reading characters and printing them until we get to
1047 \\ the next double quote.
1048 \\ In compile mode we have the problem of where we're going to store the string (remember
1049 \\ that the input buffer where the string comes from may be overwritten by the time we
1050 \\ come round to running the function). We store the string in the compiled function
1052 \\ LITSTRING, string length, string rounded up to 4 bytes, EMITSTRING, ...
1054 STATE @ \\ compiling?
1056 ' LITSTRING , \\ compile LITSTRING
1057 HERE @ \\ save the address of the length word on the stack
1058 0 , \\ dummy length - we don't know what it is yet
1060 KEY \\ get next character of the string
1063 HERE @ !b \\ store the character in the compiled image
1064 1 HERE +! \\ increment HERE pointer by 1 byte
1066 DROP \\ drop the double quote character at the end
1067 DUP \\ get the saved address of the length word
1068 HERE @ SWAP - \\ calculate the length
1069 4- \\ subtract 4 (because we measured from the start of the length word)
1070 SWAP ! \\ and back-fill the length location
1071 HERE @ \\ round up to next multiple of 4 bytes for the remaining code
1075 ' EMITSTRING , \\ compile the final EMITSTRING
1077 \\ In immediate mode, just read characters and print them until we get
1078 \\ to the ending double quote. Much simpler than the above code!
1081 DUP '\"' = IF EXIT THEN
1087 \\ While compiling, [COMPILE] WORD compiles WORD if it would otherwise be IMMEDIATE.
1088 : [COMPILE] IMMEDIATE
1089 WORD \\ get the next word
1090 FIND \\ find it in the dictionary
1091 >CFA \\ get its codeword
1092 , \\ and compile that
1095 \\ RECURSE makes a recursive call to the current word that is being compiled.
1096 \\ Normally while a word is being compiled, it is marked HIDDEN so that references to the
1097 \\ same word within are calls to the previous definition of the word.
1099 LATEST @ >CFA \\ LATEST points to the word being compiled at the moment
1103 \\ ALLOT is used to allocate (static) memory when compiling. It increases HERE by
1104 \\ the amount given on the stack.
1108 \\ Finally print the welcome prompt.