1 /* A sometimes minimal FORTH compiler and tutorial for Linux / i386 systems. -*- asm -*-
2 By Richard W.M. Jones <rich@annexia.org> http://annexia.org/forth
4 gcc -m32 -nostdlib -static -Wl,-Ttext,0 -o jonesforth jonesforth.S
6 INTRODUCTION ----------------------------------------------------------------------
8 FORTH is one of those alien languages which most working programmers regard in the same
9 way as Haskell, LISP, and so on. Something so strange that they'd rather any thoughts
10 of it just go away so they can get on with writing this paying code. But that's wrong
11 and if you care at all about programming then you should at least understand all these
12 languages, even if you will never use them.
14 LISP is the ultimate high-level language, and features from LISP are being added every
15 decade to the more common languages. But FORTH is in some ways the ultimate in low level
16 programming. Out of the box it lacks features like dynamic memory management and even
17 strings. In fact, at its primitive level it lacks even basic concepts like IF-statements
20 Why then would you want to learn FORTH? There are several very good reasons. First
21 and foremost, FORTH is minimal. You really can write a complete FORTH in, say, 2000
22 lines of code. I don't just mean a FORTH program, I mean a complete FORTH operating
23 system, environment and language. You could boot such a FORTH on a bare PC and it would
24 come up with a prompt where you could start doing useful work. The FORTH you have here
25 isn't minimal and uses a Linux process as its 'base PC' (both for the purposes of making
26 it a good tutorial). It's possible to completely understand the system. Who can say they
27 completely understand how Linux works, or gcc?
29 Secondly FORTH has a peculiar bootstrapping property. By that I mean that after writing
30 a little bit of assembly to talk to the hardware and implement a few primitives, all the
31 rest of the language and compiler is written in FORTH itself. Remember I said before
32 that FORTH lacked IF-statements and loops? Well of course it doesn't really because
33 such a lanuage would be useless, but my point was rather that IF-statements and loops are
34 written in FORTH itself.
36 Now of course this is common in other languages as well, and in those languages we call
37 them 'libraries'. For example in C, 'printf' is a library function written in C. But
38 in FORTH this goes way beyond mere libraries. Can you imagine writing C's 'if' in C?
39 And that brings me to my third reason: If you can write 'if' in FORTH, then why restrict
40 yourself to the usual if/while/for/switch constructs? You want a construct that iterates
41 over every other element in a list of numbers? You can add it to the language. What
42 about an operator which pulls in variables directly from a configuration file and makes
43 them available as FORTH variables? Or how about adding Makefile-like dependencies to
44 the language? No problem in FORTH. This concept isn't common in programming languages,
45 but it has a name (in fact two names): "macros" (by which I mean LISP-style macros, not
46 the lame C preprocessor) and "domain specific languages" (DSLs).
48 This tutorial isn't about learning FORTH as the language. I'll point you to some references
49 you should read if you're not familiar with using FORTH. This tutorial is about how to
50 write FORTH. In fact, until you understand how FORTH is written, you'll have only a very
51 superficial understanding of how to use it.
53 So if you're not familiar with FORTH or want to refresh your memory here are some online
56 http://en.wikipedia.org/wiki/Forth_%28programming_language%29
58 http://galileo.phys.virginia.edu/classes/551.jvn.fall01/primer.htm
60 http://wiki.laptop.org/go/Forth_Lessons
62 Here is another "Why FORTH?" essay: http://www.jwdt.com/~paysan/why-forth.html
64 ACKNOWLEDGEMENTS ----------------------------------------------------------------------
66 This code draws heavily on the design of LINA FORTH (http://home.hccnet.nl/a.w.m.van.der.horst/lina.html)
67 by Albert van der Horst. Any similarities in the code are probably not accidental.
69 SETTING UP ----------------------------------------------------------------------
71 Let's get a few housekeeping things out of the way. Firstly because I need to draw lots of
72 ASCII-art diagrams to explain concepts, the best way to look at this is using a window which
73 uses a fixed width font and is at least this wide:
75 <------------------------------------------------------------------------------------------------------------------------>
77 Secondly make sure TABS are set to 8 characters. The following should be a vertical
78 line. If not, sort out your tabs.
84 Thirdly I assume that your screen is at least 50 characters high.
86 ASSEMBLING ----------------------------------------------------------------------
88 If you want to actually run this FORTH, rather than just read it, you will need Linux on an
89 i386. Linux because instead of programming directly to the hardware on a bare PC which I
90 could have done, I went for a simpler tutorial by assuming that the 'hardware' is a Linux
91 process with a few basic system calls (read, write and exit and that's about all). i386
92 is needed because I had to write the assembly for a processor, and i386 is by far the most
93 common. (Of course when I say 'i386', any 32- or 64-bit x86 processor will do. I'm compiling
94 this on a 64 bit AMD Opteron).
96 Again, to assemble this you will need gcc and gas (the GNU assembler). The commands to
97 assemble and run the code (save this file as 'jonesforth.S') are:
99 gcc -m32 -nostdlib -static -Wl,-Ttext,0 -o jonesforth jonesforth.S
102 You will see lots of 'Warning: unterminated string; newline inserted' messages from the
103 assembler. That's just because the GNU assembler doesn't have a good syntax for multi-line
104 strings (or rather it used to, but the developers removed it!) so I've abused the syntax
105 slightly to make things readable. Ignore these warnings.
107 ASSEMBLER ----------------------------------------------------------------------
109 (You can just skip to the next section -- you don't need to be able to read assembler to
110 follow this tutorial).
112 However if you do want to read the assembly code here are a few notes about gas (the GNU assembler):
114 (1) Register names are prefixed with '%', so %eax is the 32 bit i386 accumulator. The registers
115 available on i386 are: %eax, %ebx, %ecx, %edx, %esi, %edi, %ebp and %esp, and most of them
116 have special purposes.
118 (2) Add, mov, etc. take arguments in the form SRC,DEST. So mov %eax,%ecx moves %eax -> %ecx
120 (3) Constants are prefixed with '$', and you mustn't forget it! If you forget it then it
121 causes a read from memory instead, so:
122 mov $2,%eax moves number 2 into %eax
123 mov 2,%eax reads the 32 bit word from address 2 into %eax (ie. most likely a mistake)
125 (4) gas has a funky syntax for local labels, where '1f' (etc.) means label '1:' "forwards"
126 and '1b' (etc.) means label '1:' "backwards".
128 (5) 'ja' is "jump if above", 'jb' for "jump if below", 'je' "jump if equal" etc.
130 (6) gas has a reasonably nice .macro syntax, and I use them a lot to make the code shorter and
133 For more help reading the assembler, do "info gas" at the Linux prompt.
135 Now the tutorial starts in earnest.
137 THE DICTIONARY ----------------------------------------------------------------------
139 In FORTH as you will know, functions are called "words", as just as in other languages they
140 have a name and a definition. Here are two FORTH words:
142 : DOUBLE DUP + ; \ name is "DOUBLE", definition is "DUP +"
143 : QUADRUPLE DOUBLE DOUBLE ; \ name is "QUADRUPLE", definition is "DOUBLE DOUBLE"
145 Words, both built-in ones and ones which the programmer defines later, are stored in a dictionary
146 which is just a linked list of dictionary entries.
148 <--- DICTIONARY ENTRY (HEADER) ----------------------->
149 +------------------------+--------+---------- - - - - +----------- - - - -
150 | LINK POINTER | LENGTH/| NAME | DEFINITION
152 +--- (4 bytes) ----------+- byte -+- n bytes - - - - +----------- - - - -
154 I'll come to the definition of the word later. For now just look at the header. The first
155 4 bytes are the link pointer. This points back to the previous word in the dictionary, or, for
156 the first word in the dictionary it is just a NULL pointer. Then comes a length/flags byte.
157 The length of the word can be up to 31 characters (5 bits used) and the top three bits are used
158 for various flags which I'll come to later. This is followed by the name itself, and in this
159 implementation the name is rounded up to a multiple of 4 bytes by padding it with zero bytes.
160 That's just to ensure that the definition starts on a 32 bit boundary.
162 A FORTH variable called LATEST contains a pointer to the most recently defined word, in
163 other words, the head of this linked list.
165 DOUBLE and QUADRUPLE might look like this:
167 pointer to previous word
170 +--|------+---+---+---+---+---+---+---+---+------------- - - - -
171 | LINK | 6 | D | O | U | B | L | E | 0 | (definition ...)
172 +---------+---+---+---+---+---+---+---+---+------------- - - - -
175 +--|------+---+---+---+---+---+---+---+---+---+---+---+---+------------- - - - -
176 | LINK | 9 | Q | U | A | D | R | U | P | L | E | 0 | 0 | (definition ...)
177 +---------+---+---+---+---+---+---+---+---+---+---+---+---+------------- - - - -
183 You shoud be able to see from this how you might implement functions to find a word in
184 the dictionary (just walk along the dictionary entries starting at LATEST and matching
185 the names until you either find a match or hit the NULL pointer at the end of the dictionary),
186 and add a word to the dictionary (create a new definition, set its LINK to LATEST, and set
187 LATEST to point to the new word). We'll see precisely these functions implemented in
188 assembly code later on.
190 One interesting consequence of using a linked list is that you can redefine words, and
191 a newer definition of a word overrides an older one. This is an important concept in
192 FORTH because it means that any word (even "built-in" or "standard" words) can be
193 overridden with a new definition, either to enhance it, to make it faster or even to
194 disable it. However because of the way that FORTH words get compiled, which you'll
195 understand below, words defined using the old definition of a word continue to use
196 the old definition. Only words defined after the new definition use the new definition.
198 DIRECT THREADED CODE ----------------------------------------------------------------------
200 Now we'll get to the really crucial bit in understanding FORTH, so go and get a cup of tea
201 or coffee and settle down. It's fair to say that if you don't understand this section, then you
202 won't "get" how FORTH works, and that would be a failure on my part for not explaining it well.
203 So if after reading this section a few times you don't understand it, please email me
206 Let's talk first about what "threaded code" means. Imagine a peculiar version of C where
207 you are only allowed to call functions without arguments. (Don't worry for now that such a
208 language would be completely useless!) So in our peculiar C, code would look like this:
217 and so on. How would a function, say 'f' above, be compiled by a standard C compiler?
218 Probably into assembly code like this. On the right hand side I've written the actual
222 CALL a E8 08 00 00 00
223 CALL b E8 1C 00 00 00
224 CALL c E8 2C 00 00 00
225 ; ignore the return from the function for now
227 "E8" is the x86 machine code to "CALL" a function. In the first 20 years of computing
228 memory was hideously expensive and we might have worried about the wasted space being used
229 by the repeated "E8" bytes. We can save 20% in code size (and therefore, in expensive memory)
230 by compressing this into just:
232 08 00 00 00 Just the function addresses, without
233 1C 00 00 00 the CALL prefix.
236 [Historical note: If the execution model that FORTH uses looks strange from the following
237 paragraphs, then it was motivated entirely by the need to save memory on early computers.
238 This code compression isn't so important now when our machines have more memory in their L1
239 caches than those early computers had in total, but the execution model still has some
242 Of course this code won't run directly any more. Instead we need to write an interpreter
243 which takes each pair of bytes and calls it.
245 On an i386 machine it turns out that we can write this interpreter rather easily, in just
246 two assembly instructions which turn into just 3 bytes of machine code. Let's store the
247 pointer to the next word to execute in the %esi register:
249 08 00 00 00 <- We're executing this one now. %esi is the _next_ one to execute.
253 The all-important x86 instruction is called LODSL (or in Intel manuals, LODSW). It does
254 two things. Firstly it reads the memory at %esi into the accumulator (%eax). Secondly it
255 increments %esi by 4 bytes. So after LODSL, the situation now looks like this:
257 08 00 00 00 <- We're still executing this one
258 1C 00 00 00 <- %eax now contains this address (0x0000001C)
261 Now we just need to jump to the address in %eax. This is again just a single x86 instruction
262 written JMP *(%eax). And after doing the jump, the situation looks like:
265 1C 00 00 00 <- Now we're executing this subroutine.
268 To make this work, each subroutine is followed by the two instructions 'LODSL; JMP *(%eax)'
269 which literally make the jump to the next subroutine.
271 And that brings us to our first piece of actual code! Well, it's a macro.
280 /* The macro is called NEXT. That's a FORTH-ism. It expands to those two instructions.
282 Every FORTH primitive that we write has to be ended by NEXT. Think of it kind of like
285 The above describes what is known as direct threaded code.
287 To sum up: We compress our function calls down to a list of addresses and use a somewhat
288 magical macro to act as a "jump to next function in the list". We also use one register (%esi)
289 to act as a kind of instruction pointer, pointing to the next function in the list.
291 I'll just give you a hint of what is to come by saying that a FORTH definition such as:
293 : QUADRUPLE DOUBLE DOUBLE ;
295 actually compiles (almost, not precisely but we'll see why in a moment) to a list of
296 function addresses for DOUBLE, DOUBLE and a special function called EXIT to finish off.
298 At this point, REALLY EAGLE-EYED ASSEMBLY EXPERTS are saying "JONES, YOU'VE MADE A MISTAKE!".
300 I lied about JMP *(%eax).
302 INDIRECT THREADED CODE ----------------------------------------------------------------------
304 It turns out that direct threaded code is interesting but only if you want to just execute
305 a list of functions written in assembly language. So QUADRUPLE would work only if DOUBLE
306 was an assembly language function. In the direct threaded code, QUADRUPLE would look like:
309 | addr of DOUBLE --------------------> (assembly code to do the double)
310 +------------------+ NEXT
311 %esi -> | addr of DOUBLE |
314 We can add an extra indirection to allow us to run both words written in assembly language
315 (primitives written for speed) and words written in FORTH themselves as lists of addresses.
317 The extra indirection is the reason for the brackets in JMP *(%eax).
319 Let's have a look at how QUADRUPLE and DOUBLE really look in FORTH:
321 : QUADRUPLE DOUBLE DOUBLE ;
324 | codeword | : DOUBLE DUP + ;
326 | addr of DOUBLE ---------------> +------------------+
327 +------------------+ | codeword |
328 | addr of DOUBLE | +------------------+
329 +------------------+ | addr of DUP --------------> +------------------+
330 | addr of EXIT | +------------------+ | codeword -------+
331 +------------------+ %esi -> | addr of + --------+ +------------------+ |
332 +------------------+ | | assembly to <-----+
333 | addr of EXIT | | | implement DUP |
334 +------------------+ | | .. |
337 | +------------------+
339 +-----> +------------------+
341 +------------------+ |
342 | assembly to <------+
349 This is the part where you may need an extra cup of tea/coffee/favourite caffeinated
350 beverage. What has changed is that I've added an extra pointer to the beginning of
351 the definitions. In FORTH this is sometimes called the "codeword". The codeword is
352 a pointer to the interpreter to run the function. For primitives written in
353 assembly language, the "interpreter" just points to the actual assembly code itself.
355 In words written in FORTH (like QUADRUPLE and DOUBLE), the codeword points to an interpreter
358 I'll show you the interpreter function shortly, but let's recall our indirect
359 JMP *(%eax) with the "extra" brackets. Take the case where we're executing DOUBLE
360 as shown, and DUP has been called. Note that %esi is pointing to the address of +.
362 The assembly code for DUP eventually does a NEXT. That:
364 (1) reads the address of + into %eax %eax points to the codeword of +
365 (2) increments %esi by 4
366 (3) jumps to the indirect %eax jumps to the address in the codeword of +,
367 ie. the assembly code to implement +
372 | addr of DOUBLE ---------------> +------------------+
373 +------------------+ | codeword |
374 | addr of DOUBLE | +------------------+
375 +------------------+ | addr of DUP --------------> +------------------+
376 | addr of EXIT | +------------------+ | codeword -------+
377 +------------------+ | addr of + --------+ +------------------+ |
378 +------------------+ | | assembly to <-----+
379 %esi -> | addr of EXIT | | | implement DUP |
380 +------------------+ | | .. |
383 | +------------------+
385 +-----> +------------------+
387 +------------------+ |
388 now we're | assembly to <------+
389 executing | implement + |
395 So I hope that I've convinced you that NEXT does roughly what you'd expect. This is
396 indirect threaded code.
398 I've glossed over four things. I wonder if you can guess without reading on what they are?
404 My list of four things are: (1) What does "EXIT" do? (2) which is related to (1) is how do
405 you call into a function, ie. how does %esi start off pointing at part of QUADRUPLE, but
406 then point at part of DOUBLE. (3) What goes in the codeword for the words which are written
407 in FORTH? (4) How do you compile a function which does anything except call other functions
408 ie. a function which contains a number like : DOUBLE 2 * ; ?
410 THE INTERPRETER AND RETURN STACK ------------------------------------------------------------
412 Going at these in no particular order, let's talk about issues (3) and (2), the interpreter
413 and the return stack.
415 Words which are defined in FORTH need a codeword which points to a little bit of code to
416 give them a "helping hand" in life. They don't need much, but they do need what is known
417 as an "interpreter", although it doesn't really "interpret" in the same way that, say,
418 Java bytecode used to be interpreted (ie. slowly). This interpreter just sets up a few
419 machine registers so that the word can then execute at full speed using the indirect
420 threaded model above.
422 One of the things that needs to happen when QUADRUPLE calls DOUBLE is that we save the old
423 %esi ("instruction pointer") and create a new one pointing to the first word in DOUBLE.
424 Because we will need to restore the old %esi at the end of DOUBLE (this is, after all, like
425 a function call), we will need a stack to store these "return addresses" (old values of %esi).
427 As you will have read, when reading the background documentation, FORTH has two stacks,
428 an ordinary stack for parameters, and a return stack which is a bit more mysterious. But
429 our return stack is just the stack I talked about in the previous paragraph, used to save
430 %esi when calling from a FORTH word into another FORTH word.
432 In this FORTH, we are using the normal stack pointer (%esp) for the parameter stack.
433 We will use the i386's "other" stack pointer (%ebp, usually called the "frame pointer")
434 for our return stack.
436 I've got two macros which just wrap up the details of using %ebp for the return stack:
439 /* Macros to deal with the return stack. */
441 lea -4(%ebp),%ebp // push reg on to return stack
446 mov (%ebp),\reg // pop top of return stack to reg
451 And with that we can now talk about the interpreter.
453 In FORTH the interpreter function is often called DOCOL (I think it means "DO COLON" because
454 all FORTH definitions start with a colon, as in : DOUBLE DUP + ;
456 The "interpreter" (it's not really "interpreting") just needs to push the old %esi on the
457 stack and set %esi to the first word in the definition. Remember that we jumped to the
458 function using JMP *(%eax)? Well a consequence of that is that conveniently %eax contains
459 the address of this codeword, so just by adding 4 to it we get the address of the first
460 data word. Finally after setting up %esi, it just does NEXT which causes that first word
464 /* DOCOL - the interpreter! */
468 PUSHRSP %esi // push %esi on to the return stack
469 addl $4,%eax // %eax points to codeword, so make
470 movl %eax,%esi // %esi point to first data word
474 Just to make this absolutely clear, let's see how DOCOL works when jumping from QUADRUPLE
480 +------------------+ DOUBLE:
481 | addr of DOUBLE ---------------> +------------------+
482 +------------------+ %eax -> | addr of DOCOL |
483 %esi -> | addr of DOUBLE | +------------------+
484 +------------------+ | addr of DUP -------------->
485 | addr of EXIT | +------------------+
486 +------------------+ | etc. |
488 First, the call to DOUBLE causes DOCOL (the codeword of DOUBLE). DOCOL does this: It
489 pushes the old %esi on the return stack. %eax points to the codeword of DOUBLE, so we
490 just add 4 on to it to get our new %esi:
495 +------------------+ DOUBLE:
496 | addr of DOUBLE ---------------> +------------------+
497 +------------------+ | addr of DOCOL |
498 | addr of DOUBLE | +------------------+
499 +------------------+ %esi -> | addr of DUP -------------->
500 | addr of EXIT | +------------------+
501 +------------------+ | etc. |
503 Then we do NEXT, and because of the magic of threaded code that increments %esi again
506 Well, it seems to work.
508 One minor point here. Because DOCOL is the first bit of assembly actually to be defined
509 in this file (the others were just macros), and because I usually compile this code with the
510 text segment starting at address 0, DOCOL has address 0. So if you are disassembling the
511 code and see a word with a codeword of 0, you will immediately know that the word is
512 written in FORTH (it's not an assembler primitive) and so uses DOCOL as the interpreter.
518 /* ELF entry point. */
523 mov %esp,var_S0 // Store the initial data stack pointer.
524 mov $return_stack,%ebp // Initialise the return stack.
526 mov $cold_start,%esi // Initialise interpreter.
527 NEXT // Run interpreter!
530 cold_start: // High-level code without a codeword.
533 /*----------------------------------------------------------------------
534 * Fixed sized buffers for everything.
538 /* FORTH return stack. */
539 #define RETURN_STACK_SIZE 8192
541 .space RETURN_STACK_SIZE
544 /* Space for user-defined words. */
545 #define USER_DEFS_SIZE 16384
548 .space USER_DEFS_SIZE
555 /*----------------------------------------------------------------------
556 * Built-in words defined the long way.
559 #define F_HIDDEN 0x20
561 // Store the chain of links.
564 .macro defcode name, namelen, flags=0, label
570 .set link,name_\label
571 .byte \flags+\namelen // flags + length byte
572 .ascii "\name" // the name
576 .int code_\label // codeword
580 code_\label : // assembler code follows
583 .macro defword name, namelen, flags=0, label
589 .set link,name_\label
590 .byte \flags+\namelen // flags + length byte
591 .ascii "\name" // the name
595 .int DOCOL // codeword - the interpreter
596 // list of word pointers follow
599 .macro defvar name, namelen, flags=0, label, initial=0
600 defcode \name,\namelen,\flags,\label
609 // Some easy ones, written in assembly for speed
610 defcode "DROP",4,,DROP
611 pop %eax // drop top of stack
615 pop %eax // duplicate top of stack
620 defcode "SWAP",4,,SWAP
621 pop %eax // swap top of stack
627 defcode "OVER",4,,OVER
628 mov 4(%esp),%eax // get the second element of stack
629 push %eax // and push it on top
641 defcode "-ROT",4,,NROT
651 incl (%esp) // increment top of stack
655 decl (%esp) // decrement top of stack
658 defcode "4+",2,,INCR4
659 addl $4,(%esp) // increment top of stack
662 defcode "4-",2,,DECR4
663 subl $4,(%esp) // decrement top of stack
680 push %eax // ignore overflow
688 push %eax // push quotient
696 push %edx // push remainder
699 defcode "=",1,,EQU // top two words are equal?
709 defcode "<>",2,,NEQU // top two words are not equal?
719 defcode "0=",2,,ZEQU // top of stack equals 0?
738 defcode "INVERT",6,,INVERT
742 // COLD must not return (ie. must not call EXIT).
743 defword "COLD",4,,COLD
744 // XXX reinitialisation of the interpreter
745 .int INTERPRETER // call the interpreter loop (never returns)
746 .int LIT,1,SYSEXIT // hmmm, but in case it does, exit(1).
748 defcode "EXIT",4,,EXIT
749 POPRSP %esi // pop return stack into %esi
753 // %esi points to the next command, but in this case it points to the next
754 // literal 32 bit integer. Get that literal into %eax and increment %esi.
755 // On x86, it's a convenient single byte instruction! (cf. NEXT macro)
757 push %eax // push the literal number on to stack
760 defcode "LITSTRING",9,,LITSTRING
761 lodsl // get the length of the string
762 push %eax // push it on the stack
763 push %esi // push the address of the start of the string
764 addl %eax,%esi // skip past the string
765 addl $3,%esi // but round up to next 4 byte boundary
769 defcode "BRANCH",6,,BRANCH
770 add (%esi),%esi // add the offset to the instruction pointer
773 defcode "0BRANCH",7,,ZBRANCH
775 test %eax,%eax // top of stack is zero?
776 jz code_BRANCH // if so, jump back to the branch function above
777 lodsl // otherwise we need to skip the offset
781 pop %ebx // address to store at
782 pop %eax // data to store there
783 mov %eax,(%ebx) // store it
787 pop %ebx // address to fetch
788 mov (%ebx),%eax // fetch it
789 push %eax // push value onto stack
792 defcode "+!",2,,ADDSTORE
794 pop %eax // the amount to add
795 addl %eax,(%ebx) // add it
798 defcode "-!",2,,SUBSTORE
800 pop %eax // the amount to subtract
801 subl %eax,(%ebx) // add it
804 /* ! and @ (STORE and FETCH) store 32-bit words. It's also useful to be able to read and write bytes.
805 * I don't know whether FORTH has these words, so I invented my own, called !b and @b.
806 * Byte-oriented operations only work on architectures which permit them (i386 is one of those).
807 * UPDATE: writing a byte to the dictionary pointer is called C, in FORTH.
809 defcode "!b",2,,STOREBYTE
810 pop %ebx // address to store at
811 pop %eax // data to store there
812 movb %al,(%ebx) // store it
815 defcode "@b",2,,FETCHBYTE
816 pop %ebx // address to fetch
818 movb (%ebx),%al // fetch it
819 push %eax // push value onto stack
822 // The STATE variable is 0 for execute mode, != 0 for compile mode
823 defvar "STATE",5,,STATE
825 // This points to where compiled words go.
826 defvar "HERE",4,,HERE,user_defs_start
828 // This is the last definition in the dictionary.
829 defvar "LATEST",6,,LATEST,name_SYSEXIT // SYSEXIT must be last in built-in dictionary
831 // _X, _Y and _Z are scratch variables used by standard words.
836 // This stores the top of the data stack.
839 // This stores the top of the return stack.
840 defvar "R0",2,,RZ,return_stack
842 defcode "DSP@",4,,DSPFETCH
847 defcode "DSP!",4,,DSPSTORE
852 pop %eax // pop parameter stack into %eax
853 PUSHRSP %eax // push it on to the return stack
856 defcode "R>",2,,FROMR
857 POPRSP %eax // pop return stack on to %eax
858 push %eax // and push on to parameter stack
861 defcode "RSP@",4,,RSPFETCH
865 defcode "RSP!",4,,RSPSTORE
869 defcode "RDROP",5,,RDROP
870 lea 4(%ebp),%ebp // pop return stack and throw away
873 #include <asm-i386/unistd.h>
877 push %eax // push return value on stack
889 1: // out of input; use read(2) to fetch more input from stdin
890 xor %ebx,%ebx // 1st param: stdin
891 mov $buffer,%ecx // 2nd param: buffer
893 mov $buffend-buffer,%edx // 3rd param: max length
894 mov $__NR_read,%eax // syscall: read
896 test %eax,%eax // If %eax <= 0, then exit.
898 addl %eax,%ecx // buffer+%eax = bufftop
902 2: // error or out of input: exit
904 mov $__NR_exit,%eax // syscall: exit
907 defcode "EMIT",4,,EMIT
912 mov $1,%ebx // 1st param: stdout
914 // write needs the address of the byte to write
916 mov $2f,%ecx // 2nd param: address
918 mov $1,%edx // 3rd param: nbytes = 1
920 mov $__NR_write,%eax // write syscall
925 2: .space 1 // scratch used by EMIT
927 defcode "WORD",4,,WORD
929 push %ecx // push length
930 push %edi // push base address
934 /* Search for first non-blank character. Also skip \ comments. */
936 call _KEY // get next key, returned in %eax
937 cmpb $'\\',%al // start of a comment?
938 je 3f // if so, skip the comment
940 jbe 1b // if so, keep looking
942 /* Search for the end of the word, storing chars as we go. */
943 mov $5f,%edi // pointer to return buffer
945 stosb // add character to return buffer
946 call _KEY // get next key, returned in %al
947 cmpb $' ',%al // is blank?
948 ja 2b // if not, keep looping
950 /* Return the word (well, the static buffer) and length. */
952 mov %edi,%ecx // return length of the word
953 mov $5f,%edi // return address of the word
956 /* Code to skip \ comments to end of the current line. */
959 cmpb $'\n',%al // end of line yet?
964 // A static buffer where WORD returns. Subsequent calls
965 // overwrite this buffer. Maximum word length is 32 chars.
968 defcode "EMITSTRING",10,,EMITSTRING
969 mov $1,%ebx // 1st param: stdout
970 pop %ecx // 2nd param: address of string
971 pop %edx // 3rd param: length of string
973 mov $__NR_write,%eax // write syscall
979 pop %eax // Get the number to print into %eax
980 call _DOT // Easier to do this recursively ...
983 mov $10,%ecx // Base 10
987 xor %edx,%edx // %edx:%eax / %ecx -> quotient %eax, remainder %edx
1001 // Parse a number from a string on the stack -- almost the opposite of . (DOT)
1002 // Note that there is absolutely no error checking. In particular the length of the
1003 // string must be >= 1 bytes.
1004 defcode "SNUMBER",7,,SNUMBER
1014 imull $10,%eax // %eax *= 10
1017 subb $'0',%bl // ASCII -> digit
1023 defcode "FIND",4,,FIND
1024 pop %edi // %edi = address
1025 pop %ecx // %ecx = length
1031 push %esi // Save %esi so we can use it in string comparison.
1033 // Now we start searching backwards through the dictionary for this word.
1034 mov var_LATEST,%edx // LATEST points to name header of the latest word in the dictionary
1036 test %edx,%edx // NULL pointer? (end of the linked list)
1039 // Compare the length expected and the length of the word.
1040 // Note that if the F_HIDDEN flag is set on the word, then by a bit of trickery
1041 // this won't pick the word (the length will appear to be wrong).
1043 movb 4(%edx),%al // %al = flags+length field
1044 andb $(F_HIDDEN|0x1f),%al // %al = name length
1045 cmpb %cl,%al // Length is the same?
1048 // Compare the strings in detail.
1049 push %ecx // Save the length
1050 push %edi // Save the address (repe cmpsb will move this pointer)
1051 lea 5(%edx),%esi // Dictionary string we are checking against.
1052 repe cmpsb // Compare the strings.
1055 jne 2f // Not the same.
1057 // The strings are the same - return the header pointer in %eax
1063 mov (%edx),%edx // Move back through the link field to the previous word
1064 jmp 1b // .. and loop.
1068 xor %eax,%eax // Return zero to indicate not found.
1071 defcode ">CFA",4,,TCFA // DEA -> Codeword address
1078 add $4,%edi // Skip link pointer.
1079 movb (%edi),%al // Load flags+len into %al.
1080 inc %edi // Skip flags+len byte.
1081 andb $0x1f,%al // Just the length, not the flags.
1082 add %eax,%edi // Skip the name.
1083 addl $3,%edi // The codeword is 4-byte aligned.
1087 defcode "CHAR",4,,CHAR
1088 call _WORD // Returns %ecx = length, %edi = pointer to word.
1090 movb (%edi),%al // Get the first character of the word.
1091 push %eax // Push it onto the stack.
1094 defcode ":",1,,COLON
1096 // Get the word and create a dictionary entry header for it.
1097 call _WORD // Returns %ecx = length, %edi = pointer to word.
1098 mov %edi,%ebx // %ebx = address of the word
1100 movl var_HERE,%edi // %edi is the address of the header
1101 movl var_LATEST,%eax // Get link pointer
1102 stosl // and store it in the header.
1104 mov %cl,%al // Get the length.
1105 orb $F_HIDDEN,%al // Set the HIDDEN flag on this entry.
1106 stosb // Store the length/flags byte.
1108 mov %ebx,%esi // %esi = word
1109 rep movsb // Copy the word
1111 addl $3,%edi // Align to next 4 byte boundary.
1114 movl $DOCOL,%eax // The codeword for user-created words is always DOCOL (the interpreter)
1117 // Header built, so now update LATEST and HERE.
1118 // We'll be compiling words and putting them HERE.
1120 movl %eax,var_LATEST
1123 // And go into compile mode by setting STATE to 1.
1127 defcode ",",1,,COMMA
1128 pop %eax // Code pointer to store.
1132 movl var_HERE,%edi // HERE
1134 movl %edi,var_HERE // Update HERE (incremented)
1137 defcode "HIDDEN",6,,HIDDEN
1141 movl var_LATEST,%edi // LATEST word.
1142 addl $4,%edi // Point to name/flags byte.
1143 xorb $F_HIDDEN,(%edi) // Toggle the HIDDEN bit.
1146 defcode "IMMEDIATE",9,F_IMMED,IMMEDIATE
1150 movl var_LATEST,%edi // LATEST word.
1151 addl $4,%edi // Point to name/flags byte.
1152 xorb $F_IMMED,(%edi) // Toggle the IMMED bit.
1155 defcode ";",1,F_IMMED,SEMICOLON
1156 movl $EXIT,%eax // EXIT is the final codeword in compiled words.
1157 call _COMMA // Store it.
1158 call _HIDDEN // Toggle the HIDDEN flag (unhides the new word).
1159 xor %eax,%eax // Set STATE to 0 (back to execute mode).
1163 /* This definiton of ' (TICK) is strictly cheating - it also only works in compiled code. */
1165 lodsl // Get the address of the next word and skip it.
1166 pushl %eax // Push it on the stack.
1169 /* This interpreter is pretty simple, but remember that in FORTH you can always override
1170 * it later with a more powerful one!
1172 defword "INTERPRETER",11,,INTERPRETER
1173 .int INTERPRET,RDROP,INTERPRETER
1175 defcode "INTERPRET",9,,INTERPRET
1176 call _WORD // Returns %ecx = length, %edi = pointer to word.
1178 // Is it in the dictionary?
1180 movl %eax,interpret_is_lit // Not a literal number (not yet anyway ...)
1181 call _FIND // Returns %eax = pointer to header or 0 if not found.
1182 test %eax,%eax // Found?
1185 // In the dictionary. Is it an IMMEDIATE codeword?
1186 mov %eax,%edi // %edi = dictionary entry
1187 movb 4(%edi),%al // Get name+flags.
1188 push %ax // Just save it for now.
1189 call _TCFA // Convert dictionary entry (in %edi) to codeword pointer.
1191 andb $F_IMMED,%al // Is IMMED flag set?
1193 jnz 4f // If IMMED, jump straight to executing.
1197 1: // Not in the dictionary (not a word) so assume it's a literal number.
1198 incl interpret_is_lit
1199 call _SNUMBER // Returns the parsed number in %eax
1201 mov $LIT,%eax // The word is LIT
1203 2: // Are we compiling or executing?
1206 jz 4f // Jump if executing.
1208 // Compiling - just append the word to the current dictionary definition.
1210 mov interpret_is_lit,%ecx // Was it a literal?
1213 mov %ebx,%eax // Yes, so LIT is followed by a number.
1217 4: // Executing - run it!
1218 mov interpret_is_lit,%ecx // Literal?
1219 test %ecx,%ecx // Literal?
1222 // Not a literal, execute it now. This never returns, but the codeword will
1223 // eventually call NEXT which will reenter the loop in INTERPRETER.
1226 5: // Executing a literal, which means push it on the stack.
1233 .int 0 // Flag used to record if reading a literal
1235 // NB: SYSEXIT must be the last entry in the built-in dictionary.
1236 defcode SYSEXIT,7,,SYSEXIT
1241 /*----------------------------------------------------------------------
1242 * Input buffer & initial input.
1247 // XXX gives 'Warning: unterminated string; newline inserted' messages which you can ignore
1249 \\ Define some character constants
1255 \\ CR prints a carriage return
1258 \\ SPACE prints a space
1259 : SPACE 'SPACE' EMIT ;
1261 \\ Primitive . (DOT) function doesn't follow with a blank, so redefine it to behave like FORTH.
1262 \\ Notice how we can trivially redefine existing functions.
1265 \\ DUP, DROP are defined in assembly for speed, but this is how you might define them
1266 \\ in FORTH. Notice use of the scratch variables _X and _Y.
1267 \\ : DUP _X ! _X @ _X @ ;
1270 \\ The 2... versions of the standard operators work on pairs of stack entries. They're not used
1271 \\ very commonly so not really worth writing in assembler. Here is how they are defined in FORTH.
1275 \\ More standard FORTH words.
1279 \\ [ and ] allow you to break into immediate mode while compiling a word.
1280 : [ IMMEDIATE \\ define [ as an immediate word
1281 0 STATE ! \\ go into immediate mode
1285 1 STATE ! \\ go back to compile mode
1288 \\ LITERAL takes whatever is on the stack and compiles LIT <foo>
1290 ' LIT , \\ compile LIT
1291 , \\ compile the literal itself (from the stack)
1294 \\ condition IF true-part THEN rest
1296 \\ condition 0BRANCH OFFSET true-part rest
1297 \\ where OFFSET is the offset of 'rest'
1298 \\ condition IF true-part ELSE false-part THEN
1300 \\ condition 0BRANCH OFFSET true-part BRANCH OFFSET2 false-part rest
1301 \\ where OFFSET if the offset of false-part and OFFSET2 is the offset of rest
1303 \\ IF is an IMMEDIATE word which compiles 0BRANCH followed by a dummy offset, and places
1304 \\ the address of the 0BRANCH on the stack. Later when we see THEN, we pop that address
1305 \\ off the stack, calculate the offset, and back-fill the offset.
1307 ' 0BRANCH , \\ compile 0BRANCH
1308 HERE @ \\ save location of the offset on the stack
1309 0 , \\ compile a dummy offset
1314 HERE @ SWAP - \\ calculate the offset from the address saved on the stack
1315 SWAP ! \\ store the offset in the back-filled location
1319 ' BRANCH , \\ definite branch to just over the false-part
1320 HERE @ \\ save location of the offset on the stack
1321 0 , \\ compile a dummy offset
1322 SWAP \\ now back-fill the original (IF) offset
1323 DUP \\ same as for THEN word above
1328 \\ BEGIN loop-part condition UNTIL
1330 \\ loop-part condition 0BRANCH OFFSET
1331 \\ where OFFSET points back to the loop-part
1332 \\ This is like do { loop-part } while (condition) in the C language
1334 HERE @ \\ save location on the stack
1338 ' 0BRANCH , \\ compile 0BRANCH
1339 HERE @ - \\ calculate the offset from the address saved on the stack
1340 , \\ compile the offset here
1343 \\ BEGIN loop-part AGAIN
1345 \\ loop-part BRANCH OFFSET
1346 \\ where OFFSET points back to the loop-part
1347 \\ In other words, an infinite loop which can only be returned from with EXIT
1349 ' BRANCH , \\ compile BRANCH
1350 HERE @ - \\ calculate the offset back
1351 , \\ compile the offset here
1354 \\ BEGIN condition WHILE loop-part REPEAT
1356 \\ condition 0BRANCH OFFSET2 loop-part BRANCH OFFSET
1357 \\ where OFFSET points back to condition (the beginning) and OFFSET2 points to after the whole piece of code
1358 \\ So this is like a while (condition) { loop-part } loop in the C language
1360 ' 0BRANCH , \\ compile 0BRANCH
1361 HERE @ \\ save location of the offset2 on the stack
1362 0 , \\ compile a dummy offset2
1366 ' BRANCH , \\ compile BRANCH
1367 SWAP \\ get the original offset (from BEGIN)
1368 HERE @ - , \\ and compile it after BRANCH
1370 HERE @ SWAP - \\ calculate the offset2
1371 SWAP ! \\ and back-fill it in the original location
1374 \\ With the looping constructs, we can now write SPACES, which writes n spaces to stdout.
1377 SPACE \\ print a space
1378 1- \\ until we count down to 0
1383 \\ .S prints the contents of the stack. Very useful for debugging.
1385 DSP@ \\ get current stack pointer
1387 DUP @ . \\ print the stack element
1389 DUP S0 @ 4- = \\ stop when we get to the top
1394 \\ DEPTH returns the depth of the stack.
1395 : DEPTH S0 @ DSP@ - ;
1397 \\ .\" is the print string operator in FORTH. Example: .\" Something to print\"
1398 \\ The space after the operator is the ordinary space required between words.
1399 \\ This is tricky to define because it has to do different things depending on whether
1400 \\ we are compiling or in immediate mode. (Thus the word is marked IMMEDIATE so it can
1401 \\ detect this and do different things).
1402 \\ In immediate mode we just keep reading characters and printing them until we get to
1403 \\ the next double quote.
1404 \\ In compile mode we have the problem of where we're going to store the string (remember
1405 \\ that the input buffer where the string comes from may be overwritten by the time we
1406 \\ come round to running the function). We store the string in the compiled function
1408 \\ LITSTRING, string length, string rounded up to 4 bytes, EMITSTRING, ...
1410 STATE @ \\ compiling?
1412 ' LITSTRING , \\ compile LITSTRING
1413 HERE @ \\ save the address of the length word on the stack
1414 0 , \\ dummy length - we don't know what it is yet
1416 KEY \\ get next character of the string
1419 HERE @ !b \\ store the character in the compiled image
1420 1 HERE +! \\ increment HERE pointer by 1 byte
1422 DROP \\ drop the double quote character at the end
1423 DUP \\ get the saved address of the length word
1424 HERE @ SWAP - \\ calculate the length
1425 4- \\ subtract 4 (because we measured from the start of the length word)
1426 SWAP ! \\ and back-fill the length location
1427 HERE @ \\ round up to next multiple of 4 bytes for the remaining code
1431 ' EMITSTRING , \\ compile the final EMITSTRING
1433 \\ In immediate mode, just read characters and print them until we get
1434 \\ to the ending double quote. Much simpler than the above code!
1437 DUP '\"' = IF EXIT THEN
1443 \\ While compiling, [COMPILE] WORD compiles WORD if it would otherwise be IMMEDIATE.
1444 : [COMPILE] IMMEDIATE
1445 WORD \\ get the next word
1446 FIND \\ find it in the dictionary
1447 >CFA \\ get its codeword
1448 , \\ and compile that
1451 \\ RECURSE makes a recursive call to the current word that is being compiled.
1452 \\ Normally while a word is being compiled, it is marked HIDDEN so that references to the
1453 \\ same word within are calls to the previous definition of the word.
1455 LATEST @ >CFA \\ LATEST points to the word being compiled at the moment
1459 \\ ALLOT is used to allocate (static) memory when compiling. It increases HERE by
1460 \\ the amount given on the stack.
1464 \\ Finally print the welcome prompt.