1 /* A sometimes minimal FORTH compiler and tutorial for Linux / i386 systems. -*- asm -*-
2 By Richard W.M. Jones <rich@annexia.org> http://annexia.org/forth
4 gcc -m32 -nostdlib -static -Wl,-Ttext,0 -o jonesforth jonesforth.S
6 INTRODUCTION ----------------------------------------------------------------------
8 FORTH is one of those alien languages which most working programmers regard in the same
9 way as Haskell, LISP, and so on. Something so strange that they'd rather any thoughts
10 of it just go away so they can get on with writing this paying code. But that's wrong
11 and if you care at all about programming then you should at least understand all these
12 languages, even if you will never use them.
14 LISP is the ultimate high-level language, and features from LISP are being added every
15 decade to the more common languages. But FORTH is in some ways the ultimate in low level
16 programming. Out of the box it lacks features like dynamic memory management and even
17 strings. In fact, at its primitive level it lacks even basic concepts like IF-statements
20 Why then would you want to learn FORTH? There are several very good reasons. First
21 and foremost, FORTH is minimal. You really can write a complete FORTH in, say, 2000
22 lines of code. I don't just mean a FORTH program, I mean a complete FORTH operating
23 system, environment and language. You could boot such a FORTH on a bare PC and it would
24 come up with a prompt where you could start doing useful work. The FORTH you have here
25 isn't minimal and uses a Linux process as its 'base PC' (both for the purposes of making
26 it a good tutorial). It's possible to completely understand the system. Who can say they
27 completely understand how Linux works, or gcc?
29 Secondly FORTH has a peculiar bootstrapping property. By that I mean that after writing
30 a little bit of assembly to talk to the hardware and implement a few primitives, all the
31 rest of the language and compiler is written in FORTH itself. Remember I said before
32 that FORTH lacked IF-statements and loops? Well of course it doesn't really because
33 such a lanuage would be useless, but my point was rather that IF-statements and loops are
34 written in FORTH itself.
36 Now of course this is common in other languages as well, and in those languages we call
37 them 'libraries'. For example in C, 'printf' is a library function written in C. But
38 in FORTH this goes way beyond mere libraries. Can you imagine writing C's 'if' in C?
39 And that brings me to my third reason: If you can write 'if' in FORTH, then why restrict
40 yourself to the usual if/while/for/switch constructs? You want a construct that iterates
41 over every other element in a list of numbers? You can add it to the language. What
42 about an operator which pulls in variables directly from a configuration file and makes
43 them available as FORTH variables? Or how about adding Makefile-like dependencies to
44 the language? No problem in FORTH. This concept isn't common in programming languages,
45 but it has a name (in fact two names): "macros" (by which I mean LISP-style macros, not
46 the lame C preprocessor) and "domain specific languages" (DSLs).
48 This tutorial isn't about learning FORTH as the language. I'll point you to some references
49 you should read if you're not familiar with using FORTH. This tutorial is about how to
50 write FORTH. In fact, until you understand how FORTH is written, you'll have only a very
51 superficial understanding of how to use it.
53 So if you're not familiar with FORTH or want to refresh your memory here are some online
56 http://en.wikipedia.org/wiki/Forth_%28programming_language%29
58 http://galileo.phys.virginia.edu/classes/551.jvn.fall01/primer.htm
60 http://wiki.laptop.org/go/Forth_Lessons
62 Here is another "Why FORTH?" essay: http://www.jwdt.com/~paysan/why-forth.html
64 ACKNOWLEDGEMENTS ----------------------------------------------------------------------
66 This code draws heavily on the design of LINA FORTH (http://home.hccnet.nl/a.w.m.van.der.horst/lina.html)
67 by Albert van der Horst. Any similarities in the code are probably not accidental.
69 SETTING UP ----------------------------------------------------------------------
71 Let's get a few housekeeping things out of the way. Firstly because I need to draw lots of
72 ASCII-art diagrams to explain concepts, the best way to look at this is using a window which
73 uses a fixed width font and is at least this wide:
75 <------------------------------------------------------------------------------------------------------------------------>
77 Secondly make sure TABS are set to 8 characters. The following should be a vertical
78 line. If not, sort out your tabs.
84 Thirdly I assume that your screen is at least 50 characters high.
86 ASSEMBLING ----------------------------------------------------------------------
88 If you want to actually run this FORTH, rather than just read it, you will need Linux on an
89 i386. Linux because instead of programming directly to the hardware on a bare PC which I
90 could have done, I went for a simpler tutorial by assuming that the 'hardware' is a Linux
91 process with a few basic system calls (read, write and exit and that's about all). i386
92 is needed because I had to write the assembly for a processor, and i386 is by far the most
93 common. (Of course when I say 'i386', any 32- or 64-bit x86 processor will do. I'm compiling
94 this on a 64 bit AMD Opteron).
96 Again, to assemble this you will need gcc and gas (the GNU assembler). The commands to
97 assemble and run the code (save this file as 'jonesforth.S') are:
99 gcc -m32 -nostdlib -static -Wl,-Ttext,0 -o jonesforth jonesforth.S
102 You will see lots of 'Warning: unterminated string; newline inserted' messages from the
103 assembler. That's just because the GNU assembler doesn't have a good syntax for multi-line
104 strings (or rather it used to, but the developers removed it!) so I've abused the syntax
105 slightly to make things readable. Ignore these warnings.
107 ASSEMBLER ----------------------------------------------------------------------
109 (You can just skip to the next section -- you don't need to be able to read assembler to
110 follow this tutorial).
112 However if you do want to read the assembly code here are a few notes about gas (the GNU assembler):
114 (1) Register names are prefixed with '%', so %eax is the 32 bit i386 accumulator. The registers
115 available on i386 are: %eax, %ebx, %ecx, %edx, %esi, %edi, %ebp and %esp, and most of them
116 have special purposes.
118 (2) Add, mov, etc. take arguments in the form SRC,DEST. So mov %eax,%ecx moves %eax -> %ecx
120 (3) Constants are prefixed with '$', and you mustn't forget it! If you forget it then it
121 causes a read from memory instead, so:
122 mov $2,%eax moves number 2 into %eax
123 mov 2,%eax reads the 32 bit word from address 2 into %eax (ie. most likely a mistake)
125 (4) gas has a funky syntax for local labels, where '1f' (etc.) means label '1:' "forwards"
126 and '1b' (etc.) means label '1:' "backwards".
128 (5) 'ja' is "jump if above", 'jb' for "jump if below", 'je' "jump if equal" etc.
130 (6) gas has a reasonably nice .macro syntax, and I use them a lot to make the code shorter and
133 For more help reading the assembler, do "info gas" at the Linux prompt.
135 Now the tutorial starts in earnest.
137 THE DICTIONARY ----------------------------------------------------------------------
139 In FORTH as you will know, functions are called "words", as just as in other languages they
140 have a name and a definition. Here are two FORTH words:
142 : DOUBLE DUP + ; \ name is "DOUBLE", definition is "DUP +"
143 : QUADRUPLE DOUBLE DOUBLE ; \ name is "QUADRUPLE", definition is "DOUBLE DOUBLE"
145 Words, both built-in ones and ones which the programmer defines later, are stored in a dictionary
146 which is just a linked list of dictionary entries.
148 <--- DICTIONARY ENTRY (HEADER) ----------------------->
149 +------------------------+--------+---------- - - - - +----------- - - - -
150 | LINK POINTER | LENGTH/| NAME | DEFINITION
152 +--- (4 bytes) ----------+- byte -+- n bytes - - - - +----------- - - - -
154 I'll come to the definition of the word later. For now just look at the header. The first
155 4 bytes are the link pointer. This points back to the previous word in the dictionary, or, for
156 the first word in the dictionary it is just a NULL pointer. Then comes a length/flags byte.
157 The length of the word can be up to 31 characters (5 bits used) and the top three bits are used
158 for various flags which I'll come to later. This is followed by the name itself, and in this
159 implementation the name is rounded up to a multiple of 4 bytes by padding it with zero bytes.
160 That's just to ensure that the definition starts on a 32 bit boundary.
162 A FORTH variable called LATEST contains a pointer to the most recently defined word, in
163 other words, the head of this linked list.
165 DOUBLE and QUADRUPLE might look like this:
167 pointer to previous word
170 +--|------+---+---+---+---+---+---+---+---+------------- - - - -
171 | LINK | 6 | D | O | U | B | L | E | 0 | (definition ...)
172 +---------+---+---+---+---+---+---+---+---+------------- - - - -
175 +--|------+---+---+---+---+---+---+---+---+---+---+---+---+------------- - - - -
176 | LINK | 9 | Q | U | A | D | R | U | P | L | E | 0 | 0 | (definition ...)
177 +---------+---+---+---+---+---+---+---+---+---+---+---+---+------------- - - - -
183 You shoud be able to see from this how you might implement functions to find a word in
184 the dictionary (just walk along the dictionary entries starting at LATEST and matching
185 the names until you either find a match or hit the NULL pointer at the end of the dictionary),
186 and add a word to the dictionary (create a new definition, set its LINK to LATEST, and set
187 LATEST to point to the new word). We'll see precisely these functions implemented in
188 assembly code later on.
190 One interesting consequence of using a linked list is that you can redefine words, and
191 a newer definition of a word overrides an older one. This is an important concept in
192 FORTH because it means that any word (even "built-in" or "standard" words) can be
193 overridden with a new definition, either to enhance it, to make it faster or even to
194 disable it. However because of the way that FORTH words get compiled, which you'll
195 understand below, words defined using the old definition of a word continue to use
196 the old definition. Only words defined after the new definition use the new definition.
198 DIRECT THREADED CODE ----------------------------------------------------------------------
200 Now we'll get to the really crucial bit in understanding FORTH, so go and get a cup of tea
201 or coffee and settle down. It's fair to say that if you don't understand this section, then you
202 won't "get" how FORTH works, and that would be a failure on my part for not explaining it well.
203 So if after reading this section a few times you don't understand it, please email me
206 Let's talk first about what "threaded code" means. Imagine a peculiar version of C where
207 you are only allowed to call functions without arguments. (Don't worry for now that such a
208 language would be completely useless!) So in our peculiar C, code would look like this:
217 and so on. How would a function, say 'f' above, be compiled by a standard C compiler?
218 Probably into assembly code like this. On the right hand side I've written the actual
222 CALL a E8 08 00 00 00
223 CALL b E8 1C 00 00 00
224 CALL c E8 2C 00 00 00
225 ; ignore the return from the function for now
227 "E8" is the x86 machine code to "CALL" a function. In the first 20 years of computing
228 memory was hideously expensive and we might have worried about the wasted space being used
229 by the repeated "E8" bytes. We can save 20% in code size (and therefore, in expensive memory)
230 by compressing this into just:
232 08 00 00 00 Just the function addresses, without
233 1C 00 00 00 the CALL prefix.
236 [Historical note: If the execution model that FORTH uses looks strange from the following
237 paragraphs, then it was motivated entirely by the need to save memory on early computers.
238 This code compression isn't so important now when our machines have more memory in their L1
239 caches than those early computers had in total, but the execution model still has some
242 Of course this code won't run directly any more. Instead we need to write an interpreter
243 which takes each pair of bytes and calls it.
245 On an i386 machine it turns out that we can write this interpreter rather easily, in just
246 two assembly instructions which turn into just 3 bytes of machine code. Let's store the
247 pointer to the next word to execute in the %esi register:
249 08 00 00 00 <- We're executing this one now. %esi is the _next_ one to execute.
253 The all-important x86 instruction is called LODSL (or in Intel manuals, LODSW). It does
254 two things. Firstly it reads the memory at %esi into the accumulator (%eax). Secondly it
255 increments %esi by 4 bytes. So after LODSL, the situation now looks like this:
257 08 00 00 00 <- We're still executing this one
258 1C 00 00 00 <- %eax now contains this address (0x0000001C)
261 Now we just need to jump to the address in %eax. This is again just a single x86 instruction
262 written JMP *(%eax). And after doing the jump, the situation looks like:
265 1C 00 00 00 <- Now we're executing this subroutine.
268 To make this work, each subroutine is followed by the two instructions 'LODSL; JMP *(%eax)'
269 which literally make the jump to the next subroutine.
271 And that brings us to our first piece of actual code! Well, it's a macro.
280 /* The macro is called NEXT. That's a FORTH-ism. It expands to those two instructions.
282 Every FORTH primitive that we write has to be ended by NEXT. Think of it kind of like
285 The above describes what is known as direct threaded code.
287 To sum up: We compress our function calls down to a list of addresses and use a somewhat
288 magical macro to act as a "jump to next function in the list". We also use one register (%esi)
289 to act as a kind of instruction pointer, pointing to the next function in the list.
291 I'll just give you a hint of what is to come by saying that a FORTH definition such as:
293 : QUADRUPLE DOUBLE DOUBLE ;
295 actually compiles (almost, not precisely but we'll see why in a moment) to a list of
296 function addresses for DOUBLE, DOUBLE and a special function called EXIT to finish off.
298 At this point, REALLY EAGLE-EYED ASSEMBLY EXPERTS are saying "JONES, YOU'VE MADE A MISTAKE!".
300 I lied about JMP *(%eax).
302 INDIRECT THREADED CODE ----------------------------------------------------------------------
304 It turns out that direct threaded code is interesting but only if you want to just execute
305 a list of functions written in assembly language. So QUADRUPLE would work only if DOUBLE
306 was an assembly language function. In the direct threaded code, QUADRUPLE would look like:
309 | addr of DOUBLE --------------------> (assembly code to do the double)
310 +------------------+ NEXT
311 %esi -> | addr of DOUBLE |
314 We can add an extra indirection to allow us to run both words written in assembly language
315 (primitives written for speed) and words written in FORTH themselves as lists of addresses.
317 The extra indirection is the reason for the brackets in JMP *(%eax).
319 Let's have a look at how QUADRUPLE and DOUBLE really look in FORTH:
321 : QUADRUPLE DOUBLE DOUBLE ;
324 | codeword | : DOUBLE DUP + ;
326 | addr of DOUBLE ---------------> +------------------+
327 +------------------+ | codeword |
328 | addr of DOUBLE | +------------------+
329 +------------------+ | addr of DUP --------------> +------------------+
330 | addr of EXIT | +------------------+ | codeword -------+
331 +------------------+ %esi -> | addr of + --------+ +------------------+ |
332 +------------------+ | | assembly to <-----+
333 | addr of EXIT | | | implement DUP |
334 +------------------+ | | .. |
337 | +------------------+
339 +-----> +------------------+
341 +------------------+ |
342 | assembly to <------+
349 This is the part where you may need an extra cup of tea/coffee/favourite caffeinated
350 beverage. What has changed is that I've added an extra pointer to the beginning of
351 the definitions. In FORTH this is sometimes called the "codeword". The codeword is
352 a pointer to the interpreter to run the function. For primitives written in
353 assembly language, the "interpreter" just points to the actual assembly code itself.
354 They don't need interpreting, they just run.
356 In words written in FORTH (like QUADRUPLE and DOUBLE), the codeword points to an interpreter
359 I'll show you the interpreter function shortly, but let's recall our indirect
360 JMP *(%eax) with the "extra" brackets. Take the case where we're executing DOUBLE
361 as shown, and DUP has been called. Note that %esi is pointing to the address of +
363 The assembly code for DUP eventually does a NEXT. That:
365 (1) reads the address of + into %eax %eax points to the codeword of +
366 (2) increments %esi by 4
367 (3) jumps to the indirect %eax jumps to the address in the codeword of +,
368 ie. the assembly code to implement +
373 | addr of DOUBLE ---------------> +------------------+
374 +------------------+ | codeword |
375 | addr of DOUBLE | +------------------+
376 +------------------+ | addr of DUP --------------> +------------------+
377 | addr of EXIT | +------------------+ | codeword -------+
378 +------------------+ | addr of + --------+ +------------------+ |
379 +------------------+ | | assembly to <-----+
380 %esi -> | addr of EXIT | | | implement DUP |
381 +------------------+ | | .. |
384 | +------------------+
386 +-----> +------------------+
388 +------------------+ |
389 now we're | assembly to <------+
390 executing | implement + |
396 So I hope that I've convinced you that NEXT does roughly what you'd expect. This is
397 indirect threaded code.
399 I've glossed over four things. I wonder if you can guess without reading on what they are?
405 My list of four things are: (1) What does "EXIT" do? (2) which is related to (1) is how do
406 you call into a function, ie. how does %esi start off pointing at part of QUADRUPLE, but
407 then point at part of DOUBLE. (3) What goes in the codeword for the words which are written
408 in FORTH? (4) How do you compile a function which does anything except call other functions
409 ie. a function which contains a number like : DOUBLE 2 * ; ?
411 THE INTERPRETER AND RETURN STACK ------------------------------------------------------------
413 Going at these in no particular order, let's talk about issues (3) and (2), the interpreter
414 and the return stack.
416 Words which are defined in FORTH need a codeword which points to a little bit of code to
417 give them a "helping hand" in life. They don't need much, but they do need what is known
418 as an "interpreter", although it doesn't really "interpret" in the same way that, say,
419 Java bytecode used to be interpreted (ie. slowly). This interpreter just sets up a few
420 machine registers so that the word can then execute at full speed using the indirect
421 threaded model above.
423 One of the things that needs to happen when QUADRUPLE calls DOUBLE is that we save the old
424 %esi ("instruction pointer") and create a new one pointing to the first word in DOUBLE.
425 Because we will need to restore the old %esi at the end of DOUBLE (this is, after all, like
426 a function call), we will need a stack to store these "return addresses" (old values of %esi).
428 As you will have read, when reading the background documentation, FORTH has two stacks,
429 an ordinary stack for parameters, and a return stack which is a bit more mysterious. But
430 our return stack is just the stack I talked about in the previous paragraph, used to save
431 %esi when calling from a FORTH word into another FORTH word.
433 In this FORTH, we are using the normal stack pointer (%esp) for the parameter stack.
434 We will use the i386's "other" stack pointer (%ebp, usually called the "frame pointer")
435 for our return stack.
437 I've got two macros which just wrap up the details of using %ebp for the return stack.
438 You use them as for example "PUSHRSP %eax" (push %eax on the return stack) or "POPRSP %ebx"
439 (pop top of return stack into %ebx).
442 /* Macros to deal with the return stack. */
444 lea -4(%ebp),%ebp // push reg on to return stack
449 mov (%ebp),\reg // pop top of return stack to reg
454 And with that we can now talk about the interpreter.
456 In FORTH the interpreter function is often called DOCOL (I think it means "DO COLON" because
457 all FORTH definitions start with a colon, as in : DOUBLE DUP + ;
459 The "interpreter" (it's not really "interpreting") just needs to push the old %esi on the
460 stack and set %esi to the first word in the definition. Remember that we jumped to the
461 function using JMP *(%eax)? Well a consequence of that is that conveniently %eax contains
462 the address of this codeword, so just by adding 4 to it we get the address of the first
463 data word. Finally after setting up %esi, it just does NEXT which causes that first word
467 /* DOCOL - the interpreter! */
471 PUSHRSP %esi // push %esi on to the return stack
472 addl $4,%eax // %eax points to codeword, so make
473 movl %eax,%esi // %esi point to first data word
477 Just to make this absolutely clear, let's see how DOCOL works when jumping from QUADRUPLE
483 +------------------+ DOUBLE:
484 | addr of DOUBLE ---------------> +------------------+
485 +------------------+ %eax -> | addr of DOCOL |
486 %esi -> | addr of DOUBLE | +------------------+
487 +------------------+ | addr of DUP -------------->
488 | addr of EXIT | +------------------+
489 +------------------+ | etc. |
491 First, the call to DOUBLE causes DOCOL (the codeword of DOUBLE). DOCOL does this: It
492 pushes the old %esi on the return stack. %eax points to the codeword of DOUBLE, so we
493 just add 4 on to it to get our new %esi:
498 +------------------+ DOUBLE:
499 | addr of DOUBLE ---------------> +------------------+
500 top of return +------------------+ %eax -> | addr of DOCOL |
501 stack points -> | addr of DOUBLE | + 4 = +------------------+
502 +------------------+ %esi -> | addr of DUP -------------->
503 | addr of EXIT | +------------------+
504 +------------------+ | etc. |
506 Then we do NEXT, and because of the magic of threaded code that increments %esi again
509 Well, it seems to work.
511 One minor point here. Because DOCOL is the first bit of assembly actually to be defined
512 in this file (the others were just macros), and because I usually compile this code with the
513 text segment starting at address 0, DOCOL has address 0. So if you are disassembling the
514 code and see a word with a codeword of 0, you will immediately know that the word is
515 written in FORTH (it's not an assembler primitive) and so uses DOCOL as the interpreter.
517 STARTING UP ----------------------------------------------------------------------
519 Now let's get down to nuts and bolts. When we start the program we need to set up
520 a few things like the return stack. But as soon as we can, we want to jump into FORTH
521 code (albeit much of the "early" FORTH code will still need to be written as
522 assembly language primitives).
524 This is what the set up code does. Does a tiny bit of house-keeping, sets up the
525 separate return stack (NB: Linux gives us the ordinary parameter stack already), then
526 immediately jumps to a FORTH word called COLD. COLD stands for cold-start. In ISO
527 FORTH (but not in this FORTH), COLD can be called at any time to completely reset
528 the state of FORTH, and there is another word called WARM which does a partial reset.
531 /* ELF entry point. */
536 mov %esp,var_S0 // Store the initial data stack pointer.
537 mov $return_stack,%ebp // Initialise the return stack.
539 mov $cold_start,%esi // Initialise interpreter.
540 NEXT // Run interpreter!
543 cold_start: // High-level code without a codeword.
547 We also allocate some space for the return stack and some space to store user
548 definitions. These are static memory allocations using fixed-size buffers, but it
549 wouldn't be a great deal of work to make them dynamic.
553 /* FORTH return stack. */
554 #define RETURN_STACK_SIZE 8192
556 .space RETURN_STACK_SIZE
557 return_stack: // Initial top of return stack.
559 /* Space for user-defined words. */
560 #define USER_DEFS_SIZE 16384
563 .space USER_DEFS_SIZE
566 BUILT-IN WORDS ----------------------------------------------------------------------
568 Remember our dictionary entries (headers). Let's bring those together with the codeword
569 and data words to see how : DOUBLE DUP + ; really looks in memory.
571 pointer to previous word
574 +--|------+---+---+---+---+---+---+---+---+------------+------------+------------+------------+
575 | LINK | 6 | D | O | U | B | L | E | 0 | DOCOL | DUP | + | EXIT |
576 +---------+---+---+---+---+---+---+---+---+------------+--|---------+------------+------------+
579 LINK in next word points to codeword of DUP
581 Initially we can't just write ": DOUBLE DUP + ;" (ie. that literal string) here because we
582 don't yet have anything to read the string, break it up at spaces, parse each word, etc. etc.
583 So instead we will have to define built-in words using the GNU assembler data constructors
584 (like .int, .byte, .string, .ascii and so on -- look them up in the gas info page if you are
587 The long way would be:
588 .int <link to previous word>
590 .ascii "DOUBLE" // string
592 DOUBLE: .int DOCOL // codeword
593 .int DUP // pointer to codeword of DUP
594 .int PLUS // pointer to codeword of +
595 .int EXIT // pointer to codeword of EXIT
597 That's going to get quite tedious rather quickly, so here I define an assembler macro
598 so that I can just write:
600 defword "DOUBLE",6,,DOUBLE
603 and I'll get exactly the same effect.
605 Don't worry too much about the exact implementation details of this macro - it's complicated!
608 // Store the chain of links.
611 .macro defword name, namelen, flags=0, label
617 .set link,name_\label
618 .byte \flags+\namelen // flags + length byte
619 .ascii "\name" // the name
623 .int DOCOL // codeword - the interpreter
624 // list of word pointers follow
628 Similarly I want a way to write words written in assembly language. There will quite a few
629 of these to start with because, well, everything has to start in assembly before there's
630 enough "infrastructure" to be able to start writing FORTH words, but also I want to define
631 some common FORTH words in assembly language for speed, even though I could write them in FORTH.
633 This is what DUP looks like in memory:
635 pointer to previous word
638 +--|------+---+---+---+---+------------+
639 | LINK | 3 | D | U | P | code_DUP ---------------------> points to the assembly
640 +---------+---+---+---+---+------------+ code used to write DUP,
641 ^ len codeword which is ended with NEXT.
645 Again, for brevity in writing the header I'm going to use an assembler macro called defcode.
648 .macro defcode name, namelen, flags=0, label
654 .set link,name_\label
655 .byte \flags+\namelen // flags + length byte
656 .ascii "\name" // the name
660 .int code_\label // codeword
664 code_\label : // assembler code follows
668 Now some easy FORTH primitives. These are written in assembly for speed. If you understand
669 i386 assembly language then it is worth reading these. However if you don't understand assembly
670 you can skip the details.
674 pop %eax // duplicate top of stack
679 defcode "DROP",4,,DROP
680 pop %eax // drop top of stack
683 defcode "SWAP",4,,SWAP
684 pop %eax // swap top of stack
690 defcode "OVER",4,,OVER
691 mov 4(%esp),%eax // get the second element of stack
692 push %eax // and push it on top
704 defcode "-ROT",4,,NROT
714 incl (%esp) // increment top of stack
718 decl (%esp) // decrement top of stack
721 defcode "4+",2,,INCR4
722 addl $4,(%esp) // increment top of stack
725 defcode "4-",2,,DECR4
726 subl $4,(%esp) // decrement top of stack
743 push %eax // ignore overflow
751 push %eax // push quotient
759 push %edx // push remainder
762 defcode "=",1,,EQU // top two words are equal?
772 defcode "<>",2,,NEQU // top two words are not equal?
782 defcode "0=",2,,ZEQU // top of stack equals 0?
801 defcode "INVERT",6,,INVERT
807 #define F_HIDDEN 0x20
809 // COLD must not return (ie. must not call EXIT).
810 defword "COLD",4,,COLD
811 // XXX reinitialisation of the interpreter
812 .int INTERPRETER // call the interpreter loop (never returns)
813 .int LIT,1,SYSEXIT // hmmm, but in case it does, exit(1).
815 defcode "EXIT",4,,EXIT
816 POPRSP %esi // pop return stack into %esi
820 // %esi points to the next command, but in this case it points to the next
821 // literal 32 bit integer. Get that literal into %eax and increment %esi.
822 // On x86, it's a convenient single byte instruction! (cf. NEXT macro)
824 push %eax // push the literal number on to stack
827 defcode "LITSTRING",9,,LITSTRING
828 lodsl // get the length of the string
829 push %eax // push it on the stack
830 push %esi // push the address of the start of the string
831 addl %eax,%esi // skip past the string
832 addl $3,%esi // but round up to next 4 byte boundary
836 defcode "BRANCH",6,,BRANCH
837 add (%esi),%esi // add the offset to the instruction pointer
840 defcode "0BRANCH",7,,ZBRANCH
842 test %eax,%eax // top of stack is zero?
843 jz code_BRANCH // if so, jump back to the branch function above
844 lodsl // otherwise we need to skip the offset
848 pop %ebx // address to store at
849 pop %eax // data to store there
850 mov %eax,(%ebx) // store it
854 pop %ebx // address to fetch
855 mov (%ebx),%eax // fetch it
856 push %eax // push value onto stack
859 defcode "+!",2,,ADDSTORE
861 pop %eax // the amount to add
862 addl %eax,(%ebx) // add it
865 defcode "-!",2,,SUBSTORE
867 pop %eax // the amount to subtract
868 subl %eax,(%ebx) // add it
871 /* ! and @ (STORE and FETCH) store 32-bit words. It's also useful to be able to read and write bytes.
872 * I don't know whether FORTH has these words, so I invented my own, called !b and @b.
873 * Byte-oriented operations only work on architectures which permit them (i386 is one of those).
874 * UPDATE: writing a byte to the dictionary pointer is called C, in FORTH.
876 defcode "!b",2,,STOREBYTE
877 pop %ebx // address to store at
878 pop %eax // data to store there
879 movb %al,(%ebx) // store it
882 defcode "@b",2,,FETCHBYTE
883 pop %ebx // address to fetch
885 movb (%ebx),%al // fetch it
886 push %eax // push value onto stack
889 .macro defvar name, namelen, flags=0, label, initial=0
890 defcode \name,\namelen,\flags,\label
899 // The STATE variable is 0 for execute mode, != 0 for compile mode
900 defvar "STATE",5,,STATE
902 // This points to where compiled words go.
903 defvar "HERE",4,,HERE,user_defs_start
905 // This is the last definition in the dictionary.
906 defvar "LATEST",6,,LATEST,name_SYSEXIT // SYSEXIT must be last in built-in dictionary
908 // _X, _Y and _Z are scratch variables used by standard words.
913 // This stores the top of the data stack.
916 // This stores the top of the return stack.
917 defvar "R0",2,,RZ,return_stack
919 defcode "DSP@",4,,DSPFETCH
924 defcode "DSP!",4,,DSPSTORE
929 pop %eax // pop parameter stack into %eax
930 PUSHRSP %eax // push it on to the return stack
933 defcode "R>",2,,FROMR
934 POPRSP %eax // pop return stack on to %eax
935 push %eax // and push on to parameter stack
938 defcode "RSP@",4,,RSPFETCH
942 defcode "RSP!",4,,RSPSTORE
946 defcode "RDROP",5,,RDROP
947 lea 4(%ebp),%ebp // pop return stack and throw away
950 #include <asm-i386/unistd.h>
954 push %eax // push return value on stack
966 1: // out of input; use read(2) to fetch more input from stdin
967 xor %ebx,%ebx // 1st param: stdin
968 mov $buffer,%ecx // 2nd param: buffer
970 mov $buffend-buffer,%edx // 3rd param: max length
971 mov $__NR_read,%eax // syscall: read
973 test %eax,%eax // If %eax <= 0, then exit.
975 addl %eax,%ecx // buffer+%eax = bufftop
979 2: // error or out of input: exit
981 mov $__NR_exit,%eax // syscall: exit
984 defcode "EMIT",4,,EMIT
989 mov $1,%ebx // 1st param: stdout
991 // write needs the address of the byte to write
993 mov $2f,%ecx // 2nd param: address
995 mov $1,%edx // 3rd param: nbytes = 1
997 mov $__NR_write,%eax // write syscall
1002 2: .space 1 // scratch used by EMIT
1004 defcode "WORD",4,,WORD
1006 push %ecx // push length
1007 push %edi // push base address
1011 /* Search for first non-blank character. Also skip \ comments. */
1013 call _KEY // get next key, returned in %eax
1014 cmpb $'\\',%al // start of a comment?
1015 je 3f // if so, skip the comment
1017 jbe 1b // if so, keep looking
1019 /* Search for the end of the word, storing chars as we go. */
1020 mov $5f,%edi // pointer to return buffer
1022 stosb // add character to return buffer
1023 call _KEY // get next key, returned in %al
1024 cmpb $' ',%al // is blank?
1025 ja 2b // if not, keep looping
1027 /* Return the word (well, the static buffer) and length. */
1029 mov %edi,%ecx // return length of the word
1030 mov $5f,%edi // return address of the word
1033 /* Code to skip \ comments to end of the current line. */
1036 cmpb $'\n',%al // end of line yet?
1041 // A static buffer where WORD returns. Subsequent calls
1042 // overwrite this buffer. Maximum word length is 32 chars.
1045 defcode "EMITSTRING",10,,EMITSTRING
1046 mov $1,%ebx // 1st param: stdout
1047 pop %ecx // 2nd param: address of string
1048 pop %edx // 3rd param: length of string
1050 mov $__NR_write,%eax // write syscall
1056 pop %eax // Get the number to print into %eax
1057 call _DOT // Easier to do this recursively ...
1060 mov $10,%ecx // Base 10
1064 xor %edx,%edx // %edx:%eax / %ecx -> quotient %eax, remainder %edx
1078 // Parse a number from a string on the stack -- almost the opposite of . (DOT)
1079 // Note that there is absolutely no error checking. In particular the length of the
1080 // string must be >= 1 bytes.
1081 defcode "SNUMBER",7,,SNUMBER
1091 imull $10,%eax // %eax *= 10
1094 subb $'0',%bl // ASCII -> digit
1100 defcode "FIND",4,,FIND
1101 pop %edi // %edi = address
1102 pop %ecx // %ecx = length
1108 push %esi // Save %esi so we can use it in string comparison.
1110 // Now we start searching backwards through the dictionary for this word.
1111 mov var_LATEST,%edx // LATEST points to name header of the latest word in the dictionary
1113 test %edx,%edx // NULL pointer? (end of the linked list)
1116 // Compare the length expected and the length of the word.
1117 // Note that if the F_HIDDEN flag is set on the word, then by a bit of trickery
1118 // this won't pick the word (the length will appear to be wrong).
1120 movb 4(%edx),%al // %al = flags+length field
1121 andb $(F_HIDDEN|0x1f),%al // %al = name length
1122 cmpb %cl,%al // Length is the same?
1125 // Compare the strings in detail.
1126 push %ecx // Save the length
1127 push %edi // Save the address (repe cmpsb will move this pointer)
1128 lea 5(%edx),%esi // Dictionary string we are checking against.
1129 repe cmpsb // Compare the strings.
1132 jne 2f // Not the same.
1134 // The strings are the same - return the header pointer in %eax
1140 mov (%edx),%edx // Move back through the link field to the previous word
1141 jmp 1b // .. and loop.
1145 xor %eax,%eax // Return zero to indicate not found.
1148 defcode ">CFA",4,,TCFA // DEA -> Codeword address
1155 add $4,%edi // Skip link pointer.
1156 movb (%edi),%al // Load flags+len into %al.
1157 inc %edi // Skip flags+len byte.
1158 andb $0x1f,%al // Just the length, not the flags.
1159 add %eax,%edi // Skip the name.
1160 addl $3,%edi // The codeword is 4-byte aligned.
1164 defcode "CHAR",4,,CHAR
1165 call _WORD // Returns %ecx = length, %edi = pointer to word.
1167 movb (%edi),%al // Get the first character of the word.
1168 push %eax // Push it onto the stack.
1171 defcode ":",1,,COLON
1173 // Get the word and create a dictionary entry header for it.
1174 call _WORD // Returns %ecx = length, %edi = pointer to word.
1175 mov %edi,%ebx // %ebx = address of the word
1177 movl var_HERE,%edi // %edi is the address of the header
1178 movl var_LATEST,%eax // Get link pointer
1179 stosl // and store it in the header.
1181 mov %cl,%al // Get the length.
1182 orb $F_HIDDEN,%al // Set the HIDDEN flag on this entry.
1183 stosb // Store the length/flags byte.
1185 mov %ebx,%esi // %esi = word
1186 rep movsb // Copy the word
1188 addl $3,%edi // Align to next 4 byte boundary.
1191 movl $DOCOL,%eax // The codeword for user-created words is always DOCOL (the interpreter)
1194 // Header built, so now update LATEST and HERE.
1195 // We'll be compiling words and putting them HERE.
1197 movl %eax,var_LATEST
1200 // And go into compile mode by setting STATE to 1.
1204 defcode ",",1,,COMMA
1205 pop %eax // Code pointer to store.
1209 movl var_HERE,%edi // HERE
1211 movl %edi,var_HERE // Update HERE (incremented)
1214 defcode "HIDDEN",6,,HIDDEN
1218 movl var_LATEST,%edi // LATEST word.
1219 addl $4,%edi // Point to name/flags byte.
1220 xorb $F_HIDDEN,(%edi) // Toggle the HIDDEN bit.
1223 defcode "IMMEDIATE",9,F_IMMED,IMMEDIATE
1227 movl var_LATEST,%edi // LATEST word.
1228 addl $4,%edi // Point to name/flags byte.
1229 xorb $F_IMMED,(%edi) // Toggle the IMMED bit.
1232 defcode ";",1,F_IMMED,SEMICOLON
1233 movl $EXIT,%eax // EXIT is the final codeword in compiled words.
1234 call _COMMA // Store it.
1235 call _HIDDEN // Toggle the HIDDEN flag (unhides the new word).
1236 xor %eax,%eax // Set STATE to 0 (back to execute mode).
1240 /* This definiton of ' (TICK) is strictly cheating - it also only works in compiled code. */
1242 lodsl // Get the address of the next word and skip it.
1243 pushl %eax // Push it on the stack.
1246 /* This interpreter is pretty simple, but remember that in FORTH you can always override
1247 * it later with a more powerful one!
1249 defword "INTERPRETER",11,,INTERPRETER
1250 .int INTERPRET,RDROP,INTERPRETER
1252 defcode "INTERPRET",9,,INTERPRET
1253 call _WORD // Returns %ecx = length, %edi = pointer to word.
1255 // Is it in the dictionary?
1257 movl %eax,interpret_is_lit // Not a literal number (not yet anyway ...)
1258 call _FIND // Returns %eax = pointer to header or 0 if not found.
1259 test %eax,%eax // Found?
1262 // In the dictionary. Is it an IMMEDIATE codeword?
1263 mov %eax,%edi // %edi = dictionary entry
1264 movb 4(%edi),%al // Get name+flags.
1265 push %ax // Just save it for now.
1266 call _TCFA // Convert dictionary entry (in %edi) to codeword pointer.
1268 andb $F_IMMED,%al // Is IMMED flag set?
1270 jnz 4f // If IMMED, jump straight to executing.
1274 1: // Not in the dictionary (not a word) so assume it's a literal number.
1275 incl interpret_is_lit
1276 call _SNUMBER // Returns the parsed number in %eax
1278 mov $LIT,%eax // The word is LIT
1280 2: // Are we compiling or executing?
1283 jz 4f // Jump if executing.
1285 // Compiling - just append the word to the current dictionary definition.
1287 mov interpret_is_lit,%ecx // Was it a literal?
1290 mov %ebx,%eax // Yes, so LIT is followed by a number.
1294 4: // Executing - run it!
1295 mov interpret_is_lit,%ecx // Literal?
1296 test %ecx,%ecx // Literal?
1299 // Not a literal, execute it now. This never returns, but the codeword will
1300 // eventually call NEXT which will reenter the loop in INTERPRETER.
1303 5: // Executing a literal, which means push it on the stack.
1310 .int 0 // Flag used to record if reading a literal
1312 // NB: SYSEXIT must be the last entry in the built-in dictionary.
1313 defcode SYSEXIT,7,,SYSEXIT
1318 /*----------------------------------------------------------------------
1319 * Input buffer & initial input.
1324 // XXX gives 'Warning: unterminated string; newline inserted' messages which you can ignore
1326 \\ Define some character constants
1332 \\ CR prints a carriage return
1335 \\ SPACE prints a space
1336 : SPACE 'SPACE' EMIT ;
1338 \\ Primitive . (DOT) function doesn't follow with a blank, so redefine it to behave like FORTH.
1339 \\ Notice how we can trivially redefine existing functions.
1342 \\ DUP, DROP are defined in assembly for speed, but this is how you might define them
1343 \\ in FORTH. Notice use of the scratch variables _X and _Y.
1344 \\ : DUP _X ! _X @ _X @ ;
1347 \\ The 2... versions of the standard operators work on pairs of stack entries. They're not used
1348 \\ very commonly so not really worth writing in assembler. Here is how they are defined in FORTH.
1352 \\ More standard FORTH words.
1356 \\ [ and ] allow you to break into immediate mode while compiling a word.
1357 : [ IMMEDIATE \\ define [ as an immediate word
1358 0 STATE ! \\ go into immediate mode
1362 1 STATE ! \\ go back to compile mode
1365 \\ LITERAL takes whatever is on the stack and compiles LIT <foo>
1367 ' LIT , \\ compile LIT
1368 , \\ compile the literal itself (from the stack)
1371 \\ condition IF true-part THEN rest
1373 \\ condition 0BRANCH OFFSET true-part rest
1374 \\ where OFFSET is the offset of 'rest'
1375 \\ condition IF true-part ELSE false-part THEN
1377 \\ condition 0BRANCH OFFSET true-part BRANCH OFFSET2 false-part rest
1378 \\ where OFFSET if the offset of false-part and OFFSET2 is the offset of rest
1380 \\ IF is an IMMEDIATE word which compiles 0BRANCH followed by a dummy offset, and places
1381 \\ the address of the 0BRANCH on the stack. Later when we see THEN, we pop that address
1382 \\ off the stack, calculate the offset, and back-fill the offset.
1384 ' 0BRANCH , \\ compile 0BRANCH
1385 HERE @ \\ save location of the offset on the stack
1386 0 , \\ compile a dummy offset
1391 HERE @ SWAP - \\ calculate the offset from the address saved on the stack
1392 SWAP ! \\ store the offset in the back-filled location
1396 ' BRANCH , \\ definite branch to just over the false-part
1397 HERE @ \\ save location of the offset on the stack
1398 0 , \\ compile a dummy offset
1399 SWAP \\ now back-fill the original (IF) offset
1400 DUP \\ same as for THEN word above
1405 \\ BEGIN loop-part condition UNTIL
1407 \\ loop-part condition 0BRANCH OFFSET
1408 \\ where OFFSET points back to the loop-part
1409 \\ This is like do { loop-part } while (condition) in the C language
1411 HERE @ \\ save location on the stack
1415 ' 0BRANCH , \\ compile 0BRANCH
1416 HERE @ - \\ calculate the offset from the address saved on the stack
1417 , \\ compile the offset here
1420 \\ BEGIN loop-part AGAIN
1422 \\ loop-part BRANCH OFFSET
1423 \\ where OFFSET points back to the loop-part
1424 \\ In other words, an infinite loop which can only be returned from with EXIT
1426 ' BRANCH , \\ compile BRANCH
1427 HERE @ - \\ calculate the offset back
1428 , \\ compile the offset here
1431 \\ BEGIN condition WHILE loop-part REPEAT
1433 \\ condition 0BRANCH OFFSET2 loop-part BRANCH OFFSET
1434 \\ where OFFSET points back to condition (the beginning) and OFFSET2 points to after the whole piece of code
1435 \\ So this is like a while (condition) { loop-part } loop in the C language
1437 ' 0BRANCH , \\ compile 0BRANCH
1438 HERE @ \\ save location of the offset2 on the stack
1439 0 , \\ compile a dummy offset2
1443 ' BRANCH , \\ compile BRANCH
1444 SWAP \\ get the original offset (from BEGIN)
1445 HERE @ - , \\ and compile it after BRANCH
1447 HERE @ SWAP - \\ calculate the offset2
1448 SWAP ! \\ and back-fill it in the original location
1451 \\ With the looping constructs, we can now write SPACES, which writes n spaces to stdout.
1454 SPACE \\ print a space
1455 1- \\ until we count down to 0
1460 \\ .S prints the contents of the stack. Very useful for debugging.
1462 DSP@ \\ get current stack pointer
1464 DUP @ . \\ print the stack element
1466 DUP S0 @ 4- = \\ stop when we get to the top
1471 \\ DEPTH returns the depth of the stack.
1472 : DEPTH S0 @ DSP@ - ;
1474 \\ .\" is the print string operator in FORTH. Example: .\" Something to print\"
1475 \\ The space after the operator is the ordinary space required between words.
1476 \\ This is tricky to define because it has to do different things depending on whether
1477 \\ we are compiling or in immediate mode. (Thus the word is marked IMMEDIATE so it can
1478 \\ detect this and do different things).
1479 \\ In immediate mode we just keep reading characters and printing them until we get to
1480 \\ the next double quote.
1481 \\ In compile mode we have the problem of where we're going to store the string (remember
1482 \\ that the input buffer where the string comes from may be overwritten by the time we
1483 \\ come round to running the function). We store the string in the compiled function
1485 \\ LITSTRING, string length, string rounded up to 4 bytes, EMITSTRING, ...
1487 STATE @ \\ compiling?
1489 ' LITSTRING , \\ compile LITSTRING
1490 HERE @ \\ save the address of the length word on the stack
1491 0 , \\ dummy length - we don't know what it is yet
1493 KEY \\ get next character of the string
1496 HERE @ !b \\ store the character in the compiled image
1497 1 HERE +! \\ increment HERE pointer by 1 byte
1499 DROP \\ drop the double quote character at the end
1500 DUP \\ get the saved address of the length word
1501 HERE @ SWAP - \\ calculate the length
1502 4- \\ subtract 4 (because we measured from the start of the length word)
1503 SWAP ! \\ and back-fill the length location
1504 HERE @ \\ round up to next multiple of 4 bytes for the remaining code
1508 ' EMITSTRING , \\ compile the final EMITSTRING
1510 \\ In immediate mode, just read characters and print them until we get
1511 \\ to the ending double quote. Much simpler than the above code!
1514 DUP '\"' = IF EXIT THEN
1520 \\ While compiling, [COMPILE] WORD compiles WORD if it would otherwise be IMMEDIATE.
1521 : [COMPILE] IMMEDIATE
1522 WORD \\ get the next word
1523 FIND \\ find it in the dictionary
1524 >CFA \\ get its codeword
1525 , \\ and compile that
1528 \\ RECURSE makes a recursive call to the current word that is being compiled.
1529 \\ Normally while a word is being compiled, it is marked HIDDEN so that references to the
1530 \\ same word within are calls to the previous definition of the word.
1532 LATEST @ >CFA \\ LATEST points to the word being compiled at the moment
1536 \\ ALLOT is used to allocate (static) memory when compiling. It increases HERE by
1537 \\ the amount given on the stack.
1541 \\ Finally print the welcome prompt.