1 /* A sometimes minimal FORTH compiler and tutorial for Linux / i386 systems. -*- asm -*-
2 By Richard W.M. Jones <rich@annexia.org> http://annexia.org/forth
3 This is PUBLIC DOMAIN (see public domain release statement below).
4 $Id: jonesforth.S,v 1.42 2007-10-07 11:07:15 rich Exp $
6 gcc -m32 -nostdlib -static -Wl,-Ttext,0 -Wl,--build-id=none -o jonesforth jonesforth.S
10 INTRODUCTION ----------------------------------------------------------------------
12 FORTH is one of those alien languages which most working programmers regard in the same
13 way as Haskell, LISP, and so on. Something so strange that they'd rather any thoughts
14 of it just go away so they can get on with writing this paying code. But that's wrong
15 and if you care at all about programming then you should at least understand all these
16 languages, even if you will never use them.
18 LISP is the ultimate high-level language, and features from LISP are being added every
19 decade to the more common languages. But FORTH is in some ways the ultimate in low level
20 programming. Out of the box it lacks features like dynamic memory management and even
21 strings. In fact, at its primitive level it lacks even basic concepts like IF-statements
24 Why then would you want to learn FORTH? There are several very good reasons. First
25 and foremost, FORTH is minimal. You really can write a complete FORTH in, say, 2000
26 lines of code. I don't just mean a FORTH program, I mean a complete FORTH operating
27 system, environment and language. You could boot such a FORTH on a bare PC and it would
28 come up with a prompt where you could start doing useful work. The FORTH you have here
29 isn't minimal and uses a Linux process as its 'base PC' (both for the purposes of making
30 it a good tutorial). It's possible to completely understand the system. Who can say they
31 completely understand how Linux works, or gcc?
33 Secondly FORTH has a peculiar bootstrapping property. By that I mean that after writing
34 a little bit of assembly to talk to the hardware and implement a few primitives, all the
35 rest of the language and compiler is written in FORTH itself. Remember I said before
36 that FORTH lacked IF-statements and loops? Well of course it doesn't really because
37 such a lanuage would be useless, but my point was rather that IF-statements and loops are
38 written in FORTH itself.
40 Now of course this is common in other languages as well, and in those languages we call
41 them 'libraries'. For example in C, 'printf' is a library function written in C. But
42 in FORTH this goes way beyond mere libraries. Can you imagine writing C's 'if' in C?
43 And that brings me to my third reason: If you can write 'if' in FORTH, then why restrict
44 yourself to the usual if/while/for/switch constructs? You want a construct that iterates
45 over every other element in a list of numbers? You can add it to the language. What
46 about an operator which pulls in variables directly from a configuration file and makes
47 them available as FORTH variables? Or how about adding Makefile-like dependencies to
48 the language? No problem in FORTH. How about modifying the FORTH compiler to allow
49 complex inlining strategies -- simple. This concept isn't common in programming languages,
50 but it has a name (in fact two names): "macros" (by which I mean LISP-style macros, not
51 the lame C preprocessor) and "domain specific languages" (DSLs).
53 This tutorial isn't about learning FORTH as the language. I'll point you to some references
54 you should read if you're not familiar with using FORTH. This tutorial is about how to
55 write FORTH. In fact, until you understand how FORTH is written, you'll have only a very
56 superficial understanding of how to use it.
58 So if you're not familiar with FORTH or want to refresh your memory here are some online
61 http://en.wikipedia.org/wiki/Forth_%28programming_language%29
63 http://galileo.phys.virginia.edu/classes/551.jvn.fall01/primer.htm
65 http://wiki.laptop.org/go/Forth_Lessons
67 http://www.albany.net/~hello/simple.htm
69 Here is another "Why FORTH?" essay: http://www.jwdt.com/~paysan/why-forth.html
71 Discussion and criticism of this FORTH here: http://lambda-the-ultimate.org/node/2452
73 ACKNOWLEDGEMENTS ----------------------------------------------------------------------
75 This code draws heavily on the design of LINA FORTH (http://home.hccnet.nl/a.w.m.van.der.horst/lina.html)
76 by Albert van der Horst. Any similarities in the code are probably not accidental.
78 Some parts of this FORTH are also based on this IOCCC entry from 1992:
79 http://ftp.funet.fi/pub/doc/IOCCC/1992/buzzard.2.design.
80 I was very proud when Sean Barrett, the original author of the IOCCC entry, commented in the LtU thread
81 http://lambda-the-ultimate.org/node/2452#comment-36818 about this FORTH.
83 And finally I'd like to acknowledge the (possibly forgotten?) authors of ARTIC FORTH because their
84 original program which I still have on original cassette tape kept nagging away at me all these years.
85 http://en.wikipedia.org/wiki/Artic_Software
87 PUBLIC DOMAIN ----------------------------------------------------------------------
89 I, the copyright holder of this work, hereby release it into the public domain. This applies worldwide.
91 In case this is not legally possible, I grant any entity the right to use this work for any purpose,
92 without any conditions, unless such conditions are required by law.
94 SETTING UP ----------------------------------------------------------------------
96 Let's get a few housekeeping things out of the way. Firstly because I need to draw lots of
97 ASCII-art diagrams to explain concepts, the best way to look at this is using a window which
98 uses a fixed width font and is at least this wide:
100 <------------------------------------------------------------------------------------------------------------------------>
102 Secondly make sure TABS are set to 8 characters. The following should be a vertical
103 line. If not, sort out your tabs.
109 Thirdly I assume that your screen is at least 50 characters high.
111 ASSEMBLING ----------------------------------------------------------------------
113 If you want to actually run this FORTH, rather than just read it, you will need Linux on an
114 i386. Linux because instead of programming directly to the hardware on a bare PC which I
115 could have done, I went for a simpler tutorial by assuming that the 'hardware' is a Linux
116 process with a few basic system calls (read, write and exit and that's about all). i386
117 is needed because I had to write the assembly for a processor, and i386 is by far the most
118 common. (Of course when I say 'i386', any 32- or 64-bit x86 processor will do. I'm compiling
119 this on a 64 bit AMD Opteron).
121 Again, to assemble this you will need gcc and gas (the GNU assembler). The commands to
122 assemble and run the code (save this file as 'jonesforth.S') are:
124 gcc -m32 -nostdlib -static -Wl,-Ttext,0 -Wl,--build-id=none -o jonesforth jonesforth.S
125 cat jonesforth.f - | ./jonesforth
127 If you want to run your own FORTH programs you can do:
129 cat jonesforth.f myprog.f | ./jonesforth
131 If you want to load your own FORTH code and then continue reading user commands, you can do:
133 cat jonesforth.f myfunctions.f - | ./jonesforth
135 ASSEMBLER ----------------------------------------------------------------------
137 (You can just skip to the next section -- you don't need to be able to read assembler to
138 follow this tutorial).
140 However if you do want to read the assembly code here are a few notes about gas (the GNU assembler):
142 (1) Register names are prefixed with '%', so %eax is the 32 bit i386 accumulator. The registers
143 available on i386 are: %eax, %ebx, %ecx, %edx, %esi, %edi, %ebp and %esp, and most of them
144 have special purposes.
146 (2) Add, mov, etc. take arguments in the form SRC,DEST. So mov %eax,%ecx moves %eax -> %ecx
148 (3) Constants are prefixed with '$', and you mustn't forget it! If you forget it then it
149 causes a read from memory instead, so:
150 mov $2,%eax moves number 2 into %eax
151 mov 2,%eax reads the 32 bit word from address 2 into %eax (ie. most likely a mistake)
153 (4) gas has a funky syntax for local labels, where '1f' (etc.) means label '1:' "forwards"
154 and '1b' (etc.) means label '1:' "backwards".
156 (5) 'ja' is "jump if above", 'jb' for "jump if below", 'je' "jump if equal" etc.
158 (6) gas has a reasonably nice .macro syntax, and I use them a lot to make the code shorter and
161 For more help reading the assembler, do "info gas" at the Linux prompt.
163 Now the tutorial starts in earnest.
165 THE DICTIONARY ----------------------------------------------------------------------
167 In FORTH as you will know, functions are called "words", and just as in other languages they
168 have a name and a definition. Here are two FORTH words:
170 : DOUBLE DUP + ; \ name is "DOUBLE", definition is "DUP +"
171 : QUADRUPLE DOUBLE DOUBLE ; \ name is "QUADRUPLE", definition is "DOUBLE DOUBLE"
173 Words, both built-in ones and ones which the programmer defines later, are stored in a dictionary
174 which is just a linked list of dictionary entries.
176 <--- DICTIONARY ENTRY (HEADER) ----------------------->
177 +------------------------+--------+---------- - - - - +----------- - - - -
178 | LINK POINTER | LENGTH/| NAME | DEFINITION
180 +--- (4 bytes) ----------+- byte -+- n bytes - - - - +----------- - - - -
182 I'll come to the definition of the word later. For now just look at the header. The first
183 4 bytes are the link pointer. This points back to the previous word in the dictionary, or, for
184 the first word in the dictionary it is just a NULL pointer. Then comes a length/flags byte.
185 The length of the word can be up to 31 characters (5 bits used) and the top three bits are used
186 for various flags which I'll come to later. This is followed by the name itself, and in this
187 implementation the name is rounded up to a multiple of 4 bytes by padding it with zero bytes.
188 That's just to ensure that the definition starts on a 32 bit boundary.
190 A FORTH variable called LATEST contains a pointer to the most recently defined word, in
191 other words, the head of this linked list.
193 DOUBLE and QUADRUPLE might look like this:
195 pointer to previous word
198 +--|------+---+---+---+---+---+---+---+---+------------- - - - -
199 | LINK | 6 | D | O | U | B | L | E | 0 | (definition ...)
200 +---------+---+---+---+---+---+---+---+---+------------- - - - -
203 +--|------+---+---+---+---+---+---+---+---+---+---+---+---+------------- - - - -
204 | LINK | 9 | Q | U | A | D | R | U | P | L | E | 0 | 0 | (definition ...)
205 +---------+---+---+---+---+---+---+---+---+---+---+---+---+------------- - - - -
211 You should be able to see from this how you might implement functions to find a word in
212 the dictionary (just walk along the dictionary entries starting at LATEST and matching
213 the names until you either find a match or hit the NULL pointer at the end of the dictionary);
214 and add a word to the dictionary (create a new definition, set its LINK to LATEST, and set
215 LATEST to point to the new word). We'll see precisely these functions implemented in
216 assembly code later on.
218 One interesting consequence of using a linked list is that you can redefine words, and
219 a newer definition of a word overrides an older one. This is an important concept in
220 FORTH because it means that any word (even "built-in" or "standard" words) can be
221 overridden with a new definition, either to enhance it, to make it faster or even to
222 disable it. However because of the way that FORTH words get compiled, which you'll
223 understand below, words defined using the old definition of a word continue to use
224 the old definition. Only words defined after the new definition use the new definition.
226 DIRECT THREADED CODE ----------------------------------------------------------------------
228 Now we'll get to the really crucial bit in understanding FORTH, so go and get a cup of tea
229 or coffee and settle down. It's fair to say that if you don't understand this section, then you
230 won't "get" how FORTH works, and that would be a failure on my part for not explaining it well.
231 So if after reading this section a few times you don't understand it, please email me
234 Let's talk first about what "threaded code" means. Imagine a peculiar version of C where
235 you are only allowed to call functions without arguments. (Don't worry for now that such a
236 language would be completely useless!) So in our peculiar C, code would look like this:
245 and so on. How would a function, say 'f' above, be compiled by a standard C compiler?
246 Probably into assembly code like this. On the right hand side I've written the actual
250 CALL a E8 08 00 00 00
251 CALL b E8 1C 00 00 00
252 CALL c E8 2C 00 00 00
253 ; ignore the return from the function for now
255 "E8" is the x86 machine code to "CALL" a function. In the first 20 years of computing
256 memory was hideously expensive and we might have worried about the wasted space being used
257 by the repeated "E8" bytes. We can save 20% in code size (and therefore, in expensive memory)
258 by compressing this into just:
260 08 00 00 00 Just the function addresses, without
261 1C 00 00 00 the CALL prefix.
264 On a 16-bit machine like the ones which originally ran FORTH the savings are even greater - 33%.
266 [Historical note: If the execution model that FORTH uses looks strange from the following
267 paragraphs, then it was motivated entirely by the need to save memory on early computers.
268 This code compression isn't so important now when our machines have more memory in their L1
269 caches than those early computers had in total, but the execution model still has some
272 Of course this code won't run directly any more. Instead we need to write an interpreter
273 which takes each pair of bytes and calls it.
275 On an i386 machine it turns out that we can write this interpreter rather easily, in just
276 two assembly instructions which turn into just 3 bytes of machine code. Let's store the
277 pointer to the next word to execute in the %esi register:
279 08 00 00 00 <- We're executing this one now. %esi is the _next_ one to execute.
283 The all-important i386 instruction is called LODSL (or in Intel manuals, LODSW). It does
284 two things. Firstly it reads the memory at %esi into the accumulator (%eax). Secondly it
285 increments %esi by 4 bytes. So after LODSL, the situation now looks like this:
287 08 00 00 00 <- We're still executing this one
288 1C 00 00 00 <- %eax now contains this address (0x0000001C)
291 Now we just need to jump to the address in %eax. This is again just a single x86 instruction
292 written JMP *(%eax). And after doing the jump, the situation looks like:
295 1C 00 00 00 <- Now we're executing this subroutine.
298 To make this work, each subroutine is followed by the two instructions 'LODSL; JMP *(%eax)'
299 which literally make the jump to the next subroutine.
301 And that brings us to our first piece of actual code! Well, it's a macro.
310 /* The macro is called NEXT. That's a FORTH-ism. It expands to those two instructions.
312 Every FORTH primitive that we write has to be ended by NEXT. Think of it kind of like
315 The above describes what is known as direct threaded code.
317 To sum up: We compress our function calls down to a list of addresses and use a somewhat
318 magical macro to act as a "jump to next function in the list". We also use one register (%esi)
319 to act as a kind of instruction pointer, pointing to the next function in the list.
321 I'll just give you a hint of what is to come by saying that a FORTH definition such as:
323 : QUADRUPLE DOUBLE DOUBLE ;
325 actually compiles (almost, not precisely but we'll see why in a moment) to a list of
326 function addresses for DOUBLE, DOUBLE and a special function called EXIT to finish off.
328 At this point, REALLY EAGLE-EYED ASSEMBLY EXPERTS are saying "JONES, YOU'VE MADE A MISTAKE!".
330 I lied about JMP *(%eax).
332 INDIRECT THREADED CODE ----------------------------------------------------------------------
334 It turns out that direct threaded code is interesting but only if you want to just execute
335 a list of functions written in assembly language. So QUADRUPLE would work only if DOUBLE
336 was an assembly language function. In the direct threaded code, QUADRUPLE would look like:
339 | addr of DOUBLE --------------------> (assembly code to do the double)
340 +------------------+ NEXT
341 %esi -> | addr of DOUBLE |
344 We can add an extra indirection to allow us to run both words written in assembly language
345 (primitives written for speed) and words written in FORTH themselves as lists of addresses.
347 The extra indirection is the reason for the brackets in JMP *(%eax).
349 Let's have a look at how QUADRUPLE and DOUBLE really look in FORTH:
351 : QUADRUPLE DOUBLE DOUBLE ;
354 | codeword | : DOUBLE DUP + ;
356 | addr of DOUBLE ---------------> +------------------+
357 +------------------+ | codeword |
358 | addr of DOUBLE | +------------------+
359 +------------------+ | addr of DUP --------------> +------------------+
360 | addr of EXIT | +------------------+ | codeword -------+
361 +------------------+ %esi -> | addr of + --------+ +------------------+ |
362 +------------------+ | | assembly to <-----+
363 | addr of EXIT | | | implement DUP |
364 +------------------+ | | .. |
367 | +------------------+
369 +-----> +------------------+
371 +------------------+ |
372 | assembly to <------+
379 This is the part where you may need an extra cup of tea/coffee/favourite caffeinated
380 beverage. What has changed is that I've added an extra pointer to the beginning of
381 the definitions. In FORTH this is sometimes called the "codeword". The codeword is
382 a pointer to the interpreter to run the function. For primitives written in
383 assembly language, the "interpreter" just points to the actual assembly code itself.
384 They don't need interpreting, they just run.
386 In words written in FORTH (like QUADRUPLE and DOUBLE), the codeword points to an interpreter
389 I'll show you the interpreter function shortly, but let's recall our indirect
390 JMP *(%eax) with the "extra" brackets. Take the case where we're executing DOUBLE
391 as shown, and DUP has been called. Note that %esi is pointing to the address of +
393 The assembly code for DUP eventually does a NEXT. That:
395 (1) reads the address of + into %eax %eax points to the codeword of +
396 (2) increments %esi by 4
397 (3) jumps to the indirect %eax jumps to the address in the codeword of +,
398 ie. the assembly code to implement +
403 | addr of DOUBLE ---------------> +------------------+
404 +------------------+ | codeword |
405 | addr of DOUBLE | +------------------+
406 +------------------+ | addr of DUP --------------> +------------------+
407 | addr of EXIT | +------------------+ | codeword -------+
408 +------------------+ | addr of + --------+ +------------------+ |
409 +------------------+ | | assembly to <-----+
410 %esi -> | addr of EXIT | | | implement DUP |
411 +------------------+ | | .. |
414 | +------------------+
416 +-----> +------------------+
418 +------------------+ |
419 now we're | assembly to <-----+
420 executing | implement + |
426 So I hope that I've convinced you that NEXT does roughly what you'd expect. This is
427 indirect threaded code.
429 I've glossed over four things. I wonder if you can guess without reading on what they are?
435 My list of four things are: (1) What does "EXIT" do? (2) which is related to (1) is how do
436 you call into a function, ie. how does %esi start off pointing at part of QUADRUPLE, but
437 then point at part of DOUBLE. (3) What goes in the codeword for the words which are written
438 in FORTH? (4) How do you compile a function which does anything except call other functions
439 ie. a function which contains a number like : DOUBLE 2 * ; ?
441 THE INTERPRETER AND RETURN STACK ------------------------------------------------------------
443 Going at these in no particular order, let's talk about issues (3) and (2), the interpreter
444 and the return stack.
446 Words which are defined in FORTH need a codeword which points to a little bit of code to
447 give them a "helping hand" in life. They don't need much, but they do need what is known
448 as an "interpreter", although it doesn't really "interpret" in the same way that, say,
449 Java bytecode used to be interpreted (ie. slowly). This interpreter just sets up a few
450 machine registers so that the word can then execute at full speed using the indirect
451 threaded model above.
453 One of the things that needs to happen when QUADRUPLE calls DOUBLE is that we save the old
454 %esi ("instruction pointer") and create a new one pointing to the first word in DOUBLE.
455 Because we will need to restore the old %esi at the end of DOUBLE (this is, after all, like
456 a function call), we will need a stack to store these "return addresses" (old values of %esi).
458 As you will have read, when reading the background documentation, FORTH has two stacks,
459 an ordinary stack for parameters, and a return stack which is a bit more mysterious. But
460 our return stack is just the stack I talked about in the previous paragraph, used to save
461 %esi when calling from a FORTH word into another FORTH word.
463 In this FORTH, we are using the normal stack pointer (%esp) for the parameter stack.
464 We will use the i386's "other" stack pointer (%ebp, usually called the "frame pointer")
465 for our return stack.
467 I've got two macros which just wrap up the details of using %ebp for the return stack.
468 You use them as for example "PUSHRSP %eax" (push %eax on the return stack) or "POPRSP %ebx"
469 (pop top of return stack into %ebx).
472 /* Macros to deal with the return stack. */
474 lea -4(%ebp),%ebp // push reg on to return stack
479 mov (%ebp),\reg // pop top of return stack to reg
484 And with that we can now talk about the interpreter.
486 In FORTH the interpreter function is often called DOCOL (I think it means "DO COLON" because
487 all FORTH definitions start with a colon, as in : DOUBLE DUP + ;
489 The "interpreter" (it's not really "interpreting") just needs to push the old %esi on the
490 stack and set %esi to the first word in the definition. Remember that we jumped to the
491 function using JMP *(%eax)? Well a consequence of that is that conveniently %eax contains
492 the address of this codeword, so just by adding 4 to it we get the address of the first
493 data word. Finally after setting up %esi, it just does NEXT which causes that first word
497 /* DOCOL - the interpreter! */
501 PUSHRSP %esi // push %esi on to the return stack
502 addl $4,%eax // %eax points to codeword, so make
503 movl %eax,%esi // %esi point to first data word
507 Just to make this absolutely clear, let's see how DOCOL works when jumping from QUADRUPLE
513 +------------------+ DOUBLE:
514 | addr of DOUBLE ---------------> +------------------+
515 +------------------+ %eax -> | addr of DOCOL |
516 %esi -> | addr of DOUBLE | +------------------+
517 +------------------+ | addr of DUP |
518 | addr of EXIT | +------------------+
519 +------------------+ | etc. |
521 First, the call to DOUBLE calls DOCOL (the codeword of DOUBLE). DOCOL does this: It
522 pushes the old %esi on the return stack. %eax points to the codeword of DOUBLE, so we
523 just add 4 on to it to get our new %esi:
528 +------------------+ DOUBLE:
529 | addr of DOUBLE ---------------> +------------------+
530 top of return +------------------+ %eax -> | addr of DOCOL |
531 stack points -> | addr of DOUBLE | + 4 = +------------------+
532 +------------------+ %esi -> | addr of DUP |
533 | addr of EXIT | +------------------+
534 +------------------+ | etc. |
536 Then we do NEXT, and because of the magic of threaded code that increments %esi again
539 Well, it seems to work.
541 One minor point here. Because DOCOL is the first bit of assembly actually to be defined
542 in this file (the others were just macros), and because I usually compile this code with the
543 text segment starting at address 0, DOCOL has address 0. So if you are disassembling the
544 code and see a word with a codeword of 0, you will immediately know that the word is
545 written in FORTH (it's not an assembler primitive) and so uses DOCOL as the interpreter.
547 STARTING UP ----------------------------------------------------------------------
549 Now let's get down to nuts and bolts. When we start the program we need to set up
550 a few things like the return stack. But as soon as we can, we want to jump into FORTH
551 code (albeit much of the "early" FORTH code will still need to be written as
552 assembly language primitives).
554 This is what the set up code does. Does a tiny bit of house-keeping, sets up the
555 separate return stack (NB: Linux gives us the ordinary parameter stack already), then
556 immediately jumps to a FORTH word called QUIT. Despite its name, QUIT doesn't quit
557 anything. It resets some internal state and starts reading and interpreting commands.
558 (The reason it is called QUIT is because you can call QUIT from your own FORTH code
559 to "quit" your program and go back to interpreting).
562 /* Assembler entry point. */
567 mov %esp,var_S0 // Save the initial data stack pointer in FORTH variable S0.
568 mov $return_stack_top,%ebp // Initialise the return stack.
569 call set_up_data_segment
571 mov $cold_start,%esi // Initialise interpreter.
572 NEXT // Run interpreter!
575 cold_start: // High-level code without a codeword.
579 BUILT-IN WORDS ----------------------------------------------------------------------
581 Remember our dictionary entries (headers)? Let's bring those together with the codeword
582 and data words to see how : DOUBLE DUP + ; really looks in memory.
584 pointer to previous word
587 +--|------+---+---+---+---+---+---+---+---+------------+------------+------------+------------+
588 | LINK | 6 | D | O | U | B | L | E | 0 | DOCOL | DUP | + | EXIT |
589 +---------+---+---+---+---+---+---+---+---+------------+--|---------+------------+------------+
592 LINK in next word points to codeword of DUP
594 Initially we can't just write ": DOUBLE DUP + ;" (ie. that literal string) here because we
595 don't yet have anything to read the string, break it up at spaces, parse each word, etc. etc.
596 So instead we will have to define built-in words using the GNU assembler data constructors
597 (like .int, .byte, .string, .ascii and so on -- look them up in the gas info page if you are
600 The long way would be:
601 .int <link to previous word>
603 .ascii "DOUBLE" // string
605 DOUBLE: .int DOCOL // codeword
606 .int DUP // pointer to codeword of DUP
607 .int PLUS // pointer to codeword of +
608 .int EXIT // pointer to codeword of EXIT
610 That's going to get quite tedious rather quickly, so here I define an assembler macro
611 so that I can just write:
613 defword "DOUBLE",6,,DOUBLE
616 and I'll get exactly the same effect.
618 Don't worry too much about the exact implementation details of this macro - it's complicated!
621 /* Flags - these are discussed later. */
624 .set F_LENMASK,0x1f // length mask
626 // Store the chain of links.
629 .macro defword name, namelen, flags=0, label
635 .set link,name_\label
636 .byte \flags+\namelen // flags + length byte
637 .ascii "\name" // the name
638 .align 4 // padding to next 4 byte boundary
641 .int DOCOL // codeword - the interpreter
642 // list of word pointers follow
646 Similarly I want a way to write words written in assembly language. There will quite a few
647 of these to start with because, well, everything has to start in assembly before there's
648 enough "infrastructure" to be able to start writing FORTH words, but also I want to define
649 some common FORTH words in assembly language for speed, even though I could write them in FORTH.
651 This is what DUP looks like in memory:
653 pointer to previous word
656 +--|------+---+---+---+---+------------+
657 | LINK | 3 | D | U | P | code_DUP ---------------------> points to the assembly
658 +---------+---+---+---+---+------------+ code used to write DUP,
659 ^ len codeword which ends with NEXT.
663 Again, for brevity in writing the header I'm going to write an assembler macro called defcode.
666 .macro defcode name, namelen, flags=0, label
672 .set link,name_\label
673 .byte \flags+\namelen // flags + length byte
674 .ascii "\name" // the name
675 .align 4 // padding to next 4 byte boundary
678 .int code_\label // codeword
682 code_\label : // assembler code follows
686 Now some easy FORTH primitives. These are written in assembly for speed. If you understand
687 i386 assembly language then it is worth reading these. However if you don't understand assembly
688 you can skip the details.
691 defcode "DROP",4,,DROP
692 pop %eax // drop top of stack
695 defcode "SWAP",4,,SWAP
696 pop %eax // swap top two elements on stack
703 mov (%esp),%eax // duplicate top of stack
707 defcode "OVER",4,,OVER
708 mov 4(%esp),%eax // get the second element of stack
709 push %eax // and push it on top
721 defcode "-ROT",4,,NROT
730 defcode "?DUP",4,,QDUP // duplicate top of stack if non-zero
738 incl (%esp) // increment top of stack
742 decl (%esp) // decrement top of stack
745 defcode "4+",2,,INCR4
746 addl $4,(%esp) // add 4 to top of stack
749 defcode "4-",2,,DECR4
750 subl $4,(%esp) // subtract 4 from top of stack
754 pop %eax // get top of stack
755 addl %eax,(%esp) // and add it to next word on stack
759 pop %eax // get top of stack
760 subl %eax,(%esp) // and subtract it from next word on stack
767 push %eax // ignore overflow
771 In this FORTH, only /MOD is primitive. Later we will define the / and MOD words in
772 terms of the primitive /MOD. The design of the i386 assembly instruction idiv which
773 leaves both quotient and remainder makes this the obvious choice.
776 defcode "/MOD",4,,DIVMOD
781 push %edx // push remainder
782 push %eax // push quotient
786 Lots of comparison operations.
788 ANS FORTH says that the comparison words should return all (binary) 1's for
789 TRUE and all 0's for FALSE. However this is a bit of a strange convention
790 so this FORTH breaks it and returns the more normal (for C programmers ...)
791 1 meaning TRUE and 0 meaning FALSE.
794 defcode "=",1,,EQU // top two words are equal?
803 defcode "<>",2,,NEQU // top two words are not equal?
848 defcode "0=",2,,ZEQU // top of stack equals 0?
856 defcode "0<>",3,,ZNEQU // top of stack not 0?
864 defcode "0<",2,,ZLT // comparisons with 0
896 defcode "AND",3,,AND // bitwise AND
901 defcode "OR",2,,OR // bitwise OR
906 defcode "XOR",3,,XOR // bitwise XOR
911 defcode "INVERT",6,,INVERT // this is the FORTH bitwise "NOT" function (cf. NEGATE and NOT)
916 RETURNING FROM FORTH WORDS ----------------------------------------------------------------------
918 Time to talk about what happens when we EXIT a function. In this diagram QUADRUPLE has called
919 DOUBLE, and DOUBLE is about to exit (look at where %esi is pointing):
924 +------------------+ DOUBLE
925 | addr of DOUBLE ---------------> +------------------+
926 +------------------+ | codeword |
927 | addr of DOUBLE | +------------------+
928 +------------------+ | addr of DUP |
929 | addr of EXIT | +------------------+
930 +------------------+ | addr of + |
932 %esi -> | addr of EXIT |
935 What happens when the + function does NEXT? Well, the following code is executed.
938 defcode "EXIT",4,,EXIT
939 POPRSP %esi // pop return stack into %esi
943 EXIT gets the old %esi which we saved from before on the return stack, and puts it in %esi.
944 So after this (but just before NEXT) we get:
949 +------------------+ DOUBLE
950 | addr of DOUBLE ---------------> +------------------+
951 +------------------+ | codeword |
952 %esi -> | addr of DOUBLE | +------------------+
953 +------------------+ | addr of DUP |
954 | addr of EXIT | +------------------+
955 +------------------+ | addr of + |
960 And NEXT just completes the job by, well, in this case just by calling DOUBLE again :-)
962 LITERALS ----------------------------------------------------------------------
964 The final point I "glossed over" before was how to deal with functions that do anything
965 apart from calling other functions. For example, suppose that DOUBLE was defined like this:
969 It does the same thing, but how do we compile it since it contains the literal 2? One way
970 would be to have a function called "2" (which you'd have to write in assembler), but you'd need
971 a function for every single literal that you wanted to use.
973 FORTH solves this by compiling the function using a special word called LIT:
975 +---------------------------+-------+-------+-------+-------+-------+
976 | (usual header of DOUBLE) | DOCOL | LIT | 2 | * | EXIT |
977 +---------------------------+-------+-------+-------+-------+-------+
979 LIT is executed in the normal way, but what it does next is definitely not normal. It
980 looks at %esi (which now points to the number 2), grabs it, pushes it on the stack, then
981 manipulates %esi in order to skip the number as if it had never been there.
983 What's neat is that the whole grab/manipulate can be done using a single byte single
984 i386 instruction, our old friend LODSL. Rather than me drawing more ASCII-art diagrams,
985 see if you can find out how LIT works:
989 // %esi points to the next command, but in this case it points to the next
990 // literal 32 bit integer. Get that literal into %eax and increment %esi.
991 // On x86, it's a convenient single byte instruction! (cf. NEXT macro)
993 push %eax // push the literal number on to stack
997 MEMORY ----------------------------------------------------------------------
999 As important point about FORTH is that it gives you direct access to the lowest levels
1000 of the machine. Manipulating memory directly is done frequently in FORTH, and these are
1001 the primitive words for doing it.
1004 defcode "!",1,,STORE
1005 pop %ebx // address to store at
1006 pop %eax // data to store there
1007 mov %eax,(%ebx) // store it
1010 defcode "@",1,,FETCH
1011 pop %ebx // address to fetch
1012 mov (%ebx),%eax // fetch it
1013 push %eax // push value onto stack
1016 defcode "+!",2,,ADDSTORE
1018 pop %eax // the amount to add
1019 addl %eax,(%ebx) // add it
1022 defcode "-!",2,,SUBSTORE
1024 pop %eax // the amount to subtract
1025 subl %eax,(%ebx) // add it
1029 ! and @ (STORE and FETCH) store 32-bit words. It's also useful to be able to read and write bytes
1030 so we also define standard words C@ and C!.
1032 Byte-oriented operations only work on architectures which permit them (i386 is one of those).
1035 defcode "C!",2,,STOREBYTE
1036 pop %ebx // address to store at
1037 pop %eax // data to store there
1038 movb %al,(%ebx) // store it
1041 defcode "C@",2,,FETCHBYTE
1042 pop %ebx // address to fetch
1044 movb (%ebx),%al // fetch it
1045 push %eax // push value onto stack
1048 /* C@C! is a useful byte copy primitive. */
1049 defcode "C@C!",4,,CCOPY
1050 movl 4(%esp),%ebx // source address
1051 movb (%ebx),%al // get source character
1052 pop %edi // destination address
1053 stosb // copy to destination
1054 push %edi // increment destination address
1055 incl 4(%esp) // increment source address
1058 /* and CMOVE is a block copy operation. */
1059 defcode "CMOVE",5,,CMOVE
1060 mov %esi,%edx // preserve %esi
1062 pop %edi // destination address
1063 pop %esi // source address
1064 rep movsb // copy source to destination
1065 mov %edx,%esi // restore %esi
1069 BUILT-IN VARIABLES ----------------------------------------------------------------------
1071 These are some built-in variables and related standard FORTH words. Of these, the only one that we
1072 have discussed so far was LATEST, which points to the last (most recently defined) word in the
1073 FORTH dictionary. LATEST is also a FORTH word which pushes the address of LATEST (the variable)
1074 on to the stack, so you can read or write it using @ and ! operators. For example, to print
1075 the current value of LATEST (and this can apply to any FORTH variable) you would do:
1079 To make defining variables shorter, I'm using a macro called defvar, similar to defword and
1080 defcode above. (In fact the defvar macro uses defcode to do the dictionary header).
1083 .macro defvar name, namelen, flags=0, label, initial=0
1084 defcode \name,\namelen,\flags,\label
1094 The built-in variables are:
1096 STATE Is the interpreter executing code (0) or compiling a word (non-zero)?
1097 LATEST Points to the latest (most recently defined) word in the dictionary.
1098 HERE Points to the next free byte of memory. When compiling, compiled words go here.
1099 S0 Stores the address of the top of the parameter stack.
1100 BASE The current base for printing and reading numbers.
1103 defvar "STATE",5,,STATE
1104 defvar "HERE",4,,HERE
1105 defvar "LATEST",6,,LATEST,name_SYSCALL0 // SYSCALL0 must be last in built-in dictionary
1107 defvar "BASE",4,,BASE,10
1110 BUILT-IN CONSTANTS ----------------------------------------------------------------------
1112 It's also useful to expose a few constants to FORTH. When the word is executed it pushes a
1113 constant value on the stack.
1115 The built-in constants are:
1117 VERSION Is the current version of this FORTH.
1118 R0 The address of the top of the return stack.
1119 DOCOL Pointer to DOCOL.
1120 F_IMMED The IMMEDIATE flag's actual value.
1121 F_HIDDEN The HIDDEN flag's actual value.
1122 F_LENMASK The length mask in the flags/len byte.
1124 SYS_* and the numeric codes of various Linux syscalls (from <asm/unistd.h>)
1127 //#include <asm-i386/unistd.h> // you might need this instead
1128 #include <asm/unistd.h>
1130 .macro defconst name, namelen, flags=0, label, value
1131 defcode \name,\namelen,\flags,\label
1136 defconst "VERSION",7,,VERSION,JONES_VERSION
1137 defconst "R0",2,,RZ,return_stack_top
1138 defconst "DOCOL",5,,__DOCOL,DOCOL
1139 defconst "F_IMMED",7,,__F_IMMED,F_IMMED
1140 defconst "F_HIDDEN",8,,__F_HIDDEN,F_HIDDEN
1141 defconst "F_LENMASK",9,,__F_LENMASK,F_LENMASK
1143 defconst "SYS_EXIT",8,,SYS_EXIT,__NR_exit
1144 defconst "SYS_OPEN",8,,SYS_OPEN,__NR_open
1145 defconst "SYS_CLOSE",9,,SYS_CLOSE,__NR_close
1146 defconst "SYS_READ",8,,SYS_READ,__NR_read
1147 defconst "SYS_WRITE",9,,SYS_WRITE,__NR_write
1148 defconst "SYS_CREAT",9,,SYS_CREAT,__NR_creat
1149 defconst "SYS_BRK",7,,SYS_BRK,__NR_brk
1151 defconst "O_RDONLY",8,,__O_RDONLY,0
1152 defconst "O_WRONLY",8,,__O_WRONLY,1
1153 defconst "O_RDWR",6,,__O_RDWR,2
1154 defconst "O_CREAT",7,,__O_CREAT,0100
1155 defconst "O_EXCL",6,,__O_EXCL,0200
1156 defconst "O_TRUNC",7,,__O_TRUNC,01000
1157 defconst "O_APPEND",8,,__O_APPEND,02000
1158 defconst "O_NONBLOCK",10,,__O_NONBLOCK,04000
1161 RETURN STACK ----------------------------------------------------------------------
1163 These words allow you to access the return stack. Recall that the register %ebp always points to
1164 the top of the return stack.
1168 pop %eax // pop parameter stack into %eax
1169 PUSHRSP %eax // push it on to the return stack
1172 defcode "R>",2,,FROMR
1173 POPRSP %eax // pop return stack on to %eax
1174 push %eax // and push on to parameter stack
1177 defcode "RSP@",4,,RSPFETCH
1181 defcode "RSP!",4,,RSPSTORE
1185 defcode "RDROP",5,,RDROP
1186 addl $4,%ebp // pop return stack and throw away
1190 PARAMETER (DATA) STACK ----------------------------------------------------------------------
1192 These functions allow you to manipulate the parameter stack. Recall that Linux sets up the parameter
1193 stack for us, and it is accessed through %esp.
1196 defcode "DSP@",4,,DSPFETCH
1201 defcode "DSP!",4,,DSPSTORE
1206 INPUT AND OUTPUT ----------------------------------------------------------------------
1208 These are our first really meaty/complicated FORTH primitives. I have chosen to write them in
1209 assembler, but surprisingly in "real" FORTH implementations these are often written in terms
1210 of more fundamental FORTH primitives. I chose to avoid that because I think that just obscures
1211 the implementation. After all, you may not understand assembler but you can just think of it
1212 as an opaque block of code that does what it says.
1214 Let's discuss input first.
1216 The FORTH word KEY reads the next byte from stdin (and pushes it on the parameter stack).
1217 So if KEY is called and someone hits the space key, then the number 32 (ASCII code of space)
1218 is pushed on the stack.
1220 In FORTH there is no distinction between reading code and reading input. We might be reading
1221 and compiling code, we might be reading words to execute, we might be asking for the user
1222 to type their name -- ultimately it all comes in through KEY.
1224 The implementation of KEY uses an input buffer of a certain size (defined at the start of this
1225 file). It calls the Linux read(2) system call to fill this buffer and tracks its position
1226 in the buffer using a couple of variables, and if it runs out of input buffer then it refills
1227 it automatically. The other thing that KEY does is if it detects that stdin has closed, it
1228 exits the program, which is why when you hit ^D the FORTH system cleanly exits.
1233 +-------------------------------+--------------------------------------+
1234 | INPUT READ FROM STDIN ....... | unused part of the buffer |
1235 +-------------------------------+--------------------------------------+
1238 currkey (next character to read)
1240 <---------------------- BUFFER_SIZE (4096 bytes) ---------------------->
1244 defcode "KEY",3,,KEY
1246 push %eax // push return value on stack
1251 jge 1f // exhausted the input buffer?
1258 1: // Out of input; use read(2) to fetch more input from stdin.
1259 xor %ebx,%ebx // 1st param: stdin
1260 mov $buffer,%ecx // 2nd param: buffer
1262 mov $BUFFER_SIZE,%edx // 3rd param: max length
1263 mov $__NR_read,%eax // syscall: read
1265 test %eax,%eax // If %eax <= 0, then exit.
1267 addl %eax,%ecx // buffer+%eax = bufftop
1271 2: // Error or end of input: exit the program.
1273 mov $__NR_exit,%eax // syscall: exit
1279 .int buffer // Current place in input buffer (next character to read).
1281 .int buffer // Last valid data in input buffer + 1.
1284 By contrast, output is much simpler. The FORTH word EMIT writes out a single byte to stdout.
1285 This implementation just uses the write system call. No attempt is made to buffer output, but
1286 it would be a good exercise to add it.
1289 defcode "EMIT",4,,EMIT
1294 mov $1,%ebx // 1st param: stdout
1296 // write needs the address of the byte to write
1297 mov %al,emit_scratch
1298 mov $emit_scratch,%ecx // 2nd param: address
1300 mov $1,%edx // 3rd param: nbytes = 1
1302 mov $__NR_write,%eax // write syscall
1306 .data // NB: easier to fit in the .data section
1308 .space 1 // scratch used by EMIT
1311 Back to input, WORD is a FORTH word which reads the next full word of input.
1313 What it does in detail is that it first skips any blanks (spaces, tabs, newlines and so on).
1314 Then it calls KEY to read characters into an internal buffer until it hits a blank. Then it
1315 calculates the length of the word it read and returns the address and the length as
1316 two words on the stack (with the length at the top of stack).
1318 Notice that WORD has a single internal buffer which it overwrites each time (rather like
1319 a static C string). Also notice that WORD's internal buffer is just 32 bytes long and
1320 there is NO checking for overflow. 31 bytes happens to be the maximum length of a
1321 FORTH word that we support, and that is what WORD is used for: to read FORTH words when
1322 we are compiling and executing code. The returned strings are not NUL-terminated.
1324 Start address+length is the normal way to represent strings in FORTH (not ending in an
1325 ASCII NUL character as in C), and so FORTH strings can contain any character including NULs
1326 and can be any length.
1328 WORD is not suitable for just reading strings (eg. user input) because of all the above
1329 peculiarities and limitations.
1331 Note that when executing, you'll see:
1333 which puts "FOO" and length 3 on the stack, but when compiling:
1335 is an error (or at least it doesn't do what you might expect). Later we'll talk about compiling
1336 and immediate mode, and you'll understand why.
1339 defcode "WORD",4,,WORD
1341 push %edi // push base address
1342 push %ecx // push length
1346 /* Search for first non-blank character. Also skip \ comments. */
1348 call _KEY // get next key, returned in %eax
1349 cmpb $'\\',%al // start of a comment?
1350 je 3f // if so, skip the comment
1352 jbe 1b // if so, keep looking
1354 /* Search for the end of the word, storing chars as we go. */
1355 mov $word_buffer,%edi // pointer to return buffer
1357 stosb // add character to return buffer
1358 call _KEY // get next key, returned in %al
1359 cmpb $' ',%al // is blank?
1360 ja 2b // if not, keep looping
1362 /* Return the word (well, the static buffer) and length. */
1363 sub $word_buffer,%edi
1364 mov %edi,%ecx // return length of the word
1365 mov $word_buffer,%edi // return address of the word
1368 /* Code to skip \ comments to end of the current line. */
1371 cmpb $'\n',%al // end of line yet?
1375 .data // NB: easier to fit in the .data section
1376 // A static buffer where WORD returns. Subsequent calls
1377 // overwrite this buffer. Maximum word length is 32 chars.
1382 As well as reading in words we'll need to read in numbers and for that we are using a function
1383 called NUMBER. This parses a numeric string such as one returned by WORD and pushes the
1384 number on the parameter stack.
1386 The function uses the variable BASE as the base (radix) for conversion, so for example if
1387 BASE is 2 then we expect a binary number. Normally BASE is 10.
1389 If the word starts with a '-' character then the returned value is negative.
1391 If the string can't be parsed as a number (or contains characters outside the current BASE)
1392 then we need to return an error indication. So NUMBER actually returns two items on the stack.
1393 At the top of stack we return the number of unconverted characters (ie. if 0 then all characters
1394 were converted, so there is no error). Second from top of stack is the parsed number or a
1395 partial value if there was an error.
1397 defcode "NUMBER",6,,NUMBER
1398 pop %ecx // length of string
1399 pop %edi // start address of string
1401 push %eax // parsed number
1402 push %ecx // number of unparsed characters (0 = no error)
1409 test %ecx,%ecx // trying to parse a zero-length string is an error, but will return 0.
1412 movl var_BASE,%edx // get BASE (in %dl)
1414 // Check if first character is '-'.
1415 movb (%edi),%bl // %bl = first character in string
1417 push %eax // push 0 on stack
1418 cmpb $'-',%bl // negative number?
1421 push %ebx // push <> 0 on stack, indicating negative
1424 pop %ebx // error: string is only '-'.
1428 // Loop reading digits.
1429 1: imull %edx,%eax // %eax *= BASE
1430 movb (%edi),%bl // %bl = next character in string
1433 // Convert 0-9, A-Z to a number 0-35.
1434 2: subb $'0',%bl // < '0'?
1436 cmp $10,%bl // <= '9'?
1438 subb $17,%bl // < 'A'? (17 is 'A'-'0')
1442 3: cmp %dl,%bl // >= BASE?
1445 // OK, so add it to %eax and loop.
1450 // Negate the result if first character was '-' (saved on the stack).
1459 DICTIONARY LOOK UPS ----------------------------------------------------------------------
1461 We're building up to our prelude on how FORTH code is compiled, but first we need yet more infrastructure.
1463 The FORTH word FIND takes a string (a word as parsed by WORD -- see above) and looks it up in the
1464 dictionary. What it actually returns is the address of the dictionary header, if it finds it,
1467 So if DOUBLE is defined in the dictionary, then WORD DOUBLE FIND returns the following pointer:
1473 +---------+---+---+---+---+---+---+---+---+------------+------------+------------+------------+
1474 | LINK | 6 | D | O | U | B | L | E | 0 | DOCOL | DUP | + | EXIT |
1475 +---------+---+---+---+---+---+---+---+---+------------+------------+------------+------------+
1477 See also >CFA and >DFA.
1479 FIND doesn't find dictionary entries which are flagged as HIDDEN. See below for why.
1482 defcode "FIND",4,,FIND
1483 pop %ecx // %ecx = length
1484 pop %edi // %edi = address
1486 push %eax // %eax = address of dictionary entry (or NULL)
1490 push %esi // Save %esi so we can use it in string comparison.
1492 // Now we start searching backwards through the dictionary for this word.
1493 mov var_LATEST,%edx // LATEST points to name header of the latest word in the dictionary
1494 1: test %edx,%edx // NULL pointer? (end of the linked list)
1497 // Compare the length expected and the length of the word.
1498 // Note that if the F_HIDDEN flag is set on the word, then by a bit of trickery
1499 // this won't pick the word (the length will appear to be wrong).
1501 movb 4(%edx),%al // %al = flags+length field
1502 andb $(F_HIDDEN|F_LENMASK),%al // %al = name length
1503 cmpb %cl,%al // Length is the same?
1506 // Compare the strings in detail.
1507 push %ecx // Save the length
1508 push %edi // Save the address (repe cmpsb will move this pointer)
1509 lea 5(%edx),%esi // Dictionary string we are checking against.
1510 repe cmpsb // Compare the strings.
1513 jne 2f // Not the same.
1515 // The strings are the same - return the header pointer in %eax
1520 2: mov (%edx),%edx // Move back through the link field to the previous word
1521 jmp 1b // .. and loop.
1525 xor %eax,%eax // Return zero to indicate not found.
1529 FIND returns the dictionary pointer, but when compiling we need the codeword pointer (recall
1530 that FORTH definitions are compiled into lists of codeword pointers). The standard FORTH
1531 word >CFA turns a dictionary pointer into a codeword pointer.
1533 The example below shows the result of:
1535 WORD DOUBLE FIND >CFA
1537 FIND returns a pointer to this
1538 | >CFA converts it to a pointer to this
1541 +---------+---+---+---+---+---+---+---+---+------------+------------+------------+------------+
1542 | LINK | 6 | D | O | U | B | L | E | 0 | DOCOL | DUP | + | EXIT |
1543 +---------+---+---+---+---+---+---+---+---+------------+------------+------------+------------+
1548 Because names vary in length, this isn't just a simple increment.
1550 In this FORTH you cannot easily turn a codeword pointer back into a dictionary entry pointer, but
1551 that is not true in most FORTH implementations where they store a back pointer in the definition
1552 (with an obvious memory/complexity cost). The reason they do this is that it is useful to be
1553 able to go backwards (codeword -> dictionary entry) in order to decompile FORTH definitions
1556 What does CFA stand for? My best guess is "Code Field Address".
1559 defcode ">CFA",4,,TCFA
1566 add $4,%edi // Skip link pointer.
1567 movb (%edi),%al // Load flags+len into %al.
1568 inc %edi // Skip flags+len byte.
1569 andb $F_LENMASK,%al // Just the length, not the flags.
1570 add %eax,%edi // Skip the name.
1571 addl $3,%edi // The codeword is 4-byte aligned.
1576 Related to >CFA is >DFA which takes a dictionary entry address as returned by FIND and
1577 returns a pointer to the first data field.
1579 FIND returns a pointer to this
1580 | >CFA converts it to a pointer to this
1582 | | >DFA converts it to a pointer to this
1585 +---------+---+---+---+---+---+---+---+---+------------+------------+------------+------------+
1586 | LINK | 6 | D | O | U | B | L | E | 0 | DOCOL | DUP | + | EXIT |
1587 +---------+---+---+---+---+---+---+---+---+------------+------------+------------+------------+
1590 (Note to those following the source of FIG-FORTH / ciforth: My >DFA definition is
1591 different from theirs, because they have an extra indirection).
1593 You can see that >DFA is easily defined in FORTH just by adding 4 to the result of >CFA.
1596 defword ">DFA",4,,TDFA
1597 .int TCFA // >CFA (get code field address)
1598 .int INCR4 // 4+ (add 4 to it to get to next word)
1599 .int EXIT // EXIT (return from FORTH word)
1602 COMPILING ----------------------------------------------------------------------
1604 Now we'll talk about how FORTH compiles words. Recall that a word definition looks like this:
1608 and we have to turn this into:
1610 pointer to previous word
1613 +--|------+---+---+---+---+---+---+---+---+------------+------------+------------+------------+
1614 | LINK | 6 | D | O | U | B | L | E | 0 | DOCOL | DUP | + | EXIT |
1615 +---------+---+---+---+---+---+---+---+---+------------+--|---------+------------+------------+
1616 ^ len pad codeword |
1618 LATEST points here points to codeword of DUP
1620 There are several problems to solve. Where to put the new word? How do we read words? How
1621 do we define the words : (COLON) and ; (SEMICOLON)?
1623 FORTH solves this rather elegantly and as you might expect in a very low-level way which
1624 allows you to change how the compiler works on your own code.
1626 FORTH has an INTERPRET function (a true interpreter this time, not DOCOL) which runs in a
1627 loop, reading words (using WORD), looking them up (using FIND), turning them into codeword
1628 pointers (using >CFA) and deciding what to do with them.
1630 What it does depends on the mode of the interpreter (in variable STATE).
1632 When STATE is zero, the interpreter just runs each word as it looks them up. This is known as
1635 The interesting stuff happens when STATE is non-zero -- compiling mode. In this mode the
1636 interpreter appends the codeword pointer to user memory (the HERE variable points to the next
1637 free byte of user memory -- see DATA SEGMENT section below).
1639 So you may be able to see how we could define : (COLON). The general plan is:
1641 (1) Use WORD to read the name of the function being defined.
1643 (2) Construct the dictionary entry -- just the header part -- in user memory:
1645 pointer to previous word (from LATEST) +-- Afterwards, HERE points here, where
1646 ^ | the interpreter will start appending
1648 +--|------+---+---+---+---+---+---+---+---+------------+
1649 | LINK | 6 | D | O | U | B | L | E | 0 | DOCOL |
1650 +---------+---+---+---+---+---+---+---+---+------------+
1653 (3) Set LATEST to point to the newly defined word, ...
1655 (4) .. and most importantly leave HERE pointing just after the new codeword. This is where
1656 the interpreter will append codewords.
1658 (5) Set STATE to 1. This goes into compile mode so the interpreter starts appending codewords to
1659 our partially-formed header.
1661 After : has run, our input is here:
1666 Next byte returned by KEY will be the 'D' character of DUP
1668 so the interpreter (now it's in compile mode, so I guess it's really the compiler) reads "DUP",
1669 looks it up in the dictionary, gets its codeword pointer, and appends it:
1671 +-- HERE updated to point here.
1674 +---------+---+---+---+---+---+---+---+---+------------+------------+
1675 | LINK | 6 | D | O | U | B | L | E | 0 | DOCOL | DUP |
1676 +---------+---+---+---+---+---+---+---+---+------------+------------+
1679 Next we read +, get the codeword pointer, and append it:
1681 +-- HERE updated to point here.
1684 +---------+---+---+---+---+---+---+---+---+------------+------------+------------+
1685 | LINK | 6 | D | O | U | B | L | E | 0 | DOCOL | DUP | + |
1686 +---------+---+---+---+---+---+---+---+---+------------+------------+------------+
1689 The issue is what happens next. Obviously what we _don't_ want to happen is that we
1690 read ";" and compile it and go on compiling everything afterwards.
1692 At this point, FORTH uses a trick. Remember the length byte in the dictionary definition
1693 isn't just a plain length byte, but can also contain flags. One flag is called the
1694 IMMEDIATE flag (F_IMMED in this code). If a word in the dictionary is flagged as
1695 IMMEDIATE then the interpreter runs it immediately _even if it's in compile mode_.
1697 This is how the word ; (SEMICOLON) works -- as a word flagged in the dictionary as IMMEDIATE.
1699 And all it does is append the codeword for EXIT on to the current definition and switch
1700 back to immediate mode (set STATE back to 0). Shortly we'll see the actual definition
1701 of ; and we'll see that it's really a very simple definition, declared IMMEDIATE.
1703 After the interpreter reads ; and executes it 'immediately', we get this:
1705 +---------+---+---+---+---+---+---+---+---+------------+------------+------------+------------+
1706 | LINK | 6 | D | O | U | B | L | E | 0 | DOCOL | DUP | + | EXIT |
1707 +---------+---+---+---+---+---+---+---+---+------------+------------+------------+------------+
1713 And that's it, job done, our new definition is compiled, and we're back in immediate mode
1714 just reading and executing words, perhaps including a call to test our new word DOUBLE.
1716 The only last wrinkle in this is that while our word was being compiled, it was in a
1717 half-finished state. We certainly wouldn't want DOUBLE to be called somehow during
1718 this time. There are several ways to stop this from happening, but in FORTH what we
1719 do is flag the word with the HIDDEN flag (F_HIDDEN in this code) just while it is
1720 being compiled. This prevents FIND from finding it, and thus in theory stops any
1721 chance of it being called.
1723 The above explains how compiling, : (COLON) and ; (SEMICOLON) works and in a moment I'm
1724 going to define them. The : (COLON) function can be made a little bit more general by writing
1725 it in two parts. The first part, called CREATE, makes just the header:
1727 +-- Afterwards, HERE points here.
1730 +---------+---+---+---+---+---+---+---+---+
1731 | LINK | 6 | D | O | U | B | L | E | 0 |
1732 +---------+---+---+---+---+---+---+---+---+
1735 and the second part, the actual definition of : (COLON), calls CREATE and appends the
1736 DOCOL codeword, so leaving:
1738 +-- Afterwards, HERE points here.
1741 +---------+---+---+---+---+---+---+---+---+------------+
1742 | LINK | 6 | D | O | U | B | L | E | 0 | DOCOL |
1743 +---------+---+---+---+---+---+---+---+---+------------+
1746 CREATE is a standard FORTH word and the advantage of this split is that we can reuse it to
1747 create other types of words (not just ones which contain code, but words which contain variables,
1748 constants and other data).
1751 defcode "CREATE",6,,CREATE
1753 // Get the name length and address.
1754 pop %ecx // %ecx = length
1755 pop %ebx // %ebx = address of name
1758 movl var_HERE,%edi // %edi is the address of the header
1759 movl var_LATEST,%eax // Get link pointer
1760 stosl // and store it in the header.
1762 // Length byte and the word itself.
1763 mov %cl,%al // Get the length.
1764 stosb // Store the length/flags byte.
1766 mov %ebx,%esi // %esi = word
1767 rep movsb // Copy the word
1769 addl $3,%edi // Align to next 4 byte boundary.
1772 // Update LATEST and HERE.
1774 movl %eax,var_LATEST
1779 Because I want to define : (COLON) in FORTH, not assembler, we need a few more FORTH words
1782 The first is , (COMMA) which is a standard FORTH word which appends a 32 bit integer to the user
1783 memory pointed to by HERE, and adds 4 to HERE. So the action of , (COMMA) is:
1785 previous value of HERE
1788 +---------+---+---+---+---+---+---+---+---+-- - - - - --+------------+
1789 | LINK | 6 | D | O | U | B | L | E | 0 | | <data> |
1790 +---------+---+---+---+---+---+---+---+---+-- - - - - --+------------+
1795 and <data> is whatever 32 bit integer was at the top of the stack.
1797 , (COMMA) is quite a fundamental operation when compiling. It is used to append codewords
1798 to the current word that is being compiled.
1801 defcode ",",1,,COMMA
1802 pop %eax // Code pointer to store.
1806 movl var_HERE,%edi // HERE
1808 movl %edi,var_HERE // Update HERE (incremented)
1812 Our definitions of : (COLON) and ; (SEMICOLON) will need to switch to and from compile mode.
1814 Immediate mode vs. compile mode is stored in the global variable STATE, and by updating this
1815 variable we can switch between the two modes.
1817 For various reasons which may become apparent later, FORTH defines two standard words called
1818 [ and ] (LBRAC and RBRAC) which switch between modes:
1820 Word Assembler Action Effect
1821 [ LBRAC STATE := 0 Switch to immediate mode.
1822 ] RBRAC STATE := 1 Switch to compile mode.
1824 [ (LBRAC) is an IMMEDIATE word. The reason is as follows: If we are in compile mode and the
1825 interpreter saw [ then it would compile it rather than running it. We would never be able to
1826 switch back to immediate mode! So we flag the word as IMMEDIATE so that even in compile mode
1827 the word runs immediately, switching us back to immediate mode.
1830 defcode "[",1,F_IMMED,LBRAC
1832 movl %eax,var_STATE // Set STATE to 0.
1835 defcode "]",1,,RBRAC
1836 movl $1,var_STATE // Set STATE to 1.
1840 Now we can define : (COLON) using CREATE. It just calls CREATE, appends DOCOL (the codeword), sets
1841 the word HIDDEN and goes into compile mode.
1844 defword ":",1,,COLON
1845 .int WORD // Get the name of the new word
1846 .int CREATE // CREATE the dictionary entry / header
1847 .int LIT, DOCOL, COMMA // Append DOCOL (the codeword).
1848 .int LATEST, FETCH, HIDDEN // Make the word hidden (see below for definition).
1849 .int RBRAC // Go into compile mode.
1850 .int EXIT // Return from the function.
1853 ; (SEMICOLON) is also elegantly simple. Notice the F_IMMED flag.
1856 defword ";",1,F_IMMED,SEMICOLON
1857 .int LIT, EXIT, COMMA // Append EXIT (so the word will return).
1858 .int LATEST, FETCH, HIDDEN // Toggle hidden flag -- unhide the word (see below for definition).
1859 .int LBRAC // Go back to IMMEDIATE mode.
1860 .int EXIT // Return from the function.
1863 EXTENDING THE COMPILER ----------------------------------------------------------------------
1865 Words flagged with IMMEDIATE (F_IMMED) aren't just for the FORTH compiler to use. You can define
1866 your own IMMEDIATE words too, and this is a crucial aspect when extending basic FORTH, because
1867 it allows you in effect to extend the compiler itself. Does gcc let you do that?
1869 Standard FORTH words like IF, WHILE, ." and so on are all written as extensions to the basic
1870 compiler, and are all IMMEDIATE words.
1872 The IMMEDIATE word toggles the F_IMMED (IMMEDIATE flag) on the most recently defined word,
1873 or on the current word if you call it in the middle of a definition.
1877 : MYIMMEDWORD IMMEDIATE
1881 but some FORTH programmers write this instead:
1887 The two usages are equivalent, to a first approximation.
1890 defcode "IMMEDIATE",9,F_IMMED,IMMEDIATE
1891 movl var_LATEST,%edi // LATEST word.
1892 addl $4,%edi // Point to name/flags byte.
1893 xorb $F_IMMED,(%edi) // Toggle the IMMED bit.
1897 'addr HIDDEN' toggles the hidden flag (F_HIDDEN) of the word defined at addr. To hide the
1898 most recently defined word (used above in : and ; definitions) you would do:
1902 'HIDE word' toggles the flag on a named 'word'.
1904 Setting this flag stops the word from being found by FIND, and so can be used to make 'private'
1905 words. For example, to break up a large word into smaller parts you might do:
1907 : SUB1 ... subword ... ;
1908 : SUB2 ... subword ... ;
1909 : SUB3 ... subword ... ;
1910 : MAIN ... defined in terms of SUB1, SUB2, SUB3 ... ;
1915 After this, only MAIN is 'exported' or seen by the rest of the program.
1918 defcode "HIDDEN",6,,HIDDEN
1919 pop %edi // Dictionary entry.
1920 addl $4,%edi // Point to name/flags byte.
1921 xorb $F_HIDDEN,(%edi) // Toggle the HIDDEN bit.
1924 defword "HIDE",4,,HIDE
1925 .int WORD // Get the word (after HIDE).
1926 .int FIND // Look up in the dictionary.
1927 .int HIDDEN // Set F_HIDDEN flag.
1928 .int EXIT // Return.
1931 ' (TICK) is a standard FORTH word which returns the codeword pointer of the next word.
1933 The common usage is:
1937 which appends the codeword of FOO to the current word we are defining (this only works in compiled code).
1939 You tend to use ' in IMMEDIATE words. For example an alternate (and rather useless) way to define
1940 a literal 2 might be:
1943 ' LIT , \ Appends LIT to the currently-being-defined word
1944 2 , \ Appends the number 2 to the currently-being-defined word
1951 (If you don't understand how LIT2 works, then you should review the material about compiling words
1952 and immediate mode).
1954 This definition of ' uses a cheat which I copied from buzzard92. As a result it only works in
1955 compiled code. It is possible to write a version of ' based on WORD, FIND, >CFA which works in
1959 lodsl // Get the address of the next word and skip it.
1960 pushl %eax // Push it on the stack.
1964 BRANCHING ----------------------------------------------------------------------
1966 It turns out that all you need in order to define looping constructs, IF-statements, etc.
1969 BRANCH is an unconditional branch. 0BRANCH is a conditional branch (it only branches if the
1970 top of stack is zero).
1972 The diagram below shows how BRANCH works in some imaginary compiled word. When BRANCH executes,
1973 %esi starts by pointing to the offset field (compare to LIT above):
1975 +---------------------+-------+---- - - ---+------------+------------+---- - - - ----+------------+
1976 | (Dictionary header) | DOCOL | | BRANCH | offset | (skipped) | word |
1977 +---------------------+-------+---- - - ---+------------+-----|------+---- - - - ----+------------+
1980 | +-----------------------+
1981 %esi added to offset
1983 The offset is added to %esi to make the new %esi, and the result is that when NEXT runs, execution
1984 continues at the branch target. Negative offsets work as expected.
1986 0BRANCH is the same except the branch happens conditionally.
1988 Now standard FORTH words such as IF, THEN, ELSE, WHILE, REPEAT, etc. can be implemented entirely
1989 in FORTH. They are IMMEDIATE words which append various combinations of BRANCH or 0BRANCH
1990 into the word currently being compiled.
1992 As an example, code written like this:
1994 condition-code IF true-part THEN rest-code
1998 condition-code 0BRANCH OFFSET true-part rest-code
2004 defcode "BRANCH",6,,BRANCH
2005 add (%esi),%esi // add the offset to the instruction pointer
2008 defcode "0BRANCH",7,,ZBRANCH
2010 test %eax,%eax // top of stack is zero?
2011 jz code_BRANCH // if so, jump back to the branch function above
2012 lodsl // otherwise we need to skip the offset
2016 LITERAL STRINGS ----------------------------------------------------------------------
2018 LITSTRING is a primitive used to implement the ." and S" operators (which are written in
2019 FORTH). See the definition of those operators later.
2021 TELL just prints a string. It's more efficient to define this in assembly because we
2022 can make it a single Linux syscall.
2025 defcode "LITSTRING",9,,LITSTRING
2026 lodsl // get the length of the string
2027 push %esi // push the address of the start of the string
2028 push %eax // push it on the stack
2029 addl %eax,%esi // skip past the string
2030 addl $3,%esi // but round up to next 4 byte boundary
2034 defcode "TELL",4,,TELL
2035 mov $1,%ebx // 1st param: stdout
2036 pop %edx // 3rd param: length of string
2037 pop %ecx // 2nd param: address of string
2038 mov $__NR_write,%eax // write syscall
2043 QUIT AND INTERPRET ----------------------------------------------------------------------
2045 QUIT is the first FORTH function called, almost immediately after the FORTH system "boots".
2046 As explained before, QUIT doesn't "quit" anything. It does some initialisation (in particular
2047 it clears the return stack) and it calls INTERPRET in a loop to interpret commands. The
2048 reason it is called QUIT is because you can call it from your own FORTH words in order to
2049 "quit" your program and start again at the user prompt.
2051 INTERPRET is the FORTH interpreter ("toploop", "toplevel" or "REPL" might be a more accurate
2052 description -- see: http://en.wikipedia.org/wiki/REPL).
2055 // QUIT must not return (ie. must not call EXIT).
2056 defword "QUIT",4,,QUIT
2057 .int RZ,RSPSTORE // R0 RSP!, clear the return stack
2058 .int INTERPRET // interpret the next word
2059 .int BRANCH,-8 // and loop (indefinitely)
2062 This interpreter is pretty simple, but remember that in FORTH you can always override
2063 it later with a more powerful one!
2065 defcode "INTERPRET",9,,INTERPRET
2066 call _WORD // Returns %ecx = length, %edi = pointer to word.
2068 // Is it in the dictionary?
2070 movl %eax,interpret_is_lit // Not a literal number (not yet anyway ...)
2071 call _FIND // Returns %eax = pointer to header or 0 if not found.
2072 test %eax,%eax // Found?
2075 // In the dictionary. Is it an IMMEDIATE codeword?
2076 mov %eax,%edi // %edi = dictionary entry
2077 movb 4(%edi),%al // Get name+flags.
2078 push %ax // Just save it for now.
2079 call _TCFA // Convert dictionary entry (in %edi) to codeword pointer.
2081 andb $F_IMMED,%al // Is IMMED flag set?
2083 jnz 4f // If IMMED, jump straight to executing.
2087 1: // Not in the dictionary (not a word) so assume it's a literal number.
2088 incl interpret_is_lit
2089 call _NUMBER // Returns the parsed number in %eax, %ecx > 0 if error
2093 mov $LIT,%eax // The word is LIT
2095 2: // Are we compiling or executing?
2098 jz 4f // Jump if executing.
2100 // Compiling - just append the word to the current dictionary definition.
2102 mov interpret_is_lit,%ecx // Was it a literal?
2105 mov %ebx,%eax // Yes, so LIT is followed by a number.
2109 4: // Executing - run it!
2110 mov interpret_is_lit,%ecx // Literal?
2111 test %ecx,%ecx // Literal?
2114 // Not a literal, execute it now. This never returns, but the codeword will
2115 // eventually call NEXT which will reenter the loop in QUIT.
2118 5: // Executing a literal, which means push it on the stack.
2122 6: // Parse error (not a known word or a number in the current BASE).
2123 // Print an error message followed by up to 40 characters of context.
2124 mov $2,%ebx // 1st param: stderr
2125 mov $errmsg,%ecx // 2nd param: error message
2126 mov $errmsgend-errmsg,%edx // 3rd param: length of string
2127 mov $__NR_write,%eax // write syscall
2130 mov (currkey),%ecx // the error occurred just before currkey position
2132 sub $buffer,%edx // %edx = currkey - buffer (length in buffer before currkey)
2133 cmp $40,%edx // if > 40, then print only 40 characters
2136 7: sub %edx,%ecx // %ecx = start of area to print, %edx = length
2137 mov $__NR_write,%eax // write syscall
2140 mov $errmsgnl,%ecx // newline
2142 mov $__NR_write,%eax // write syscall
2148 errmsg: .ascii "PARSE ERROR: "
2150 errmsgnl: .ascii "\n"
2152 .data // NB: easier to fit in the .data section
2155 .int 0 // Flag used to record if reading a literal
2158 ODDS AND ENDS ----------------------------------------------------------------------
2160 CHAR puts the ASCII code of the first character of the following word on the stack. For example
2161 CHAR A puts 65 on the stack.
2163 EXECUTE is used to run execution tokens. See the discussion of execution tokens in the
2164 FORTH code for more details.
2166 SYSCALL0, SYSCALL1, SYSCALL2, SYSCALL3 make a standard Linux system call. (See <asm/unistd.h>
2167 for a list of system call numbers). As their name suggests these forms take between 0 and 3
2168 syscall parameters, plus the system call number.
2170 In this FORTH, SYSCALL0 must be the last word in the built-in (assembler) dictionary because we
2171 initialise the LATEST variable to point to it. This means that if you want to extend the assembler
2172 part, you must put new words before SYSCALL0, or else change how LATEST is initialised.
2175 defcode "CHAR",4,,CHAR
2176 call _WORD // Returns %ecx = length, %edi = pointer to word.
2178 movb (%edi),%al // Get the first character of the word.
2179 push %eax // Push it onto the stack.
2182 defcode "EXECUTE",7,,EXECUTE
2183 pop %eax // Get xt into %eax
2184 jmp *(%eax) // and jump to it.
2185 // After xt runs its NEXT will continue executing the current word.
2187 defcode "SYSCALL3",8,,SYSCALL3
2188 pop %eax // System call number (see <asm/unistd.h>)
2189 pop %ebx // First parameter.
2190 pop %ecx // Second parameter
2191 pop %edx // Third parameter
2193 push %eax // Result (negative for -errno)
2196 defcode "SYSCALL2",8,,SYSCALL2
2197 pop %eax // System call number (see <asm/unistd.h>)
2198 pop %ebx // First parameter.
2199 pop %ecx // Second parameter
2201 push %eax // Result (negative for -errno)
2204 defcode "SYSCALL1",8,,SYSCALL1
2205 pop %eax // System call number (see <asm/unistd.h>)
2206 pop %ebx // First parameter.
2208 push %eax // Result (negative for -errno)
2211 defcode "SYSCALL0",8,,SYSCALL0
2212 pop %eax // System call number (see <asm/unistd.h>)
2214 push %eax // Result (negative for -errno)
2218 DATA SEGMENT ----------------------------------------------------------------------
2220 Here we set up the Linux data segment, used for user definitions and variously known as just
2221 the 'data segment', 'user memory' or 'user definitions area'. It is an area of memory which
2222 grows upwards and stores both newly-defined FORTH words and global variables of various
2225 It is completely analogous to the C heap, except there is no generalised 'malloc' and 'free'
2226 (but as with everything in FORTH, writing such functions would just be a Simple Matter
2227 Of Programming). Instead in normal use the data segment just grows upwards as new FORTH
2228 words are defined/appended to it.
2230 There are various "features" of the GNU toolchain which make setting up the data segment
2231 more complicated than it really needs to be. One is the GNU linker which inserts a random
2232 "build ID" segment. Another is Address Space Randomization which means we can't tell
2233 where the kernel will choose to place the data segment (or the stack for that matter).
2235 Therefore writing this set_up_data_segment assembler routine is a little more complicated
2236 than it really needs to be. We ask the Linux kernel where it thinks the data segment starts
2237 using the brk(2) system call, then ask it to reserve some initial space (also using brk(2)).
2239 You don't need to worry about this code.
2242 .set INITIAL_DATA_SEGMENT_SIZE,65536
2243 set_up_data_segment:
2244 xor %ebx,%ebx // Call brk(0)
2247 movl %eax,var_HERE // Initialise HERE to point at beginning of data segment.
2248 addl $INITIAL_DATA_SEGMENT_SIZE,%eax // Reserve nn bytes of memory for initial data segment.
2249 movl %eax,%ebx // Call brk(HERE+INITIAL_DATA_SEGMENT_SIZE)
2255 We allocate static buffers for the return static and input buffer (used when
2256 reading in files and text that the user types in).
2258 .set RETURN_STACK_SIZE,8192
2259 .set BUFFER_SIZE,4096
2262 /* FORTH return stack. */
2265 .space RETURN_STACK_SIZE
2266 return_stack_top: // Initial top of return stack.
2268 /* This is used as a temporary input buffer when reading from files or the terminal. */
2274 START OF FORTH CODE ----------------------------------------------------------------------
2276 We've now reached the stage where the FORTH system is running and self-hosting. All further
2277 words can be written as FORTH itself, including words like IF, THEN, .", etc which in most
2278 languages would be considered rather fundamental.
2280 I used to append this here in the assembly file, but I got sick of fighting against gas's
2281 crack-smoking (lack of) multiline string syntax. So now that is in a separate file called
2284 If you don't already have that file, download it from http://annexia.org/forth in order
2285 to continue the tutorial.
2288 /* END OF jonesforth.S */