Tuesday, 3 April 2012

Itsy-Forth: the Dictionary and Inner Interpreter

Itsy Forth is a minimal Forth compiler implemented in under 1kB. Earlier we examined Itsy's outer interpreter. Now we take a closer look at the dictionary and inner interpreter.

Forth Dictionary

Itsy's dictionary is a linked list holding the name and code for each word (subroutine). Each entry in the list has a header containing a link, counted string and XT (execution token). For example here's the dictionary entry for nip:

        ; header
        dw link_to_previous_word
        db 3, 'nip'
xt_nip  dw docolon
        ; body
        dw xt_swap
        dw xt_drop
        dw xt_exit

The first line of the header links to the previous word in the dictionary. The second line holds the word's name preceded by its length. The final line contains the XT, a pointer to the routine which performs the actual operation of the word. Itsy uses four different XTs:

  • docolon - The word is a list of pointers to XTs. Call each in turn.
  • doconst - The word is a constant. Place its value on the data stack.
  • dovar - The word is a variable. Place its address on the data stack.
  • pointer to body - The word is a primitive (machine code). Execute it.


I'm not a big fan of macros. They're ugly and lock the code to a particular assembler. On the other hand they can add flexibility and make the code less prone to errors. Compare the definition of + with and without macros:

Without macros:

        dw link_to_previous_word
        db 1, '+'
xt_plus dw mc_plus
mc_plus pop ax
        add bx,ax
        jmp next

With macros:

        primitive '+',plus
        pop ax
        add bx,ax
        jmp next

The NASM macros to set up headers and maintain the linked list are pretty simple:

        %define link 0
        %define immediate 080h

        %macro head 4
        %%link dw link
        %define link %%link
        %strlen %%count %1
        db %3 + %%count,%1
        xt_ %+ %2 dw %4

        %macro primitive 2-3 0
        head %1,%2,%3,$+2

        %macro colon 2-3 0
        head %1,%2,%3,docolon

        %macro constant 3
        head %1,%2,0,doconst
        val_ %+ %2 dw %3

        %macro variable 3
        head %1,%2,0,dovar
        val_ %+ %2 dw %3

Macro Examples

constant is used to define a Forth constant. E.g. to define false = 0:

        constant 'false',false,0

variable creates a Forth variable. E.g. to create base and initialise to 10:

        variable 'base',base,10

primitive sets up an assembly language word. E.g. to create drop:

        primitive 'drop',drop
        pop bx
        jmp next

colon defines a compiled Forth word. E.g. to define nip:

        colon 'nip',nip
        dw xt_swap
        dw xt_drop
        dw xt_exit

Register Allocation

Itsy's use of the registers is similar to most 8086 Forths. The system stack is used for the data stack while a register is used for the return stack. Note the top element of the data stack is kept in a register to enhance performance:

  • sp - data stack pointer
  • bp - return stack pointer
  • si - Forth instruction pointer
  • di - pointer to current XT
  • bx - TOS (top of data stack)

Itsy's Inner Interpreter

The Forth inner interpreter needs only three simple routines:

  • docolon - the XT to enter a Forth word. Save the Forth IP on the return stack then point it to the word being entered.
  • exit - return from a compiled Forth word. exit recovers the Forth IP from the return stack.
  • next - return from a primitive (machine code) word and call the next XT.
docolon dec bp
        dec bp
        mov word[bp],si
        lea si,[di+2]

next    lodsw
        xchg di,ax
        jmp word[di]

        primitive 'exit',exit
        mov si,word[bp]
        inc bp
        inc bp
        jmp next

Next we'll define approx 30 words and finally get the interpreter up and running. In the meantime I'd love to hear any comments on the code so far :-)


  1. Instead of using assembler-specific macros, you could use a general macro processor. Even though "the C pre-processor is for C," that hasn't stopped people from using it to pre-process assembly -- myself included :-). Something like m4 might also be an option.

  2. I second the M4, it is pretty awesome (was made for this kind of stuff). I have also seen people use FASM macro's for things other than assembler. It has a very powerful macro language. You could also use forth as a macro language, it works really well for writing assemblers.

  3. Any examples of FASM macros implementing FORTH elements would be useful to me. cwpjr02@gmail.com
    Thanks, Clyde

    1. I thought that RevaForth was an example, since that's what Google told me ... but when I downloaded it and looked at the sources, its NASM format. Don't know if that helped, but if it does, google for Reva201101.zip