Sunday, 13 March 2011

Efficiency in Forth

I'm busy implementing my own Forth at the moment and it's taking a little longer than anticipated. Each Forth word is a puzzle in itself:

  • What's the least number of words needed to write it?
  • What's the most efficient implementation?

For example, take a look at Implementing MIN in Forth without Conditional Code. The smallest version of MIN is 2 words shorter than eForth's MIN.

Even words with a trivial implementation can pose an interesting problem. For example which of the following is most efficient:

Jones Forth TUCK


: TUCK DUP -ROT ;

Alternative TUCK


: TUCK SWAP OVER ;

If DUP, -ROT, SWAP and OVER are all primitive, the alternative implementation will execute two fewer instructions on an 80x86 Forth. However, -ROT is often implemented in Forth which would cause the Jones Forth TUCK to be substantially slower. Here's a typical implementation of -ROT:

: -ROT SWAP >R SWAP R> ;

If you can think of an alternative two word implementation of TUCK or four word implementation of -ROT, please let me know. :-)

6 comments:

  1. Premature optmisation is the root of all evil John! Finish it and then optimise!

    Cheers,
    Ruben

    ReplyDelete
  2. Ruben: true, but only about 97% of the time ;-)

    ReplyDelete
  3. "but only about 97% of the time ;-) "

    Give examples of the 3%. If you don't finish it, then optimize, how do you know if a) you need to optimize, and b) if your "optimization" does any good.

    ReplyDelete
  4. In support of optimization: 1) it's fun, and 2) this is the core!
    --from someone who has implemented a Forth kernel before (on PowerPC).

    ReplyDelete
  5. Optimisation, in the '97% of the time, premature optimisation is the root of all evil' quote actually refers to 'micro-optimisation' which is about shaving a few processor cycles here and there.

    An example of the 3% would be when your optimisation forces larger changes...for example, in a software renderer, forcing power-of-two textures, allowing the use of bitshifts rather than multiplication and division.

    ReplyDelete