Go FORTH and Code

taylorza · April 2, 2013, 6:09pm

Let me apologize upfront for the long post…

While I have not been around much lately, mostly due to a hectic travel schedule, I have also been busy working on something that

I am really excited about. And now it is ready to start share a little information and some demo code.

I have been building an interface that makes integrating fast “near native” code routines into your .NETMF projects much easier

and without having to leave Visual Studio at all or install various tool chains etc.

After a number of prototypes and tests of almost complete solutions I settled on implementing a Forth compiler that runs on the

device directly. Other prototypes have included a partial C scripting compiler and a nearly complete Pascal compiler.

So why Forth?

Learn more about Forth

I first learned Forth in the mid 80’s and have had a passion for the language ever since. Forth is unique in that it can run

interactively ie. interpret your commands and execute them directly while at the same time you can compile a snippet of code that

when executed will run with improved performance all in the same session without leaving the environment.

It is this dual nature that has always drawn me to Forth, the goal is to have code as part of the VS project deployed to the

board and at the sametime connect to the board via the serial port and using a terminal to test out a few things before having it

compiled into your code. The testing does not need the board to be in any special state it can be running code from the VS

project and at the same time you can be initiating commands via the serial port.

Here is a simple example of a factorial routine written in Forth. This implementation is not using recursion, but recursion is

fully supported.


: factorial ( +n1 -- +n2 )
  dup 2 < if  drop 1 exit then
  dup
  begin dup 2 > while
    1-  swap over *  swap
  repeat  drop
;

This code will define a new Forth word called factorial, when a word is defined like this it is immediately compiled so when you

execute the word it does not need to be interpreted therefore providing significant speed advantages.

The most direct way to execute this code in .NETMF would be to use the Forth.Execute method which is a .NETMF method which calls

into the Forth VM.


Forth.Execute("5 factorial");

This will push the number 5 onto the stack and execute factorial the result 120 will be left on the stack to be picked up by the

next code to excute or you can use the Forth.Peek()/Forth.Pop() functions to pull the result off of the stack into a .NETMF

variable.

Here is a more visual example

First we have some code that defines a few constants, the word ‘hex’ tells the environment to interpret all number from this

point on as hex numbers, decimal puts the environment into decimal parsing mode (octal and binary are also supported)


hex
500000 constant LCD_BASE_REG
LCD_BASE_REG @ constant LCD_BUFFER_ADDR

decimal
320 constant SCREEN_WIDTH
240 constant SCREEN_HEIGHT
SCREEN_WIDTH 2* constant SCREEN_WIDTH_BYTES
SCREEN_WIDTH_BYTES SCREEN_HEIGHT * constant SCREEN_TOTAL_BYTES

Next we define some new words in the language

1 fillscreen : takes the number off the top of the stack and fill the screen with that color
2 setpixel : set the pixel at the location and color defined by the three number on the stack
3 showoff : calls fillscreen in a loop to demonstrate the speed of the code


: fillscreen ( color -- )
  SCREEN_TOTAL_BYTES 0 do dup LCD_BUFFER_ADDR i + w! 2 +loop drop 
; 

: setpixel ( color x y -- )
  SCREEN_WIDTH * + 2* LCD_BUFFER_ADDR + w! 
;

hex
: showoff ( -- )
  10 0 do
    001f fillscreen
    0fc0 fillscreen
    f800 fillscreen
  loop
;

To run ‘showoff’ for example you would just issue the following command on the .NETMF side


Forth.Execute("showoff");

And to fill the screen with white you would issue the following command


Forth.Execute("fillscreen", Color.White);

This overload of Execute will push the additional arguments onto the stack before executing the ‘fillscreen’ word.

You can push a .NETMF byte array onto the stack and have the FORTH code populate the array, when Forth.Execute returns the manged

array will be filled with data from the the routine.

Everything is still a little rough and ready, it is currently prototyped using RLP and has really 0 optimizations at the moment.
If this proves useful, I would like to do two things

Have a custom firmware which contains the support for using FORTH to write the code that needs more speed
Provide a custom DL40 based module which can be programmed using FORTH via the UART or through the DaisyLink interface to have

custom code running on the device.

Here is a video showing the ‘showoff’ word being executed.

Architect · April 2, 2013, 6:13pm

Wow! Very interesting. I have played with Forth long time ago. Ho big is the Forth VM code?

taylorza · April 2, 2013, 6:23pm

The code is not very big, I wrote the entire vm in 2 days. It is not a port of an existing implementation so the code is still evolving it currently compiles to less than 25KB, but that includes the string tables for the error messages and debugging which will come out.

Architect · April 2, 2013, 6:26pm

Very cool! Looking forward to try it

Gus_Issa · April 2, 2013, 6:36pm

Love it and look forward to seeing how fast it runs

Blue_Hair_Bob · April 2, 2013, 7:19pm

Big forth fan. Looking forward to it.

fradav · April 2, 2013, 8:17pm

Looks very promising.

Just a question : by “near native speed”, do you mean giving access to “near real-time” with μs resolution ?

taylorza · April 2, 2013, 11:42pm

@ Blue Hair Bob - That is great to hear, I was worried that no one would even recognize the language.

@ fradav - I have not run any serious performance tests, but it should be much faster than managed code , but it will still be much slower than native code ESP. With the prototype implementation since it still has the overhead of the C calling convention which is a downside of implementing the Forth vm in C and not using assembly stubs.

Unknown · April 3, 2013, 12:11am

So the idea is that you can do Edit and Continue on the embedded device?

taylorza · April 3, 2013, 12:29am

Correct, but the edit/continue or REPL will be via a serial terminal.

jasdev · April 3, 2013, 12:35am

@ taylorza - very very cool!
A number of years ago I wrote an embedded application that processed real time radar data (synchronous serial data at relatively low bit rate) on an controller board that executed Forth code “natively”. I can’t remember the name of the device, but I think it was made by Rockwell. The radars were used in the air traffic control system in Taiwan.

taylorza · April 3, 2013, 12:36am

Here are the first quantitative performance tests, keep in mind that this code is only 2 days old and not optimized.

Test : Double the value of each element of a 10,000 byte array

Managed Code : 476 ms
Forth Code : 25 ms

Forth Code x18 faster

Managed Code


byte[] _data = new byte[10000];


int len = _data.Length;
DateTime startTime = DateTime.Now;
for (int i = 0; i < len; i++)
{
  _data[i] *= 2;
}
DateTime endTime = DateTime.Now;
Debug.Print("Execution time : " + (endTime.Ticks - startTime.Ticks) / TimeSpan.TicksPerMillisecond);

Forth Code (Manipulating the same managed array)


: doublearray ( a-addr len -- )
    0 do 
    dup i + dup c@ 2* swap c! 
  loop
  drop
;

Architect · April 3, 2013, 1:02am

Amazing!

taylorza · April 3, 2013, 1:03am

@ jasdev - I am really excited to hear that there are some Forth guys here. At least the code will not look totally gibberish to everyone

taylorza · April 3, 2013, 1:04am

@ Architect - Thanks. When I start to tune the compiler I think I can squeeze out a few more clock cycles.

Architect · April 3, 2013, 1:12am

I have no doubts!

P.S.
I hope the bank crisis didn’t hit you hard

leforban · April 3, 2013, 3:42am

Looks promising!!! However, the demo uses a constant multiplication by two, which may be not relevant because most of modern compiler are not performing the real multiplication but performs at least a left shift. Do you have already implemented such a mechanism? what about this on CLR?

Blue_Hair_Bob · April 3, 2013, 3:46am

When are you going to want alpha testers?

Justin · April 3, 2013, 4:13am

@ taylorza - Nice work young man

taylorza · April 3, 2013, 5:03am

@ leforban - You are correct, the test is not of the computational complexity, but the loop iteration. I just did not want to do nothing in the loop otherwise the compiler would optimize it away and I also wanted to make use of the stack in the Forth version to be as fair as possible. And at the same test the manipulation of the managed array from the Forth code.

I believe that .NETMF would be doing a left shift since the firmware is written in C and as you say modern compilers will issue a left shift.

In the same way the Forth compiler is doing the same thing and the following instruction is emitted for the 2*


mov	r2, r2, asl #1