CS 3304 Project #2

Date Assigned: October 15, 2003
Date Due: October 29, 2003, 11:59pm Eastern Time

In this assignment, you are going to implement an interpreter for (a subset of) the language Forth using ML. We begin by describing Forth and then some guidelines on implementing your interpreter.

What is FORTH?

Forth is a cute language meant for control and embedded systems applications. It comes with a lot of functionality, but for this assignment, you will be implementing only a small subset of the entire language. You can read more about Forth at, e.g., the pForth website.

The central entity in Forth is a stack. You push things onto the stack and do operations from there. A Forth program is a bunch of "words" with spaces between them. For instance, the following is a simple Forth program:
23 7 91
All it does is to push the three above numbers onto the stack. 91 is on top and 23 is at the bottom (last in, first out).

Here's another Forth program:
23 7 91 .
The only difference between this and the previous is the "dot" (.) at the end of it. The "dot" pops (removes) the top element off the stack and prints it. After this, only two numbers are remaining (23 and 7) on the stack. 7 is on the top.

The next command we will encounter is ".S" (dot-S). Consider the program:
23 7 91 .S
This causes as output "23 7 91". dot-S prints the entire stack (topmost element is printed rightmost) but doesn't remove anything.

So, we can put things onto the stack and remove them. The following program leaves the stack empty, but causes as output "91 7 23".
23 7 91 . . .
We can also do arithmetic operations on the stack. "+" takes the top two elements off the stack, adds them, and pushes the result back. Similarly for "-", "*", and "/". So,
4 5 + .
causes "9" to be printed and leaves the stack empty.
4 5 - .
causes "-1" to be printed and leaves the stack empty.

Some other useful operations in Forth are given now. DUP duplicates the topmost element of the list.
77 DUP .S
causes "77 77" to be printed (and the stack now contains these two integers). Similarly, SWAP swaps the topmost two elements on the stack.
8 7 SWAP .S
prints "7 8". DROP drops the topmost element from the stack (unlike "dot", it doesn't print anything). OVER causes a copy of the second element on the stack to leapfrog over the first. So,
8 9 OVER .S
causes "8 9 8" to be printed (and remain on the stack). Finally, ROT takes the third element from the stack and moves it to the top.
7 8 9 ROT .S
prints "8 9 7" (and this also remains on the stack). As more examples,
11 22 33 SWAP DUP
causes "11 33 22 22" to be on the stack (doesn't print anything). Similarly,
11 22 33 ROT DROP
causes "22 33" to be on the stack (doesn't print anything). For your assignment, you only have to support the above commands (words). Furthermore, you can assume that only integers can be pushed onto the stack. A syntactically correct FORTH program is simply a string that contains any combination of integers and FORTH words, separated by white space. When printing using "dot" or "dot-S" you can assume that all printings are on the same line, with spaces in between the elements printed.

An Interpreter in ML

The operation of the interpreter is to read a set of lines from a file named "input.txt" (which will be in the current directory), where each line is a string denoting a syntactically correct FORTH program. The interpreter will process this program and return a tuple (a,b) which is printed onto a file named "output.txt" (in the same directory). "a" is a string that contains the result of all the printings that happen during the execution of the program. "b" is a string that identifies what is remaining on the stack (again topmost element to the right). For each line in the input, there is a corresponding line in the output.

For instance, for the input
11 22 33 ROT DROP .
the output is
("33","22")
(yes you have to print those brackets and quotes). For the input
11 1 3 4 + 5 7 DROP . .S
the output is
("5 11 1 7","11 1 7")
If something illegal is attempted, your interpreter should print
(illegal,illegal)
This can happen, for instance, when dot/DUP/DROP is attempted on an empty stack, the arithmetic operations/SWAP/OVER are attempted on a stack that contains less than two elements, or when ROT is attempted on a stack that contains less than three elements.

If nothing is printed by the program, or if the stack is empty, you should substitute an empty string ("") in the appropriate place in the returned tuple. For instance,
3 2 . .
produces
("2 3","")

Implementation Guidelines

Your implementation must adhere to the tenents of functional programming. Specifically, your program must utilize recursion and higher-order functions to the fullest extent. It must also be free of side effects with the exception of input/output.

You may not use any imperative features of ML (remember the language does have some impurities). For example, you may not use statement blocks beginning and ending with ( and );. You may use let's however as they do not run contrary to the spirit of functional programming. Absolutely no iteration is permitted (use recursion).

Your program must contain at least one curry-able function and you must make use of some curried form of it in your solution.

Adopt an iterative development approach. E.g., prepare your interpreter to first handle simple pushing and popping off stack, and then proeed to add the more complex FORTH words.

You might want to open the SML-supplied TextIO, Int, Char, String structures, to see if there are any useful functions and datatypes provided (typing them in interactive mode will dump a list of declarations on the screen). See also the code from the lecture summaries (especially the one from Oct 13). Keep in mind that when you Open structures, their definitions might override other functions. To be careful, you can always prefix functions with the structure name (e.g., TextIO.endOfStream).

Submitting Your Program

Your ML implementation is to be organized in a single file. Recall that you can "read" in ML statements from a file, by typing "use <filename>" inside the SML session; Alternatively you can just do "sml < <filename>" from the UNIX command prompt (this is really how we will test your submission).

All documentation and identifying information should be placed in comments within the single source file. The comments at the beginning of the file should identify the assignment and give your full name. Every ML function should be preceded by comments explaining its purpose, the meaning of each parameter, and the general strategy of its implementation if applicable.

Your program will be submitted electronically through the Web-CAT Curator for automatic grading. 80% of the score comes from correct implementation and execution; 20% of the score comes from (functional) programming style, sound design, and good coding principles (comments, readability). The system will begin accepting submissions on 10/22. You will have unlimited submissions and we will grade the final submission only. Remember we do not accept late assignments or projects, unless you have already arranged something with us. Therefore, please stay abreast of the deadline and plan your schedule accordingly.


Return
Home