Write simple scheme interpreter on ruby
01 Nov 2015TL;DR: github repo
Every developer has a moment in his life where he wants to write his own programming language. In this article, I want to show you how to do this for a simple lisp compiler.
Why scheme and lisp?
Firstly, lisp is very simple for create and for understanding.
Lisp (LISt Processor) is a family of languages which is based on the idea of S-expressions.
S-Expression are needed for data representation and may consist of atoms (numbers, symbols, boolean expressions) or are the expression of the form (x . y)
where x
and y
are s-expressions.
This expression may be formed as lists ((1 . ( 2 . 3))
this equals (1 2 3)
) and trees (((1 . (2 . 3)) . (4 . 5))
).
Secondly, after creating interpreter you can better understand the language (the author fully understood the environment idea). Also you can understand the main idea of compilers and interpreters.
Now let’s begin our journey into the world of compilers and interpreters so that we can write a simple scheme interpreter.
Main idea
Our language will contain two parts: a parser which translates the string to AST and eval
function.
This function will take the AST with envariement value and will return the result of the code.
Schematically, it looks like this:
First step. Parser.
To begin, let’s define what we want to get.
For example, we have a string '(+ 1 1 1)'
.
What should our parser return? What kind of data structure should we receive? I think, that an array would be correct.
Let write simple test code:
As you can see, this is a simple code, so I just displayed parse
method code:
As you know, in lisp you can write your code with nested operators, for example - (+ (* 2 2) (- 5 3))
.
And this code will return 6.
When we use our parser for this code, the result is not quite what we need, so let’s update our test code:
As you might guess, the most obvious way to fix our code is to call parse
method in recursion and all array elements from '('
to ')'
.
We will move to a nested array.
The code will be look loke this:
We did it! Let’s start to make eval
method.
Eval method
As I said earlier, our interpreter consists of two parts: a parser and eval
function.
The eval
function will take two arguments: an expression, exp
, that we want to evaluate, and an environment, env
, in which to evaluate it. An environment is a mapping from variable names to their values.
By default, eval will use a instance value that includes the names for a bunch of standard things.
let’s implement @env
variable with car
, cdr
and cons
functions:
The next step is to make eval
function which will look for a match on the first element of the input array
Now we have a problem: what will happen when the first element of the array will be not be a symbol (integer for example) and what will be happen when we have nested functions? I think we can add a check to the element type like this:
Some (eg arithmetic), we can easily add to env
variable, and some do not.
Therefore, we need to extend the checking in eval
function. We will add a check for function name.
For example, the code below will demonstrate quote
and if
functions:
The next step is to initialize define
and lambda
functions.
In scheme define
function syntax is as follows:
And our code must create a new key-value pair in the env
hash:
The last function, lambda
in scheme will have this syntax:
The first thing that comes to mind is to return the new lambda
object with a new value inside env
that will serve our code:
As you can see, we made basic functionality of the our interpreter.
I will leave learning how to do the implement the arithmetic methods and other methods such as true
, false
, list
, etc up to the reader.
REPL
In main REPL have really simple idea repl takes single user inputs, evaluates them, and returns the result to the user. And all this is happens in an infinite loop:
As a result, you should get something like this:
Conclusions
Right now, we have a simple scheme interpreter. It is easy to expand since we wrote a simple repl, and considered the basic idea of the interpreter. In this article, we did not consider such important concepts as macros, multithreading, code optimization, work with the system, and much more. These concepts will be discussed in future articles.
Further reading
- Lisp interpreter written on python (http://norvig.com/lispy.html)
- Lisp interpreter written on C lang (http://www.buildyourownlisp.com)
- http://www.wikiwand.com/en/Scheme_(programming_language)
- http://www.wikiwand.com/en/S-expression