Tail recursion in REBOL

REBOL 1.0 is properly tail recursive. Proper tail recursion is usually easy to achieve in most languages, but in REBOL it surprisingly difficult. Here is a short description of the problem and solution.

-----

A quick intro to the language

REBOL has been characterized as `like Lisp but without the parenthesis' or `the result of Ousterhout meeting Sussman in a dark alley'. Like Lisp, REBOL uses prefix notation; however, REBOL's notation is unbracketed, making it much closer to the `Polish notation' developed by Jan Lukasiewicz in the 1920's.

In order to provide an unbracketed notation, REBOL requires that all functions have fixed `arity'. No function takes optional or unlimited numbers of arguments (there are mechanisms available to work around this limitation). When REBOL encounters a function during evaluation, it assumes that the function will be applied to subsequent arguments. A special quoting mechanism indicates that a function is to be passed as an argument rather than invoked.

REBOL supports a `pseudo-infix notation' for mathematical operators. Infix operators are binary, and there is a simple syntactic translation that converts infix to prefix. REBOL's infix notation is evaluated left-to-right; there are no standard `rules of precedence'.

Because REBOL supports infix notation, order of evaluation for infix operators may be controlled by parenthesis. Parenthesis are not specific to the infix operators, they can be used anywhere else as well, but the prefix notation of REBOL makes them superfluous (to the interpreter, at least. The user may want to parenthesize some expressions in order to make them more readable.)

A typical REBOL expression may look like this:

 floor random expt 2 x length foo

Which would be roughly equivalent to the lisp form:

 (floor (random (expt 2 x)) (length foo))

Evaluation would proceed as follows:

  FLOOR takes two arguments, recursively find the first argument...

      RANDOM takes one argument, recursively find it ...

          EXPT takes two arguments, recursively find the first ...

              2 is atomic, return it

          Get second arg for EXPT

              x is a variable, (assume for this example x equals 16)

          Apply EXPT to 2 and 16 yielding 65536, return this....

      Apply RANDOM to 65536 getting back ????

  Get second argument to FLOOR...

      LENGTH takes one argument....

          FOO is not a function, return it....

      Apply LENGTH to value of FOO.....

  Apply FLOOR to the random number and result of length.

Expressions may be concatenated in order to execute them sequentially. Again, no special syntax is necessary.

A function that is the result of a computation will not automatically be applied. To illustrate this, suppose `make-adder' is a function that takes an integer and returns a function that adds that integer to an argument. The REBOL expression

 MAKE-ADDER 3

would evaluate to a function, but the expression

 MAKE-ADDER 3 4

would NOT apply that function. Rather, this expression would be treated as a sequence of two expressions: The call to make-adder (the result of which would be discarded) and the self-evaluating expression `4'. The primitive function `DO' may be used to invoke a computed function on an argument.

-----

Blocks

A `block' in REBOL is indicated by square brackets around an expression or series of expressions. A block is a delayed evaluation (a thunk). It is lexically scoped, and not memoized. Because blocks delay evaluation, there is no need for special forms within the language. For instance, a `when' function could be written as

 my-when: func [pred consequences] [if pred do consequences]

The DO primitive, when applied to a block, forces execution of the block.

Why tail recursion is important in REBOL

Since blocks are thunks, all user-defined control constructs are written in continuation-passing style. Without proper tail-recursion, user-defined control constructs will blow out the stack. One possible solution is to have a plethora of primitive control constructs (if, when, while, until, loop, repeat, etc.) The drawback of this method is that each new control construct must be written in C, and care must be taken to preserve the language semantics (for instance, one must be careful to ensure lexical scoping). A properly tail-recursive implementation is more flexible for the user.

Unfortunately, making REBOL properly tail recursive is quite difficult. Let us explore the problem by writing a toy rebol interpreter in Scheme.

(define (eval-sequence block env)
  (if (empty-block? block)
      block
      (multiple-value-bind (value remainder)
          (eval-1 block env)  ;; ****
        (if (null? remainder)
            value
            (eval-sequence remainder env)))))

(define (eval-1 block env)
  (let ((token (car block)))
    (cond ((quoted? token) (values (unquote token) (cdr block)))
          ((assignment? token) (eval-assignment token (cdr block) env))
          ((block? token) (values (make-closure token env) (cdr block))
          (else
            (let ((value (lookup token env)))
              (if (or (fquoted? token)
                      (not (procedure? value)))
                  (values value (cdr block))
                  (eval-arguments value (cdr block) env))))))))

(define (eval-arguments function block env)
  (eval-arguments-1 function block env (arity function) '()))

(define (eval-arguments-1 function block env arity arglist)
  (cond ((zero? arity) (values (apply function (reverse arglist)) block))
        ((empty-block? block) (error "Too few arguments."))
	;;; fexpr processing would go here
        (else
          (multiple-value-bind (value remainder)
              (eval-1 block env)
            (eval-arguments-1 function remainder env (-1+ arity) (cons value arglist))))))

Note the line with asterisks. When evaluating a sequence of expressions in a block, we recursively evaluate the first expression in the sequence obtaining its value and the `remaining' part of the block. We don't know how much of the block will remain because we will `eat' arguments from the block incrementally. We only know that if there are any elements left, then we should continue evaluation of the block rather than return the value.

If this were a scheme interpreter, we would have a simple solution:

(define (eval-sequence sequence env)  ;; scheme version
  (cond ((empty-sequence? sequence) nil)
        ((at-last-element? sequence) (eval (car sequence) env)) ;; tail recurse
        (t (eval (car sequence) env)
           (eval-sequence (cdr sequence) env))))

We would tail-recurse onto eval when we reach the last element. Again, we cannot do this in REBOL because we cannot write an `at-last-element?' function.

-----

The solution

The solution seems obvious in retrospect, but it took several weeks before I could convince myself that I wasn't just missing something trivial. I tried reorganizing the code, passing back flags, etc. Finally I pulled out the big guns and considered the denotational semantics of the language.

Tail recursion comes down to this: when recursively invoking the interpreter, do not create `trivial' continuations. A trivial continuation is the `identity' function, i.e., it simply returns the value handed to it.

How does one find the trivial continuations? Convert the function to `continuation-passing style' to make the continuations explicit, then eta-reduce any continuation that you can. So a properly tail-recursive program is one in which the continuation-passing-style has been eta-reduced.

Here's the cps version of a Rebol interpreter in Scheme:

(define (eval-sequence block env k)
  (if (empty-block? block)
      block
      (eval-1 block env
        (lambda (value more)
          (eval-sequence more env k))
	(lambda (value) (k value)) ; *****
       )))

(define (eval-1 block env k-more k-last)
  (define (return value)
    (if (null? (cdr block))
        (k-last value)
        (k-more value (cdr block))))

  (let ((token (car block)))
    (cond ((quoted? token) (return (unquote token)))
          ((assignment? token) (eval-assignment token (cdr block) env k-more k-last))
          ((block? token) (return (make-closure token env)))
          (else
            (let ((value (lookup token env)))
              (if (or (fquoted? token)
                      (not (procedure? value)))
                  (return value)
                  (eval-arguments value (cdr block) env k-more k-last)))))))

(define (eval-arguments function block env k-more k-last)
  (eval-arguments-1 function block env (arity function) '() k-more k-last))

(define (eval-arguments-1 function block env arity arglist k-more k-last)
  (cond ((zero? arity) (apply function (reverse arglist)
                              (if (empty-block? block)
                                  k-last
                                  k-more)))
        ((empty-block? block) (error "Too few arguments."))
	;;; fexpr processing would go here
        (else
          (eval-1 block env
                  (lambda (last-arg)
		    (eval-arguments-1 function NIL NIL
                                      (1- arity) (cons last-arg arglist)
                                      NIL k-last))
                  (lambda (arg more)
                    (eval-arguments-1 function more env
                                      (1- arity) (cons arg arglist)
                                      k-more
                                      k-last))))))

Again, note the line with asterisks. This is the problematic continuation. It is trivial in that it simply passes its argument back to its caller, so it could be written as:

(define (eval-sequence block env k)
  (if (empty-block? block)
      block
      (eval-1 block env
        (lambda (value more)
          (eval-sequence more env k))
	k ; eta-reduction of (lambda (value) (k value))
       )))

This version of the interpreter will be properly tail recursive.

There is a surprise here. Notice that EVAL-1 (what would correspond to EVAL in a Lisp or Scheme interpreter) takes two continuations rather than the usual one. One continuation is invoked in the case where `more' of the sequence needs to be processed, the other in the case that the sequence is `all used up'. Furthemore, note that APPLY does not need an extra continuation.

-----

Implementation in C

It is often the case when you are implementing a language in terms of another that you wish to find a mapping between the host language's continuations and the target language's continuations. This allows you to use the function call and return mechanism in the host language (C) to implement the same in the target language (REBOL). Having two continuations for EVAL poses a problem. In C, there is a single continuation for each function. (Never mind the fact that C does not eta-reduce its continuations, i.e. it isn't tail recursive anyway).

It is here that Henry Baker's Cheney on the MTA trick comes in real handy. With `Cheney on the MTA', we write our C code in CPS and never return from a C function. This creates the mapping between the function calls of the host language and GOTOs (which will become calls, tail-calls, and returns) in the target language.

void eval_sequence (BLOCK b, int offset, ENVIRONMENT env, CONTINUATION_1 k)
{
    struct continuation_2 kmore = malloc (sizeof (struct continuation_2));
    kmore->function = & eval_sequence_continue;
    kmore->env = env;
    kmore->parent = k;

    eval_1 (b, offset, env, &kmore, k);
}

void eval_sequence_continue (CONTINUATION_2 self, VALUE val, BLOCK b, int offset)
{
    eval_sequence (b, offset, self->env, self->parent);
}

void eval_1 (BLOCK b, int offset, ENVIRONMENT env, CONTINUATION_2 k_more, CONTINUATION_1 k_last)
{
   OBJECT token = fetch_token (b, offset++);

   if (token_is (token, QUOTED)) {
       if (offset == length (block))
           invoke_continuation_1 (k_last, unquote (token));
           invoke_continuation_2 (k_more, unquote (token), b, offset);
       }
    ....

    else
        eval_arguments (value, b, offset, env, k_more, k_last);
}

This code never returns a value, but proceeds by explicitly creating continuation objects and passing them along to the next procedure. So what happens when we blow off the top of the stack? Well, there is nothing on the stack of interest other than the topmost frame. So we'll check for stack overflow in eval_1, and if we are nearing the end, we'll squirrel away the arguments to eval_1 in some globals, longjmp to clear the stack, and call eval_1 with the saved arguments.

This code CONSes continuations like they are going out of style. Every time we evaluate an argument, we create two continuations. Every time we return a value, we discard one continuation. To avoid interlocking with the GC for every argument evaluation, we can allocate our continuations on the stack. Essentially, we use the machine's `stack pointer' as the `heap frontier' pointer.

void eval_sequence (BLOCK b, int offset, ENVIRONMENT env, CONTINUATION_1 k)
{
    struct continuation_2 kmore;
    kmore.function = & eval_sequence_continue;
    kmore.env = env;
    kmore.parent = k;

    eval_1 (b, offset, env, &kmore, k);
}

This solves the consing problem, but now when we blow off the top of the stack, we have to evacuate the current continuation chain to the heap. We could tag every continuation on the stack so that a stack evacuator can parse the continuation chain, but there is a niftier way:

In each continuation, we place a slot that holds a pointer to a function that knows how to evacuate that particular kind of continuation. When we approach the end of the stack, we jump to the evacuation function of the topmost continuation. It will evacuate itself from the stack and recurse on its parent, etc. etc. When all continuations are off the stack, we can then do a longjmp and restart the eval.

void eval_sequence (BLOCK b, int offset, ENVIRONMENT env, CONTINUATION_1 k)
{
    struct continuation_2 kmore;
    kmore.evacuate = & eval_sequence_continue_evacuate;
    kmore.function = & eval_sequence_continue;
    kmore.env = env;
    kmore.parent = k;

    eval_1 (b, offset, env, &kmore, k);
}

-----

Issues of implementation

PROS

This is *very* portable. The code ran identically on Windows, Unix, Mac and Amiga.

CONS

A big drawback of this implementation technique can be seen in the object code. Every function call has a `preamble' that creates a `stack frame' for the function, and the return address will be saved at the call point. However, neither the return address nor the value of the previous `stack frame' pointer is ever used. The C language is creating and maintaining continuations for its own use, but we are intentionally not using them.
Another drawback is that since we only grow the stack, we may be violating assumptions that the hardware makes about stack use. If the stack buffer is large, though, this may not be a problem.
This still conses quite a bit.
Performance is so-so.
I think there is a way to show that in each continuation pair one continuation strictly contains the other. If this is the case, then there is no need to cons both, and the code could simply compute an offset to the correct continuation. I did not investigate this.
The layout of the continuations echoes the layout of the stack frames. It would be worthwhile to avoid copying data out of the arguments and into the continuation structs if possible.
You need a degree in Comp. Sci. from one of a handful of places to understand this. This turned out to be one of the biggest hurdles. No one else in the company could extend the interpreter. Nonetheless, I don't see an easy way of achieving tail-recursion otherwise.