What's so Taxing about Return?

Some responses to my blog on Neil Gafter's closures talk showed concern with the handling of the return statement in BGGA closures. Since I am done with my tax return, I am blogging about the intricacies of the return statement inside closures.

??? The Dual Role of return

Many programming languages get along just fine without a return statement. In Pascal, for example, the return value is assigned to a dummy variable whose name equals the name of the function.

function foo(arg: integer): integer;
begin
   (* compute return value *)
   foo := retval;
   (* maybe some more cleanup *)
end;

In Scheme, the body of a function is a sequence of expressions, and the value of the last expression is returned as the result of the function call:

(define foo (lambda (arg) expr1 expr2 .... retvalExpr))

And, of course, in assembly, the return value is simply deposited in a register :-)

movl retval, %eax

In Pascal and Scheme, you return to the caller when you reach the end of the function. There is no equivalent to the “quick and dirty”

if (somethingAbnormal) return null;
// more work in the normal case...

This shows that there are really two aspects of returning:

???Closures

A closure is a block of code, packaged up for execution at a later point. When it executes, all the references to the surrounding code should just work as if the code had executed in the defining scope.

Here is a typical example, using the BGGA 0.5 syntax (which, like all proposal syntax, is highly subject to change):

public static void main(String[] args) {
    JFrame frame = new JFrame();
    JButton button = new JButton("Click me!");
    frame.add(button);
    int counter = 0;
    button.addActionListener({ ActionEvent e => 
        counter++; 
        frame.setTitle("Clicked " + counter + " times."); });
    frame.pack();
    frame.setVisible(true);
}

When the button is clicked, the closure gets called, the counter variable in the enclosing scope is updated, and the frame title is set to reflect the click count.

Wait, there is a problem here. By the time the button is clicked, main has terminated and the local variable counter is dead and gone.

Actually, though, the closure will capture a reference to a new int[1] containing the counter, and of course, a reference to the JFrame object.

All this can be done with any of the various closure proposals, by gussying up inner classes with the ability to capture non-final locals.

Unlike some other closures proposals, BGGA goes further and says that the closures also need to capture the meaning of execution transfer statements, i.e. break, continue, and return. (What about throw? That's never statically typed, so we don't expect to capture it.)

At first glance, this seems like an odd thing to do. When the action listener executes a break statement, surely we don't want to go back in time and revive the main method (at least not until the proposal to add continuations to Java :-))

But it comes in handy for another use of closures: programmer-provided control statements. Let's say I want to provide an easy way of iterating over a matrix:

for each(int i, int j: matrix) { // look, ma, no matrix[i].length!
    if (matrix[i][j] == 0)
        continue;
    . . .
}

This actually means: Pass matrix and the closure

{ int i, int j =>
    if (matrix[i][j] == 0)
        continue;
    . . .
}

to the each method:

public static void each(int[][] a, { int, int => void } block) {
    int i = 0;
    int j = 0;
    for (; i < a.length; j = (j == a[i].length ? 0 : j + 1), i = (i + ((j == 0) ? 1 : 0))) {
        block.invoke(i, j);
    }     
}

Of course, now continue should mean the right thing: continue after the block, with the next iteration of the for statement. (Sorry about the tortured logic in the for update; one must use an assignment, increment, method call, or new expression. I suppose I could use a closure invocation { => if (j < a[i].length) j++; else { i++; j = 0; }}.invoke()...)

???The Point of No Return?

Back to the topic of returns. A closure returns a value (if it has a result type). For example,

{ int x, int y => Math.max(x, y) }

returns an integer, the max of its parameters. But if a closure contais a return statement, that means to return from the enclosing block. For example,

{ int x, int y => return Math.max(x, y); }

is a closure with return type void that, when invoked, causes its caller to return the max of the parameters (or, presumably, if the caller can't return an int, throw an exception).

Several commentators to my earlier blog point to this issue as the Achilles heel of the BGGA proposal.

More unhappily, the closure

{ int x, int y => Math.max(x, y); }

computes the max of its parameters, discards it, and returns no value.

When I heard about that, my gut reaction was fear...the fear of students queuing up for my office hours.

Ultimately, the culprit is the dual nature of return: yielding a value, and jumping to the end of the method code. In Pascal or Scheme, none of this is an issue. These languages have no return (or break or continue) to worry about.

???Many Happy Returns

Let's try to throw some syntax at this. A BGGA closure body is a sequence of statements followed by an optional expression. Maybe that's too subtle. Let's make the return expression more prominent. Something like

{ int x, int y => int : /* 0 or more statements */ => Math.max(x, y) }

I already see the line outside my office getting shorter. (I suppose it would also allow early returns from a closure, but I don't want to go there...That's what got us in trouble in the first place.)

But I have to agree with Stephen Colebourne that there are two entirely separate use cases here.

When one uses a closure for a control abstraction, the return type must be void since the closure denotes a statement. And it is pretty clear that return means to return from the enclosing scope.

When one uses a closure as a callback, to be invoked at a much later time, does one ever want to capture the enclosing semantics of return? I don't think so, but I will find out soon enough if I am wrong...

It would make sense to differentiate these use cases.

In a control abstraction, the programmer doesn't provide an explicit closure, but the compiler puts together a parameter list and a block. Conversely, when passing a closure to a callback, the programmer does the { ... => ... } thing. So, we can tell them apart.

In the first case, the block can contain return, break, continue, labeled break, etc. Pile it on!

In the second case, none of them should be allowed. It's just a syntax error. That should take care of the line outside my office. Students can wrestle with the compiler—what gets them is code that compiles and does the wrong thing.

This is almost the same as the RestrictedFunction interface, except that you can capture non-final locals. It is also somewhat related to the distinction in BGGA control abstractions. The for control abstractions allow break and continue, whereas other control abstractions don't.

If I understand the FCM/JCA proposal correctly, they have essentially the same solution. But the meaning of return changes from one use case to the other. I am not sure that's such a good idea. But again, it's just syntax.

I am a total amateur at this, of course, as I and the world are sure to find out from the blog comments in a few hours. But it seems to me that after the tweaks that are sure to come, BGGA will differ very little from FCM/JCS, except for the issue of method literals. (These may be nice to have for other purposes. I'll warm up to them if someone can show me how they solve my pet peeve: property boilerplate.)