In Praise of Language Specs

I give an example of why having a language spec builds confidence in a situation that would induce fear and trembling in a seat-of-the-pants programming language. I boldly generalize to posit that it is good to have multiple implementations of a spec, and that there should be more than one implementation of the Java platform.

When I see a language specification  like this, I run for the hills. I am a firm believer in specifications and multiple implementations. Here's a case in point. I put together an example for implicit conversions in my upcoming “Scala for the Impatient” book. 

object FractionConversions {
  implicit def int2Fraction(n: Int) = new Fraction(n, 1)
  implicit def fraction2Double(f: Fraction) = f.num * 1.0 / f.den
}

class Fraction(n: Int, d: Int) {
  val num: Int = if (d == 0) 1 else n / gcd(n, d);
  val den: Int = if (d == 0) 0 else d / gcd(n, d);
  private def gcd(a: Int, b: Int): Int = if (b == 0) a else gcd(b, a % b)
  override def toString = num + "/" + den
  def *(other: Fraction) = new Fraction(num * other.num, den * other.den)
  // other operators...
}

One always worries when there are too many conversions in and out of a particular type. If you translate the code above to C++, with two conversions

Fraction::Fraction(int n)
operator Fraction::double() const

you run into ambiguities that makes the class essentially impossible to use.

Of course, in Scala, you can turn off unhelpful conversions, by only importing the ones you want:

import FractionConversions.int2Fraction

or excluding the ones you don't want:

import FractionConversions.{fraction2Double => _, _} 

I also found reassurance in the following quote from the Odersky/Spoon/Venners book:

An implicit conversion is only inserted if there is no other possible conversion to insert. If the compiler has two options to fix x * y, say using either convert1(x) * y or convert2(x) * y, then it will report an error and refuse to choose between them. It would be possible to define some kind of “best match” rule that prefers some conversions over others. However, such choices lead to really obscure code. Imagine the compiler chooses convert2, but you are new to the file and are only aware of convert1—you could spend a lot of time thinking a different conversion had been applied!

That's great. And it is technically, literally true. But it doesn't mean you won't ever spend a lot of time thinking which conversion has been applied. Look at this:

val f = new Fraction(3, 4)
f * 5

What is it?

f * int2Fraction(5)

or

fraction2Double(f) * 5

It could be either, right? So it's ambiguous. So it won't compile, right? Except it does.

The compiler sees a * method applied to a Fraction. The parameter type is wrong, but it can be patched up, so that's what it does: f * int2Fraction(5)

Now look at the opposite:

5 * f

The compiler sees a * method applied to an Int. The parameter type is wrong, but it can be patched up, so that's what it does: 5 * fractionToDouble(f).

How can I make it ambiguous? Like this, surely:

def mul(a: Double, b: Double) = a * b
def mul(a: Fraction, b: Fraction) = a * b

Nope. mul(f, 5) yields a Fraction(15, 4) without a murmur. Huh? Aren't there two possible conversions? mul(fraction2Double(f), 5.toDouble) and mul(f, int2Fraction(5)). Shouldn't it be ambiguous? I had a hard time reading the spec (the infamous section 6.26), so I started an email thread. People had various conflicting theories. None of them were able to account for the fact that the seemingly identical

def mul(a: Double, b: Float) = a * b
def mul(a: Fraction, b: Fraction) = a * b

is ambiguous.

Daniel Sobral figured it out, not by reasoning from experience or common sense, but by reading the spec. When choosing among overloaded methods, Scala prefers the most specific.

Why is mul(Fraction, Fraction) more specific than mul(Double, Double)? Consider a call mul(0.5, 0.5). You can't use the first method—there is no DoubleFraction conversion. But with mul(f, f), either one will work. So, the second version is more general—it works in strictly more cases. The first version is more specific. More specific is considered better.

That's perhaps more intuitive in the case of inheritance. You'd want to prefer a fun(Person) over a fun(Object) when the parameter is a Person or Student. Specific is better.

If you think this is yet another proof that Scala is more complex than Java, click here and weep. The Java spec is strictly more complex in this regard.

What's the point? In a language that isn't formally specified, these rules can and do change on a whim, as implementors fine-tune the compiler to achieve this or that pretty effect. You have no recourse if your code breaks as a result.

In Java and Scala, there is a language specification, and the behavior isn't likely to change, except by a conscious effort. If something doesn't work according to spec, I can file a bug, and there is no discussion whether it is a bug or not.

Some time ago, there was a discussion in the Java Champions mailing list whether there was any value to having multiple implementations of the JDK. Some people thought it was fine that the open-source implementors had only one choice—take OpenJDK and tweak it. Me, not so much. I am a huge fan of multiple implementations. It puts pressure on the spec authors to separate essential and ephemeral complexity, and, of course, it contributes to specs that are comprehensible and implementable. Nobody would dream of having just one implementation of HTML or C++, so why be satisfied with one implementation of the Java platform?