Lessons from Advent Of Code - Part 3: In Praise of Records

.jpeg

Records were introduced in Java 16, and to show it can be done, there is a UnixDomainPrincipal record in the JDK. But that's it for the public part of the JDK. Elsewhere, I have not seen a profusion of record types either. Yet, I found records immensely useful in smaller programs, and I wonder if they are under-appreciated.

Right now, I am thinking of records every day because I am solving the 2024 Advent of Code problems in Java. It seems that most participants use Python, but I enjoy working the problems in Java. Good autocompletion in the IDE, and a powerful debugger at your finger tips when you need it. Also, with JEP 495, Java isn't as verbose as it used to be. And I use records all the time.

Moreover, this year I decided to use Java for smaller programming tasks, whenever a shell script became unwieldy. Again, records were my friend suprisingly often.

Why don't records get more love? Quite possibly, when Java programmers see

record UnixDomainPrincipal(UserPrincipal user, GroupPrincipal group) {}
record Point(int x, int y) {}

they think, meh, pairs or tuples. But there is real value.

Standard Methods for Free

With a record, you get a bunch of methods for free:

What's the use? With smaller programs, you are more likely to debug simple problems with print statements, so it is nice that toString does the sensible thing without you having to provide it. And of course, in larger programs, you may still need to log the occasional object, so it's nice not to see com.acme.MyClass@4fca772d.

Having equals properly defined means that it is easy to find instances. For example, in Day 6, you had to decide whether a walk in a maze contains a cycle. For that, you have to save your previous locations, as well as the directions you were facing. If you get back to the exact same ones, you are in a cycle.

Simply define

record Position(int row, int col, int angle)

and keep a List<Position>. Then you can call previousPositions.contains(currentPosition) and it just works. Because List<T>.contains uses T.equals for comparison, and Position.equals is defined properly.

Or better, use a HashSet<Position>. Since Position.hashCode is also properly defined, a hash set just works. Or a HashMap<Position, V> if that's what you need for your problem.

Finally, the component accessors come in handy as method expressions, for example in stream pipelines.

In Day 7, you had to figure whether certain equations had solutions. An equation has a result and a list of operands, such as

292: 11 6 16 20

Can you solve it with the + and * operators? It turns out you can: 11 + 6 * 16 + 20. The operators go left to right, not with the usual “multiplication wins over addition” rule, i.e. ((11 + 6) * 16) + 20 is 292.

The problem input is a sequence of such equations, so it makes sense to define

record Equation(long result, List<Long> operands) {}

Then one can save them all in a

List<Equation> equations;

That's better than two parallel lists:

List<Long> results;
List<List<Long>> operands;

This may seem like a weak reason to introduce a new type, but bear with me.

The problem asks to find all results that can be obtained by combining the operands with some operators. Or rather, the sum of all of the solvable results, because that's how Advent of Code works. You can only report a single value as proof of accomplishing the task.

Here is the computation. Note the method expression:

long sum = equations.stream().filter(...).mapToLong(Equation::result).sum();

As you can see, those automatically generated methods are nice to have. Better than public fields.

A Home for Methods

It never ceases to amaze me that once I group related data into a record, how that record type becomes a natural place to add methods.

I always think, whatever, it's just some data put together, and immutable, how interesting can it be? But the behavior usually emerges.

Consider those Equation instances in our list. We need to determine which of them have a solution.

That's naturally a method of Equation:

record Equation(long result, List<Long> operands) {
    boolean hasSolution(List<LongBinaryOperator> operators) { ... }
    ...
}

There is a fairly straightforward recursive way of deciding whether there is a solution—see https://github.com/cayhorstmann/adventofcode2024/blob/main/Day7.java

Have a look at the overall organization of that file.

Using a record drives you to a coherent code organization. Behavior is placed with the record, and you use it abstractly elsewhere.

What about mutability? In this example, you might not feel an urge to mutate the data structure. But consider a more basic example. With the maze traversal, I use a CharGrid class that itself uses a record:

record Location(row, col) {}

For maze traversals, you want to move along in a given direction. The traditional object-oriented approach would be a method

void moveBy(int angle)

that updates the location. But of course, records are immutable, requiring a slightly different method

Location moved(int angle)

That method yields a new Location object instead of mutating the current one.

It might seem inefficient to produce new objects instead of mutating existing ones, but the JVM is pretty good at optimizing the performance of locally used objects. Once we have value classes, it will be even better.

Enums Are Cool Too

.webp

Actually, in the preceding example of moving in a grid, you don't move along an arbitrary angle. You only move in the eight compass directions N, NE, E, SE, S, SW, W, NW.

In Python, you would probably represent them ad-hoc as 0, 45, 90, 135, 180, ..., or 0, 1, 2, 3, ...

But in Java, it is a cheap thrill to declare

enum Direction { N, NE, E, SE, S, SW, W, NW }

The point is again that this is (a) more readable and (b) a natural place to attach methods. For example:

public Direction turn(int eightsClockwise) {
    return Direction.values()[Math.floorMod(ordinal() + eightsClockwise, 8)];
}

Here is how you use it:

Direction in = ...;
Direction out = in.turn(2);

This code yields a 90° turn, which the problem asks for if the moved location would run into an obstacle.

What About Data-Oriented Programming?

Brian Goetz and Nikolai Parlog have written about Data-Oriented Programming, which uses sealed hierarchies of immutable classes and pattern matching.

That is very effective for certain data types: immutable lists and binary trees, JSON, algebraic expressions, parse trees, successful vs. exceptional completion. See Brian's article for these examples. There are some characteristics that are not present in my situation:

When it works, it works great. However, I am not convinced it is a general strategy for modeling domain data. Take Nikolai's example:

sealed interface Item permits Book, Furniture, ElectronicItem

It seems pretty unlikely that shopping software would use such a model. I suspect that useful sealed hierarchies may not be all that common in business logic. Or in coding puzzles.

For Day 9, I tried a data-oriented model for blocks in a file system:

sealed interface Block {
    int length();
}

record FileBlock(int id, int length) implements Block {}
record FreeBlock(int length) implements Block  {}

It felt unnecessarily complex. and I preferred using a single record type:

record Block(int id, int start, int length, int free) {}

Conclusion

So, that's my pitch for records and enumerations. You get two things:

Use them to make your code easier to read and maintain!

Frankly, when I compare Advent of Code solutions in Python and Java, the big difference is readability. In Python, data structures are usually modeled with ad-hoc lists, tuples, or dictionaries (i.e. maps). And ad-hoc constants instead of enums. It's concise, but not so easy to understand. There is real value in Java to give names to those types. And to have a natural home for the operations on them.

Comments powered by Talkyard.