Checking Code Samples


In this blog, I describe the process that I use to extract sample programs from my latest book to make sure that the code works. Read on if you want to create an automated work flow for checking code that you describe in your blogs, slides, and so on.

Authors—How do you handle code samples?

Ben Evans asked on the Java Champions mailing list how professional authors handle code samples. How do you get the code from the repo into your article or book? How do you handle editing for brevity—shorter names, dropped modifiers/annotations (final, @Override), elided code?

Respondents chimed in about putting the code in a repo for readers to clone, pulling in code snippets with AsciiDoctor, unit testing, continuous integration. Only one person admitted to copying code manually into a Word document.

I felt thoroughly inadequate. I have written programming books for almost 30 years. I never came up with a process for ensuring that every snippet of code in my books compiles without error.

For a long time, I had an excuse. The books were produced with automation-hostile technology (Word, FrameMaker, InDesign). But nowadays, all my books are in XHTML, so it is time to step up my game.

There is another problem—scale. My latest book, Modern JavaScript for the Impatient, has over 200 sections, each deserving of a sample program, and well over 1000 code snippets. And that is one of my shorter books.

I had already written the book when I read this message thread, so I decided to go the unconventional route—from the book to runnable code.

Code to Book

Suppose you have a bunch of source code that compiles and passes tests. Copying and pasting is fine for a blog, but it doesn't scale to a book. You need to automate the process. For example, in ASCIIDoc, you can tag source regions with comments and pull them into your document. Here are instructions.

I have a problem with that approach. In a book, computer code isn't (or shouldn't be) a block of black or syntax-colored monospaced letters.

Sometimes I want to highlight a part of code:

const PI = 3.141592653589793;

Sometimes I want to gray out unimportant detail:

const PI = 3.141592653589793;

And consider comments:

const PI = 3.14; // You can't change this sucker even though you'd want to

Comments are text, so they should appear in a font that is suitable for reading text. You, dear reader, would not enjoy reading this blog post in its entirety in a monospaced font, so why accept it for comments?

Of course, if the comment contains computer code, that should be monospaced:

const PI = 3.14; // Don't do this—just use Math.PI

Another issue is eliding code. In a book, you might write:

class Duck {  
  walk() { . . . }
  quack() { . . . } 

In your testable code, there would be some actual code instead of the . . . ellipses.

It is possible to come up with a tagging system that injects , but it would be tedious to use. And is not intrinsically error-free. If you mess up those directives, you might be creating the wrong document. The hope is that reviewers would notice. But they might not.

Book to Code

With my JavaScript book, I went the other way around. It was easy to copy code snippets out of the book, dropping the visual adornments. I just used an HTML parser and fetched the text content. Had I written the book in Markdown or AsciiDoc, it might have been a bit harder to do that.

To make “Book to Code” work, you have to provide directives for assembling the code from book fragments and code snippets (for printing test message, to filled in . . . ellipses). I chose to make a file for each sample program, with directives such as

//INS sec2-2
//INS sec2-3
//CAP Creating two employees
const harry = createEmployee('Harry Smith', 90000)
const sally = createEmployee('Sally Lopez', 100000)
//LOG harry, sally

My quick and dirty tool produces a file with the text contents of the elements with ID sec2-2 and sec-3, followed by the given lines of code and instructions to show the contents of the variables:

const employeePrototype = {
  raiseSalary: function(percent) {
    this.salary *= 1 + percent / 100
function createEmployee(name, salary) {
  const result = { name, salary }
  Object.setPrototypeOf(result, employeePrototype)
  return result
console.log('// Creating two employees')
const harry = createEmployee('Harry Smith', 90000)
const sally = createEmployee('Sally Lopez', 100000)
console.log('harry:', harry) // { name: 'Harry Smith', salary: 90000 }
console.log('sally:', sally) // { name: 'Sally Lopez', salary: 100000 }

The last two lines are the result of the directive LOG harry, sally.

Running that program yields the output:

// Creating two employees
harry:  { name: 'Harry Smith', salary: 90000 }
sally:  { name: 'Sally Lopez', salary: 100000 }

The CAP and LOG instructions add a bit of visual guidance to the output which can otherwise look pretty tedious.

Also note that the generated sample program shows the actual output as comments. I find that this makes the program easier to read. My tool runs the program, captures the output, and then injects it into the final program version.

Unit Tests

One reviewer suggested that I produce unit test files for all program code, but I ended up not doing that. A complete beginner in a new language may not want to figure out how to install a unit test library before reading the first chapter. And unit tests can be verbose.

Instead, I made the LOG directive smarter. Optionally, one can add the expected output:

//LOG result // 42

If the program doesn't produce 42 when logging result, my tool generates an error, and I get busy fixing it.

I didn't even have to put in so many expected outputs. The book contains a good number of examples like this:

'42' < 5 // false'42' is converted to the number 42
'' < 5 // true'' is converted to the number 0
'Hello' <= 5 // false'Hello' is converted to NaN
5 <= 'Hello' // false'Hello' is converted to NaN

The directive

//LOG sec6-1

yields a call to print each expression, and the value is compared to the value after the comment. That seemed a pretty effective alternative to unit tests.

Intentionally Wrong Code

Every once in a while, the book has an example of code that doesn't work, such as the following JavaScript snippet:

x = a
(console.log(6 * 7))

The directive

//ERR sec2-5

produces code

try {
  x = a
  (console.log(6 * 7))
} catch (exception) {
  console.log('Error:', exception.message) // a is not a function

The try/catch block and the comment that show the actual exception are added automatically.

In JavaScript, almost all errors are runtime errors, and this approach worked well. For other languages, one would also need to catch compiler messages.

Did it Work?

I generated 213 sample programs that contained 1112 code snippets from the book. Each sample program has extracted code snippets, together with variable initializations, logging calls, and comments to orient the reader. I felt that was less painful than going the other direction—writing 213 sample programs with directives for extracting 1112 snippets. And it was infinitely better than copying and pasting 1112 snippets by hand.

I found 51 syntax errors—mostly mundane stuff such as toUppercase instead of toUpperCase, forgotten extends clauses, or inconsistently named variables. A couple of the issues were more serious, and it would have been embarrassing to have them appear in the final book.

The unit testing feature found one error.

Overall, I found it a worthwhile investment. Contact me if you need to do something similar and would like my tool as a starting point.

Comments powered by Talkyard.