JavaDoc Code Snippets and Friends

.webp

It is easy to include buggy code snippets into code documentation. Java 18 introduces an @snippet tag with which you can pull snippets from (hopefully) working source code. I explain how that works, and also look into a couple of alternative approaches for other forms of technical documentation.
This article also appeared in the JVM Programming Advent Calendar.

“Inline” Snippets

When you include code in a document, it can be inline /* like this */ or, in typesetting speak, displayed:

/* like this */

CSS calls these inline and block display.

In HTML, you use the code tag for inline code and pre for a code block. In JavaDoc, the {@code ...} tag produces <code>...</code> in HTML, with an added benefit: The contents of the tag can contain left angle brackets < and ampersand &. You don't escape them as < and &.

For code blocks in JavaDoc, the preferred approach, up to now, has been to use

/**
 * ...
 * <pre>{@code
 *   ...
 *   ...
 * }</pre>
 */

That way, you also don't have to escape < and &.

As of Java 18, you can instead use the @snippet tag:

/**
 * ...
 * {@snippet :
 *   ...
 *   ...
 * }
 */

The text between the newline following the colon and the closing brace is enclosed in a pre tag. As with @code, leading whitespace and * are removed. (You don't need to start each line with a *, but IDEs typically provide them automatically.) The remaining multi-line text has its common indent stripped, exactly like a text block.

A couple of caveats: In @snippet, just like in @code, braces { } must balance. As with all JavaDoc, you cannot have /* */ comments because the */ would close the JavaDoc /** comment opener. In Java, comments do not nest, and JavaDoc cannot change the processing of comments.

If you are familiar with the typesetting terminology, you may be irritated that JEP 413 refers to these as “inline snippets”, even though they yield code blocks, not inline code.

Even more confusingly, the JavaDoc documentation uses “inline” in a different way. It distinguishes between block tags such as @param, that appear at the start of a line, and inline tags such as @code{ } that can appear anywhere. In that sense, @snippet is an inline tag. If it contains code (and not a reference to external code), it is inline in both senses, as a tag and a snippet.

So far, this is not a monumental advance. The real power of snippets is to pull in code from external source files.

External Snippets and Regions

.jpg

Now on to the interesting part. Suppose you have a code example that you want to include in your documentation. To make sure it is correct, you first make a unit test:

public class MarkdownBuilderTest {
   @Test void testCodeDelimitedByBackticks() {
      var builder = new MarkdownBuilder();
      builder.code("`Hello, ${name}!`");
      var result = builder.toString(); // `` `Hello, ${name}!` ``
         // Why this madness? See
         // https://meta.stackexchange.com/questions/82718/how-do-i-escape-a-backtick-within-in-line-code-in-markdown
      assertEquals(result, "`` `Hello, ${name}!` ``");
   }
   ...
}

Of course, you don't want to copy and paste the code into your JavaDoc. What if the API changes later? Instead, a snippet can read a region from a file:

/**
 * Adding an inline code element:
 * {@snippet file=MarkdownBuilderTest.java region=code-with-backticks}
 */

To mark the snippet in the source file, use markup comments:

public class MarkdownBuilderTest {
   @Test void testCodeDelimitedByBackticks() {
      // @start region=code-with-backticks
      var builder = new MarkdownBuilder();
      builder.code("`Hello, ${name}!`");
      var result = builder.toString(); // `` `Hello, ${name}!` ``
      // @end
         // Why this madness? See https://meta.stackexchange.com/questions/82718/how-do-i-escape-a-backtick-within-in-line-code-in-markdown
      assertEquals(result, "`` `Hello, ${name}!` ``");
   }
   ...
}

The JavaDoc tool locates the file, copies the region, and strips the common indent.

The region can contain arbitrary code. The braces don't have to match, and there can be /* ... */ comments.

Ok, not entirely arbitrary. The code cannot contain a comment of the form // @end 🙂

A file can have any number of regions. You can even have overlapping regions. Then the end comments must have the form

// @end region=regionName

If you want to include an entire source file in a snippet, you don't specify a region:

{@snippet file=MarkupBuilderTest.java}

Instead of the file attribute, you can use a class attribute and specify the class name:

{@snippet class=com.horstmann.test.MarkupBuilderTest}

Often, the files containing the snippets are not a part of the code that implements the features that you are documenting. In that case, you have two choices. You can place a file in a subdirectory snippet-files of the package to which the class belongs. (Note that due to the hyphen, snippet-files cannot be a part of a package name.) Or you can place them elsewhere, and invoke javadoc with the --snippet-path command-line argument, passing a list of directories separated by the platform path separator.

Highlighting

.jpg

Sometimes you want to emphasize or highlight a part of your snippet. Use a @highlight markup comment to specify the highlighted parts. For example:

builder.code("`Hello, ${name}!`"); // @highlight substring=code

Any matching substrings are displayed in bold:

builder.code("`Hello, ${name}!`");

This directive affects only the line preceding the comment. Instead, you can place the comment above the affected line:

// @highlight substring=code :
builder.code("`Hello, ${name}!`");

Note the colon at the end of the comment line. This form is useful if the subsequent line is long, and it may be necessary if the subsequent line cannot have a trailing comment, such as a text block.

To apply a @highlight directive to multiple lines, define a region, like this:

// @highlight region substring=code
...
// @end

You can name the region if you like:

// @highlight region=highlight-code substring=code
...
// @end region=highlight-code

Caution: You cannot use this name to import the region as an external snippet. Only regions marked up with @start region=regionName can be imported.

If the substring (or in general, any attribute value of a markup comment) contains spaces, enclose it in single or double quotes. Here we need to use single quotes because the string contains double quotes.

builder.code("`Hello, ${name}!`"); // @highlight substring='"`Hello, ${name}!`"'

Now the string "`Hello, ${name}!`" will be emphasized, including the quotation marks.

What if the substring to match contains both single and double quotes?

builder.code("`G'day, ${name}!`");

Then you can no longer use substring, but you can match with a regular expression. See the following section.

You can choose among three types of emphasis: type=bold (the default), type=italic, or type=highlighted. The type name becomes the class attribute of the generated HTML. Their appearance can be modified by adding or replacing the standard stylesheet (with the --add-stylesheet or --main-stylesheet option of the javadoc tool).

Regular Expressions

.jpg

To select a regular expression, use the regex attribute instead of substring:

// @highlight regex=\bcode\b

This matches all occurrences of code with word boundaries. If the string code occurs in a larger word, such as encode, the initial \b prevents a match.

Note that you do not escape the backslashes, even if the regular expression is enclosed in quotes:

// @highlight regex="\bcode\b"

As another example, we might want to match quoted strings:

// @highlight regex='"[^"]*"'

Here I had to enclose the regular expression in single quotes because it contains double quotes.

I know...if the quoted string contains \" escapes, that regular expression isn't good enough. See this SO discussion for delightful improvements.

In a regular expression, you can use Unicode escapes. In particular, you can use \u0027 or \u0022 to denote single or double quotes. To match the pesky string from the preceding section, use

// @highlight regex='"`G\u0027day, ${name}!`"'

Caution: Both javac and javadoc process Unicode escapes in source files before lexical analysis. However, javadoc does not process Unicode escapes in external snippets. In the preceding example, the six-character sequence \u0027 is a part of the regular expression.

Linking and Replacing

.jpg

Use a @link markup comment to add links to the generated JavaDoc:

builder.code("`Hello, ${name}!`"); // @link substring=code target=MarkdownBuilder#code

The target attribute value uses the same format as the {@link ...} tag for links outside snippets.

Instead of the substring attribute, you can use a regex attribute to specify the range of the link.

Sometimes, you want to elide inessential detail in the JavaDoc. This is achieved with the @replace markup comment:

builder.code("`Hello, ${name}!`"); // @replace regex='".*"' replacement='"..."'

If the regular expression has groups, you can reference the group matches in the replacement text, using the general rules for Java regex replacement:

System.out.println("Hello, World!"); // @replace regex='"(.{3}).*(.{3})"' replacement='"$1...$2"'

The replacement contains the first and last three characters in the string, yielding:

System.out.println("Hel...ld!");

Conversely, you need to escape $ and \ in the replacement string with another backslash.

As with @highlight, you can scope @link and @replace to a region:

// @link region=link-code substring=code target=MarkdownBuilder#code

Other File Types

.jpg

Snippets don't have to be Java code. The standard doclet also supports the properties format. Here is an inline snippet:

/**
 * ...
 * This program writes a file such as the following:
 * {@snippet  :
 *    #Program Properties
 *    #Sun Dec 1 12:54:19 PST 2022
 *    top=227.0
 *    left=1286.0
 *    width=423.0
 *    height=547.0
 *    filename=/home/cay/books/cj12/code/v1ch09/raven.html
 * }
 */

You can import an external properties file:

/**
 * ...
 * {@snippet file=sample.properties}
 */

In the properties file, you can use regions, highlights, links, and replacements, using markup comments that start with a # symbol:

# @highlight regex=[0-9]+\.[0-9]* :

Caution: Properties files don't allow trailing comments. Comments must span an entire line. Either use the colon syntax, applying the markup comment to the next line, or use a region.

Hybrid Snippets

.jpg

Inline snippets are convenient for the JavaDoc author because the code is right there to see, in the source file. But they are only good for eye candy—highlighting and linking. External snippets allow for testing the snippet code. But they don't show up in the source file that is being documented. Personally, I don't see that as a huge problem. You can always generate the JavaDoc and look at that. But for those JavaDoc authors who would like to see the imported code, there are hybrid snippets.

A hybrid snippet is both inline and external:

/**
 * Adding an inline code element:
 * {@snippet file=MarkdownBuilderTest.java region=code-with-backticks :
 * var builder = new MarkdownBuilder();
 * builder.code("`Hello, ${name}!`");
 * var result = builder.toString(); // `` `Hello, ${name}!` ``
 * }
 */

Note that the @snippet tag has both an external reference and a body (between the colon and the closing brace).

If the inline and external code do not match, javadoc reports an error.

The Compiler Tree API

So far, I have focused on a workflow where the snippet code is contained in external source files that are tested independently. Suppose conversely that a JavaDoc author prefers inline snippets. Then it would be useful to have a tool that extracts the inline snippets and injects them into external code, such as unit tests. The JDK doesn't contain such a tool, but it could be built with the snippet support in the Compiler Tree API. Java 18 adds a https://docs.oracle.com/en/java/javase/18/docs/api/jdk.compiler/com/sun/source/doctree/SnippetTree.html SnippetTree node to the Compiler Tree API. If you are interested in exploring this, have a look at this overview.

Scala mdoc

The snippet support is a good solution for pulling in and formatting code into JavaDoc. It does not by itself run or verify the code. The Scala mdoc tool uses a different approach. It executes code snippets and pastes their results into the documentation.

Scala mdoc is based on Markdown, not HTML, which is just a sign of our modern times, not a fundamental difference. (When JavaDoc was first conceived, the birth of Markdown was still a decade away.)

Scala mdoc is not limited to producing API docs, but it can be used for arbitrarily structured documentation such as tutorials. I don't want to dwell on the details. Here is the one interesting point: mdoc runs code snippets and splices the results into the documentation. If you write:

```scala mdoc
val x = 1
List(x, x)
```

the Markdown snippet is transformed to

```scala
val x = 1
// x: Int = 1
List(x, x)
// res0: List[Int] = List(1, 1)
```

This is similar to running the code in the Scala REPL (with some subtle differences).

In Scala, this can be very useful. One can often describe an API with well-chosen examples of method invocations and their outputs. By using mdoc, one knows that the examples compile and that the documented results are always up to date.

There is also support for code snippets that shouldn't compile or that throw an exception. You can hide setup code that is necessary for the code snippets to run.

Would something similar be useful for Java documentation? Code snippets could be piped into JShell and their values could be integrated into the documentation. It is certainly something to think about.

Articles and Books

.jpg

When writing an article or book about programming, one is faced with the same problem as the API doc authors—to make sure that the code snippets actually work.

AsciiDoc has support for importing regions from external code files that is similar to JavaDoc snippets.

Github-flavored Markdown lets you import regions by line number, provided the file is stored on Github. That's of course more limiting since the line numbers are not stable.

When I write books, I follow the inline snippet approach. I include the code in the manuscript, so that I can easily read and edit it. I use a very simple tool to extract the snippets into source files. For each source file, I have a template that splices the snippets into the source file. Here is an example from the third edition of Scala for the Impatient:

//SRC ../html/ch02.html
//OUT ch2/src/main/scala/section4.worksheet.sc

// print is like println, but doesn't add a newline: 
//INS #io-1

// You can use string concatenation for complex outputs:
//INS #io-2

// For formatted output, use the f interpolator: 
val name = "Fred"
val age = 42
//INS #io-4

// The raw interpolator:
//INS #io-4b

// Double an actual dollar in an interpolated string: 
val price = 19.95
//INS #io-4c

I write in HTML and the snippets are identified by their id attributes. In Markdown, one could use a fence label such as ```scala snippet #io-1. Such a template processor is easy to write. I describe the design in more detail in this blog.

You now know how to improve your JavaDoc with code snippets. Hopefully I have given you some food for thought how to use a similar approach for other technical writing, and a glimpse of what may become possible in the future.