In the 13th edition of Core Java, the sample web sites that I used for form posts no longer exist, so I decided to provide my own, and also support file upload. But my file upload example failed mysteriously. Here is what I learned about HTTP and the standard Java HttpClient
.
HttpClient
APIEver since Java 1.0, the Java API had an URLConnection
class for making HTTP requests. It requires an oddball sequence of calls to make a POST request:
url.openConnection()
to get a connectionsetDoOutput(true)
on the connectiongetOutputStream()
To post a form or file upload, you have to manually encode the request body.
No wonder that many programmers use a more convenient library such as Apache HttpComponents.
In Java 11, the HttpClient
class promised three improvements:
The modern HTTP client lives up to the first two promises. What about the third? Let's look at the API.
HttpClient client = HttpClient.newHttpClient()
HttpRequest request = HttpRequest.newBuilder() .uri(new URI(urlString)) .header("Content-Type", "application/json") .POST(HttpRequest.BodyPublishers.ofString(jsonString)) .build();
HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
Read the response:
String bodyString = response.body();
All good and well, except, what's the deal with those body publishers and handlers???
The request content might not be in a string, but it could be in a file or an input stream or a byte array. That's where the BodyPublisher
comes in. The BodyPublishers
helper class lets you wrap a string, a file, an input stream, a byte array, or a sequence of byte arrays, or a sequence of body publishers, into a BodyPublisher
.
What about form posts or file uploads? No, you still have to format those requests by hand.
In many years of web programming, I have exactly never had a request body that was an input stream or byte array. It's always been a JSON post, a classic form post or a file upload.
The Java API doesn't have support for JSON, perhaps because we are still waiting to do this efficiently with value objects. And it doesn't provide a BodyPublisher
for form posts or file uploads either. I don't know why, and I added implementations in Core Java.
With body handlers, it's not so bad, but maybe strings could have been the default? A handler with a JSON result would also be nice, but the Java API doesn't yet have support for JSON.
In the Java 17 edition of Core Java, I provided three examples:
So I decided to take matters into my own hands and added a simple service to the CodeCheck autograder. It runs a submitted Java program with given input and returns the output. You can submit the program and the input via JSON, a form post, or a file upload.
It is easy enough to make a body publisher for the application/x-www-form-urlencoded
content type:
public static BodyPublisher ofFormData(Map<Object, Object> data) { boolean first = true; var builder = new StringBuilder(); for (Map.Entry<Object, Object> entry : data.entrySet()) { if (first) first = false; else builder.append("&"); builder.append(URLEncoder.encode(entry.getKey().toString(), StandardCharsets.UTF_8)); builder.append("="); builder.append(URLEncoder.encode(entry.getValue().toString(), StandardCharsets.UTF_8)); } return BodyPublishers.ofString(builder.toString()); }
Which makes it all the more surprising that the Java API doesn't do it.
Tangential note: The StandardCharsets.UTF_8
is still there since URLEncoder
didn't get the memo on JEP 400: UTF-8 by Default.
File upload turned out to be fiddlier. I made a subtle mistake in the previous edition of Core Java, which that data URI service forgave. But not the framework that my CodeCheck server uses. It came back with
<h1>Bad Request</h1> <p id="detail"> For request 'POST /run' [Unexpected end of input] </p>
What was bad? The server was coy about it. And it could not show the body of an incoming request that had a parse error.
Perhaps the server implementors had thought that malformed requests would be rare. They may have never contemplated that some HTTP client APIs force its users to handcraft the fiddly details of file upload.
What about the client API? Can it show the request body? Nope, only the headers. Comment in this StackOverflow post: “Bang up job, Oracle!”
There is a reason for this. Someone would have to subscribe to a flow and figure out how to turn the flow messages into a comprehensible log. That Flow
abstraction is there for some technical optimization, not for the benefit of the API users. And perhaps the implementors found it tedious too, or they would have done the logging.
The next step in debugging is to use some simple local HTTP server that prints out the requests. Sadly not the JEP 408: Simple Web Server, which can only handle GET requests. The JavaScript universe has many choices, such as this one. Or you can just use netcat:
nc -kl 8888
I did that. And was very surprised about the results.
I used a BodyPublishers.ofByteArrays
with a list of byte[]
for the bits and pieces of the file upload protocol. The netcat output looked like this:
POST /run HTTP/1.1 Connection: Upgrade, HTTP2-Settings Host: localhost:8888 HTTP2-Settings: AAEAAEAAAAIAAAABAAMAAABkAAQBAAAAAAUAAEAA Transfer-encoding: chunked Upgrade: h2c User-Agent: Java-http-client/17.0.5 Content-Type: multipart/form-data; boundary=7693dafd05a2418c80cbb970ef8d8ec6 49 --7693dafd05a2418c80cbb970ef8d8ec6 Content-Disposition: form-data; name= 36 "Input"; filename="Input" Content-Type: text/plain 1 2 49 --7693dafd05a2418c80cbb970ef8d8ec6 Content-Disposition: form-data; name= 4b "HelloWorld.java"; filename="HelloWorld.java" Content-Type: text/x-java 97 // Our first Java program public class HelloWorld { public static void main(String[] args) { System.out.println("Hello, World!"); } } 2 26 --7693dafd05a2418c80cbb970ef8d8ec6--
There were the bits and pieces of the multipart form-data protocol, with its characteristic boundaries. But what about those numbers, pretty clearly the length of the fragments? I had never seen this before, but it is perfectly legitimate. It is called chunked transfer encoding.
Which explains why my implementation worked with some web servers. But why not with mine?
I lamented to my long-suffering wife that I tried everything, looked at every single boundary and counted every last newline. She was annoyed.
But wait. Newline? This is HTTP. One of the last places where lines end in CRLF. Sure enough, when I changed all my \n
to \r\n
, my server was happy.
HttpClient
to send and receive JSON, it's a minor nuisance that you have to deal with the string body publisher and handler. application/x-www-form-urlencoded
) or file upload (multipart/form-data
), you need to manually encode the request body. What a pain. The Java 21 edition of Core Java shows you how.HttpClient
traffic, it is nice that you can log headers. As for request bodies, use an echo server, or simply netcat.Flow
that are of no interest to API users, maybe you are on the wrong track? And if you make common use cases tedious, that's not so good either.I realize that this may come across as unkind to the Java API designers. But still, the good news is that there is a standard API in the platform. In other ecosystems, I have found myself ponder the relative merits of a number of half-baked and poorly supported libraries.
Comments powered by Talkyard.