Happstack and Streaming: Part 4: The Flaw

The Fatal Flaw

All three approaches to generating a lazy ByteString from the IO monad actually do work, as you can verify by loading the source code into ghci and invoking them manually. However, if you go through the web server and visit one of the finite stream paths, no output will appear until the stream has finished being generated, at which point the entire set of output from the server will arrive all at once, like so:

paul@queeg:~/tmp$ GET -es http://localhost:8000/pipe/limited

(nothing happens for a while, and then…)
200 OK
Connection: close
Date: Mon, 18 Jan 2010 13:30:55 GMT
Server: Happstack/0.4.1
Content-Type: text/html; charset=utf-8
Client-Date: Mon, 18 Jan 2010 13:30:58 GMT
Client-Response-Num: 1
2010-01-18 13:30:55.342444 UTC
2010-01-18 13:30:55.443235 UTC
2010-01-18 13:30:55.544115 UTC
2010-01-18 13:30:55.644934 UTC
2010-01-18 13:30:55.745514 UTC
2010-01-18 13:30:55.846283 UTC
2010-01-18 13:30:55.947581 UTC
2010-01-18 13:30:56.048834 UTC
2010-01-18 13:30:56.150068 UTC
2010-01-18 13:30:56.251281 UTC
2010-01-18 13:30:56.352521 UTC
2010-01-18 13:30:56.454423 UTC
2010-01-18 13:30:56.555816 UTC
2010-01-18 13:30:56.657179 UTC
2010-01-18 13:30:56.758547 UTC
2010-01-18 13:30:56.859939 UTC
2010-01-18 13:30:56.961296 UTC
2010-01-18 13:30:57.062581 UTC
2010-01-18 13:30:57.163847 UTC
2010-01-18 13:30:57.265212 UTC
2010-01-18 13:30:57.366438 UTC
2010-01-18 13:30:57.4677 UTC
2010-01-18 13:30:57.569059 UTC
2010-01-18 13:30:57.670414 UTC
2010-01-18 13:30:57.772045 UTC
2010-01-18 13:30:57.873404 UTC
2010-01-18 13:30:57.974761 UTC
2010-01-18 13:30:58.076117 UTC
2010-01-18 13:30:58.177485 UTC
2010-01-18 13:30:58.278847 UTC

The point where the delay occurs reveals what’s going on — not even the headers are getting sent out until the entire response has been generated. That’s not Happstack’s doing; it’s the buffering happening inside the networking library. In the absence of any command to send data out immediately, it’s going to wait until it has a large chunk it can send out immediately. Sending one large packet instead of lots of small packets makes more efficient use of network bandwidth, since each packet carries its own overhead. And since Happstack wasn’t written with streaming in mind, it doesn’t flush the buffer until it’s written out the complete response.

As further evidence, the infinite streams do stream, kind of. Once the buffer fills up, the networking library will push it out to make room for more data. As a result, nothing will arrive for a while, then all of a sudden lots of data will arrive, then another long pause as the buffer fills back up, then another big chunk of data, and so on.

This alone shows why, despite our best efforts at cleverly creating the response, it’s all for nothing unless we can control the buffering behavior down in the network library, which Happstack doesn’t provide any access to. The only exception would be if we’re trying to stream data quickly enough to rapidly fill up the buffer, but since there’s also no way to control the size of the buffer, that “solution” isn’t reliable, and certainly not applicable if we’re only trying to stream a relative trickle of information.

Buffering Strikes Back

Buffering introduces additional problems that, while they don’t kill the solution outright, adds some significant inefficiencies. These are easiest to see with the infinite streams, which continue until the browser closes the connection. (They’d also arise any time the browser closes a finite stream before receiving all the data.)

First, Happstack only detects that the connection to the browser has been closed once the network library tries to send out data and returns an error. Between the time when the connection actually closes (i.e. when the browser sends a TCP FIN packet) and the time when Happstack notices, the program continues generating new data to send. All this effort is wasted, since the data will never get sent out and will be thrown away. The app server winds up doing a lot of useless work as a result.

Being able to control network-level buffering would largely deal with this problem too: since a closed connection is detected when trying to send data, sending each chunk out immediately would allow the app server to stop generating the stream much more promptly. If that were the case, the approach of manually using unsafeInterleaveIO, despite being the most difficult of the three, would work fairly well. The other two, however, have their own buffering problems, independent of what’s happening at the network level.

For example, what is a pipe but a buffer being managed by the operating system? Since a separate thread is writing data to the pipe independently of the thread reading from it, the generation thread will keep on going even if the network connection closes, until the pipe fills up. In theory the OS should cause writes to fail as soon as the read end of the pipe is closed, but using lazy IO to read from it seems to keep this from happening promptly. The generation thread will keep writing more data to the pipe until it too gets a write error and stops.

Using channels is even worse. The network buffer and the pipe at least have the benefit of being of finite size; once they fill up, further attempts to write will block until something reads the data back out or an error is detected and the buffer is destroyed. Channels, however, are unbounded. They never fill up; they just keep growing to make room for new data as it’s written. As a result, in the event of the browser closing the connection prematurely, the size of the channel will grow and grow until the thread writing data to it decides to stop. This is a problem for the infinite streams, since they never stop; eventually the channel will grow to consume all available memory on the system until the OS kills the app server entirely. This is also a problem for finite streams, of course, since those channels won’t get thrown away until they grow to the size of the unconsumed portion of the stream, which would be a big problem if the app server is generating lots of streams.

Playing Nice

But even if all the buffering problems can be dealt with, our solution still is far less than ideal. While the idea of slowly trickling out the stream’s data as it becomes available is legal according to the definition of HTTP, it’s really not the proper way to go about it.

Remember how we had to suppress the Content-Length header? Browsers use that header to know when they can stop reading the response from the server. Without it, the only way they can tell they’ve received all the data is when the server closes the connection. Leaving the connection open has the advantage that the browser can re-use it for the next request it sends to the server, instead of creating a new connection. Establishing a new connection requires doing the TCP three-way handshake again, which involves a round trip to the server that doesn’t carry any data. Being able to reuse the connection is faster since this extra round trip is eliminated. It might not seem like much, but consider a web page that has lots of small graphics on it; without being able to reuse a connection, a new round trip is needed every time the browser tries to download another image. All those little delays add up.

It turns out HTTP does have a way to stream data while still telling the browser how much data to expect: chunked transfer encoding. Basically, the server’s response gets split up into separate chunks, and each chunks carries its own length information. The end of the stream is indicated by a zero-length chunk. With chunked transfer encoding, the browser knows when it’s finished receiving the data, even though the server doesn’t necessarily know how much data will be sent beforehand.

Chunked transfer encoding is what we’d want the server to be able to do. Of course, Happstack would need to be modified to support it, since it too needs to know when the stream has ended so it can reuse the connection for the next request from the browser.

In Part 5, we’ll look at just what sort of modifications we might try to make.

Comments Off