Happstack and Streaming: Part 5: Modifying Happstack

Modifying Happstack

So now that we’ve established that changes to Happstack are needed to support streaming, what should those changes look like? Abstractly, there needs to be a way to give it a series of chunks that it can send to the browser individually via chunked transfer encoding, flushing the network buffer after each one to make sure they’re sent promptly.

The most obvious way to do this is to add a third data constructor to the Response data type, for use with streams. The main difference would be that, instead of accepting a single ByteString as the response to the browser, it would somehow get ahold of several.

What would that “somehow” be, though? Looking back to our original motivation (the real-time multiplayer game being played in a web browser), the entire contents of the stream aren’t known at the time it starts, and its contents will be determined by the actions of the various players — i.e., things happening in the IO monad. This means a simple list of ByteStrings isn’t the way to go. Although we did demonstrate such a list could be built anyway using some trickery, ideally we’d want something a bit more elegant, or at least one that doesn’t require actively subverting the type system via unsafeInterleaveIO.

Therefore we’d want to put an IO action in the response that could be used to generate new ByteStrings on demand for each chunk. The simplest way would be to use an IO ByteString:

data Response = {- Response and SendFile data constructors... -}
              | Chunked { rsCode      :: Int,
                          rsHeaders   :: Headers,
                          rsFlags     :: RsFlags,
                          rsValidator :: Maybe (Response -> IO Response),
                          chGenerator :: IO L.ByteString
                        }

Happstack could execute that action repeatedly to generate successive chunks, and we could even denote end-of-stream by having it produce an empty ByteString, which neatly parallels how chunked transfer encoding works.

An alternative would be something that includes an explicit state parameter that gets chained from one call to the IO action to the next. Without something like that, it would be awkward (albeit not impossible) for the action to keep track of where it’s at in the stream and what it should generate next.

data Response a = {- Response and SendFile data constructors... -}
                | Chunked { rsCode         :: Int,
                            rsHeaders      :: Headers,
                            rsFlags        :: RsFlags,
                            rsValidator    :: Maybe (Response a -> IO (Response a)),
                            chGenerator    :: a -> (a, IO L.ByteString),
                            chInitialState :: a
                          }

Unfortunately, that changes the kind of Response from * to * -> * in order to account for the new type parameter a representing whatever state the stream wants to keep track of. Since Response is used throughout Happstack and programs built on it, breaking API compatibility like that really ought to be avoided if we can help it, especially when it’s just for the sake of what would be an infrequently used feature.

Instead, what if we turn things around and put the generator in the driver’s seat: pass it an IO action to call whenever it has a new chunk ready to be sent:

data Response = {- Response and SendFile data constructors... -}
              | Chunked { rsCode         :: Int,
                          rsHeaders      :: Headers,
                          rsFlags        :: RsFlags,
                          rsValidator    :: Maybe (Response -> IO Response),
                          chGenerator    :: (L.ByteString -> IO ()) -> IO ()
                        }

Happstack would invoke chGenerator with a function that handles writing a chunk to the network and doing whatever it needs to do when the stream is over. The last thing chGenerator would do is call that function with an empty ByteString to signal end-of-stream. chGenerator would be responsible for chaining any state information from one step to the next. It would actually look rather like the pipe example from earlier; the function provided by Happstack would be used in place of hPrint and hClose, but other than that it’s the same basic idea.

There’s still the issue of signaling to the generator when it should stop because the network connection closed. But hey, we’ve got a perfectly good return value from the Happstack-provided function that we’re not using. Let’s use it:

data Response = {- Response and SendFile data constructors... -}
              | Chunked { rsCode         :: Int,
                          rsHeaders      :: Headers,
                          rsFlags        :: RsFlags,
                          rsValidator    :: Maybe (Response -> IO Response),
                          chGenerator    :: (L.ByteString -> IO Bool) -> IO ()
                        }

The Happstack-provided function returns True if more data should be generated, or False if it should be aborted.

That ought to work pretty well. It addresses all the problems identified with our attempts to stream without changing Happstack. The use of Chunked as the data constructor for the Response object will tell Happstack to suppress the Content-Length header, use chunked transfer encoding, and flush the network buffer after each chunk. New data to stream is only generated as needed, without using additional threads or large buffers. API compatibility with existing code is preserved; we’re adding a new interface and leaving existing ones untouched. Even better, there’s no need to use any trickery to achieve lazy IO; with Happstack’s cooperation, the usual kind of IO works just fine.

Mind you, I haven’t written a patch that implements this proposal. It’s just an idea. At the very least, I’d want to have one realistic application (i.e., not a simple proof-of-concept like from earlier) that demonstrates how it can be used and to verify that it does indeed work correctly before actually submitting a patch. After all, as any good programmer can tell you, sometimes flaws in a design don’t become apparent until you actually try to implement them. But it seems like this ought to work.

Now I just have to get around to writing the application whose idea motivated me to investigate the feasibility of streaming data from Happstack to begin with.