<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule">

<channel>
	<title>Paul Kuliniewicz &#187; happstack</title>
	<atom:link href="http://www.kuliniewicz.org/blog/archives/tag/happstack/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.kuliniewicz.org/blog</link>
	<description>After all, it could only cost you your life, and you got that for free.</description>
	<lastBuildDate>Wed, 18 Jan 2012 04:01:40 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.4</generator>
<creativeCommons:license>http://creativecommons.org/licenses/by-nc-nd/3.0/us/</creativeCommons:license>		<item>
		<title>Happstack and Streaming: Part 5: Modifying Happstack</title>
		<link>http://www.kuliniewicz.org/blog/archives/2010/01/29/happstack-and-streaming-part-5-modifying-happstack/</link>
		<comments>http://www.kuliniewicz.org/blog/archives/2010/01/29/happstack-and-streaming-part-5-modifying-happstack/#comments</comments>
		<pubDate>Fri, 29 Jan 2010 15:00:25 +0000</pubDate>
		<dc:creator>Paul Kuliniewicz</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[happstack]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[streaming]]></category>

		<guid isPermaLink="false">http://www.kuliniewicz.org/blog/?p=1745</guid>
		<description><![CDATA[Modifying Happstack So now that we&#8217;ve established that changes to Happstack are needed to support streaming, what should those changes look like? Abstractly, there needs to be a way to give it a series of chunks that it can send to the browser individually via chunked transfer encoding, flushing the network buffer after each one [...]]]></description>
			<content:encoded><![CDATA[<h3>Modifying Happstack</h3>
<p>So now that <a href="http://www.kuliniewicz.org/blog/archives/2010/01/27/happstack-and-streaming-part-4-the-flaw/">we&#8217;ve established that changes to Happstack are needed to support streaming</a>, what should those changes look like?  Abstractly, there needs to be a way to give it a series of chunks that it can send to the browser individually via chunked transfer encoding, flushing the network buffer after each one to make sure they&#8217;re sent promptly.</p>
<p>The most obvious way to do this is to add a third data constructor to the <code>Response</code> data type, for use with streams.  The main difference would be that, instead of accepting a single <code>ByteString</code> as the response to the browser, it would somehow get ahold of several.</p>
<p>What would that &#8220;somehow&#8221; be, though?  Looking back to our original motivation (the real-time multiplayer game being played in a web browser), the entire contents of the stream aren&#8217;t known at the time it starts, and its contents will be determined by the actions of the various players &#8212; i.e., things happening in the <code>IO</code> monad.  This means a simple list of <code>ByteString</code>s isn&#8217;t the way to go.  Although we did demonstrate such a list could be built anyway using some trickery, ideally we&#8217;d want something a bit more elegant, or at least one that doesn&#8217;t require actively subverting the type system via <code>unsafeInterleaveIO</code>.</p>
<p>Therefore we&#8217;d want to put an <code>IO</code> action in the response that could be used to generate new <code>ByteString</code>s on demand for each chunk.  The simplest way would be to use an <code>IO&nbsp;ByteString</code>:</p>
<div class="vim"><code><span class="Type">data</span> Response <span class="Statement">=</span> <span class="Comment">{&#0045; Response and SendFile data constructors&#0046;&#0046;&#0046; &#0045;}</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">|</span> Chunked { rsCode&nbsp; &nbsp; &nbsp; <span class="Statement">::</span> Int,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rsHeaders&nbsp;&nbsp; <span class="Statement">::</span> Headers,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rsFlags&nbsp; &nbsp;&nbsp; <span class="Statement">::</span> RsFlags,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rsValidator <span class="Statement">::</span> Maybe (Response <span class="Statement">&#0045;&gt;</span> IO Response),<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; chGenerator <span class="Statement">::</span> IO L&#0046;ByteString<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }</code></div>
<p>Happstack could execute that action repeatedly to generate successive chunks, and we could even denote end-of-stream by having it produce an empty <code>ByteString</code>, which neatly parallels how chunked transfer encoding works.</p>
<p>An alternative would be something that includes an explicit state parameter that gets chained from one call to the <code>IO</code> action to the next.  Without something like that, it would be awkward (albeit not impossible) for the action to keep track of where it&#8217;s at in the stream and what it should generate next.</p>
<div class="vim"><code><span class="Type">data</span> Response a <span class="Statement">=</span> <span class="Comment">{&#0045; Response and SendFile data constructors&#0046;&#0046;&#0046; &#0045;}</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">|</span> Chunked { rsCode&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">::</span> Int,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rsHeaders&nbsp; &nbsp; &nbsp; <span class="Statement">::</span> Headers,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rsFlags&nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">::</span> RsFlags,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rsValidator&nbsp; &nbsp; <span class="Statement">::</span> Maybe (Response a <span class="Statement">&#0045;&gt;</span> IO (Response a)),<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; chGenerator&nbsp; &nbsp; <span class="Statement">::</span> a <span class="Statement">&#0045;&gt;</span> (a, IO L&#0046;ByteString),<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; chInitialState <span class="Statement">::</span> a<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }</code></div>
<p>Unfortunately, that changes the <a href="http://en.wikipedia.org/wiki/Kind_%28type_theory%29">kind</a> of <code>Response</code> from <code>*</code> to <code>* &#0045;&gt; *</code> in order to account for the new type parameter <code>a</code> representing whatever state the stream wants to keep track of.  Since <code>Response</code> is used throughout Happstack and programs built on it, breaking <a href="http://en.wikipedia.org/wiki/Application_programming_interface">API</a> compatibility like that really ought to be avoided if we can help it, especially when it&#8217;s just for the sake of what would be an infrequently used feature.</p>
<p>Instead, what if we turn things around and put the generator in the driver&#8217;s seat: pass <em>it</em> an <code>IO</code> action to call whenever it has a new chunk ready to be sent:</p>
<div class="vim"><code><span class="Type">data</span> Response <span class="Statement">=</span> <span class="Comment">{&#0045; Response and SendFile data constructors&#0046;&#0046;&#0046; &#0045;}</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">|</span> Chunked { rsCode&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">::</span> Int,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rsHeaders&nbsp; &nbsp; &nbsp; <span class="Statement">::</span> Headers,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rsFlags&nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">::</span> RsFlags,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rsValidator&nbsp; &nbsp; <span class="Statement">::</span> Maybe (Response <span class="Statement">&#0045;&gt;</span> IO Response),<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; chGenerator&nbsp; &nbsp; <span class="Statement">::</span> (L&#0046;ByteString <span class="Statement">&#0045;&gt;</span> IO ()) <span class="Statement">&#0045;&gt;</span> IO ()<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }</code></div>
<p>Happstack would invoke <code>chGenerator</code> with a function that handles writing a chunk to the network and doing whatever it needs to do when the stream is over.  The last thing <code>chGenerator</code> would do is call that function with an empty <code>ByteString</code> to signal end-of-stream.  <code>chGenerator</code> would be responsible for chaining any state information from one step to the next.  It would actually look rather like the pipe example from earlier; the function provided by Happstack would be used in place of <code>hPrint</code> and <code>hClose</code>, but other than that it&#8217;s the same basic idea.</p>
<p>There&#8217;s still the issue of signaling to the generator when it should stop because the network connection closed.  But hey, we&#8217;ve got a perfectly good return value from the Happstack-provided function that we&#8217;re not using.  Let&#8217;s use it:</p>
<div class="vim"><code><span class="Type">data</span> Response <span class="Statement">=</span> <span class="Comment">{&#0045; Response and SendFile data constructors&#0046;&#0046;&#0046; &#0045;}</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">|</span> Chunked { rsCode&nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">::</span> Int,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rsHeaders&nbsp; &nbsp; &nbsp; <span class="Statement">::</span> Headers,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rsFlags&nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">::</span> RsFlags,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rsValidator&nbsp; &nbsp; <span class="Statement">::</span> Maybe (Response <span class="Statement">&#0045;&gt;</span> IO Response),<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; chGenerator&nbsp; &nbsp; <span class="Statement">::</span> (L&#0046;ByteString <span class="Statement">&#0045;&gt;</span> IO Bool) <span class="Statement">&#0045;&gt;</span> IO ()<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }</code></div>
<p>The Happstack-provided function returns <code>True</code> if more data should be generated, or <code>False</code> if it should be aborted.</p>
<p>That ought to work pretty well.  It addresses all the problems identified with our attempts to stream without changing Happstack.  The use of <code>Chunked</code> as the data constructor for the <code>Response</code> object will tell Happstack to suppress the <tt>Content-Length</tt> header, use chunked transfer encoding, and flush the network buffer after each chunk.  New data to stream is only generated as needed, without using additional threads or large buffers.  API compatibility with existing code is preserved; we&#8217;re adding a new interface and leaving existing ones untouched.  Even better, there&#8217;s no need to use any trickery to achieve lazy IO; with Happstack&#8217;s cooperation, the usual kind of IO works just fine.</p>
<p>Mind you, I haven&#8217;t written a patch that implements this proposal.  It&#8217;s just an idea.  At the very least, I&#8217;d want to have one realistic application (i.e., not a simple proof-of-concept like from earlier) that demonstrates how it can be used and to verify that it does indeed work correctly before actually submitting a patch.  After all, as any good programmer can tell you, sometimes flaws in a design don&#8217;t become apparent until you actually try to implement them.  But it <em>seems</em> like this ought to work.</p>
<p>Now I just have to get around to writing the application whose idea motivated me to investigate the feasibility of streaming data from Happstack to begin with.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kuliniewicz.org/blog/archives/2010/01/29/happstack-and-streaming-part-5-modifying-happstack/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Happstack and Streaming: Part 4: The Flaw</title>
		<link>http://www.kuliniewicz.org/blog/archives/2010/01/27/happstack-and-streaming-part-4-the-flaw/</link>
		<comments>http://www.kuliniewicz.org/blog/archives/2010/01/27/happstack-and-streaming-part-4-the-flaw/#comments</comments>
		<pubDate>Wed, 27 Jan 2010 15:00:47 +0000</pubDate>
		<dc:creator>Paul Kuliniewicz</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[happstack]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[streaming]]></category>

		<guid isPermaLink="false">http://www.kuliniewicz.org/blog/?p=1741</guid>
		<description><![CDATA[The Fatal Flaw All three approaches to generating a lazy ByteString from the IO monad actually do work, as you can verify by loading the source code into ghci and invoking them manually. However, if you go through the web server and visit one of the finite stream paths, no output will appear until the [...]]]></description>
			<content:encoded><![CDATA[<h3>The Fatal Flaw</h3>
<p>All <a href="http://www.kuliniewicz.org/blog/archives/2010/01/25/happstack-and-streaming-part-3-implementation-sort-of/">three approaches</a> to generating a lazy <code>ByteString</code> from the <code>IO</code> monad actually do work, as you can verify by loading the source code into <tt>ghci</tt> and invoking them manually.  However, if you go through the web server and visit one of the finite stream paths, no output will appear until the stream has finished being generated, at which point the entire set of output from the server will arrive all at once, like so:</p>
<blockquote><p><tt>paul@queeg:~/tmp$ GET &#0045;es http://localhost:8000/pipe/limited<br /></tt><br />
<i>(nothing happens for a while, and then&#8230;)</i><br />
<tt>200 OK<br />
Connection: close<br />
Date: Mon, 18 Jan 2010 13:30:55 GMT<br />
Server: Happstack/0&#0046;4&#0046;1<br />
Content&#0045;Type: te&#120;t/html; charset=utf&#0045;8<br />
Client&#0045;Date: Mon, 18 Jan 2010 13:30:58 GMT<br />
Client&#0045;Peer: 127&#0046;0&#0046;0&#0046;1:8000<br />
Client&#0045;Response&#0045;Num: 1<br />
&nbsp;<br />
2010&#0045;01&#0045;18 13:30:55&#0046;342444 UTC<br />
2010&#0045;01&#0045;18 13:30:55&#0046;443235 UTC<br />
2010&#0045;01&#0045;18 13:30:55&#0046;544115 UTC<br />
2010&#0045;01&#0045;18 13:30:55&#0046;644934 UTC<br />
2010&#0045;01&#0045;18 13:30:55&#0046;745514 UTC<br />
2010&#0045;01&#0045;18 13:30:55&#0046;846283 UTC<br />
2010&#0045;01&#0045;18 13:30:55&#0046;947581 UTC<br />
2010&#0045;01&#0045;18 13:30:56&#0046;048834 UTC<br />
2010&#0045;01&#0045;18 13:30:56&#0046;150068 UTC<br />
2010&#0045;01&#0045;18 13:30:56&#0046;251281 UTC<br />
2010&#0045;01&#0045;18 13:30:56&#0046;352521 UTC<br />
2010&#0045;01&#0045;18 13:30:56&#0046;454423 UTC<br />
2010&#0045;01&#0045;18 13:30:56&#0046;555816 UTC<br />
2010&#0045;01&#0045;18 13:30:56&#0046;657179 UTC<br />
2010&#0045;01&#0045;18 13:30:56&#0046;758547 UTC<br />
2010&#0045;01&#0045;18 13:30:56&#0046;859939 UTC<br />
2010&#0045;01&#0045;18 13:30:56&#0046;961296 UTC<br />
2010&#0045;01&#0045;18 13:30:57&#0046;062581 UTC<br />
2010&#0045;01&#0045;18 13:30:57&#0046;163847 UTC<br />
2010&#0045;01&#0045;18 13:30:57&#0046;265212 UTC<br />
2010&#0045;01&#0045;18 13:30:57&#0046;366438 UTC<br />
2010&#0045;01&#0045;18 13:30:57&#0046;4677 UTC<br />
2010&#0045;01&#0045;18 13:30:57&#0046;569059 UTC<br />
2010&#0045;01&#0045;18 13:30:57&#0046;670414 UTC<br />
2010&#0045;01&#0045;18 13:30:57&#0046;772045 UTC<br />
2010&#0045;01&#0045;18 13:30:57&#0046;873404 UTC<br />
2010&#0045;01&#0045;18 13:30:57&#0046;974761 UTC<br />
2010&#0045;01&#0045;18 13:30:58&#0046;076117 UTC<br />
2010&#0045;01&#0045;18 13:30:58&#0046;177485 UTC<br />
2010&#0045;01&#0045;18 13:30:58&#0046;278847 UTC<br />
</tt></p></blockquote>
<p>The point where the delay occurs reveals what&#8217;s going on &#8212; not even the headers are getting sent out until the entire response has been generated.  That&#8217;s not Happstack&#8217;s doing; it&#8217;s the buffering happening inside the networking library.  In the absence of any command to send data out immediately, it&#8217;s going to wait until it has a large chunk it can send out immediately.  Sending one large packet instead of lots of small packets makes more efficient use of network bandwidth, since each packet carries its own overhead.  And since Happstack wasn&#8217;t written with streaming in mind, it doesn&#8217;t flush the buffer until it&#8217;s written out the complete response.</p>
<p>As further evidence, the infinite streams do stream, kind of.  Once the buffer fills up, the networking library will push it out to make room for more data.  As a result, nothing will arrive for a while, then all of a sudden lots of data will arrive, then another long pause as the buffer fills back up, then another big chunk of data, and so on.</p>
<p>This alone shows why, despite our best efforts at cleverly creating the response, it&#8217;s all for nothing unless we can control the buffering behavior down in the network library, which Happstack doesn&#8217;t provide any access to.  The only exception would be if we&#8217;re trying to stream data quickly enough to rapidly fill up the buffer, but since there&#8217;s also no way to control the size of the buffer, that &#8220;solution&#8221; isn&#8217;t reliable, and certainly not applicable if we&#8217;re only trying to stream a relative trickle of information.</p>
<h3>Buffering Strikes Back</h3>
<p>Buffering introduces additional problems that, while they don&#8217;t kill the solution outright, adds some significant inefficiencies.  These are easiest to see with the infinite streams, which continue until the browser closes the connection.  (They&#8217;d also arise any time the browser closes a finite stream before receiving all the data.)</p>
<p>First, Happstack only detects that the connection to the browser has been closed once the network library tries to send out data and returns an error.  Between the time when the connection actually closes (i.e. when the browser sends a TCP FIN packet) and the time when Happstack notices, the program continues generating new data to send.  All this effort is wasted, since the data will never get sent out and will be thrown away.  The app server winds up doing a lot of useless work as a result.</p>
<p>Being able to control network-level buffering would largely deal with this problem too: since a closed connection is detected when trying to send data, sending each chunk out immediately would allow the app server to stop generating the stream much more promptly.  If that were the case, the approach of manually using <code>unsafeInterleaveIO</code>, despite being the most difficult of the three, would work fairly well.  The other two, however, have their own buffering problems, independent of what&#8217;s happening at the network level.</p>
<p>For example, what is a pipe but a buffer being managed by the operating system?  Since a separate thread is writing data to the pipe independently of the thread reading from it, the generation thread will keep on going even if the network connection closes, until the pipe fills up.  In theory the OS should cause writes to fail as soon as the read end of the pipe is closed, but using lazy IO to read from it seems to keep this from happening promptly.  The generation thread will keep writing more data to the pipe until it too gets a write error and stops.</p>
<p>Using channels is even worse.  The network buffer and the pipe at least have the benefit of being of finite size; once they fill up, further attempts to write will block until something reads the data back out or an error is detected and the buffer is destroyed.  Channels, however, are <em>unbounded</em>.  They never fill up; they just keep growing to make room for new data as it&#8217;s written.  As a result, in the event of the browser closing the connection prematurely, the size of the channel will grow and grow until the thread writing data to it decides to stop.  This is a problem for the infinite streams, since they <em>never</em> stop; eventually the channel will grow to consume all available memory on the system until the OS kills the app server entirely.  This is also a problem for finite streams, of course, since those channels won&#8217;t get thrown away until they grow to the size of the unconsumed portion of the stream, which would be a big problem if the app server is generating lots of streams.</p>
<h3>Playing Nice</h3>
<p>But even if all the buffering problems can be dealt with, our solution <em>still</em> is far less than ideal.  While the idea of slowly trickling out the stream&#8217;s data as it becomes available is legal according to the definition of HTTP, it&#8217;s really not the proper way to go about it.</p>
<p>Remember how we had to suppress the <tt>Content-Length</tt> header?  Browsers use that header to know when they can stop reading the response from the server.  Without it, the only way they can tell they&#8217;ve received all the data is when the server closes the connection.  Leaving the connection open has the advantage that the browser can re-use it for the next request it sends to the server, instead of creating a new connection.  Establishing a new connection requires doing the <a href="http://en.wikipedia.org/wiki/Transmission_Control_Protocol#Connection_establishment">TCP three-way handshake</a> again, which involves a round trip to the server that doesn&#8217;t carry any data.  Being able to reuse the connection is faster since this extra round trip is eliminated.  It might not seem like much, but consider a web page that has lots of small graphics on it; without being able to reuse a connection, a new round trip is needed <em>every time</em> the browser tries to download another image.  All those little delays add up.</p>
<p>It turns out HTTP does have a way to stream data while still telling the browser how much data to expect: <a href="http://en.wikipedia.org/wiki/Chunked_transfer_encoding">chunked transfer encoding</a>.  Basically, the server&#8217;s response gets split up into separate chunks, and each chunks carries its own length information.  The end of the stream is indicated by a zero-length chunk.  With chunked transfer encoding, the browser knows when it&#8217;s finished receiving the data, even though the server doesn&#8217;t necessarily know how much data will be sent beforehand.</p>
<p>Chunked transfer encoding is what we&#8217;d want the server to be able to do.  Of course, Happstack would need to be modified to support it, since it too needs to know when the stream has ended so it can reuse the connection for the next request from the browser.</p>
<p>In Part 5, we&#8217;ll look at just what sort of modifications we might try to make.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kuliniewicz.org/blog/archives/2010/01/27/happstack-and-streaming-part-4-the-flaw/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Happstack and Streaming: Part 3: Implementation (Sort Of)</title>
		<link>http://www.kuliniewicz.org/blog/archives/2010/01/25/happstack-and-streaming-part-3-implementation-sort-of/</link>
		<comments>http://www.kuliniewicz.org/blog/archives/2010/01/25/happstack-and-streaming-part-3-implementation-sort-of/#comments</comments>
		<pubDate>Mon, 25 Jan 2010 15:00:27 +0000</pubDate>
		<dc:creator>Paul Kuliniewicz</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[happstack]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[streaming]]></category>

		<guid isPermaLink="false">http://www.kuliniewicz.org/blog/?p=1732</guid>
		<description><![CDATA[Implementation For our proof-of-concept for trying to do streaming with Happstack, here&#8217;s a simple web application that implements each of the three possible approaches discussed earlier. To keep things simple, the data we&#8217;ll be streaming is a series of timestamps taken from the system clock at regular intervals. Not a particularly useful application, but it [...]]]></description>
			<content:encoded><![CDATA[<h3>Implementation</h3>
<p>For our <a href="http://www.kuliniewicz.org/blog/archives/2010/01/22/happstack-and-streaming-part-2-lazy-io/">proof-of-concept for trying to do streaming with Happstack</a>, here&#8217;s a simple web application that implements each of the three possible approaches discussed earlier.  To keep things simple, the data we&#8217;ll be streaming is a series of timestamps taken from the system clock at regular intervals.  Not a particularly useful application, but it is something simple that uses IO, which is all we&#8217;re looking for here.  To make things a bit more interesting, the web app will support each approach two different ways: first, streaming a finite number of timestamp values before ending the stream, and second, streaming an endless series of timestamp values until the browser closes the connection.</p>
<p>Specifically, the web app will support the following six paths:</p>
<dl>
<dt><tt>/pipe/limited</tt></dt>
<dd>A finite number of timestamps, using a OS-level pipe.</dd>
<dt><tt>/pipe/infinite</tt></dt>
<dd>An infinite number of timestamps, using an OS-level pipe.</dd>
<dt><tt>/chan/limited</tt></dt>
<dd>A finite number of timestamps, using a <code>Chan</code>.</dd>
<dt><tt>/chan/infinite</tt></dt>
<dd>An infinite number of timestamps, using a <code>Chan</code>.</dd>
<dt><tt>/manual/limited</tt></dt>
<dd>A finite number of timestamps, using <code>unsafeInterleaveIO</code> manually.</dd>
<dt><tt>/manual/infinite</tt></dt>
<dd>An infinite number of timestamps, using <code>unsafeInterleaveIO</code> manually.</dd>
</dl>
<p>First, let&#8217;s get all the module imports out of the way.  There&#8217;s nothing particularly interesting about any of them, so I won&#8217;t comment on them further.</p>
<div class="vim"><code><span class="Special">{&#0045;# LANGUAGE Fle&#120;ibleConte&#120;ts #&#0045;}</span><br />
&nbsp;<br />
<span class="Type">module</span> Main (main) <span class="Type">where</span><br />
&nbsp;<br />
<span class="PreProc">import</span> Control&#0046;Concurrent (forkIO, threadDelay)<br />
<span class="PreProc">import</span> Control&#0046;Concurrent&#0046;Chan (Chan, getChanContents, newChan, writeChan)<br />
<span class="PreProc">import</span> Control&#0046;Monad (liftM, MonadPlus, msum, when)<br />
<span class="PreProc">import</span> Control&#0046;Monad&#0046;Trans (liftIO, MonadIO)<br />
<span class="PreProc">import</span> Data&#0046;Time&#0046;Clock (diffUTCTime, getCurrentTime)<br />
<span class="PreProc">import</span> Happstack&#0046;Server&#0046;HTTP&#0046;Types (Method (&#0046;&#0046;), noContentLength, nullConf, Response, resultBS)<br />
<span class="PreProc">import</span> Happstack&#0046;Server&#0046;SimpleHTTP (dir, FilterMonad, internalServerError, methodSP, nullDir,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ServerMonad, simpleHTTP, toResponse)<br />
<span class="PreProc">import</span> System&#0046;IO (Handle, hClose, hFlush, hPrint, hPutStrLn, stderr)<br />
<span class="PreProc">import</span> System&#0046;IO&#0046;Unsafe (unsafeInterleaveIO)<br />
<span class="PreProc">import</span> System&#0046;Posi&#120;&#0046;IO (createPipe, fdToHandle)<br />
&nbsp;<br />
<span class="PreProc">import</span> <span class="PreProc">qualified</span> Data&#0046;ByteString&#0046;Char8 <span class="PreProc">as</span> S<br />
<span class="PreProc">import</span> <span class="PreProc">qualified</span> Data&#0046;ByteString&#0046;Lazy&#0046;Char8 <span class="PreProc">as</span> L</code></div>
<p>Next is the code that sets up the Happstack server with the six paths mentioned above.</p>
<div class="vim"><code>main <span class="Statement">::</span> IO ()<br />
main <span class="Statement">=</span> simpleHTTP nullConf root<br />
&nbsp;<br />
root <span class="Statement">::</span> (ServerMonad m, MonadPlus m, MonadIO m, FilterMonad Response m) <span class="Statement">=&gt;</span> m Response<br />
root <span class="Statement">=</span> msum [ dir <span class="Constant">&quot;pipe&quot;</span>&nbsp;&nbsp; <span class="Statement">$</span> subdir outputPipe<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; , dir <span class="Constant">&quot;chan&quot;</span>&nbsp;&nbsp; <span class="Statement">$</span> subdir outputChan<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; , dir <span class="Constant">&quot;manual&quot;</span> <span class="Statement">$</span> subdir outputManual<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ]<br />
&nbsp;<br />
subdir <span class="Statement">::</span> (ServerMonad m, MonadPlus m, MonadIO m) <span class="Statement">=&gt;</span> ((Int <span class="Statement">&#0045;&gt;</span> Int) <span class="Statement">&#0045;&gt;</span> IO Response) <span class="Statement">&#0045;&gt;</span> m Response<br />
subdir output <span class="Statement">=</span> msum [ dir <span class="Constant">&quot;limited&quot;</span>&nbsp; <span class="Statement">$</span> nullDir <span class="Statement">&gt;&gt;</span> methodSP GET (liftIO <span class="Statement">$</span> output decr)<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; , dir <span class="Constant">&quot;infinite&quot;</span> <span class="Statement">$</span> nullDir <span class="Statement">&gt;&gt;</span> methodSP GET (liftIO <span class="Statement">$</span> output id)<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; ]<br />
&nbsp; &nbsp; <span class="Type">where</span> decr n <span class="Statement">=</span> n <span class="Statement">&#0045;</span> <span class="Constant">1</span></code></div>
<p>To support both finite and infinite streams, each stream generator takes an initial counter and a decrement function.  After generating each timestamp value, it applies the decrement function to the counter, and stops if the counter reaches zero.  For finite streams, the decrement function subtracts one from the counter.  For infinite streams, it does nothing, so that the counter never reaches zero.</p>
<p>Yeah, it&#8217;s kind of an ugly hack, but it&#8217;s good enough for this.</p>
<p>Here&#8217;s the code for generating a stream using OS-level pipes:</p>
<div class="vim"><code>outputPipe <span class="Statement">::</span> (Int <span class="Statement">&#0045;&gt;</span> Int) <span class="Statement">&#0045;&gt;</span> IO Response<br />
outputPipe decr <span class="Statement">=</span> <span class="Statement">do</span> h <span class="Statement">&lt;&#0045;</span> pipeClock decr limitedCount<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; bs <span class="Statement">&lt;&#0045;</span> L&#0046;hGetContents h<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; return <span class="Statement">$</span> streamBS bs<br />
&nbsp;<br />
pipeClock <span class="Statement">::</span> (Int <span class="Statement">&#0045;&gt;</span> Int) <span class="Statement">&#0045;&gt;</span> Int <span class="Statement">&#0045;&gt;</span> IO Handle<br />
pipeClock decr n <span class="Statement">=</span> <span class="Statement">do</span> (readFd, writeFd) <span class="Statement">&lt;&#0045;</span> createPipe<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; writeH <span class="Statement">&lt;&#0045;</span> fdToHandle writeFd<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; forkIO <span class="Statement">$</span> output writeH n <span class="Statement">`catch`</span> abort writeH<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fdToHandle readFd<br />
&nbsp; &nbsp; <span class="Type">where</span> output h <span class="Constant">0</span> <span class="Statement">=</span> <span class="Statement">do</span> hPutStrLn stderr <span class="Constant">&quot;closing pipe&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; hClose h<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; output h n <span class="Statement">=</span> <span class="Statement">do</span> now <span class="Statement">&lt;&#0045;</span> getCurrentTime<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; hPrint h now<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; hFlush h<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tick now<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; threadDelay interval<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; output h (decr n)<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; abort h e <span class="Statement">=</span> <span class="Statement">do</span> hPutStrLn stderr <span class="Statement">$</span> <span class="Constant">&quot;caught error: &quot;</span> <span class="Statement">++</span> show e<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; hClose h</code></div>
<p>The above code creates a pipe.  A new thread is forked off to repeatedly write timestamps into the pipe.  For a finite stream, the new thread closes its end of the pipe when it&#8217;s done writing.  Meanwhile, the original thread uses lazy IO to read the entire contents of the pipe into a lazy <code>ByteString</code>, which gets passed back down to Happstack for sending to the browser.</p>
<p>Here&#8217;s the code for generating a stream using an IO channel:</p>
<div class="vim"><code>outputChan <span class="Statement">::</span> (Int <span class="Statement">&#0045;&gt;</span> Int) <span class="Statement">&#0045;&gt;</span> IO Response<br />
outputChan decr <span class="Statement">=</span> <span class="Statement">do</span> ch <span class="Statement">&lt;&#0045;</span> chanClock decr limitedCount<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; chunks <span class="Statement">&lt;&#0045;</span> getChanContents ch<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">let</span> bs <span class="Statement">=</span> L&#0046;fromChunks <span class="Statement">$</span> takeWhile (not <span class="Statement">&#0046;</span> S&#0046;null) chunks<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; return <span class="Statement">$</span> streamBS bs<br />
&nbsp;<br />
chanClock <span class="Statement">::</span> (Int <span class="Statement">&#0045;&gt;</span> Int) <span class="Statement">&#0045;&gt;</span> Int <span class="Statement">&#0045;&gt;</span> IO (Chan S&#0046;ByteString)<br />
chanClock decr n <span class="Statement">=</span> <span class="Statement">do</span> ch <span class="Statement">&lt;&#0045;</span> newChan<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; forkIO <span class="Statement">$</span> output ch n<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return ch<br />
&nbsp; &nbsp; <span class="Type">where</span> output ch <span class="Constant">0</span> <span class="Statement">=</span> <span class="Statement">do</span> hPutStrLn stderr <span class="Constant">&quot;done writing to channel&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; writeChan ch S&#0046;empty<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; output ch n <span class="Statement">=</span> <span class="Statement">do</span> now <span class="Statement">&lt;&#0045;</span> getCurrentTime<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; writeChan ch <span class="Statement">$</span> chunkify now<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; tick now<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; threadDelay interval<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; output ch (decr n)</code></div>
<p>The above code does the same basic thing, just with a channel of strict <code>ByteString</code>s instead of a pipe.  A new thread is forked off to write timestamp values to the channel.  For a finite stream, the new thread writes an empty <code>ByteString</code> to indicate that it&#8217;s done.  Meanwhile, the orginal thread uses lazy IO to read the entire contents of the channel into a (lazy) list, up until an empty <code>ByteString</code>.  It then lazily combines the individual strict <code>ByteString</code>s into a single lazy <code>ByteString</code> that it then passes down to Happstack.</p>
<p>The two above approaches each use a new thread that writes a timestamp, waits for the specified interval, writes another timestamp, and so on.  Thus, the two code snippets look pretty similar in basic structure.  Not so for using <code>unsafeInterleaveIO</code> directly:</p>
<div class="vim"><code>outputManual <span class="Statement">::</span> (Int <span class="Statement">&#0045;&gt;</span> Int) <span class="Statement">&#0045;&gt;</span> IO Response<br />
outputManual decr <span class="Statement">=</span> <span class="Statement">do</span> bs <span class="Statement">&lt;&#0045;</span> manualClock decr limitedCount<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; return <span class="Statement">$</span> streamBS bs<br />
&nbsp;<br />
manualClock <span class="Statement">::</span> (Int <span class="Statement">&#0045;&gt;</span> Int) <span class="Statement">&#0045;&gt;</span> Int <span class="Statement">&#0045;&gt;</span> IO L&#0046;ByteString<br />
manualClock decr n <span class="Statement">=</span> <span class="Statement">do</span> now <span class="Statement">&lt;&#0045;</span> getCurrentTime<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tick now<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; allFutureChunks <span class="Statement">&lt;&#0045;</span> unsafeInterleaveIO <span class="Statement">$</span> ticksAfter now<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">let</span> futureChunks <span class="Statement">=</span> map fst <span class="Statement">$</span> zip allFutureChunks countdown<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return <span class="Statement">&#0046;</span> L&#0046;fromChunks <span class="Statement">$</span> chunkify now <span class="Statement">:</span> futureChunks<br />
&nbsp; &nbsp; <span class="Type">where</span> ticksAfter since <span class="Statement">=</span> <span class="Statement">do</span> now <span class="Statement">&lt;&#0045;</span> getCurrentTime<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">let</span> delta <span class="Statement">=</span> diffUTCTime now since<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; when (delta <span class="Statement">&lt;</span> interval <span class="Statement">/</span> <span class="Constant">1000000</span>) <span class="Statement">$</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; threadDelay (round (interval <span class="Statement">&#0045;</span> delta <span class="Statement">*</span> <span class="Constant">1000000</span>))<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; now' <span class="Statement">&lt;&#0045;</span> getCurrentTime<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tick now'<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; futureChunks <span class="Statement">&lt;&#0045;</span> unsafeInterleaveIO <span class="Statement">$</span> ticksAfter now'<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return <span class="Statement">$</span> chunkify now' <span class="Statement">:</span> futureChunks<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; countdown <span class="Statement">=</span> takeWhile (<span class="Statement">&gt;</span> <span class="Constant">0</span>) <span class="Statement">$</span> iterate decr (decr n)</code></div>
<p>Here, all the action takes place in a single thread.  Again, it takes the strategy of combining a bunch of strict <code>ByteString</code>s, one per timestamp, into a lazy <code>ByteString</code> with everything.  It generates the first timestamp immediately, but defers building the rest of the list until it&#8217;s needed.  When it is needed, it again generates the first timestamp of the rest of the list immediately, and defers building the rest.</p>
<p>It&#8217;s worth noting that this approach is the most difficult to write, since putting <code>unsafeInterleaveIO</code> in the right place is critical to making it work.  Also, since everything is happening in a single thread, strictly speaking it&#8217;s no longer good enough to just delay for the desired interval between timestamps, since there&#8217;s no telling how much time has been spent in other parts of the code.  Instead, it needs to check the clock <em>twice</em> each time around: first to figure out how long to delay, if any, and second to actually get the timestamp.  Not surprisingly, the code is also the hardest to read of the three.</p>
<p>Finally, there are some odds and ends that bear mentioning.  Here&#8217;s a simple function that converts a time value into a strict <code>ByteString</code>:</p>
<div class="vim"><code>chunkify <span class="Statement">::</span> Show a <span class="Statement">=&gt;</span> a <span class="Statement">&#0045;&gt;</span> S&#0046;ByteString<br />
chunkify <span class="Statement">=</span> S&#0046;pack <span class="Statement">&#0046;</span> (<span class="Statement">++</span> <span class="Constant">&quot;</span><span class="Special">&#0092;n</span><span class="Constant">&quot;</span>) <span class="Statement">&#0046;</span> show</code></div>
<p>Here&#8217;s a helper function that prints out a message whenever a new timestamp is generated, so we can watch the app server&#8217;s progress:</p>
<div class="vim"><code>tick <span class="Statement">::</span> Show a <span class="Statement">=&gt;</span> a <span class="Statement">&#0045;&gt;</span> IO ()<br />
tick &#120; <span class="Statement">=</span> hPutStrLn stderr <span class="Statement">$</span> <span class="Constant">&quot;tick &quot;</span> <span class="Statement">++</span> show &#120;</code></div>
<p>As you can see from the type signatures of those two utility functions, there&#8217;s nothing that makes them unique to time values; <em>any</em> value that can be converted to a string (i.e., anything belonging to the <code>Show</code> class), works.  We just happen to only use them with <code>UTCTime</code>s.</p>
<p>Anyway, here&#8217;s a simple function for converting a lazy <code>ByteString</code> into the <code>Response</code> object that Happstack ultimately wants:</p>
<div class="vim"><code>streamBS <span class="Statement">::</span> L&#0046;ByteString <span class="Statement">&#0045;&gt;</span> Response<br />
streamBS <span class="Statement">=</span> noContentLength <span class="Statement">&#0046;</span> resultBS <span class="Constant">200</span></code></div>
<p>We explicitly tell Happstack not to add the <tt>Content-Length</tt> header, since to do that it would need to measure the length of the entire response, which would mean it can&#8217;t send anything until it sees the entire response, which defeats the entire point of what we&#8217;re trying to accomplish.</p>
<p>Last, and also least, here are the constants specifying the time interval between timestamps, and how many timestamps to generate in a finite stream before stopping:</p>
<div class="vim"><code>interval <span class="Statement">::</span> Num a <span class="Statement">=&gt;</span> a<br />
interval <span class="Statement">=</span> <span class="Constant">100000</span>&nbsp; &nbsp; &nbsp; &nbsp; <span class="Comment">&#0045;&#0045; microseconds</span><br />
&nbsp;<br />
limitedCount <span class="Statement">::</span> Num a <span class="Statement">=&gt;</span> a<br />
limitedCount <span class="Statement">=</span> <span class="Constant">30</span></code></div>
<p>That&#8217;s the entirety of the code.  If you have <a href="http://www.haskell.org/ghc/">GHC</a> installed, you can go ahead and compile the program and play around with it to find out whether or not any of the approaches we&#8217;ve tried actually work.  Since we didn&#8217;t pass any configuration parameters to Happstack, it will start a web server on port 8000 that you can point your favorite browser to, at any of the six paths we defined.</p>
<p>Stay tuned for <a href="http://www.kuliniewicz.org/blog/archives/2010/01/27/happstack-and-streaming-part-4-the-flaw/">Part 4</a>, where we&#8217;ll see why (spoiler alert!) none of the three approaches actually work.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kuliniewicz.org/blog/archives/2010/01/25/happstack-and-streaming-part-3-implementation-sort-of/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Happstack and Streaming: Part 2: Lazy IO</title>
		<link>http://www.kuliniewicz.org/blog/archives/2010/01/22/happstack-and-streaming-part-2-lazy-io/</link>
		<comments>http://www.kuliniewicz.org/blog/archives/2010/01/22/happstack-and-streaming-part-2-lazy-io/#comments</comments>
		<pubDate>Fri, 22 Jan 2010 15:00:31 +0000</pubDate>
		<dc:creator>Paul Kuliniewicz</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[happstack]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[streaming]]></category>

		<guid isPermaLink="false">http://www.kuliniewicz.org/blog/?p=1728</guid>
		<description><![CDATA[Two Haskell features complicate our attempt at streaming data from Happstack: lazy evaluation, and its handling of IO. Laziness Unlike more mainstream languages, Haskell evaluates expressions lazily; it doesn&#8217;t actually compute the value until it&#8217;s actually used. This has some interesting benefits. For example, it&#8217;s quite easy to create infinitely long lists without requiring infinite [...]]]></description>
			<content:encoded><![CDATA[<p>Two Haskell features complicate <a href="http://www.kuliniewicz.org/blog/archives/2010/01/20/happstack-and-streaming-part-1-introduction/">our attempt at streaming data from Happstack</a>: lazy evaluation, and its handling of IO.</p>
<h3>Laziness</h3>
<p>Unlike more mainstream languages, Haskell <a href="http://en.wikipedia.org/wiki/Lazy_evaluation">evaluates expressions lazily</a>; it doesn&#8217;t actually compute the value until it&#8217;s actually used.  This has some interesting benefits.  For example, it&#8217;s quite easy to create infinitely long lists without requiring infinite amounts of memory.  For example, the expression <code>[1&nbsp;..]</code> is a list of <em>all</em> positive integers.  Haskell code can pass that infinite list around like any other value, and as long as we don&#8217;t do something that requires actually trying to evaluate the entire list (such as trying to compute its <code>length</code>), we&#8217;re perfectly safe.</p>
<p>We can even do computations where we create multiple infinitely long lists and do operations on the entire thing, as long as we never try to use the entire result.  For example, here we take the aforementioned infinitely long list, split it up into evens and odds, add the two infinitely long lists pairwise, and look for the first element greater than 5,000:</p>
<div class="vim"><code>foo <span class="Statement">::</span> Integer<br />
foo <span class="Statement">=</span> head greaterThan5000<br />
&nbsp; &nbsp; <span class="Type">where</span> greaterThan5000 <span class="Statement">=</span> filter (<span class="Statement">&gt;</span> <span class="Constant">5000</span>) sums<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; sums&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">=</span> zipWith (<span class="Statement">+</span>) odds evens<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; odds&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">=</span> filter odd positives<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; evens&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">=</span> filter even positives<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; positives&nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">=</span> [<span class="Constant">1</span> <span class="Statement">&#0046;&#0046;</span>]</code></div>
<p>The five lists defined in the <code>where</code> clause above are all infinitely long, but that&#8217;s OK because the program never needs to evaluate more than a finite part of any of them to compute the value of <code>foo</code>.  (For the record, it&#8217;s 5003.)</p>
<p>So, problem solved, right?  The application server just needs to give Happstack a <code>ByteString</code><code>, which after all is just a compacted list of <a href="http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.2.0.0/Data-Word.html#t%3AWord8">Word8</a></code>s or <a href="http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.2.0.0/Data-Char.html#t%3AChar">Char</a>s, and that list can evaluate to the data we eventually want to send to the browser, once we figure out in the future what it needs to be.</p>
<p>Sadly, it&#8217;s not <em>that</em> easy; you&#8217;re forgetting another key property of Haskell that lets lazy evaluation work.</p>
<h3>Pure Functional IO</h3>
<p>Lazy evaluation works because Haskell is a <a href="http://en.wikipedia.org/wiki/Purely_functional">purely functional</a> language: expressions do not have <a href="http://en.wikipedia.org/wiki/Side_effect_%28computer_science%29">side effects</a>.  As a result, <a href="http://en.wikipedia.org/wiki/Function_%28computer_science%29">functions in Haskell</a> are much like <a href="http://en.wikipedia.org/wiki/Function_%28mathematics%29">functions in mathematics</a>: their output is entirely determined by their input parameters, and their only result is producing a new value.  Haskell functions can&#8217;t reference any values whose value might change, since values in a Haskell program <em>never</em> change.  This is why lazy evaluation works: it doesn&#8217;t matter <em>when</em> the program gets around to evaluating an expression, if ever, since its result will always be exactly the same.</p>
<p>However, this seems to prevent a Haskell program from interacting with the outside world, since the system running a Haskell program, much like the rest of the universe, is not purely functional.  Any operation that interacts with the world outside the Haskell program could be affected by whatever happens to be going on at the time the operation is run.</p>
<p>As a simple example, consider the <code>time()</code> function in <a href="http://en.wikipedia.org/wiki/C_%28programming_language%29">C</a> programming on Linux.  <code>time()</code> returns the current time on the system, and will obviously return a different result depending on what time it is when it gets called.  Reading from a file is similar; the result returned will depend on what&#8217;s stored in the file at the time it&#8217;s read, which could change if something else writes to the file.</p>
<p>Interaction with the outside world is needed for a program to do anything useful, so how does Haskell get around this?  Via a bit of trickery known as the <code>IO</code> monad.  <a href="http://www.haskell.org/all_about_monads/html/index.html">Monads</a> can be a bit tricky to get your head around initially, but basically they&#8217;re just a way to sequence operations, with the monad doing something to take the output on one operation and give it as input to the next.  The particular monad being used gets to decide what &#8220;sequencing operations&#8221; means, as though you could redefine what the semicolon means in C.  Although most monads have a way to both put values into and take values out of a monad, the <code>IO</code> monad only lets you put values in.  Although you can also run a function on the value inside the <code>IO</code> monad, the result will itself also be in the <code>IO</code> monad.  There is no escape from the <code>IO</code> monad.</p>
<p>What&#8217;s the point?  Conceptually, the <code>IO</code> monad is just another type of <a href="http://www.haskell.org/all_about_monads/html/statemonad.html">state monad</a>, which carries another value (the state) from one operation to the next.  In the <code>IO</code> monad&#8217;s case, that state is merely the state of the entire universe outside the Haskell program.  Anything needing to interact with the outside world runs inside the <code>IO</code> monad, which as a result orders those operations into a particular sequence.</p>
<p>What does this have to do with anything?  Recall from earlier that our use case is streaming the current state of an interactive game as it changes over time in response to input from the players over a network.  That input-over-a-network is IO and thus runs inside the <code>IO</code> monad, and thus too must anything that uses the result.  So really, our hypothetical function that creates the data to stream back to the browser doesn&#8217;t &#8212; <em>can&#8217;t</em> &#8212; just return a result of type <code>ByteString</code>.  No, it has to return a result of type <code>IO ByteString</code>.</p>
<p>The good news is, the function the application server implements for Happstack to create a <code>Response</code> runs in the <code>IO</code> monad, so this is legal.  The bad news is that the <code>IO</code> monad truly <em>sequences</em> operations: the entirety of our hypothetical result-creating function has to execute <em>before</em> the result can be given to Happstack to send it to the browser.  Either the result-creating function returns right away, and thus can never see the result of other players&#8217; actions later, or it waits until those are handled, and can&#8217;t return anything until the game is over.</p>
<p>It seems that the rules of the <code>IO</code> monad prevent us from making this work.  </p>
<p><a href="http://www.youtube.com/watch?v=FbS4EvZuw1M#t=2m27s">Screw the rules, I have <strike>money</strike></a> lazy IO!</p>
<h3>Lazy IO</h3>
<p>Lazy IO lets a program bend the rules of lazy evaluation and <code>IO</code> sequencing a bit.  For example, consider the <a href="http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.2.0.0/Prelude.html#v%3AreadFile"><code>readFile</code> function</a> in the Haskell standard library, whose type is the following:</p>
<div class="vim"><code>readFile <span class="Statement">::</span> FilePath <span class="Statement">&#0045;&gt;</span> IO String</code></div>
<p>Superficially, this seems to read the entire file in memory before returning, per the rules of the <code>IO</code> monad.  Which would make the following program extremely ill-advised:</p>
<div class="vim"><code><span class="PreProc">import</span> Data&#0046;Char (ord)<br />
&nbsp;<br />
main <span class="Statement">::</span> IO ()<br />
main <span class="Statement">=</span> <span class="Statement">do</span> zeroes <span class="Statement">&lt;&#0045;</span> readFile <span class="Constant">&quot;/dev/zero&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print <span class="Statement">&#0046;</span> take <span class="Constant">20</span> <span class="Statement">$</span> map ord zeroes</code></div>
<p>It reads in the contents of the file <a href="http://en.wikipedia.org/wiki//dev/zero"><code>/dev/zero</code></a>, converts the characters to their <a href="http://en.wikipedia.org/wiki/Unicode">Unicode</a> code point values, and prints the first 20 of them.  However, on any Unix-ish system, <code>/dev/zero</code> is a file that contains an <em>infinite</em> number of zero bytes.  A program can read from it as long as it wants, and never reach the end.</p>
<p>The Haskell program, of course, doesn&#8217;t know about this property of <code>/dev/zero</code>, yet <code>readFile</code> doesn&#8217;t try to read an endless series of bytes into memory.  Why not?  Because <code>readFile</code> is a bit special; it does its IO lazily.</p>
<p><code>readFile</code> isn&#8217;t alone.  The <a href="http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.2.0.0/Control-Concurrent-Chan.html#v%3AgetChanContents"><code>getChanContents</code></a> function is similar:</p>
<div class="vim"><code>getChanContents <span class="Statement">::</span> Chan a <span class="Statement">&#0045;&gt;</span> IO [a]</code></div>
<p>It takes an object of type <code>Chan a</code> &#8212; a thread-safe unbounded queue of objects of an arbitrary type &#8212; and returns an infinite list of all items that are currently in the channel, as well as all items which will ever be written to the channel <em>in the future</em>.  It, too, does lazy IO.</p>
<p>How can this be?  If you dig into the source code of how these and similar functions are implemented (thanks to <a href="http://haskell.org/ghc/license.html">the Glasgow Haskell Compiler&#8217;s open-source license</a>, you can easily do this), and trace through the calls they make, you ultimately come to this interesting little function:</p>
<div class="vim"><code>unsafeInterleaveIO <span class="Statement">::</span> IO a <span class="Statement">&#0045;&gt;</span> IO a</code></div>
<p>The <a href="http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.2.0.0/System-IO-Unsafe.html#v%3AunsafeInterleaveIO"><code>unsafeInterleaveIO</code></a> function converts any normal <code>IO</code> computation into a lazy one: one that executes not when the <code>IO</code> action would normally run, but instead when its value is actually used.  It is implemented using the <a href="http://en.wikipedia.org/wiki/Deep_magic">deeply magic</a> function named <a href="http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.2.0.0/System-IO-Unsafe.html#v%3AunsafePerformIO"><code>unsafePerformIO</code></a>, which takes that &#8220;nothing ever escapes the <code>IO</code> monad&#8221; rule and punches it in the face:</p>
<div class="vim"><code>unsafePerformIO <span class="Statement">::</span> IO a <span class="Statement">&#0045;&gt;</span> a</code></div>
<p>As you might guess from the fact that their names both start with the word &#8220;unsafe&#8221;, and that they&#8217;re in the module named <a href="http://www.haskell.org/ghc/docs/latest/html/libraries/base-4.2.0.0/System-IO-Unsafe.html"><code>System.IO.Unsafe</code></a>, these functions are dangerous, since they let you bypass Haskell&#8217;s usual efforts to make lazy evaluation and IO not stab each other in the back.  In certain cases, lazy IO can be mostly safe.  For example, with <code>readFile</code>, you&#8217;re usually in trouble anyway if program A is reading a file while program B is writing it.  As long as nothing is actively writing to a file, it doesn&#8217;t matter whether it gets read <a href="http://en.wikipedia.org/wiki/Eager_evaluation">eagerly</a> or lazily, since the data will be the same either way.</p>
<p>Of course, you&#8217;re better off using the functions that use <code>unsafeInterleaveIO</code> and friends rather than using them directly.  As a general rule, you&#8217;re taking matter into your own hands when you use functions prefixed by the word &#8220;unsafe&#8221;.  As the saying goes, if it breaks, you get to keep the pieces.</p>
<p>However, the existence of lazy IO offers a few possibilities for how we might try to make streaming work without modifying Happstack, thanks to lazy IO.</p>
<ol>
<li>Use an OS-level pipe.  Pipes have two ends: one for writing data into it, and one for reading data out of it.  Once they&#8217;re created, each end of a pipe can be treated like any other file.  Fork a thread to write data to the pipe, which the original thread lazily reads back out as a <code>ByteString</code>.</li>
<li>Use a <code>Chan</code>.  Fork a thread to write data to the channel, which the original thread lazily reads and builds a <code>ByteString</code> from.</li>
<li>Use <code>unsafeInterleaveIO</code> directly to lazily generate the <code>ByteString</code> as needed.</li>
</ol>
<p>You know, this is starting to look like it might actually work.  I won&#8217;t mention the crippling flaw shared by each of these options, however, at least not just yet.  (They also each share a second, non-crippling but still significant flaw; a careful reading of <a href="http://tools.ietf.org/html/rfc2616">RFC 2616</a> might give you a clue what it is, if you can&#8217;t bear the suspense.)  It&#8217;s better if we try implementing them and experience how and why they each fail, as will any approach that doesn&#8217;t involve modifying Happstack somehow.</p>
<p>We&#8217;ll start doing precisely that in <a href="http://www.kuliniewicz.org/blog/archives/2010/01/25/happstack-and-streaming-part-3-implementation-sort-of/">Part 3</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kuliniewicz.org/blog/archives/2010/01/22/happstack-and-streaming-part-2-lazy-io/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Happstack and Streaming: Part 1: Introduction</title>
		<link>http://www.kuliniewicz.org/blog/archives/2010/01/20/happstack-and-streaming-part-1-introduction/</link>
		<comments>http://www.kuliniewicz.org/blog/archives/2010/01/20/happstack-and-streaming-part-1-introduction/#comments</comments>
		<pubDate>Thu, 21 Jan 2010 03:16:24 +0000</pubDate>
		<dc:creator>Paul Kuliniewicz</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[happstack]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[streaming]]></category>

		<guid isPermaLink="false">http://www.kuliniewicz.org/blog/?p=1721</guid>
		<description><![CDATA[Introduction The question: is it possible to use Happstack to serve streaming data? Spoiler alert: the answer is &#8220;no&#8221;. At least, not with the current version of Happstack (0.4.1). However, exploring the reasons why it isn&#8217;t possible sheds some light on what changes you&#8217;d need to make to Happstack to make it work. This five-part [...]]]></description>
			<content:encoded><![CDATA[<h3>Introduction</h3>
<p>The question: is it possible to use <a href="http://happstack.com/">Happstack</a> to serve streaming data?</p>
<p>Spoiler alert: the answer is &#8220;no&#8221;.  At least, not with the current version of Happstack (0.4.1).  However, exploring the reasons <em>why</em> it isn&#8217;t possible sheds some light on what changes you&#8217;d need to make to Happstack to make it work.  This five-part series will explore that topic in some detail, including code that almost-but-not-quite does what we want.</p>
<h3>What is streaming, anyway?</h3>
<p>For this discussion, I&#8217;m talking about generating a series of bytes in real time from a web application server and sending those bytes to a web browser as part of a single <a href="http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol">HTTP</a> response message.</p>
<p>This is different from simply sending a large file to the browser.  In that case, file already exists on the disk, and can simply be read from and sent across the network as quickly as the client can keep up.  <a href="http://en.wikipedia.org/wiki/Transmission_Control_Protocol">TCP</a> automatically takes care of all the ugly details of how you do that reliably; from the application server&#8217;s perspective, you send the whole thing in one giant chunk and let the network worry about the rest.</p>
<p>Here, however, the bytes we want to send <em>aren&#8217;t</em> known ahead of time.  The <a href="http://en.wikipedia.org/wiki/Use_case">use case</a> I have in mind is a browser-based game where the current game state changes continuously.  It&#8217;d be nice to have the server send all those updates as part of a single response, with the browser acting on updates as they are received.  The alternative is to have the browser repeatedly poll for updates, sending a brand new request each time it wants the latest information.  Polling is more complicated to implement because the browser now needs to worry about timing its update requests, instead of just letting the server send them when an update is ready.  (Recall that HTTP uses a <a href="http://en.wikipedia.org/wiki/Client-server">client/server</a>, <a href="http://en.wikipedia.org/wiki/Request-response">request/response</a> model, preventing the server from sending data to the browser without the browser first asking for it.)</p>
<p>The game scenario is what prevents the server from knowing what bytes will be sent to the browser beforehand, since the game state depend on what moves the players make as the game progresses.  To stream updates, the server needs to start sending the response before it knows what all the data that will be included in it are.</p>
<p>Thus the question: if we use Happstack to write the web application server implementing the game, is streaming updates over a single response even possible?</p>
<h3>Happstack model</h3>
<p>The core part of any application written for Happstack is a function that takes the request from the browser and returns a <a href="http://happstack.com/docs/0.4/happstack-server/Happstack-Server-HTTP-Types.html#t%3AResponse"><code>Response</code> object</a> that gets sent back to the browser.  (Granted, that&#8217;s a very simplified version of what you pass to <a href="http://happstack.com/docs/0.4/happstack-server/Happstack-Server-SimpleHTTP.html#v%3AsimpleHTTP"><code>simpleHTTP</code></a>, but it&#8217;s good enough for our discussion here.)  Happstack takes care of all the details of actually sending and receiving data over the network, converting things into <a href="http://www.haskell.org/">Haskell</a> objects, and the like.</p>
<p>In Happstack 0.4.1, there are two kinds of <code>Response</code>s, only the first of which is of concern to us.  (The <a href="http://happstack.com/docs/0.4/happstack-server/Happstack-Server-HTTP-Types.html#v%3ASendFile">second kind</a> is optimized for sending existing a preexisting file to the browser, which precisely not what we want.)  A <code>Response</code> is really just a <a href="http://www.haskell.org/ghc/docs/latest/html/libraries/bytestring-0.9.1.5/Data-ByteString-Lazy.html">(lazy) <code>ByteString</code></a>, along with HTTP metadata like a <a href="http://en.wikipedia.org/wiki/List_of_HTTP_status_codes">status code</a> and a set of <a href="http://en.wikipedia.org/wiki/List_of_HTTP_headers">headers</a>.  Happstack provides a lot of tools to help set it all up, but ultimately the application is responsible for providing the <code>ByteString</code> to be sent to the browser.  Once we give Happstack that <code>ByteString</code>, it&#8217;s out of the application&#8217;s hands.</p>
<p>At first glance, this seems to preclude us from returning any kind of real-time stream.  After all, how can the application possibly return a value that it can&#8217;t compute until some point in the <em>future</em>?  That doesn&#8217;t just sound difficult; it sounds physically <em>impossible</em>.</p>
<p>Actually, it turns out that&#8217;s the easy part, as we&#8217;ll see in <a href="http://www.kuliniewicz.org/blog/archives/2010/01/22/happstack-and-streaming-part-2-lazy-io/">Part 2</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kuliniewicz.org/blog/archives/2010/01/20/happstack-and-streaming-part-1-introduction/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Error monads for fun and profit</title>
		<link>http://www.kuliniewicz.org/blog/archives/2009/04/21/error-monads-for-fun-and-profit/</link>
		<comments>http://www.kuliniewicz.org/blog/archives/2009/04/21/error-monads-for-fun-and-profit/#comments</comments>
		<pubDate>Wed, 22 Apr 2009 03:18:04 +0000</pubDate>
		<dc:creator>Paul Kuliniewicz</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[happstack]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[monads]]></category>

		<guid isPermaLink="false">http://www.kuliniewicz.org/blog/?p=1324</guid>
		<description><![CDATA[Last time, we corrected the security flaws in our simple Happstack.State demo program, but were left with some stinky error propagation logic. In particular: Using Maybe to carry error information, instead of using it to carry the result of a successful computation, violating the usual convention with its use. Unwieldy tangles of conditionals to check [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.kuliniewicz.org/blog/archives/2009/04/18/data-migration-in-happstackstate/">Last time</a>, we corrected the security flaws in our simple <a href="http://happstack.com/">Happstack</a>.State demo program, but were left with some <a href="http://en.wikipedia.org/wiki/Code_smell">stinky</a> error propagation logic.  In particular:</p>
<ol>
<li>Using <code>Maybe</code> to carry error information, instead of using it to carry the result of a successful computation, violating the usual convention with its use.</li>
<li>Unwieldy tangles of conditionals to check for each possible error, obfuscating the normal path of execution.</li>
</ol>
<p>Neither of these is insurmountable.  For the first, Haskell already provides a type for results that may contain detailed error information: <code>Either <var>a</var> <var>b</var></code>.  As you might guess, a value of that type is either something of type <var>a</var> or something of type <var>b</var>.  By convention, <code><var>a</var></code> is the error type and <code><var>b</var></code> is the result type.  The mnemonic is that the right type is what you get if everything goes right.</p>
<p>In fact, our previous code used <code>Either</code> to return either an error message or a meaningful result from <code>hashPasswordFor</code>:</p>
<div class="vim"><code>hashPasswordFor <span class="Statement">::</span> MonadReader UserDirectory m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> String <span class="Statement">&#0045;&gt;</span> m (Either UserError PasswordHash)<br />
hashPasswordFor name pass <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; UserDirectory dir <span class="Statement">&lt;&#0045;</span> ask<br />
&nbsp; &nbsp; return <span class="Statement">$</span> <span class="Statement">case</span> M&#0046;lookup name dir <span class="Statement">of</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Nothing&nbsp;&nbsp; <span class="Statement">&#0045;&gt;</span> Left NoSuchUser<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Just user <span class="Statement">&#0045;&gt;</span> Right <span class="Statement">$</span> hashPassword (pwSalt <span class="Statement">$</span> usPassword user) pass</code></div>
<p>If successful, this function returns a <code>PasswordHash</code> via <code>Either</code>&#8216;s <code>Right</code> type constructor.  If the user couldn&#8217;t be found, it returns a <code>UserError</code> via <code>Either</code>&#8216;s <code>Left</code> type constructor.</p>
<p>You might think that we could use <code>Either UserError</code> as a monad in much the same way we could use <code>Maybe</code> as a monad: executing a series of computations until the first error.  Sadly, <code>Either <var>a</var></code> isn&#8217;t defined to be a monad, so this doesn&#8217;t work.</p>
<p>Fortunately, the <a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/mtl">Monad Transformer Library</a> has the next best thing: the <code>ErrorT</code> monadic transform.  In a nutshell, <code>ErrorT</code> lets us transform any monad into something that adds error propagation.  Specifically, any computation within the transformed monad can throw an error, skipping the remaining computations; the code using the transformed monad can then get an <code>Either <var>a</var> <var>b</var></code> out of it with either the result of successful computation (of type <code><var>b</var></code>), or an error (of type <code><var>a</var></code>).</p>
<p>If this sounds a lot like <a href="http://en.wikipedia.org/wiki/Exception_handling_syntax#C.23">try/catch-style exception handling</a>, that&#8217;s sort of the idea.  And in case you cleverly scrolled down to the <a href="http://en.wikipedia.org/wiki/Exception_handling_syntax#Haskell">Haskell section of that Wikipedia page</a> to see that Haskell has some support for this via the <code>IO</code> monad, that&#8217;s true, but <code>ErrorT</code> is a lot more powerful, not the least of which because there&#8217;s no need to use the <code>IO</code> monad at all.</p>
<p>This might be clearer in an example.  To use <code>ErrorT</code>, we merely have to declare whatever error type we wish to use as an instance of the typeclass <code>Error</code>, like so:</p>
<div class="vim"><code><span class="Type">instance</span> Error UserError</code></div>
<p>Looking up a user in the user directory is something our code does all the time, and each time we run the risk of failure if the user we&#8217;re looking for doesn&#8217;t exist.  Let&#8217;s make a <code>lookupUser</code> function that tries to get the <code>UserInfo</code> for a user, or throws a <code>UserError</code> if it failed:</p>
<div class="vim"><code>lookupUser <span class="Statement">::</span> Monad m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> UserDirectory <span class="Statement">&#0045;&gt;</span> ErrorT UserError m UserInfo<br />
lookupUser name (UserDirectory dir) <span class="Statement">=</span> maybe (throwError NoSuchUser) return <span class="Statement">$</span> M&#0046;lookup name dir</code></div>
<p>Let&#8217;s unpack that a bit.  The return type is <code>Monad m => ErrorT UserError m UserInfo</code>.  <code>UserError</code> is the type of errors that could get thrown, and <code>UserInfo</code> is the type of a successful result.  <code>m</code> is a type variable for the monad that <code>ErrorT</code> is transforming; here, we don&#8217;t care what kind of monad <code>m</code> is, as long as it&#8217;s a monad.  The function just does a lookup in the <code>Map</code>.  If the result of the lookup is <code>Just something</code>, that <code>something</code> is <code>return</code>ed (i.e., wrapped in) the monad.  Otherwise, if the result of the lookup is <code>Nothing</code>, we throw <code>NoSuchUser</code>, which is of type <code>UserError</code>.</p>
<p>Now let&#8217;s rewrite <code>hashPasswordFor</code> to make use of <code>lookupUser</code>:</p>
<div class="vim"><code>hashPasswordFor <span class="Statement">::</span> MonadReader UserDirectory m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> String <span class="Statement">&#0045;&gt;</span> m (Either UserError PasswordHash)<br />
hashPasswordFor name pass <span class="Statement">=</span> runErrorT <span class="Statement">$</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; dir <span class="Statement">&lt;&#0045;</span> ask<br />
&nbsp; &nbsp; user <span class="Statement">&lt;&#0045;</span> lookupUser name dir<br />
&nbsp; &nbsp; return <span class="Statement">$</span> hashPassword (pwSalt <span class="Statement">$</span> usPassword user) pass</code></div>
<p>The main body of the function is free to ignore errors &#8212; there&#8217;s no more conditional check to see if the user lookup failed.  Note, though, that the <code>do</code> block that specifies the monadic computation is now an argument to <code>runErrorT</code>.  <code>runErrorT</code> has a type signature of:</p>
<div class="vim"><code>runErrorT <span class="Statement">::</span> ErrorT e m a <span class="Statement">&#0045;&gt;</span> m (Either e a)</code></div>
<p>As you can see from the type signature, it takes a computation in an <code>ErrorT</code>-produced monad and converts it back into the original monad of type <code>m</code>, with the result inside <code>m</code> an <code>Either e a</code>.  In other words, it converts a computation where we can throw errors into one that returns <code>Either</code> an error or the computation result.</p>
<p>You might wonder why we don&#8217;t just propagate errors out of <code>hashPasswordFor</code> using <code>ErrorT</code> like we did for <code>lookupUser</code>, like this:</p>
<div class="vim"><code><span class="Comment">&#0045;&#0045; This doesn't work!</span><br />
hashPasswordFor <span class="Statement">::</span> MonadReader UserDirectory m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> String <span class="Statement">&#0045;&gt;</span> ErrorT UserError m PasswordHash<br />
hashPasswordFor name pass <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; dir <span class="Statement">&lt;&#0045;</span> ask<br />
&nbsp; &nbsp; user <span class="Statement">&lt;&#0045;</span> lookupUser name dir<br />
&nbsp; &nbsp; return <span class="Statement">$</span> hashPassword (pwSalt <span class="Statement">$</span> usPassword user) pass</code></div>
<p>There&#8217;s a very pragmatic reason why not: it doesn&#8217;t work.  Recall that <code>hashPasswordFor</code> is used to generate the <code>HashPasswordFor</code> query operation in our MACID store.  Happstack.State&#8217;s template magic crashes and burns if we try to return a computation involving <code>ErrorT</code>:</p>
<blockquote><p><tt>Users&#0046;hs:1:0:<br />
&nbsp; &nbsp; E&#120;ception when trying to run compile&#0045;time code:<br />
&nbsp; &nbsp; &nbsp; Une&#120;pected method type: Control&#0046;Monad&#0046;Error&#0046;ErrorT Users&#0046;UserError m_0 Users&#0046;PasswordHash<br />
&nbsp; &nbsp; &nbsp; Code: mkMethods<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#39;UserDirectory<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; [&#39;addUser, &#39;hashPasswordFor, &#39;authenticateUser, &#39;listUsers]</tt></p></blockquote>
<p>This is unfortunate, since we&#8217;re ultimately trying to use <code>HashPasswordFor</code> and <code>AuthenticateUser</code> &#8212; each of which can fail &#8212; in our implementation of <code>loginUser</code>, and our whole goal is to wait until the very end to convert the result of the computation into an <code>Either</code>.  The workaround is to do the opposite of <code>runErrorT</code> after we invoke <code>HashPasswordFor</code>, converting the <code>m (Either UserError PasswordHash)</code> back into a <code>ErrorT UserError m PasswordHash</code>.  Luckily, it&#8217;s pretty straightforward:</p>
<div class="vim"><code>rethrowError <span class="Statement">::</span> (Error e, Monad m) <span class="Statement">=&gt;</span> Either e a <span class="Statement">&#0045;&gt;</span> ErrorT e m a<br />
rethrowError (Left error)&nbsp;&nbsp; <span class="Statement">=</span> throwError error<br />
rethrowError (Right result) <span class="Statement">=</span> return result</code></div>
<p>Now we just need to feed the result of <code>HashPasswordFor</code> and <code>AuthenticateUser</code> into <code>rethrowError</code> inside <code>loginUser</code>:</p>
<div class="vim"><code>loginUser <span class="Statement">::</span> MonadIO m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> String <span class="Statement">&#0045;&gt;</span> m (Either UserError ())<br />
loginUser name pass <span class="Statement">=</span> runErrorT <span class="Statement">$</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; passHash <span class="Statement">&lt;&#0045;</span> rethrowError <span class="Statement">=&lt;&lt;</span> (query <span class="Statement">$</span> HashPasswordFor name pass)<br />
&nbsp; &nbsp; now <span class="Statement">&lt;&#0045;</span> liftIO getClockTime<br />
&nbsp; &nbsp; rethrowError <span class="Statement">=&lt;&lt;</span> (update <span class="Statement">$</span> AuthenticateUser name passHash now)</code></div>
<p>Aside from the minor hassle of needing to use <code>rethrowError</code>, this works quite nicely.  Any <code>UserError</code> that gets thrown, regardless of where it happens, get caught by <code>runErrorT</code> and converted into <code>Either UserError ()</code> for the result of the monadic computation.  The code inside the <code>do</code> block doesn&#8217;t have to worry about error checking; <code>ErrorT</code> handles that for us.</p>
<p><code>hashPasswordFor</code> was a trivial example, but remember this ugly nastiness from the previous post?</p>
<div class="vim"><code>authenticateUser <span class="Statement">::</span> MonadState UserDirectory m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> PasswordHash <span class="Statement">&#0045;&gt;</span> ClockTime <span class="Statement">&#0045;&gt;</span> m (Maybe UserError)<br />
authenticateUser name passHash when <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; UserDirectory dir <span class="Statement">&lt;&#0045;</span> get<br />
&nbsp; &nbsp; <span class="Statement">case</span> M&#0046;lookup name dir <span class="Statement">of</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; Nothing <span class="Statement">&#0045;&gt;</span> return <span class="Statement">$</span> Just NoSuchUser<br />
&nbsp; &nbsp; &nbsp; &nbsp; Just user <span class="Statement">&#0045;&gt;</span> <span class="Statement">if</span> isLocked when user<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">then</span> return <span class="Statement">$</span> fmap AccountLocked <span class="Statement">$</span> usLocked user<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">else</span> <span class="Statement">if</span> passHash <span class="Statement">==</span> usPassword user<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">then</span> <span class="Statement">do</span> put <span class="Statement">$</span> UserDirectory <span class="Statement">$</span> M&#0046;insert name (unlockUser user) dir<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return Nothing<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">else</span> <span class="Statement">do</span> put <span class="Statement">$</span> UserDirectory <span class="Statement">$</span> M&#0046;insert name (failUser when user) dir<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return <span class="Statement">$</span> Just PasswordMismatch</code></div>
<p>Here&#8217;s what a <code>ErrorT</code> magic lets us replace that with:</p>
<div class="vim"><code>authenticateUser <span class="Statement">::</span> MonadState UserDirectory m <span class="Statement">=&gt;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; String <span class="Statement">&#0045;&gt;</span> PasswordHash <span class="Statement">&#0045;&gt;</span> ClockTime <span class="Statement">&#0045;&gt;</span> m (Either UserError ())<br />
authenticateUser name passHash when <span class="Statement">=</span> runErrorT <span class="Statement">$</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; dir <span class="Statement">&lt;&#0045;</span> get<br />
&nbsp; &nbsp; user <span class="Statement">&lt;&#0045;</span> lookupUser name dir<br />
&nbsp; &nbsp; checkUnlocked when user<br />
&nbsp; &nbsp; <span class="Statement">if</span> passHash <span class="Statement">==</span> usPassword user<br />
&nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">then</span> <span class="Statement">do</span> insertUser name (unlockUser user)<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; return ()<br />
&nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">else</span> <span class="Statement">do</span> insertUser name (failUser when user)<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; throwError PasswordMismatch</code></div>
<p>Suddenly it&#8217;s much easier to see what the code&#8217;s supposed to do!  Goodbye deep nesting of conditionals; if not for the fact that we need to update the data store differently based on whether we see a password mismatch, we wouldn&#8217;t even need the one still there.</p>
<p>Just for completeness&#8217;s sake, here&#8217;s <code>checkUnlocked</code>, which replaces <code>isLocked</code> from the previous code:</p>
<div class="vim"><code>checkUnlocked <span class="Statement">::</span> Monad m <span class="Statement">=&gt;</span> ClockTime <span class="Statement">&#0045;&gt;</span> UserInfo <span class="Statement">&#0045;&gt;</span> ErrorT UserError m ()<br />
checkUnlocked asOf user <span class="Statement">=</span> <span class="Statement">case</span> usLocked user <span class="Statement">of</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Just until <span class="Statement">&#0045;&gt;</span> when (asOf <span class="Statement">&lt;</span> until) (throwError <span class="Statement">$</span> AccountLocked until)<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Nothing&nbsp; &nbsp; <span class="Statement">&#0045;&gt;</span> return ()</code></div>
<p>Technically we could&#8217;ve used <code>maybe</code> instead of pattern-matching to turn <code>checkUnlocked</code> into a one-liner like <code>isLocked</code> was, but I think the code becomes too difficult to read in that case, which defeats the whole rationale behind using <code>ErrorT</code> throughout our code in the first place.</p>
<p>As always, <a href='http://www.kuliniewicz.org/blog/wp-content/uploads/2009/04/v4tar.gz'>here&#8217;s the complete program with these changes</a>.  The code behaves the exact same way as the previous version, but the implementation is now much easier on the eyes.</p>
<p>Let this be a lesson to you: it&#8217;s often said that most programming problems can be simplified by adding another level of indirection.  In Haskell, I suspect the equivalent is adding another monad.  Monads are a little tricky to get your head around at first, but you can do some neat things with them.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kuliniewicz.org/blog/archives/2009/04/21/error-monads-for-fun-and-profit/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data migration in Happstack.State</title>
		<link>http://www.kuliniewicz.org/blog/archives/2009/04/18/data-migration-in-happstackstate/</link>
		<comments>http://www.kuliniewicz.org/blog/archives/2009/04/18/data-migration-in-happstackstate/#comments</comments>
		<pubDate>Sat, 18 Apr 2009 23:13:25 +0000</pubDate>
		<dc:creator>Paul Kuliniewicz</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[happstack]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[state]]></category>

		<guid isPermaLink="false">http://www.kuliniewicz.org/blog/?p=1310</guid>
		<description><![CDATA[Last time (modulo a distracting interlude), we looked at fixing a password disclosure vulnerability in our simple Happstack.State demo program. However, there are still some security flaws that leave passwords open to attack: There are no checks for password strength &#8212; users are free to pick trivially weak passwords. There is nothing preventing an online [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.kuliniewicz.org/blog/archives/2009/04/11/protecting-passwords-for-fun-and-profit/">Last time</a> (modulo a <a href="http://www.kuliniewicz.org/blog/archives/2009/04/16/your-daily-dose-of-distraction/">distracting interlude</a>), we looked at fixing a password disclosure vulnerability in our simple <a href="http://happstack.com/">Happstack</a>.State demo program.  However, there are still some security flaws that leave passwords open to attack:</p>
<ul>
<li>There are no checks for password strength &#8212; users are free to pick trivially weak passwords.</li>
<li>There is nothing preventing an online brute force attack against an account, where an attacker repeatedly guesses passwords and tries to log in until he finds the right one.</li>
</ul>
<p>These two vulnerabilities are hardly theoretical.  As I mentioned in an earlier post, not too long ago <a href="http://blog.wired.com/27bstroke6/2009/01/professed-twitt.html">an administrator&#8217;s account on Twitter was broken into by someone running a dictionary attack</a>.  The number of login attempts per second in such an attack is only bounded by the network bandwidth between the attacker and the server, since any halfway competent attacker is going to use a script to do the work for him &#8212; it&#8217;s not like he&#8217;s sitting there trying in passwords as quickly as he can.  And people being lazy, they&#8217;re liable to pick simple, easily guessed passwords unless we do something to stop them.</p>
<p>So let&#8217;s modify our program from last time to add some countermeasures to these attacks:</p>
<ul>
<li>When creating an account, check whether the password is sufficiently strong, and refuse to create the account if it isn&#8217;t.</li>
<li>If we see repeated failed logins for an account, temporarily lock the account, limiting how quickly an attacker can guess passwords.</li>
</ul>
<p>First, since we&#8217;re introducing even more ways operations can fail, it&#8217;d be nice if our program provided more feedback to the user about <em>why</em> an operation failed.  A simple <a href="http://en.wikipedia.org/wiki/Algebraic_data_type">algebraic sum type</a> will suffice:</p>
<div class="vim"><code><span class="Type">data</span> UserError <span class="Statement">=</span> UserE&#120;ists<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">|</span> NoSuchUser<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">|</span> PasswordMismatch<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">|</span> AccountLocked ClockTime<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">|</span> PasswordTooShort Int<br />
&nbsp; &nbsp; <span class="Type">deriving</span> (Eq, Ord, Typeable, Data)<br />
&nbsp;<br />
<span class="Type">instance</span> Version UserError<br />
<span class="Statement">$</span>(deriveSerialize &apos;&apos;UserError)<br />
&nbsp;<br />
<span class="Type">instance</span> Show UserError <span class="Type">where</span><br />
&nbsp; &nbsp; show UserE&#120;ists&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">=</span> <span class="Constant">&quot;A user by that name already e&#120;ists&#0046;&quot;</span><br />
&nbsp; &nbsp; show NoSuchUser&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">=</span> <span class="Constant">&quot;No user by that name e&#120;ists&#0046;&quot;</span><br />
&nbsp; &nbsp; show PasswordMismatch&nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">=</span> <span class="Constant">&quot;Incorrect password&#0046;&quot;</span><br />
&nbsp; &nbsp; show (AccountLocked until)&nbsp; <span class="Statement">=</span> <span class="Constant">&quot;Account is locked until &quot;</span> <span class="Statement">++</span> show until <span class="Statement">++</span> <span class="Constant">&quot;&#0046;&quot;</span><br />
&nbsp; &nbsp; show (PasswordTooShort min) <span class="Statement">=</span> <span class="Constant">&quot;Password must be at least &quot;</span> <span class="Statement">++</span> show min <span class="Statement">++</span> <span class="Constant">&quot; characters long&#0046;&quot;</span></code></div>
<p>Even if you don&#8217;t know Haskell, it should be fairly self-evident what&#8217;s going on.  I&#8217;ll just note two things.  First, we still need to implement the <code>Version</code> class and derive a serialization function via <code>deriveSerialize</code> even though we don&#8217;t plan to save a <code>UserError</code> in our MACID store, because it&#8217;s a requirement for anything we pass in or out of a query or update.  Second, we implement <code>Show</code> ourselves instead of letting the compiler do it for us, so we can provide human-readable versions of each error.</p>
<p>The core of checking password strength is almost trivial: a function that takes a proposed password and returns a <code>UserError</code> if it doesn&#8217;t meet our exacting standards.  Our code that actually creates the account will then call this function to check the strength of the user&#8217;s proposed password:</p>
<div class="vim"><code>checkPasswordStrength <span class="Statement">::</span> String <span class="Statement">&#0045;&gt;</span> Maybe UserError<br />
checkPasswordStrength pass <span class="Statement">=</span> <span class="Statement">if</span> length pass <span class="Statement">&lt;</span> <span class="Constant">8</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">then</span> Just <span class="Statement">$</span> PasswordTooShort <span class="Constant">8</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">else</span> Nothing</code></div>
<p>Admittedly, this isn&#8217;t <em>much</em> of a strength check, since it&#8217;s only looking at the length of the password.  Heck, even the classic bad password &#8220;password&#8221; passes with flying colors.  But this is good enough for demonstration purposes here; it&#8217;s trivial to modify the function to check more things, and presumably add new constructors to <code>UserError</code> accordingly.</p>
<p>The more interesting issue is that of temporarily locking accounts after some number of consecutive login failures.  Clearly, for each account we need to keep track of how many failures there have been since the last successful login, whether the account is locked, and if so, when the lock will expire.  Adding that information to <code>UserInfo</code> looks easy enough:</p>
<div class="vim"><code><span class="Type">data</span> UserInfo <span class="Statement">=</span> UserInfo { usPassword <span class="Statement">::</span> PasswordHash<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; , usJoined&nbsp;&nbsp; <span class="Statement">::</span> ClockTime<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; , usFailures <span class="Statement">::</span> Int&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Comment">&#0045;&#0045; new</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; , usLocked&nbsp;&nbsp; <span class="Statement">::</span> Maybe ClockTime&nbsp; &nbsp; <span class="Comment">&#0045;&#0045; new</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; }<br />
&nbsp; &nbsp; <span class="Type">deriving</span> (Typeable, Data)</code></div>
<p>There&#8217;s a problem, though.  Our MACID store contains records using the <em>original</em> definition of <code>UserInfo</code>, which lacks the last two fields.  If that&#8217;s all we do, suddenly we&#8217;ll be unable to load our old data.  That&#8217;s bad.</p>
<p>Fortunately, Happstack.State is one step ahead of us.  Remember that <code>Version</code> typeclass that <code>UserInfo</code>, and all the other types we use with the MACID store, has to implement?  That&#8217;s what provides our data migration path.  First, we&#8217;ll need to keep the definition of the old version of <code>UserInfo</code> around, but with a different name.  Ordinarily, we&#8217;d create a separate module to contain the old definitions, but here let&#8217;s just stick <code>_0</code> to the names of everything, to denote &#8220;version 0&#8243; to ourselves:</p>
<div class="vim"><code><span class="Type">data</span> UserInfo_0 <span class="Statement">=</span> UserInfo_0 { usPassword_0 <span class="Statement">::</span> PasswordHash<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; , usJoined_0&nbsp;&nbsp; <span class="Statement">::</span> ClockTime<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; }<br />
&nbsp; &nbsp; <span class="Type">deriving</span> (Typeable, Data)<br />
&nbsp;<br />
<span class="Type">instance</span> Version UserInfo_0<br />
<span class="Statement">$</span>(deriveSerialize &apos;&apos;UserInfo_0)</code></div>
<p>The penultimate line there declares that <code>UserInfo_0</code> is an instance of <code>Version</code>.  The default implementation of <code>Version</code>, which we&#8217;ve used up until now, says the type is the first version (version 0).  Our new and improved <code>UserInfo</code> is version 1 of this type, so its implementation of <code>Version</code> needs to state this explicitly:</p>
<div class="vim"><code><span class="Type">instance</span> Version UserInfo <span class="Type">where</span><br />
&nbsp; &nbsp; mode <span class="Statement">=</span> e&#120;tension <span class="Constant">1</span> (Pro&#120;y <span class="Statement">::</span> Pro&#120;y UserInfo_0)</code></div>
<p>This just says that <code>UserInfo</code> is version 1 of the type, and the previous version is what we&#8217;re now calling <code>UserInfo_0</code>.  (The <code>Proxy</code> object is essentially no different than <a href="http://en.wikipedia.org/wiki/Unit_type">unit</a>, but we can associate a particular type with it.  Happstack.State uses it to pass type information to a function without needing to pass an instance of that type.)</p>
<p>OK, so <code>UserInfo</code> is the successor to <code>UserInfo_0</code>, but we still need to say how to migrate from the old version to the new version.  That&#8217;s what the <code>Migrate</code> typeclass is for:</p>
<div class="vim"><code><span class="Type">instance</span> Migrate UserInfo_0 UserInfo <span class="Type">where</span><br />
&nbsp; &nbsp; migrate (UserInfo_0 password joined) <span class="Statement">=</span> UserInfo password joined <span class="Constant">0</span> Nothing</code></div>
<p>In other words, to migrate from <code>UserInfo_0</code> to <code>UserInfo</code>, copy the existing data over, set the login failure count to 0, and note that the account is not locked.</p>
<p>Now our new program will be able to read the data saved from the old one.  When it tries to read a <code>UserInfo</code> object (version 1) but sees a <code>UserInfo_0</code> object (version 0) instead, Happstack.State can automatically figure out how to perform the conversion via the <code>migrate</code> function we defined between the two types.  For the curious, <a href="http://nhlab.blogspot.com/2008/12/data-migration-with-happs-data.html">this page describes what happens behind the scenes to make this work</a>.</p>
<p>Our data migration worries solved, we can turn our attention to implementing our account locking and unlocking.  The easiest way forward is to define a bunch of simple helper functions for doing the basic steps, and then using those when implementing the login operation.</p>
<p>Checking if an account is locked is just a matter of seeing if the locked-until time, if it exists, is on or after than the current time:</p>
<div class="vim"><code>isLocked <span class="Statement">::</span> ClockTime <span class="Statement">&#0045;&gt;</span> UserInfo <span class="Statement">&#0045;&gt;</span> Bool<br />
isLocked asOf user <span class="Statement">=</span> maybe False (<span class="Statement">&gt;=</span> asOf) <span class="Statement">$</span> usLocked user</code></div>
<p>Unlocking an account just means resetting the failure counter and clearing the locked-until time.  Just note that, this being Haskell, we return a new <code>UserInfo</code> object instead of trying to modify the existing one (since we couldn&#8217;t modify it even if we tried):</p>
<div class="vim"><code>unlockUser <span class="Statement">::</span> UserInfo <span class="Statement">&#0045;&gt;</span> UserInfo<br />
unlockUser user <span class="Statement">=</span> user { usFailures <span class="Statement">=</span> <span class="Constant">0</span>, usLocked <span class="Statement">=</span> Nothing }</code></div>
<p>Incrementing the failure count is slightly complicated by needing to lock the account if the failure count exceeds the limit.  Here, if there&#8217;s been three or more consecutive failures, we lock the account for one minute:</p>
<div class="vim"><code>failUser <span class="Statement">::</span> ClockTime <span class="Statement">&#0045;&gt;</span> UserInfo <span class="Statement">&#0045;&gt;</span> UserInfo<br />
failUser when user <span class="Statement">=</span> <span class="Statement">let</span> newFailures <span class="Statement">=</span> usFailures user <span class="Statement">+</span> <span class="Constant">1</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; lockedUntil <span class="Statement">=</span> <span class="Statement">if</span> newFailures <span class="Statement">&gt;=</span> failureThreshold<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">then</span> Just <span class="Statement">$</span> addToClockTime lockPeriod when<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">else</span> Nothing<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">in</span>&nbsp; user { usFailures <span class="Statement">=</span> newFailures, usLocked <span class="Statement">=</span> lockedUntil }<br />
&nbsp; &nbsp; <span class="Type">where</span> failureThreshold <span class="Statement">=</span> <span class="Constant">3</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; lockPeriod <span class="Statement">=</span> noTimeDiff { tdMin <span class="Statement">=</span> <span class="Constant">1</span> }</code></div>
<p>Finally, we can turn our attention to the actual login process.  Login is now an update, since a login attempt, successful or unsuccessful, needs to update the failure count in the <code>UserInfo</code>.  This means we need to beware of the no-plaintext-passwords-as-update-parameters rule we discovered previously.  The workaround is the same as last time.  First, an update operation that does the work:</p>
<div class="vim"><code>authenticateUser <span class="Statement">::</span> MonadState UserDirectory m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> PasswordHash <span class="Statement">&#0045;&gt;</span> ClockTime <span class="Statement">&#0045;&gt;</span> m (Maybe UserError)<br />
authenticateUser name passHash when <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; UserDirectory dir <span class="Statement">&lt;&#0045;</span> get<br />
&nbsp; &nbsp; <span class="Statement">case</span> M&#0046;lookup name dir <span class="Statement">of</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; Nothing <span class="Statement">&#0045;&gt;</span> return <span class="Statement">$</span> Just NoSuchUser<br />
&nbsp; &nbsp; &nbsp; &nbsp; Just user <span class="Statement">&#0045;&gt;</span> <span class="Statement">if</span> isLocked when user<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">then</span> return <span class="Statement">$</span> fmap AccountLocked <span class="Statement">$</span> usLocked user<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">else</span> <span class="Statement">if</span> passHash <span class="Statement">==</span> usPassword user<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">then</span> <span class="Statement">do</span> put <span class="Statement">$</span> UserDirectory <span class="Statement">$</span> M&#0046;insert name (unlockUser user) dir<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return Nothing<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">else</span> <span class="Statement">do</span> put <span class="Statement">$</span> UserDirectory <span class="Statement">$</span> M&#0046;insert name (failUser when user) dir<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return <span class="Statement">$</span> Just PasswordMismatch</code></div>
<p><code>AuthenticateUser</code> needs a <code>PasswordHash</code>, so here&#8217;s a query that takes a plaintext password and hashes it using the salt for a particular user account:</p>
<div class="vim"><code>hashPasswordFor <span class="Statement">::</span> MonadReader UserDirectory m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> String <span class="Statement">&#0045;&gt;</span> m (Either UserError PasswordHash)<br />
hashPasswordFor name pass <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; UserDirectory dir <span class="Statement">&lt;&#0045;</span> ask<br />
&nbsp; &nbsp; return <span class="Statement">$</span> <span class="Statement">case</span> M&#0046;lookup name dir <span class="Statement">of</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Nothing&nbsp;&nbsp; <span class="Statement">&#0045;&gt;</span> Left NoSuchUser<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Just user <span class="Statement">&#0045;&gt;</span> Right <span class="Statement">$</span> hashPassword (pwSalt <span class="Statement">$</span> usPassword user) pass</code></div>
<p>Finally, the <code>loginUser</code> function hashes the password using the <code>HashPasswordFor</code> query, and if it succeeds (meaning the user does indeed exist), runs an <code>AuthenticateUser</code> update using the result:</p>
<div class="vim"><code>loginUser <span class="Statement">::</span> MonadIO m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> String <span class="Statement">&#0045;&gt;</span> m (Maybe UserError)<br />
loginUser name pass <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; hashResult <span class="Statement">&lt;&#0045;</span> query <span class="Statement">$</span> HashPasswordFor name pass<br />
&nbsp; &nbsp; <span class="Statement">case</span> hashResult <span class="Statement">of</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; Left error&nbsp; &nbsp;&nbsp; <span class="Statement">&#0045;&gt;</span> return <span class="Statement">$</span> Just error<br />
&nbsp; &nbsp; &nbsp; &nbsp; Right passHash <span class="Statement">&#0045;&gt;</span> <span class="Statement">do</span> now <span class="Statement">&lt;&#0045;</span> liftIO getClockTime<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; update <span class="Statement">$</span> AuthenticateUser name passHash now</code></div>
<p>And let&#8217;s not forget to update <code>createUser</code> to check the strength of a password:</p>
<div class="vim"><code>createUser <span class="Statement">::</span> MonadIO m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> String <span class="Statement">&#0045;&gt;</span> m (Maybe UserError)<br />
createUser name pass <span class="Statement">=</span><br />
&nbsp; &nbsp; <span class="Statement">case</span> checkPasswordStrength pass <span class="Statement">of</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; Nothing <span class="Statement">&#0045;&gt;</span> <span class="Statement">do</span> salt <span class="Statement">&lt;&#0045;</span> liftIO newSalt<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; now <span class="Statement">&lt;&#0045;</span> liftIO getClockTime<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; update <span class="Statement">$</span> AddUser name (hashPassword salt pass) now<br />
&nbsp; &nbsp; &nbsp; &nbsp; e&#120;cuse&nbsp; <span class="Statement">&#0045;&gt;</span> return e&#120;cuse</code></div>
<p>All that&#8217;s left is to adjust the command loop to expect to maybe get a <code>UserError</code> (or rather, definitely get a <code>Maybe UserError</code>), and print the error message if the command failed:</p>
<div class="vim"><code><span class="Comment">&#0045;&#0045; in commandLoop:</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; processCommand state [<span class="Constant">&quot;add&quot;</span>, user, pass] <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; result <span class="Statement">&lt;&#0045;</span> createUser user pass<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; putStrLn <span class="Statement">$</span> maybe <span class="Constant">&quot;Success&quot;</span> show result<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; commandLoop state<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; processCommand state [<span class="Constant">&quot;login&quot;</span>, user, pass] <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; result <span class="Statement">&lt;&#0045;</span> loginUser user pass<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; putStrLn <span class="Statement">$</span> maybe <span class="Constant">&quot;Success&quot;</span> show result<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; commandLoop state</code></div>
<p><a href='http://www.kuliniewicz.org/blog/wp-content/uploads/2009/04/v3tar.gz'>Here&#8217;s the complete program</a>.  Let&#8217;s try it out, starting with the result of the run with the old version, to demonstrate that the migration worked:</p>
<blockquote><p><tt>&gt; list<br />
bobby (joined Sat Apr 11 15:31:54 EDT 2009)<br />
cowboy (joined Sat Apr 11 14:57:28 EDT 2009)<br />
pmk (joined Sat Apr 11 14:57:12 EDT 2009)<br />
&gt; login pmk notthepassword<br />
Incorrect password&#0046;<br />
&gt; login pmk alsonotthepassword<br />
Incorrect password&#0046;<br />
&gt; login pmk maybethisisit<br />
Incorrect password&#0046;<br />
&gt; login pmk keeptryinganyway<br />
Account is locked until Sat Apr 18 18:29:57 EDT 2009&#0046;<br />
&gt; time&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <br />
Sat Apr 18 18:29:08 EDT 2009<br />
&gt; login pmk swordfish<br />
Account is locked until Sat Apr 18 18:29:57 EDT 2009&#0046;<br />
&gt; time<br />
Sat Apr 18 18:30:10 EDT 2009<br />
&gt; login pmk swordfish<br />
Success<br />
&gt; add bobby letstryagain<br />
A user by that name already e&#120;ists&#0046;<br />
&gt; login alice &#120;yzzy<br />
No user by that name e&#120;ists&#0046;<br />
&gt; add alice &#120;yzzy<br />
Password must be at least 8 characters long&#0046;<br />
&gt; add alice aaaaaaaa<br />
Success<br />
&gt; list<br />
alice (joined Sat Apr 18 18:31:05 EDT 2009)<br />
bobby (joined Sat Apr 11 15:31:54 EDT 2009)<br />
cowboy (joined Sat Apr 11 14:57:28 EDT 2009)<br />
pmk (joined Sat Apr 11 14:57:12 EDT 2009)<br />
&gt; checkpoint<br />
&gt; quit</tt></p></blockquote>
<p>Everything works as expected.  Trying to brute-force a password temporarily locks the account, preventing even legitimate logins until the lock times out.  New accounts are forced to have a sufficiently long password.  Meaningful error messages are displayed.  And we didn&#8217;t lose any of the data from the old version of the program.</p>
<p>Are we done?  Everything does indeed work like we want it to, but it&#8217;s still not ideal.  Let&#8217;s take another look at the definition of <code>authenticateUser</code>:</p>
<div class="vim"><code>authenticateUser <span class="Statement">::</span> MonadState UserDirectory m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> PasswordHash <span class="Statement">&#0045;&gt;</span> ClockTime <span class="Statement">&#0045;&gt;</span> m (Maybe UserError)<br />
authenticateUser name passHash when <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; UserDirectory dir <span class="Statement">&lt;&#0045;</span> get<br />
&nbsp; &nbsp; <span class="Statement">case</span> M&#0046;lookup name dir <span class="Statement">of</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; Nothing <span class="Statement">&#0045;&gt;</span> return <span class="Statement">$</span> Just NoSuchUser<br />
&nbsp; &nbsp; &nbsp; &nbsp; Just user <span class="Statement">&#0045;&gt;</span> <span class="Statement">if</span> isLocked when user<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">then</span> return <span class="Statement">$</span> fmap AccountLocked <span class="Statement">$</span> usLocked user<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">else</span> <span class="Statement">if</span> passHash <span class="Statement">==</span> usPassword user<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">then</span> <span class="Statement">do</span> put <span class="Statement">$</span> UserDirectory <span class="Statement">$</span> M&#0046;insert name (unlockUser user) dir<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return Nothing<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">else</span> <span class="Statement">do</span> put <span class="Statement">$</span> UserDirectory <span class="Statement">$</span> M&#0046;insert name (failUser when user) dir<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return <span class="Statement">$</span> Just PasswordMismatch</code></div>
<p>It <em>works</em>, but even if you don&#8217;t know Haskell, it still <a href="http://en.wikipedia.org/wiki/Code_smell">smells bad</a>.  There&#8217;s a lot of nesting going on, as seen by the ever-increasing level of indentation.  In fact, every time there&#8217;s a way to fail (no such user exists, the account is locked, etc.), we need another conditional to handle the error case, forcing the rest of the computation another level deeper.</p>
<p>Less obvious is the bad-smelling use of <code>Maybe UserError</code>.  Typically, the <code>Maybe</code> monad is used to carry the result of a <em>successful</em> computation &#8212; in fact, <a href="http://en.wikipedia.org/wiki/Monad_(functional_programming)#Examples">when <code>Maybe</code> is used as a monad</a>, that&#8217;s precisely what it does.  This is the opposite of how we&#8217;re using it: to carry the result of an <em>unsuccessful</em> computation.  This code is violating the <a href="http://en.wikipedia.org/wiki/Principle_of_least_astonishment">principle of least astonishment</a> and is liable to confuse anyone expecting <code>Maybe</code> to be used the way it normally is.</p>
<p>Still, we want to carry detailed information about errors so the user can be informed.  Is there a better way to handle errors?</p>
<p>Of course there is.  That&#8217;s the subject of the next post.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kuliniewicz.org/blog/archives/2009/04/18/data-migration-in-happstackstate/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Protecting passwords for fun and profit</title>
		<link>http://www.kuliniewicz.org/blog/archives/2009/04/11/protecting-passwords-for-fun-and-profit/</link>
		<comments>http://www.kuliniewicz.org/blog/archives/2009/04/11/protecting-passwords-for-fun-and-profit/#comments</comments>
		<pubDate>Sat, 11 Apr 2009 20:04:44 +0000</pubDate>
		<dc:creator>Paul Kuliniewicz</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[happstack]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[password]]></category>

		<guid isPermaLink="false">http://www.kuliniewicz.org/blog/?p=1287</guid>
		<description><![CDATA[Last time, we looked at a simple way to use Happstack to store a bunch of user names and passwords. In fact, it was a bit too simple &#8212; an attacker with access to the disk we&#8217;re storing the data on can trivially recover each user&#8217;s password! How can this be, when we were so [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.kuliniewicz.org/blog/archives/2009/04/05/happstackstate-the-basics/">Last time</a>, we looked at a simple way to use <a href="http://happstack.com/">Happstack</a> to store a bunch of user names and passwords.  In fact, it was a bit <em>too</em> simple &#8212; an attacker with access to the disk we&#8217;re storing the data on can trivially recover each user&#8217;s password!  How can this be, when we were so careful to only store hashes of passwords in our user directory, instead of the passwords themselves?</p>
<p>Before I get to that, let&#8217;s spend a modicum of time <a href="http://en.wikipedia.org/wiki/Code_refactoring">refactoring</a> the code we&#8217;re working with into two modules: one that <em>implements</em> our user directory and one that <em>uses</em> our user directory.  The <code>Users</code> module will only expose the interface for the operations we&#8217;re providing, hiding the implementation details:</p>
<div class="vim"><code><span class="Type">module</span> Users ( UserDirectory<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; , AddUser (<span class="Statement">&#0046;&#0046;</span>)<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; , CheckPassword (<span class="Statement">&#0046;&#0046;</span>)<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; , ListUsers (<span class="Statement">&#0046;&#0046;</span>)<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; ) <span class="Type">where</span></code></div>
<p>Here, the interface only consists of the queries and updates that clients can make, as well as the top-level type of the data being saved in the MACID store.  The <code>Main</code> module implements our command loop and imports <code>Users</code> to do the actual work:</p>
<div class="vim"><code><span class="Type">module</span> Main ( main<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ) <span class="Type">where</span><br />
&nbsp;<br />
<span class="PreProc">import</span> Users</code></div>
<p>One last change: to make it easier for successive versions of the program to share the same files backing our MACID store, we&#8217;ll explicitly set the name of the program &#8212; Happstack.State names the directory where it stores the data according to this name:</p>
<div class="vim"><code>main <span class="Statement">::</span> IO ()<br />
main <span class="Statement">=</span> withProgName <span class="Constant">&quot;state&#0045;demo&quot;</span> <span class="Statement">$</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; state <span class="Statement">&lt;&#0045;</span> startSystemState macidPro&#120;y<br />
&nbsp; &nbsp; commandLoop state</code></div>
<p>For convenience, <a href='http://www.kuliniewicz.org/blog/wp-content/uploads/2009/04/v1tar.gz'>here&#8217;s a tarball containing the complete program</a>.  Let&#8217;s give it a try:</p>
<blockquote><p><tt>&gt; list<br />
&gt; add pmk swordfish<br />
User added<br />
&gt; list<br />
pmk (joined Sat Apr 11 14:57:12 EDT 2009)<br />
&gt; add cowboy sourMilk7<br />
User added<br />
&gt; list<br />
cowboy (joined Sat Apr 11 14:57:28 EDT 2009)<br />
pmk (joined Sat Apr 11 14:57:12 EDT 2009)<br />
&gt; login cowboy swordfish<br />
Bad account or password<br />
&gt; login pmk whatever<br />
Bad account or password<br />
&gt; login pmk swordfish<br />
Success<br />
&gt; checkpoint<br />
&gt; quit</tt></p></blockquote>
<p>Since we named our program <code>state-demo</code>, Happstack.State stores its files in the directory <tt>_local/state-demo_state/</tt> under the current directory.  If we look at the checkpoint we made, which will contain the contents of the <code>UserDirectory</code> we created, we can see there aren&#8217;t any passwords in it, just as we expected:</p>
<blockquote><p><tt>0000000: 0000 0000 0000 0001 0000 0000 0000 0013&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000010: 5573 6572 732e 5573 6572 4469 7265 6374&nbsp; Users&#0046;UserDirect<br />
0000020: 6f72 7900 0000 0000 0001 8100 0000 0000&nbsp; ory&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000030: 0000 0000 0000 0000 0000 0004 a57b 6cd0&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;{l&#0046;<br />
0000040: 6cbc 5d4d 0000 0120 968a e2ad 0000 0000&nbsp; l&#0046;]M&#0046;&#0046;&#0046; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000050: 0000 0000 0000 0000 0000 0014 3139 3238&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;1928<br />
0000060: 3338 3631 3537 2035 3739 3432 3634 3530&nbsp; 386157 579426450<br />
0000070: 0000 0000 0000 0000 0000 0000 0000 0000&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000080: 0200 0000 0000 0000 0663 6f77 626f 7900&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;cowboy&#0046;<br />
0000090: 0000 0000 0000 0000 0000 0000 0000 0000&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
00000a0: 0000 0000 0000 0000 402f 2709 fa39 c1a1&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;@/&#39;&#0046;&#0046;9&#0046;&#0046;<br />
00000b0: 7f91 1c05 4ae1 f1b3 3260 e5ee 28b8 1963&nbsp; &#0046;&#0046;&#0046;&#0046;J&#0046;&#0046;&#0046;2`&#0046;&#0046;(&#0046;&#0046;c<br />
00000c0: a455 122d f04a 8235 1147 fe5c 92a3 0b4f&nbsp; &#0046;U&#0046;&#0045;&#0046;J&#0046;5&#0046;G&#0046;&#0092;&#0046;&#0046;&#0046;O<br />
00000d0: 7540 0754 3be1 939f 2d66 272f 0f09 ffdf&nbsp; u@&#0046;T;&#0046;&#0046;&#0046;&#0045;f&#39;/&#0046;&#0046;&#0046;&#0046;<br />
00000e0: 8c9d 05e2 3b8e b5b4 db00 0000 0000 0000&nbsp; &#0046;&#0046;&#0046;&#0046;;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
00000f0: 0a23 cde5 ac26 17c4 6019 a700 0000 0000&nbsp; &#0046;#&#0046;&#0046;&#0046;&amp;&#0046;&#0046;`&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000100: 0000 0000 0049 e0e8 1801 0100 0000 0000&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;I&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000110: 0000 0500 3cb8 192e 0000 0000 0000 0003&nbsp; &#0046;&#0046;&#0046;&#0046;&lt;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000120: 706d 6b00 0000 0000 0000 0000 0000 0000&nbsp; pmk&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000130: 0000 0000 0000 0000 0000 0000 4035 6924&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;@5i$<br />
0000140: cb51 f37b f7ae af8d 1953 6d44 e90b 5d4a&nbsp; &#0046;Q&#0046;{&#0046;&#0046;&#0046;&#0046;&#0046;SmD&#0046;&#0046;]J<br />
0000150: 200d 2925 8e2d 4ed7 9aa4 7b59 9d47 ed5d&nbsp;&nbsp; &#0046;)%&#0046;&#0045;N&#0046;&#0046;&#0046;{Y&#0046;G&#0046;]<br />
0000160: 8626 bef6 1e8d 6e9b 1bf7 7689 daeb facb&nbsp; &#0046;&amp;&#0046;&#0046;&#0046;&#0046;n&#0046;&#0046;&#0046;v&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000170: a9c0 ab09 ae76 272c 3b26 ce22 de00 0000&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;v&#39;,;&amp;&#0046;&#34;&#0046;&#0046;&#0046;&#0046;<br />
0000180: 0000 0000 0af1 4a1c a2dd ef72 1ea8 8e00&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;J&#0046;&#0046;&#0046;&#0046;r&#0046;&#0046;&#0046;&#0046;<br />
0000190: 0000 0000 0000 0000 0049 e0e8 0801 0100&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;I&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
00001a0: 0000 0000 0000 0500 9c69 30dd&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;i0&#0046</tt></p></blockquote>
<p>However, when we look at the transaction log from before we made the checkpoint, we&#8217;re in for a nasty surprise:</p>
<blockquote><p><tt>0000000: 0000 0000 0000 0000 0000 0000 0000 0000&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000010: 0000 0000 0000 0000 023f cf48 bc4d aa07&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;?&#0046;H&#0046;M&#0046;&#0046;<br />
0000020: 1400 0001 2096 8a62 f600 0000 0000 0000&nbsp; &#0046;&#0046;&#0046;&#0046; &#0046;&#0046;b&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000030: 0000 0000 0000 0000 1531 3438 3835 3032&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;1488502<br />
0000040: 3432 3820 3138 3234 3336 3735 3531 0000&nbsp; 428 1824367551&#0046;&#0046;<br />
0000050: 0000 0000 0000 0000 0000 0000 000d 5573&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;Us<br />
0000060: 6572 732e 4164 6455 7365 7200 0000 0000&nbsp; ers&#0046;AddUser&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000070: 0000 2400 0000 0000 0000 0000 0000 0000&nbsp; &#0046;&#0046;$&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000080: 0000 0370 6d6b 0000 0000 0000 0009 7377&nbsp; &#0046;&#0046;&#0046;pmk&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<b>sw</b><br />
0000090: 6f72 6466 6973 6800 0000 0000 0000 0000&nbsp; <b>ordfish</b>&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
00000a0: 0000 0000 0000 0000 0000 0000 0000 0004&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
00000b0: a57b 6cd0 6cbc 5d4d 0000 0120 968a 9e86&nbsp; &#0046;{l&#0046;l&#0046;]M&#0046;&#0046;&#0046; &#0046;&#0046;&#0046;&#0046;<br />
00000c0: 0000 0000 0000 0000 0000 0000 0000 0014&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
00000d0: 3139 3238 3338 3631 3537 2035 3739 3432&nbsp; 1928386157 57942<br />
00000e0: 3634 3530 0000 0000 0000 0000 0000 0000&nbsp; 6450&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
00000f0: 0000 000d 5573 6572 732e 4164 6455 7365&nbsp; &#0046;&#0046;&#0046;&#0046;Users&#0046;AddUse<br />
0000100: 7200 0000 0000 0000 2700 0000 0000 0000&nbsp; r&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#39;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000110: 0000 0000 0000 0000 0663 6f77 626f 7900&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;cowboy&#0046;<br />
0000120: 0000 0000 0000 0973 6f75 724d 696c 6b37&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<b>sourMilk7</b></tt></p></blockquote>
<p>The passwords, in the clear where anyone can see them!  What happened?</p>
<p>In retrospect, it&#8217;s obvious.  Between checkpoints, Happstack.State saves a transaction log of all updates, so that if the program quits prematurely it can recover and not lose any data.  While we were careful to hash the passwords in the MACID store itself, the <code>AddUser</code> update operation <em>didn&#8217;t</em>, so when those are saved to the transaction log, so are the passwords.</p>
<p>The moral is this: <strong>Never pass any arguments to an update that you don&#8217;t mind being written to disk.</strong></p>
<p>Now that we&#8217;ve learned our lesson, let&#8217;s rewrite <code>addUser</code> so that it takes a <code>PasswordHash</code> as an argument, instead of the password itself:</p>
<div class="vim"><code>addUser <span class="Statement">::</span> MonadState UserDirectory m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> PasswordHash <span class="Statement">&#0045;&gt;</span> ClockTime <span class="Statement">&#0045;&gt;</span> m Bool<br />
addUser name passHash when <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; UserDirectory dir <span class="Statement">&lt;&#0045;</span> get<br />
&nbsp; &nbsp; <span class="Statement">if</span> M&#0046;member name dir<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">then</span> return False<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">else</span> <span class="Statement">do</span> put <span class="Statement">$</span> UserDirectory <span class="Statement">$</span> M&#0046;insert name (UserInfo passHash when) dir<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return True</code></div>
<p>That&#8217;ll work, but we&#8217;d rather not force our client to know how passwords get hashed &#8212; <code>PasswordHash</code> should be an implementation detail, not part of the interface.  So let&#8217;s make a function that hashes the password, and <em>then</em> calls the update:</p>
<div class="vim"><code>createUser <span class="Statement">::</span> MonadIO m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> String <span class="Statement">&#0045;&gt;</span> m Bool<br />
createUser name pass <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; salt <span class="Statement">&lt;&#0045;</span> liftIO newSalt<br />
&nbsp; &nbsp; now <span class="Statement">&lt;&#0045;</span> liftIO getClockTime<br />
&nbsp; &nbsp; update <span class="Statement">$</span> AddUser name (hashPassword salt pass) now</code></div>
<p>But wait, since <code>createUser</code> is taking the password as an argument, doesn&#8217;t it have the same saving-passwords-to-the-transaction-log flaw that we just tried to fix in <code>AddUser</code>?  No, because <code>createUser</code> <em>isn&#8217;t</em> an update operation &#8212; it&#8217;s just a function that <em>calls</em> the update operation.  More specifically, it sets up the arguments needed for <code>AddUser</code> (hashing the password and getting the current time), and then invokes <code>AddUser</code> with those arguments.  <em>Those</em> are the arguments that get saved to the transaction log, not the ones that <code>createUser</code> gets.</p>
<p>If you aren&#8217;t convinced, look at the type signature for <code>createUser</code> again, as compared to the one for <code>addUser</code>:</p>
<div class="vim"><code>createUser <span class="Statement">::</span> MonadIO m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> String <span class="Statement">&#0045;&gt;</span> m Bool<br />
&nbsp;<br />
addUser <span class="Statement">::</span> MonadState UserDirectory m <span class="Statement">=&gt;</span> String <span class="Statement">&#0045;&gt;</span> PasswordHash <span class="Statement">&#0045;&gt;</span> ClockTime <span class="Statement">&#0045;&gt;</span> m Bool</code></div>
<p><code>addUser</code> operates in a monad that implements <code>MonadState UserDirectory</code>, which is what makes it an update operation.  Our original code used <code>Update UserDirectory</code> as the monad, which is the concrete type of the monad that Happstack.State uses.  Really, though, our code doesn&#8217;t care <em>what</em> the specific monad is, as long as it lets us read and write a state value of type <code>UserDirectory</code>, which is what <code>MonadState UserDirectory</code> guarantees.</p>
<p><code>createUser</code>, on the other hand, operates in a monad that implements <code>MonadIO</code>, a generalization of the <code>IO</code> monad that lets Haskell programs interact with the outside world.  This is why <code>createUser</code> can call <code>IO</code> operations like <code>getClockTime</code>, which would be illegal inside a query or update, since the <code>Update UserDirectory</code> monad doesn&#8217;t implement <code>MonadIO</code>.</p>
<p>And if you <em>still</em> don&#8217;t believe me, observe that we aren&#8217;t creating any query or update operations using the template deep magic:</p>
<div class="vim"><code><span class="Statement">$</span>(mkMethods ''UserDirectory ['addUser, 'checkPassword, 'listUsers])</code></div>
<p>See?  <code>createUser</code> doesn&#8217;t appear anywhere in there; it&#8217;s just an ordinary function.</p>
<p>Since we want clients to call <code>createUser</code> instead of using <code>AddUser</code>, let&#8217;s change the list of exports from our module:</p>
<div class="vim"><code><span class="Type">module</span> Users ( UserDirectory<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; , createUser<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; , CheckPassword (<span class="Statement">&#0046;&#0046;</span>)<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; , ListUsers (<span class="Statement">&#0046;&#0046;</span>)<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; ) <span class="Type">where</span></code></div>
<p>And, since we changed the interface, we&#8217;ll need to update part of the <code>Main</code> module that uses it:</p>
<div class="vim"><code><span class="Comment">&#0045;&#0045; inside commandLoop:</span><br />
&nbsp; &nbsp; &nbsp; processCommand state [<span class="Constant">&quot;add&quot;</span>, user, pass] <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; success <span class="Statement">&lt;&#0045;</span> createUser user pass<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; putStrLn <span class="Statement">$</span> <span class="Statement">if</span> success <span class="Statement">then</span> <span class="Constant">&quot;User added&quot;</span> <span class="Statement">else</span> <span class="Constant">&quot;User already e&#120;ists&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; commandLoop state</code></div>
<p>Clients can be blissfully unaware of the hashing that&#8217;s going on behind the scenes inside <code>createUser</code>.</p>
<p><a href='http://www.kuliniewicz.org/blog/wp-content/uploads/2009/04/v2tar.gz'>Here&#8217;s a tarball of version 2 of our program</a>, with all the changes above (and a few others) applied to it.  Starting with the same state from version 1, let&#8217;s try creating a new user and prove once and for all we&#8217;ve abolished plaintext passwords:</p>
<blockquote><p><tt>&gt; list<br />
cowboy (joined Sat Apr 11 14:57:28 EDT 2009)<br />
pmk (joined Sat Apr 11 14:57:12 EDT 2009)<br />
&gt; add bobby nowsecure<br />
User added<br />
&gt; list<br />
bobby (joined Sat Apr 11 15:31:54 EDT 2009)<br />
cowboy (joined Sat Apr 11 14:57:28 EDT 2009)<br />
pmk (joined Sat Apr 11 14:57:12 EDT 2009)<br />
&gt; login bobby nowsecure<br />
Success<br />
&gt; checkpoint<br />
&gt; quit</tt></p></blockquote>
<p>And the transaction log from this session:</p>
<blockquote><p><tt>0000000: 0000 0000 0000 0000 0000 0000 0000 0000&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000010: 0000 0000 0000 0000 0656 a5f9 68e9 e949&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;V&#0046;&#0046;h&#0046;&#0046;I<br />
0000020: 8300 0001 2096 aa27 f300 0000 0000 0000&nbsp; &#0046;&#0046;&#0046;&#0046; &#0046;&#0046;&#39;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000030: 0000 0000 0000 0000 1531 3839 3231 3631&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;1892161<br />
0000040: 3433 3820 3230 3531 3836 3033 3930 0000&nbsp; 438 2051860390&#0046;&#0046;<br />
0000050: 0000 0000 0000 0000 0000 0000 000d 5573&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;Us<br />
0000060: 6572 732e 4164 6455 7365 7200 0000 0000&nbsp; ers&#0046;AddUser&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000070: 0000 9500 0000 0000 0000 0000 0000 0000&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000080: 0000 0562 6f62 6279 0000 0000 0000 0000&nbsp; &#0046;&#0046;&#0046;bobby&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000090: 0000 0000 0000 0000 40c4 a004 6a7d 0468&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;@&#0046;&#0046;&#0046;j}&#0046;h<br />
00000a0: dd6a aef6 f047 b48d fe4d 03b9 6d42 daf2&nbsp; &#0046;j&#0046;&#0046;&#0046;G&#0046;&#0046;&#0046;M&#0046;&#0046;mB&#0046;&#0046;<br />
00000b0: d647 fe57 ba4a c04a c67d a008 5abc 44fe&nbsp; &#0046;G&#0046;W&#0046;J&#0046;J&#0046;}&#0046;&#0046;Z&#0046;D&#0046;<br />
00000c0: 1ee0 d21e 7140 e619 7a13 db8c 0060 4ee2&nbsp; &#0046;&#0046;&#0046;&#0046;q@&#0046;&#0046;z&#0046;&#0046;&#0046;&#0046;`N&#0046;<br />
00000d0: 6d05 dd16 7f0f 4493 0000 0000 0000 0000&nbsp; m&#0046;&#0046;&#0046;&#0046;&#0046;D&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
00000e0: 0a36 b95d 8dd4 1919 9fcf c200 0000 0000&nbsp; &#0046;6&#0046;]&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
00000f0: 0000 0000 0049 e0f0 2a01 0100 0000 0000&nbsp; &#0046;&#0046;&#0046;&#0046;&#0046;I&#0046;&#0046;*&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;&#0046;<br />
0000100: 0000 0540 e032 c9e7&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &#0046;&#0046;&#0046;@&#0046;2&#0046;&#0046;</tt></p></blockquote>
<p>Success!  We can clearly see the <code>AddUser</code> for bobby, but can&#8217;t see his password.  Even the transaction log only stores a hash of the password.</p>
<p>By the way, if you&#8217;re wondering why we never had to make any changes to <code>checkPassword</code>, which also takes a plaintext password as an argument, that&#8217;s because <code>checkPassword</code> is a query, and queries don&#8217;t get saved to the transaction log.  That&#8217;s because queries are guaranteed not to change the contents of the MACID store, and thus can be ignored for recovery purposes.  It&#8217;s safe to pass sensitive information as the arguments to a query, but not to an update.</p>
<p>Unfortunately, the no-sensitive-information-in-updates rule is something I only learned <em>after</em> writing <a href="http://www.kuliniewicz.org/blog/archives/2009/04/01/the-button-now-in-open-beta/">The Button</a>, so if you signed up for an account, the password you chose was exposed in the transaction log.  I&#8217;ve taken the liberty of <a href="http://lambda-diode.com/software/wipe">wiping</a> those files, but in the unlikely event that you specified a password you use for other accounts too, I highly suggest you change those passwords, unless for some reason you trust me not to do anything nefarious with the passwords I at one time had access to.</p>
<p>Of course, there are still ways we could improve our program&#8217;s security, namely by enforcing that users pick <a href="http://en.wikipedia.org/wiki/Password_strength">strong passwords</a> and preventing online brute force attacks.  That will lead us into exploring the data migration features of Happstack.State.  Stay tuned.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kuliniewicz.org/blog/archives/2009/04/11/protecting-passwords-for-fun-and-profit/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Happstack.State &#8211; the basics</title>
		<link>http://www.kuliniewicz.org/blog/archives/2009/04/05/happstackstate-the-basics/</link>
		<comments>http://www.kuliniewicz.org/blog/archives/2009/04/05/happstackstate-the-basics/#comments</comments>
		<pubDate>Mon, 06 Apr 2009 01:04:32 +0000</pubDate>
		<dc:creator>Paul Kuliniewicz</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[happstack]]></category>
		<category><![CDATA[haskell]]></category>
		<category><![CDATA[state]]></category>

		<guid isPermaLink="false">http://www.kuliniewicz.org/blog/?p=1271</guid>
		<description><![CDATA[One of the surprising things about using Happstack for developing web apps is that, unlike most other frameworks out there, it doesn&#8217;t use a conventional database to save information. Instead, it implements something it calls a MACID store: an in-memory user-defined data store that provides ACID guarantees (atomicity, consistency, isolation, and durability). There are several [...]]]></description>
			<content:encoded><![CDATA[<p>One of the surprising things about using <a href="http://happstack.com/">Happstack</a> for developing web apps is that, unlike most other frameworks out there, it doesn&#8217;t use a conventional database to save information.  Instead, it implements something it calls a <a href="http://tutorial.happstack.com/tutorial/introductiontomacid">MACID store</a>: an in-memory user-defined data store that provides <a href="http://en.wikipedia.org/wiki/ACID">ACID guarantees</a> (<a href="http://en.wikipedia.org/wiki/Atomicity_(database_systems)">atomicity</a>, <a href="http://en.wikipedia.org/wiki/Database_consistency">consistency</a>, <a href="http://en.wikipedia.org/wiki/Isolation_(database_systems)">isolation</a>, and <a href="http://en.wikipedia.org/wiki/Durability_(computer_science)">durability</a>).</p>
<p>There are several advantages to using MACID instead of a <a href="http://en.wikipedia.org/wiki/Relational_database_management_system">relational database</a> with perhaps an <a href="http://en.wikipedia.org/wiki/Object-relational_mapping">object-relational mapper</a> on top of it.  The biggest is that the program can use whatever data structures and data types are most convenient, instead of being forced to stick with things that map nicely to database tables.  It also simplifies setup of the application: there&#8217;s no need to configure a separate database server to let the application talk to it.</p>
<p>As an example of using Happstack&#8217;s MACID store, implemented in the <a href="http://hackage.haskell.org/cgi-bin/hackage-scripts/package/happstack-state">Happstack.State</a> module, the following is a simplified version of the user account directory that I wrote as part of The Button.</p>
<p>Before we get to that, though, let&#8217;s take a moment to consider how the program is going to store passwords.  Being security-conscious, we aren&#8217;t going to store the passwords themselves &#8212; if an attacker were to get access to the files backing the MACID store, he would learn the passwords for all users in the system.  Instead, we&#8217;ll store a <a href="http://en.wikipedia.org/wiki/Cryptographic_hash_function">one-way hash</a> of each user&#8217;s password.  Given a password, it&#8217;s easy to compute the corresponding hash, but extremely difficult to take a hash and compute the password it was derived from.  That way, the program can check if a user&#8217;s password matches the one for the account, without ever having to remember what the password actually is.</p>
<p>Actually, it&#8217;s not quite that simple.  Just using a hash still lets an attacker perform a <a href="http://en.wikipedia.org/wiki/Dictionary_attack">dictionary attack</a>, perhaps using a <a href="http://en.wikipedia.org/wiki/Rainbow_table">rainbow table</a>, if he gets a copy of the hashed passwords.  The basic idea is that the attacker would guess a password, hash it, and see if the hash matches the password hash for any users.  To foil this, we <a href="http://en.wikipedia.org/wiki/Salt_(cryptography)">salt</a> passwords before hashing them, appending some random data before running the hash function.  The salt values don&#8217;t need to be secret; they just need to be different for each user, to guarantee that even if two users have the same password, the hashes will be different.</p>
<p>And just to ruin the attacker&#8217;s day even more, we do <a href="http://en.wikipedia.org/wiki/Key_strengthening">key strengthening</a> by running the hash function over and over instead of just once, to increase the amount of work an attacker needs to do to brute-force a password.  Since a hash can be computed fairly quickly, the user doesn&#8217;t notice if we run the function one time or 100 times.  The extra work only becomes noticeable if you&#8217;re trying to guess lots of passwords, which slows down an attacker.</p>
<p>OK, enough about password security.  Here&#8217;s how we&#8217;ll represent a hashed password in our program:</p>
<div class="vim"><code><span class="Type">data</span> PasswordHash <span class="Statement">=</span> PasswordHash { pwHash <span class="Statement">::</span> [Word8]<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; , pwSalt <span class="Statement">::</span> [Word8]<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; }<br />
&nbsp; &nbsp; <span class="Type">deriving</span> (Typeable, Data)<br />
&nbsp;<br />
<span class="Type">instance</span> Version PasswordHash<br />
<span class="Statement">$</span>(deriveSerialize &#39;&#39;PasswordHash)</code></div>
<p>A <code>PasswordHash</code> is the hash itself, along with the salt used when computing it.  The <code>deriving (Typeable, Data)</code> and the rest is part of Happstack.State&#8217;s <a href="http://en.wikipedia.org/wiki/Deep_magic_(programming)">deep magic</a>, which we don&#8217;t need to worry about for this example.</p>
<p>The actual hashing of a password is performed by this function, which takes a salt value and the password, and computes its hash:</p>
<div class="vim"><code>hashPassword <span class="Statement">::</span> [Word8] <span class="Statement">&#0045;&gt;</span> String <span class="Statement">&#0045;&gt;</span> [Word8]<br />
hashPassword salt password <span class="Statement">=</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">let</span> passBytes <span class="Statement">=</span> listToOctets <span class="Statement">$</span> map ord password<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">in</span>&nbsp; (iterate step passBytes) <span class="Statement">!!</span> iterationCount<br />
&nbsp; &nbsp; <span class="Type">where</span> iterationCount <span class="Statement">=</span> <span class="Constant">100</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; step chain <span class="Statement">=</span> SHA512&#0046;hash (chain <span class="Statement">++</span> salt)</code></div>
<p>That function converts the password into a series of bytes and repeatedly appends the salt and runs it through the <a href="http://en.wikipedia.org/wiki/SHA_hash_functions#SHA-2_family">SHA-512</a> hash function, returning the result after doing that 100 times.</p>
<p>That&#8217;s enough about passwords for now.  Next, let&#8217;s consider what information we want to store about each user.  For this example, we&#8217;ll only care about the user&#8217;s password (hashed, of course), along with the time they first registered an account:</p>
<div class="vim"><code><span class="Type">data</span> UserInfo <span class="Statement">=</span> UserInfo { usPassword <span class="Statement">::</span> PasswordHash<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; , usJoined&nbsp;&nbsp; <span class="Statement">::</span> ClockTime<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; }<br />
&nbsp; &nbsp; <span class="Type">deriving</span> (Typeable, Data)<br />
&nbsp;<br />
<span class="Type">instance</span> Version UserInfo<br />
<span class="Statement">$</span>(deriveSerialize &#39;&#39;UserInfo)</code></div>
<p>The user directory itself will just be a <a href="http://en.wikipedia.org/wiki/Associative_array">map</a> from user names to the information for each user:</p>
<div class="vim"><code><span class="Type">newtype</span> UserDirectory <span class="Statement">=</span> UserDirectory (M&#0046;Map String UserInfo)<br />
&nbsp; &nbsp; <span class="Type">deriving</span> (Typeable, Data)<br />
&nbsp;<br />
<span class="Type">instance</span> Version UserDirectory<br />
<span class="Statement">$</span>(deriveSerialize &#39;&#39;UserDirectory)<br />
&nbsp;<br />
<span class="Type">instance</span> Component UserDirectory <span class="Type">where</span><br />
&nbsp; &nbsp; <span class="Type">type</span> Dependencies UserDirectory <span class="Statement">=</span> End<br />
&nbsp; &nbsp; initialValue <span class="Statement">=</span> UserDirectory M&#0046;empty</code></div>
<p>Since the user directory is the thing we&#8217;re actually going to put into the MACID store, there&#8217;s a few extra lines of boilerplate at the bottom to say what other MACID stores it depends on (none) and what the contents of a brand-new store is (empty).</p>
<p>That&#8217;s about all there is to defining what goes into the MACID store itself.  Next, we need to define the operations that either query the store without changing it, or update the store to a new value.</p>
<p>Let&#8217;s start with an easy one: a query that returns a list of users, paired with the date their account was created:</p>
<div class="vim"><code>listUsers <span class="Statement">::</span> Query UserDirectory [(String, ClockTime)]<br />
listUsers <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; UserDirectory dir <span class="Statement">&lt;&#0045;</span> ask<br />
&nbsp; &nbsp; return <span class="Statement">$</span> M&#0046;toList <span class="Statement">$</span> M&#0046;map usJoined dir</code></div>
<p>The type signature is perhaps the most interesting part: <code>Query UserDirectory [(String, ClockTime)]</code>.  This is a read-only query (<code>Query</code>) that operates on a <code>UserDirectory</code> and returns a list of <code>(String, ClockTime)</code> pairs.  The implementation of the function is simple: get the current value of the store using <code>ask</code>, then iterate over the map to get the user names and dates.</p>
<p>Here&#8217;s a more sophisticated query, one that takes a user name and password and sees if the password is correct:</p>
<div class="vim"><code>checkPassword <span class="Statement">::</span> String <span class="Statement">&#0045;&gt;</span> String <span class="Statement">&#0045;&gt;</span> Query UserDirectory Bool<br />
checkPassword name pass <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; UserDirectory dir <span class="Statement">&lt;&#0045;</span> ask<br />
&nbsp; &nbsp; <span class="Statement">case</span> M&#0046;lookup name dir <span class="Statement">of</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; Nothing&nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">&#0045;&gt;</span> return False<br />
&nbsp; &nbsp; &nbsp; &nbsp; Just userInfo <span class="Statement">&#0045;&gt;</span> <span class="Statement">let</span> PasswordHash hash salt <span class="Statement">=</span> usPassword userInfo<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">in</span>&nbsp; return <span class="Statement">$</span> hash <span class="Statement">==</span> hashPassword salt pass</code></div>
<p>Again, the type signature (<code>String -> String -> Query UserDirectory Bool</code>) shows this is a query on our <code>UserDirectory</code>, this time taking two strings (user name and password) as arguments and returning a boolean value.  The code looks up the <code>UserInfo</code>, if any, associated with the user, gets the salt, hashes the password provided as an argument to the query, and returns whether the two match.</p>
<p>Of course, it would help if we had a way to add users to the store.  That&#8217;s what this update operation does:</p>
<div class="vim"><code>addUser <span class="Statement">::</span> String <span class="Statement">&#0045;&gt;</span> String <span class="Statement">&#0045;&gt;</span> Update UserDirectory Bool<br />
addUser name pass <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; UserDirectory dir <span class="Statement">&lt;&#0045;</span> get<br />
&nbsp; &nbsp; <span class="Statement">if</span> M&#0046;member name dir<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">then</span> return False<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">else</span> <span class="Statement">do</span> salt <span class="Statement">&lt;&#0045;</span> newSalt<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; now <span class="Statement">&lt;&#0045;</span> liftM fi&#120;EventClockTime getEventClockTime<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">let</span> passwordHash <span class="Statement">=</span> PasswordHash (hashPassword salt pass) salt<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">let</span> userInfo <span class="Statement">=</span> UserInfo passwordHash now<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; put <span class="Statement">$</span> UserDirectory <span class="Statement">$</span> M&#0046;insert name userInfo dir<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return True</code></div>
<p>Since this is an update, the type signature has <code>Update</code> in it instead of <code>Query</code>.  This means instead of using <code>ask</code> to get the current state, we use <code>get</code> to get the current state and <code>put</code> to set it to a new value.  The code itself first checks if a user by the name already exists, and if not, creates a new <code>UserInfo</code> structure with the user&#8217;s hashed password and the time that the update was made.</p>
<p>A couple things are worth pointing out.  First, inside a <code>Query</code> or <code>Update</code>, we can use the MACID store as a <a href="http://en.wikipedia.org/wiki/Random_number_generation">random number generator</a>, which is how a new salt is generated for the user:</p>
<div class="vim"><code><span class="Type">instance</span> Random Word8 <span class="Type">where</span><br />
&nbsp; &nbsp; randomR (lo, hi) rng <span class="Statement">=</span> <span class="Statement">let</span> (val, rng') <span class="Statement">=</span> randomR (fromIntegral lo, fromIntegral hi) rng<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; val <span class="Statement">::</span> Int<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; <span class="Statement">in</span>&nbsp; (fromIntegral val, rng')<br />
&nbsp; &nbsp; random rng <span class="Statement">=</span> randomR (minBound, ma&#120;Bound) rng<br />
&nbsp;<br />
newSalt <span class="Statement">::</span> AnyEv [Word8]<br />
newSalt <span class="Statement">=</span> sequence <span class="Statement">$</span> take saltLength <span class="Statement">$</span> repeat getRandom<br />
&nbsp; &nbsp; <span class="Type">where</span> saltLength <span class="Statement">=</span> <span class="Constant">10</span></code></div>
<p><code>AnyEv</code> is a <a href="http://en.wikipedia.org/wiki/Monad_(functional_programming)">monad</a> that both <code>Query</code> and <code>Update</code> work with.  The <code>newSalt</code> function generates 10 random bytes by first generating an infinite list of monadic computations that return random numbers (<code>repeat getRandom</code>), throwing away all but the first ten of them (<code>take saltLength</code>), and finally combining them into a single monadic computation that returns a list of random bytes (<code>sequence</code>) (instead of a list of monadic computations that each returns one random byte).</p>
<p>Annoyingly, Haskell doesn&#8217;t provide a direct way to generate random bytes (of type <code>Word8</code>), so we have to manually specify how <code>Word8</code> implements the <code>Random</code> typeclass.  The implementation is straightforward: to generate a random byte, first generate a random integer, and then convert it to a byte.</p>
<p>Returning to the update function, it&#8217;s also worth noting that <code>getEventClockTime</code> returns the time at which the update was logged, <em>not</em> the time the update is executing!  This is because the MACID store doesn&#8217;t save a new copy of the state to disk every time it changes.  Instead, it saves a log of what updates were made.  If the program terminates abnormally before the state can be checkpointed, the MACID store recovers by loading the most recent checkpoint and re-executing all the updates since then.  This is how the MACID store provides durability without completely thrashing the disk.</p>
<p>The astute reader will note that, if we ever do need to replay the <code>addUser</code> update, <code>newSalt</code> will be executed again!  Does this mean that an update that uses the RNG could get different values during recovery?  As it turns out, the MACID store also saves the state of the RNG, so replaying the update during recovery will generate the same random numbers that were obtained from the original execution of the update.  The particularly astute reader will conclude that the MACID store&#8217;s RNG is unsuitable for generating cryptographic keys, since an attacker with access to the MACID store&#8217;s backing files can predict what the RNG will return.  Fortunately, we don&#8217;t care if the salt is predictable, just that it&#8217;s different for each user.</p>
<p>OK, enough of that.  There&#8217;s just one little piece of deep magic we need to convert those functions into honest-to-goodness transaction handlers for the MACID store:</p>
<div class="vim"><code><span class="Statement">$</span>(mkMethods &#39;&#39;UserDirectory [&#39;addUser, &#39;checkPassword, &#39;listUsers])</code></div>
<p>That bit of magic generates a type constructor for each transaction function.  When invoking a transaction, instead of calling the functions themselves, we create an object to represent the request, and then pass it to <code>query</code> or <code>update</code> as appropriate.</p>
<p>Here&#8217;s an example of using our new MACID store.  The following implements a simple command interpreter that lets us manipulate our user directory:</p>
<div class="vim"><code>commandLoop <span class="Statement">::</span> MVar T&#120;Control <span class="Statement">&#0045;&gt;</span> IO ()<br />
commandLoop state <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; putStr <span class="Constant">&quot;&gt; &quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; hFlush stdout<br />
&nbsp; &nbsp; &nbsp; &nbsp; command <span class="Statement">&lt;&#0045;</span> liftM words getLine<br />
&nbsp; &nbsp; &nbsp; &nbsp; processCommand state command<br />
&nbsp; &nbsp; <span class="Type">where</span> processCommand state [<span class="Constant">&quot;list&quot;</span>] <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; people <span class="Statement">&lt;&#0045;</span> query ListUsers<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; mapM_ (putStrLn <span class="Statement">&#0046;</span> showUser) people<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; commandLoop state<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; processCommand state [<span class="Constant">&quot;add&quot;</span>, user, pass] <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; success <span class="Statement">&lt;&#0045;</span> update <span class="Statement">$</span> AddUser user pass<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; putStrLn <span class="Statement">$</span> <span class="Statement">if</span> success <span class="Statement">then</span> <span class="Constant">&quot;User added&quot;</span> <span class="Statement">else</span> <span class="Constant">&quot;User already e&#120;ists&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; commandLoop state<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; processCommand state [<span class="Constant">&quot;login&quot;</span>, user, pass] <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; success <span class="Statement">&lt;&#0045;</span> query <span class="Statement">$</span> CheckPassword user pass<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; putStrLn <span class="Statement">$</span> <span class="Statement">if</span> success <span class="Statement">then</span> <span class="Constant">&quot;Success&quot;</span> <span class="Statement">else</span> <span class="Constant">&quot;Bad account or password&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; commandLoop state<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; processCommand state [<span class="Constant">&quot;checkpoint&quot;</span>] <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; createCheckpoint state<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; commandLoop state<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; processCommand _&nbsp; &nbsp;&nbsp; [<span class="Constant">&quot;quit&quot;</span>] <span class="Statement">=</span> return ()<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; processCommand state _&nbsp; &nbsp; &nbsp; &nbsp; <span class="Statement">=</span> <span class="Statement">do</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; putStrLn <span class="Constant">&quot;Unrecognized command&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; commandLoop state<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; showUser (name, joined) <span class="Statement">=</span> name <span class="Statement">++</span> <span class="Constant">&quot; (joined &quot;</span> <span class="Statement">++</span> show joined <span class="Statement">++</span> <span class="Constant">&quot;)&quot;</span><br />
&nbsp;<br />
macidPro&#120;y <span class="Statement">::</span> Pro&#120;y UserDirectory<br />
macidPro&#120;y <span class="Statement">=</span> Pro&#120;y<br />
&nbsp;<br />
main <span class="Statement">::</span> IO ()<br />
main <span class="Statement">=</span> startSystemState macidPro&#120;y <span class="Statement">&gt;&gt;=</span> commandLoop</code></div>
<p>Note how, for example, instead of calling <code>addUser user pass</code>, we do <code>update $ AddUser user pass</code>, where the <code>AddUser</code> type constructor was generated automatically and takes the same arguments as <code>addUser</code>.  The <code>update</code> function does all the work of invoking <code>addUser</code> and making sure the result gets saved to the MACID store which enforcing all the ACID guarantees.</p>
<p>Instead of walking through how the command interpreter works, here&#8217;s a sample session, which should give you the idea:</p>
<blockquote><p><tt>&gt; list<br />
&gt; add paul <a href="http://tvtropes.org/pmwiki/pmwiki.php/Main/ThePasswordIsAlwaysSwordfish">swordfish</a><br />
User added<br />
&gt; list<br />
paul (joined Sun Apr&nbsp; 5 20:00:01 EDT 2009)<br />
&gt; add cowboy sourMilk7<br />
User added<br />
&gt; list<br />
cowboy (joined Sun Apr&nbsp; 5 20:00:13 EDT 2009)<br />
paul (joined Sun Apr&nbsp; 5 20:00:01 EDT 2009)<br />
&gt; login paul notHisPassword<br />
Bad account or password<br />
&gt; login nobody foobar<br />
Bad account or password<br />
&gt; login paul swordfish<br />
Success<br />
&gt; quit</tt></p></blockquote>
<p>The files backing the MACID store are, by default, in the <tt>_local</tt> directory under where you ran the program from: <tt>current-0000000000</tt>, the initial checkpoint of the (empty) user directory we started with; and <tt>events-0000000000</tt>, the log of all the updates that were applied.  If we had created a checkpoint, it would be in the directory too.</p>
<p>Now, quiz time.  Not counting <a href="http://www.kuliniewicz.org/blog/archives/2009/04/03/the-button-is-down/">all the security problems I already mentioned in The Button</a> that apply equally well to this example, there is <strong>an additional security flaw</strong> in the above code that could let an attacker steal passwords with little effort on his part.  Your quiz has two questions:</p>
<ol>
<li>How can the attacker steal passwords from the MACID store?</li>
<li>How can we change the program to prevent him from doing so?</li>
</ol>
<p>For your convenience, here is <a href='http://www.kuliniewicz.org/blog/wp-content/uploads/2009/04/vulnerable.hs'>Vulnerable.hs, the complete example described in the above post</a>.  Assuming you&#8217;ve <a href="http://www.kuliniewicz.org/blog/archives/2009/03/24/installing-ghc-610-on-ubuntu-intrepid/">installed GHC 6.10 and Happstack on your computer</a>, just do &#8220;<tt>runghc Vulnerable.hs</tt>&#8221; to try the program out yourself.</p>
<p>Next time: the answers to the quiz.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kuliniewicz.org/blog/archives/2009/04/05/happstackstate-the-basics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Button is down</title>
		<link>http://www.kuliniewicz.org/blog/archives/2009/04/03/the-button-is-down/</link>
		<comments>http://www.kuliniewicz.org/blog/archives/2009/04/03/the-button-is-down/#comments</comments>
		<pubDate>Sat, 04 Apr 2009 02:42:30 +0000</pubDate>
		<dc:creator>Paul Kuliniewicz</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[happstack]]></category>
		<category><![CDATA[password]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[the button]]></category>

		<guid isPermaLink="false">http://www.kuliniewicz.org/blog/?p=1267</guid>
		<description><![CDATA[As some of you have already noticed, The Button is down and not coming back anytime soon, for several reasons. First, and most obviously, it was in part an April Fools&#8217; Day prank making fun of Twitter: a microblogging platform with a zero-character limit. I was hoping that &#8220;femtoblogging&#8221; would be a unique name, but [...]]]></description>
			<content:encoded><![CDATA[<p>As <a href="http://www.kuliniewicz.org/blog/archives/2009/04/01/the-button-now-in-open-beta/#comment-2899">some of you have already noticed</a>, <a href="http://www.kuliniewicz.org/button.html">The Button</a> is down and not coming back anytime soon, for several reasons.</p>
<p>First, and most obviously, it was in part an <a href="http://en.wikipedia.org/wiki/April_Fools%27_Day">April Fools&#8217; Day</a> prank making fun of Twitter: a microblogging platform with a zero-character limit.  I was hoping that &#8220;femtoblogging&#8221; would be a unique name, but as it turns out for each of the sub-micro <a href="http://en.wikipedia.org/wiki/SI_prefixes#List_of_SI_prefixes">SI prefixes</a>, there are plenty of hits for <var>prefix</var>blog, from <a href="http://www.google.com/search?hl=en&#038;q=nanoblog">nanoblog</a> down to <a href="http://www.google.com/search?hl=en&#038;q=yoctoblog">yoctoblog</a>.</p>
<p>Second, and primarily, I wrote it to get a feel for developing web apps in <a href="http://happstack.com/">Happstack</a>, a <a href="http://www.haskell.org/">Haskell</a>-based application server development framework.  The Button was trivial enough to be implemented over a weekend (plus a little polishing the following Monday evening), but nontrivial enough to let me play around with several features and get a better understanding of why <a href="http://tutorial.happstack.com/">the examples</a> are organized the way they are.  I definitely learned some things in writing The Button, which I&#8217;ll regale you with in the next few posts.</p>
<p>Third, and most pragmatically, the actual hosting of The Button was an ugly ugly hack.  Happstack requires <a href="http://haskell.org/ghc/">GHC</a> 6.10 (the Haskell <a href="http://en.wikipedia.org/wiki/Compiler">compiler</a>), and although my web host does indeed have GHC pre-installed, it only had version 6.6.  I tried compiling the latest version of GHC from source, but that failed once it exhausted the remaining 200 MB of my disk quota.  Downloading precompiled binaries was also impossible since the unpacked tarballs for those also required more than 200 MB of disk.  While I don&#8217;t foresee any issues in getting my quota increased, since I was trying to do this the evening of March 31, I couldn&#8217;t count on the turnaround time of the request being quick enough.</p>
<p>In short, The Button was running off queeg, my laptop.  The domain button.kuliniewicz.org was pointing to my home connection.  (It doesn&#8217;t anymore; it&#8217;s currently acting as a synonym for www.kuliniewicz.org, which I need to fix.)  In case you were wondering why it was running on a high-numbered <a href="http://en.wikipedia.org/wiki/TCP_and_UDP_port">port</a>, that&#8217;s why &#8212; there&#8217;s no way I was going to run a largely untested server as root on my home machine and open it to the world!  Naturally, using my laptop as a web server was hardly a long-term solution, so once April 1 passed I took it down.</p>
<p>Fourth, and most security-consciously, The Button&#8217;s password security was a joke.  Other than storing them with <a href="http://en.wikipedia.org/wiki/SHA_hash_functions#SHA-2_family">strong</a>, randomly <a href="http://en.wikipedia.org/wiki/Salt_(cryptography)">salted</a>, <a href="http://en.wikipedia.org/wiki/Key_strengthening">strengthened</a> hashes, it was bad.  Passwords were transmitted to the server in the clear.  There were no checks whatsoever for <a href="http://en.wikipedia.org/wiki/Password_strength">strong passwords</a>.  Nor was there any protection against online <a href="http://en.wikipedia.org/wiki/Brute_force_attack">brute force attacks</a> (which, incidentally, <a href="http://blog.wired.com/27bstroke6/2009/01/professed-twitt.html">Twitter fell victim to earlier this year</a>, with little &#8220;happiness&#8221; to be had by that compromised admin account).</p>
<p>So, I hope those of you who did actually register accounts with The Button didn&#8217;t use the same password you use for anything important.</p>
<p>If I had had more than a weekend to work on The Button, I would&#8217;ve addressed those issues, but I simply ran out of time.  I couldn&#8217;t in good conscience continue running a server with that many security vulnerabilities once the joke had passed.  That&#8217;s also why I&#8217;m reluctant to post the code that implemented The Button unless someone <em>really</em> wants to see it.  It&#8217;s not of good enough quality for someone to use as the basis of something real.</p>
<p>If for some reason you actually think The Button, or femtoblogging in general, actually has potential (I can actually think of one or two legitimate use cases for it, though I can also come up with better solutions for those use cases), I&#8217;ve demonstrated you can implement the core functionality over a weekend, even if you aren&#8217;t particularly well-versed in the framework or the language you&#8217;re using.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kuliniewicz.org/blog/archives/2009/04/03/the-button-is-down/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

