Reading is dangerous (if you’re writing Haskell)

In Haskell, the read function is the usual, simple way to parse a String into a value of some other type:

ghci> :t read
read :: Read a => String -> a

read can parse anything that implements the aptly-named Read class. All the standard numeric types implement Read:

ghci> read "42" :: Int
42

Actually, it turns out the Read instance for integral types like Int is a bit too clever for its own good. Did you know (in GHC, at least) it can handle floating-point-style scientific notation, as long as the mantissa significand is an integer and the exponent is nonnegative?

ghci> read "42e2" :: Int
4200

This is nifty, but if the exponent is large, you can easily eat up all of GHC’s memory and crash the program:

ghci> read "1e1000000000" :: Int
<interactive>: out of memory (requested 45088768 bytes)
$

This is bad if the argument to read comes from an untrusted source. This was the subject of a recent security fix to happstack-server, where the entire server application could be brought down by sending it a request that used something like 1e1000000000 as a request parameter that would be parsed as an integral type. Of course, the vulnerability isn’t specific to happstack-server, but anything that tries to read an integer from untrusted input.

I don’t think Neutronium is currently vulnerable to this (other than being built against a vulnerable version of happstack-server at the moment), but that’s mostly because there isn’t yet anything that’s using read in this manner. One of the next changes I was planning to make was in how room membership is tracked, and part of that would be assigning each member of the room an integral identifier that is sent as a parameter to each room-related Ajax request. (This will support identifying who is who in the room, and solve the problem of having the same room open in multiple tabs without getting things all confused.) So, even if Neutronium isn’t vulnerable to this denial of service attack yet, it would be soon, and without having seen that posting, I don’t know if I would have even thought to worry about a seemingly simple built-in function like read for Int causing problems like that.

I can’t help but wonder if there are static source code analysis tools out there for Haskell that could find security-relevant flaws like this, like there are for more mainstream languages like C, C++, or Java. My gut tells me it’d be easier to write an analyzer for Haskell than for a language like C, but I don’t know if anyone has ever actually sat down to do it yet.

One Response

  1. Update: it appears that this is a bug in GHC; according to the Haskell Report (i.e. the language specification), reading integers isn’t supposed to support exponential notation at all.

Comments are closed.