Data migration in Happstack.State

Last time (modulo a distracting interlude), we looked at fixing a password disclosure vulnerability in our simple Happstack.State demo program. However, there are still some security flaws that leave passwords open to attack:

  • There are no checks for password strength — users are free to pick trivially weak passwords.
  • There is nothing preventing an online brute force attack against an account, where an attacker repeatedly guesses passwords and tries to log in until he finds the right one.

These two vulnerabilities are hardly theoretical. As I mentioned in an earlier post, not too long ago an administrator’s account on Twitter was broken into by someone running a dictionary attack. The number of login attempts per second in such an attack is only bounded by the network bandwidth between the attacker and the server, since any halfway competent attacker is going to use a script to do the work for him — it’s not like he’s sitting there trying in passwords as quickly as he can. And people being lazy, they’re liable to pick simple, easily guessed passwords unless we do something to stop them.

So let’s modify our program from last time to add some countermeasures to these attacks:

  • When creating an account, check whether the password is sufficiently strong, and refuse to create the account if it isn’t.
  • If we see repeated failed logins for an account, temporarily lock the account, limiting how quickly an attacker can guess passwords.

First, since we’re introducing even more ways operations can fail, it’d be nice if our program provided more feedback to the user about why an operation failed. A simple algebraic sum type will suffice:

data UserError = UserExists
               | NoSuchUser
               | PasswordMismatch
               | AccountLocked ClockTime
               | PasswordTooShort Int
    deriving (Eq, Ord, Typeable, Data)
 
instance Version UserError
$(deriveSerialize ''UserError)
 
instance Show UserError where
    show UserExists             = "A user by that name already exists."
    show NoSuchUser             = "No user by that name exists."
    show PasswordMismatch       = "Incorrect password."
    show (AccountLocked until)  = "Account is locked until " ++ show until ++ "."
    show (PasswordTooShort min) = "Password must be at least " ++ show min ++ " characters long."

Even if you don’t know Haskell, it should be fairly self-evident what’s going on. I’ll just note two things. First, we still need to implement the Version class and derive a serialization function via deriveSerialize even though we don’t plan to save a UserError in our MACID store, because it’s a requirement for anything we pass in or out of a query or update. Second, we implement Show ourselves instead of letting the compiler do it for us, so we can provide human-readable versions of each error.

The core of checking password strength is almost trivial: a function that takes a proposed password and returns a UserError if it doesn’t meet our exacting standards. Our code that actually creates the account will then call this function to check the strength of the user’s proposed password:

checkPasswordStrength :: String -> Maybe UserError
checkPasswordStrength pass = if length pass < 8
                                then Just $ PasswordTooShort 8
                                else Nothing

Admittedly, this isn’t much of a strength check, since it’s only looking at the length of the password. Heck, even the classic bad password “password” passes with flying colors. But this is good enough for demonstration purposes here; it’s trivial to modify the function to check more things, and presumably add new constructors to UserError accordingly.

The more interesting issue is that of temporarily locking accounts after some number of consecutive login failures. Clearly, for each account we need to keep track of how many failures there have been since the last successful login, whether the account is locked, and if so, when the lock will expire. Adding that information to UserInfo looks easy enough:

data UserInfo = UserInfo { usPassword :: PasswordHash
                         , usJoined   :: ClockTime
                         , usFailures :: Int                -- new
                         , usLocked   :: Maybe ClockTime    -- new
                         }
    deriving (Typeable, Data)

There’s a problem, though. Our MACID store contains records using the original definition of UserInfo, which lacks the last two fields. If that’s all we do, suddenly we’ll be unable to load our old data. That’s bad.

Fortunately, Happstack.State is one step ahead of us. Remember that Version typeclass that UserInfo, and all the other types we use with the MACID store, has to implement? That’s what provides our data migration path. First, we’ll need to keep the definition of the old version of UserInfo around, but with a different name. Ordinarily, we’d create a separate module to contain the old definitions, but here let’s just stick _0 to the names of everything, to denote “version 0″ to ourselves:

data UserInfo_0 = UserInfo_0 { usPassword_0 :: PasswordHash
                             , usJoined_0   :: ClockTime
                             }
    deriving (Typeable, Data)
 
instance Version UserInfo_0
$(deriveSerialize ''UserInfo_0)

The penultimate line there declares that UserInfo_0 is an instance of Version. The default implementation of Version, which we’ve used up until now, says the type is the first version (version 0). Our new and improved UserInfo is version 1 of this type, so its implementation of Version needs to state this explicitly:

instance Version UserInfo where
    mode = extension 1 (Proxy :: Proxy UserInfo_0)

This just says that UserInfo is version 1 of the type, and the previous version is what we’re now calling UserInfo_0. (The Proxy object is essentially no different than unit, but we can associate a particular type with it. Happstack.State uses it to pass type information to a function without needing to pass an instance of that type.)

OK, so UserInfo is the successor to UserInfo_0, but we still need to say how to migrate from the old version to the new version. That’s what the Migrate typeclass is for:

instance Migrate UserInfo_0 UserInfo where
    migrate (UserInfo_0 password joined) = UserInfo password joined 0 Nothing

In other words, to migrate from UserInfo_0 to UserInfo, copy the existing data over, set the login failure count to 0, and note that the account is not locked.

Now our new program will be able to read the data saved from the old one. When it tries to read a UserInfo object (version 1) but sees a UserInfo_0 object (version 0) instead, Happstack.State can automatically figure out how to perform the conversion via the migrate function we defined between the two types. For the curious, this page describes what happens behind the scenes to make this work.

Our data migration worries solved, we can turn our attention to implementing our account locking and unlocking. The easiest way forward is to define a bunch of simple helper functions for doing the basic steps, and then using those when implementing the login operation.

Checking if an account is locked is just a matter of seeing if the locked-until time, if it exists, is on or after than the current time:

isLocked :: ClockTime -> UserInfo -> Bool
isLocked asOf user = maybe False (>= asOf) $ usLocked user

Unlocking an account just means resetting the failure counter and clearing the locked-until time. Just note that, this being Haskell, we return a new UserInfo object instead of trying to modify the existing one (since we couldn’t modify it even if we tried):

unlockUser :: UserInfo -> UserInfo
unlockUser user = user { usFailures = 0, usLocked = Nothing }

Incrementing the failure count is slightly complicated by needing to lock the account if the failure count exceeds the limit. Here, if there’s been three or more consecutive failures, we lock the account for one minute:

failUser :: ClockTime -> UserInfo -> UserInfo
failUser when user = let newFailures = usFailures user + 1
                         lockedUntil = if newFailures >= failureThreshold
                                            then Just $ addToClockTime lockPeriod when
                                            else Nothing
                     in  user { usFailures = newFailures, usLocked = lockedUntil }
    where failureThreshold = 3
          lockPeriod = noTimeDiff { tdMin = 1 }

Finally, we can turn our attention to the actual login process. Login is now an update, since a login attempt, successful or unsuccessful, needs to update the failure count in the UserInfo. This means we need to beware of the no-plaintext-passwords-as-update-parameters rule we discovered previously. The workaround is the same as last time. First, an update operation that does the work:

authenticateUser :: MonadState UserDirectory m => String -> PasswordHash -> ClockTime -> m (Maybe UserError)
authenticateUser name passHash when = do
    UserDirectory dir <- get
    case M.lookup name dir of
        Nothing -> return $ Just NoSuchUser
        Just user -> if isLocked when user
                        then return $ fmap AccountLocked $ usLocked user
                        else if passHash == usPassword user
                                then do put $ UserDirectory $ M.insert name (unlockUser user) dir
                                        return Nothing
                                else do put $ UserDirectory $ M.insert name (failUser when user) dir
                                        return $ Just PasswordMismatch

AuthenticateUser needs a PasswordHash, so here’s a query that takes a plaintext password and hashes it using the salt for a particular user account:

hashPasswordFor :: MonadReader UserDirectory m => String -> String -> m (Either UserError PasswordHash)
hashPasswordFor name pass = do
    UserDirectory dir <- ask
    return $ case M.lookup name dir of
                Nothing   -> Left NoSuchUser
                Just user -> Right $ hashPassword (pwSalt $ usPassword user) pass

Finally, the loginUser function hashes the password using the HashPasswordFor query, and if it succeeds (meaning the user does indeed exist), runs an AuthenticateUser update using the result:

loginUser :: MonadIO m => String -> String -> m (Maybe UserError)
loginUser name pass = do
    hashResult <- query $ HashPasswordFor name pass
    case hashResult of
        Left error     -> return $ Just error
        Right passHash -> do now <- liftIO getClockTime
                             update $ AuthenticateUser name passHash now

And let’s not forget to update createUser to check the strength of a password:

createUser :: MonadIO m => String -> String -> m (Maybe UserError)
createUser name pass =
    case checkPasswordStrength pass of
        Nothing -> do salt <- liftIO newSalt
                      now <- liftIO getClockTime
                      update $ AddUser name (hashPassword salt pass) now
        excuse  -> return excuse

All that’s left is to adjust the command loop to expect to maybe get a UserError (or rather, definitely get a Maybe UserError), and print the error message if the command failed:

-- in commandLoop:
          processCommand state ["add", user, pass] = do
                    result <- createUser user pass
                    putStrLn $ maybe "Success" show result
                    commandLoop state
          processCommand state ["login", user, pass] = do
                    result <- loginUser user pass
                    putStrLn $ maybe "Success" show result
                    commandLoop state

Here’s the complete program. Let’s try it out, starting with the result of the run with the old version, to demonstrate that the migration worked:

> list
bobby (joined Sat Apr 11 15:31:54 EDT 2009)
cowboy (joined Sat Apr 11 14:57:28 EDT 2009)
pmk (joined Sat Apr 11 14:57:12 EDT 2009)
> login pmk notthepassword
Incorrect password.
> login pmk alsonotthepassword
Incorrect password.
> login pmk maybethisisit
Incorrect password.
> login pmk keeptryinganyway
Account is locked until Sat Apr 18 18:29:57 EDT 2009.
> time              
Sat Apr 18 18:29:08 EDT 2009
> login pmk swordfish
Account is locked until Sat Apr 18 18:29:57 EDT 2009.
> time
Sat Apr 18 18:30:10 EDT 2009
> login pmk swordfish
Success
> add bobby letstryagain
A user by that name already exists.
> login alice xyzzy
No user by that name exists.
> add alice xyzzy
Password must be at least 8 characters long.
> add alice aaaaaaaa
Success
> list
alice (joined Sat Apr 18 18:31:05 EDT 2009)
bobby (joined Sat Apr 11 15:31:54 EDT 2009)
cowboy (joined Sat Apr 11 14:57:28 EDT 2009)
pmk (joined Sat Apr 11 14:57:12 EDT 2009)
> checkpoint
> quit

Everything works as expected. Trying to brute-force a password temporarily locks the account, preventing even legitimate logins until the lock times out. New accounts are forced to have a sufficiently long password. Meaningful error messages are displayed. And we didn’t lose any of the data from the old version of the program.

Are we done? Everything does indeed work like we want it to, but it’s still not ideal. Let’s take another look at the definition of authenticateUser:

authenticateUser :: MonadState UserDirectory m => String -> PasswordHash -> ClockTime -> m (Maybe UserError)
authenticateUser name passHash when = do
    UserDirectory dir <- get
    case M.lookup name dir of
        Nothing -> return $ Just NoSuchUser
        Just user -> if isLocked when user
                        then return $ fmap AccountLocked $ usLocked user
                        else if passHash == usPassword user
                                then do put $ UserDirectory $ M.insert name (unlockUser user) dir
                                        return Nothing
                                else do put $ UserDirectory $ M.insert name (failUser when user) dir
                                        return $ Just PasswordMismatch

It works, but even if you don’t know Haskell, it still smells bad. There’s a lot of nesting going on, as seen by the ever-increasing level of indentation. In fact, every time there’s a way to fail (no such user exists, the account is locked, etc.), we need another conditional to handle the error case, forcing the rest of the computation another level deeper.

Less obvious is the bad-smelling use of Maybe UserError. Typically, the Maybe monad is used to carry the result of a successful computation — in fact, when Maybe is used as a monad, that’s precisely what it does. This is the opposite of how we’re using it: to carry the result of an unsuccessful computation. This code is violating the principle of least astonishment and is liable to confuse anyone expecting Maybe to be used the way it normally is.

Still, we want to carry detailed information about errors so the user can be informed. Is there a better way to handle errors?

Of course there is. That’s the subject of the next post.

Comments are closed.