Super Mario Bros. is NP-complete (sort of)

I recently stumbled across (via Scott Aaronson’s blog) an academic computer science paper that proved that Super Mario Bros. is NP-complete, and so are Donkey Kong Country, The Legend of Zelda, Metroid, and Pokémon. Or, to be more precise, the decision problem of whether a level in any of those games is winnable is NP-complete.

Don’t get too excited about this result, however. It doesn’t mean you’re going to be able to pick up a NES controller and suddenly be able to solve intractable problems with ease.

Informally, an NP-complete problem is the hardest type of question where there is no known way to figure out the answer quickly, but it is possible to quickly verify that an alleged answer is true. The Boolean satisfiability problem is one well-known NP-complete problem: given an arbitrary Boolean formula, is there some way to assign values to each variable in it to make the formula true? If I say I know the answer for a particular formula, it’s easy to see if I’m right: plug in the values I give you for each variable, and see if the formula evaluates to true or false. However, there’s no known way to figure out in general what those values should be, if they even exist at all.

NP-complete problems are a subset of the NP problems, which are all problems where it’s possible to quickly check whether an alleged answer is true. NP-complete problems are the hardest NP problems.

What does that mean? NP-complete problems have the property that if you can find a fast way to solve one of them, you automatically know how to solve all NP-complete problems (and all NP problems in general) quickly too. This works because it’s possible to quickly transform any NP problem into any NP-complete problem, in such a way that if you can solve the NP-complete problem, you can quickly transform that answer back into an answer for the original NP problem.

Whether or not there’s a fast way to solve NP-complete problems is one of the big unsolved problems in computer science. Despite several decades’ worth of people working on it, no one has come up with a fast way to solve any of the thousands of known NP-complete problems in general, nor come up with a proof that it’s not possible.

So, what does this have to do with Super Mario Bros.? The authors of the paper found that the question of whether a stage in generalized Super Mario Bros. is winnable or not is an NP-complete problem. Here, “generalized” means that unlike in the original game, the stage can be more than one screen high, and there is no limit (in physical size or in time) on the size of the stage. Their proof of NP-completeness shows that given an instance of the 3-SAT problem (a special case of the Boolean satisfiability problem that is also NP-complete), it’s possible to construct a generalized Super Mario Bros. stage that is winnable only if a solution exists for the Boolean formula, and where the path Mario takes from the start to the finish directly corresponds to how to set the variables in the Boolean formula to make it true. If you can find a path from the beginning to the end of the stage, you can find a solution to the 3-SAT problem, and so the winnability problem is also NP-complete.

The construction effectively turns Super Mario Bros. into a puzzle game, where the challenge is to find a way to the flagpole that doesn’t result in getting Mario trapped in a dead-end. The idea behind the construction is that Mario has to kick the right Koopa shells to break bricks that block a corridor leading to the flagpole. Forks in the path correspond to each variable in the Boolean formula, and different paths lead to different Koopas positioned to break different blocks. Most of the rest of the construction is contrived to prevent Mario from backtracking through the stage, and contriving a way to have two paths cross without letting Mario change which path he’s on.

The authors use the same basic approach to show that the question of winnability of Donkey Kong Country and Metroid stages is also NP-complete. The details of the construction of course vary based on the game mechanics, but use the same basic idea. The authors also prove NP-completeness of the question of winnability of Legend of Zelda and Pokémon dungeons using a different style of construction. In all cases, each game’s mechanics are used to effectively turn it into a puzzle game.

It might be interesting to come up with something to let someone actually “play” a stage derived from an instance of 3-SAT, although playing through such a stage wouldn’t be very fun. The stage would clearly be composed of repetitive elements, and it’d often be difficult to figure out how to get through it successfully. That is the idea, after all. Such a game would just be a novelty to demonstrate the concept; it wouldn’t be an effective way to try to solve NP-complete problems, as amusing as it might be to see a compiler challenge a programmer with a Legend of Zelda dungeon when it needs to perform register allocation (which is equivalent to graph coloring, which is NP-complete).

Neutronium update – Dec 12, 2011

I’ve continued working on Neutronium daily, but for the past week or so I’ve been focusing on refactoring some of the code I wrote earlier to facilitate actual new features. The biggest change has been putting most of the room-related logic in the STM monad, instead of having parts of it in STM and other parts using IO-based synchronization, particularly MVars. By putting all if it in STM, I avoid having the room directory MVar acting like a global mutual exclusion lock, and let room-related operations behave atomically, which could save some headaches down the time. I’ve also cleaned up various other bits of the code to simplify things and make things more easily testable, but nothing that warrants much discussion here.

Now I’m working on fixing how joining and leaving rooms work. Before, a member of a room was identified by the session identifier of the user’s session, but that’s very problematic. First, it means that opening the same room up in multiple tabs would result in confusing behavior, since the server wouldn’t have any way of distinguishing each tab. Second, it makes distinguishing one member from another difficult on the client side, since session IDs are sensitive information and can’t be shared, lest session hijacking result.

My solution is to assign yet another unique identifier to represent each member joining a room. Since member IDs aren’t sensitive, they can be freely communicated to everyone else in the room. Since it’s decoupled from the session cookie, each tab can be given its own member ID, should someone open the same room up multiple times for some reason. On the server side, each member ID is still bound to a particular session ID, preventing one member from trying to impersonate someone else; if the member ID sent in the request doesn’t match the session ID the request is made under, the request is ignored.

At least, that’s how it will work when I’m finished making those changes.

Comments Off

Reading is dangerous (if you’re writing Haskell)

In Haskell, the read function is the usual, simple way to parse a String into a value of some other type:

ghci> :t read
read :: Read a => String -> a

read can parse anything that implements the aptly-named Read class. All the standard numeric types implement Read:

ghci> read "42" :: Int

Actually, it turns out the Read instance for integral types like Int is a bit too clever for its own good. Did you know (in GHC, at least) it can handle floating-point-style scientific notation, as long as the mantissa significand is an integer and the exponent is nonnegative?

ghci> read "42e2" :: Int

This is nifty, but if the exponent is large, you can easily eat up all of GHC’s memory and crash the program:

ghci> read "1e1000000000" :: Int
<interactive>: out of memory (requested 45088768 bytes)

This is bad if the argument to read comes from an untrusted source. This was the subject of a recent security fix to happstack-server, where the entire server application could be brought down by sending it a request that used something like 1e1000000000 as a request parameter that would be parsed as an integral type. Of course, the vulnerability isn’t specific to happstack-server, but anything that tries to read an integer from untrusted input.

I don’t think Neutronium is currently vulnerable to this (other than being built against a vulnerable version of happstack-server at the moment), but that’s mostly because there isn’t yet anything that’s using read in this manner. One of the next changes I was planning to make was in how room membership is tracked, and part of that would be assigning each member of the room an integral identifier that is sent as a parameter to each room-related Ajax request. (This will support identifying who is who in the room, and solve the problem of having the same room open in multiple tabs without getting things all confused.) So, even if Neutronium isn’t vulnerable to this denial of service attack yet, it would be soon, and without having seen that posting, I don’t know if I would have even thought to worry about a seemingly simple built-in function like read for Int causing problems like that.

I can’t help but wonder if there are static source code analysis tools out there for Haskell that could find security-relevant flaws like this, like there are for more mainstream languages like C, C++, or Java. My gut tells me it’d be easier to write an analyzer for Haskell than for a language like C, but I don’t know if anyone has ever actually sat down to do it yet.

Neutronium Demo Video

With Renee’s help, I recorded a six-minute-or-so demo of Neutronium, the game I’ve been working on since the beginning of November. You’ll probably want to play it at 720p (HD) and full-screen in order to be able to read the text.

Neutronium is a web-based game inspired by a game inspired by the light-cycle game from the movie Tron. Currently, only the fundamental gameplay and a rudimentary chat function are implemented. Each player controls a constantly-growing line. They can’t speed up; they can’t slow down. All they can do (currently) is change direction. Once a player’s line collides with something, it stops and the player loses. The last player left alive wins.

(Note that the game doesn’t actually detect “winning” yet, so a game lasts until everyone has run into something.)

The video above was recorded from my machine. I am the blue line, and Renee is the red line. The chat window doesn’t yet have a way to identify who says what, but in this case you can tell my messages apart from Renee’s because you can watch me type them.

Working on a video

By popular request, I’m working out a way to record a quick video of my game in action. Before I do that, though, I want add an option to select the size of the game area. Right now it’s a hardcoded size, and it’s large enough to be awkward for an embeddable video. Something smaller would probably look better.

Comments Off

LoCo Day 30

Today I added support for command-line options when starting the web server. Specifically, it’s now possible to specify which port the server will listen on, and the hostname that will be used when showing type-safe URLs. Previously, both of these were hard-coded in such a way that made it difficult to use the server on anything but localhost. I also increased the server-side timeout, so that I can go increase the long-poll duration.

Now that November’s over, I haven’t decided whether I’ll keep posting daily updates about progress on the game. I would like to keep working on it and get it to the point where I can actually put it on the Internet — though I did succeed in my goal of having a playable game by the end of the month, there’s a lot of polishing that needs to be done, to say nothing of the important-but-nonessential functionality I skipped over to focus on the core gameplay. I have no idea how long it’ll take me to get all that done, though.

LoCo Day 29

Two small but welcome changes today:

First, I reduced a lot of the lag during game play by excising a bit of debugging code on the server. The server had been printing out to the console the session context for each request before processing it further. Since I had gotten sessions working quite some time ago, all this did in practice was add a slight delay in processing each request, which wasn’t noticeable until you start trying to play a real-time game where each move sends a new request to the server. There’s still a bit of noticeable lag during the game, but it feels much smaller now.

Second, colors are now used to distinguish you from other players: you are blue, and everyone else is red. This way, you can tell right from the beginning who you are in the game. This is currently implemented by having the server reply to the “I’m ready to play” request by sending the client the ID used to identify that player during the game. There’s still no indication of who is who in the chat window, or is information about who dies when presented to the player (other than saying “someone died”), but the information is there for whenever I get around to implementing it.

Also! I discovered that part of the Conf structure passed into Happstack to start the web server is indeed a timeout for how long to let a request sit idle before killing it. The timeout defaults to 30 seconds, which explains why my initial attempt to use a 60-second interval for long polling kept resulting in server-generated errors. One of the next things on my to-do list is to make the host and port the web server listens on configurable instead of hard-coded, so while I’m at it I might as well lengthen the timeout.

LoCo Day 28

Today I tried to solve the problem of the “ready to play” button swallowing keyboard input when the game starts, even though the button was disabled. Apparently jQuery doesn’t let you put the input focus on the document itself, which is where the keyboard input handler is attached. I ultimately settled on hiding the button entirely instead of disabling it after it’s been pressed, which has the result I wanted: putting the input focus on nothing in particular, so that the document sees the key presses.

This probably won’t work well if you start using the chat window again before the game starts, since the focus would still be in the text entry box, and arrow keys there will move the text cursor instead of moving your player avatar in the game, leading to much unhappiness as you get killed off before realizing what’s going on. So, this is more of a quick hack instead of an actual fix.

LoCo Discontinuity

Apparently my posts for Friday and Saturday never made it out of draft form, and so didn’t appear on the blog until now. Just in case you were wondering what I was up to after Thanksgiving.

Comments Off

LoCo Day 27

Success? The basic gameplay in the browser is working now, so for the first time ever it’s actually possible to play the game. My loving girlfriend Renee helped me test that it does indeed work with multiple players, and she even beat me in the first real match we played against each other. For posterity’s sake, I will note that the name of the room she created for that first fateful match was named “Stupid time for uncool animals”. (Although I did win against her in some of the subsequent matches, she is quick to point out that there is some lag between changing your player’s direction and when it takes effect on screen. I’m not sure what the precise cause of that is yet, since lag over a LAN should be minimal.)

Although the game is playable in the most basic sense, there is a huge amount of work to be done to make this something that could be exposed to the public. Heck, right now the game doesn’t even provide a way to distinguish your player from the others on screen, so the beginning of each match usually involves each player frantically trying to figure out which one they control before they run into something and die. That needs to be fixed, obviously. There’s still no concept of user accounts, or even user identities, so all status messages refer to everyone as “someone”.

Also, Renee wants more explosions when collisions occur. The baseline number of explosions is zero, currently.

Sometime soon I’ll try to post a screenshot or two of what this game actually looks like, now that it’s in a state where stuff actually shows up on screen instead of living hidden in the server’s memory.

LoCo Day 26

Continued work on client-side support for the game. I realize the JSON format I had defined for sending game state to the browser wasn’t very useful for what the client-side JavaScript needs to do to display the state of the game on the page. Instead of doing additional processing on the browser, I decided to do the conversion on the server side, instead of generating JSON that just wraps how game data is modeled on the server. Unfortunately, it’s taking me longer than I anticipated to get my test cases working again following the changes, which suggests there’s some corner cases my existing code isn’t handling properly.

Comments Off

LoCo Day 25

Now the server actually runs the game when everyone in a room is ready to play, instead of just waiting for a few seconds and then claiming the game is over.

At the moment, however, there’s nothing on the client side that understands the game, so the next chunk of code to write is the JavaScript (ugh) that implements the game logic on the browser side. When the game is actually running, the client and server are running the game logic relatively independently of one another, with messages being send back and forth only to update each other when a player changes direction. This way, there won’t be a bunch of unnecessary traffic crossing the network.

I also think at some point I’m going to need to change how the server maintains the state of all the rooms. Right now, it’s a Map stored inside an MVar, but that means the MVar effectively acts as a global exclusive lock for anything that wants to do anything with a room. Not a problem for testing, but I bet that’ll be a big problem with even just a moderate load. I think I’ll replace that with a TVar for the Map itself, and another TVar for each room. Using software transactional memory instead of exclusive locks will allow multiple threads (i.e., multiple requests) to operate on pieces of the state at the same time. This matches the actual use much more nicely:

  • Most operations are just sending and receiving messages via a room’s channel, so the room and Map themselves aren’t being modified. Read-only operations are nice for STM.
  • Less commonly, an existing room is being modified (e.g. someone entering or leaving, or games being started up). This modifies an individual room but leaves everything else alone, and with each room having its own TVar, these within-room writes can be isolated from other write operations.
  • More rarely, rooms are created or destroyed, which is the only time when the structure of the overall Map itself needs to change.

Using TVars instead of an MVar also gives a bit more consistency with the message-passing bits of code, which are already using STM objects (namely TChans and TVars). Perhaps each request could be treated as a single STM transaction, instead of separate bits and pieces being run separately and intermixed with other IO operations. But since the next chunk of functionality will almost be all client-side, I probably won’t get around to migrating more stuff to STM for a little while.

Comments Off

LoCo Day 24

Today I made a few small steps towards integrating the game logic with the web interface.

Comments Off

LoCo Day 23

Today I finished the core server-side game logic, at least to the point of it passing the set of test cases I prepared earlier. Now the next big task is to connect the game to the web interface, so that it’s possible to actually play and see what’s going on.

Comments Off

LoCo Day 22

Today I implemented generation of the starting configurations of games. For now, all initial player positions are aligned along an invisible grid across the play area. It’d be nice to have more randomness in there, but that complicates things a bit more, since I still want to ensure players don’t start off too close to each other.

QuickCheck is great, but sometimes you have to be a bit careful with the set of function inputs you let it generate. In this case, I had to make the dimensions and grid spacing sized to prevent them from getting obnoxiously, impractically huge. I’m never going to try to create play areas measured in millions or billions of pixels on a side, or that have a grid spacing of similar magnitude. My starting configuration generator doesn’t work well in those extremes (by which I mean, it runs in to integer overflow problems), so it’s not worth testing them, especially if I’d have to contort my code to deal with those kinds of inputs.

Comments Off