About Smegging Time (or, Wallace 0.99.0 is out!)

I’ll let the following link speak for itself:

Download Wallace 0.99.0 (source tarball, 713 KB)

OK, maybe I can say a little more.

I’ve got Wallace, everyone’s favorite NES emulator for Linux with a genetic programming tool bolted on, worked into a sufficiently functional state to let the outside world play with it. Thet there link will get you a copy of the source code, which you can do the usual “./configure, make, make install” thing on to get a working copy of it.

A couple caveats to keep in mind before diving in, though. First, consider this a tech demo, or pre-alpha software at best, not a polished, finished product. It’ll probably work OK if you don’t poke or prod it too hard, though if you want to crash it or send it into undefined behavior I can think of three ways off the top of my head to do so. Also, I make no guarantees that this release will be compatible with future ones, so don’t come crying to me if the evolver you’ve been running for weeks stops working when 0.99.1 (or whatever) comes out.

And of course, I’ve only tested this on my laptop running Debian unstable. Getting it to run on some other Linux system shouldn’t be too hard. With work you might even be able to get it working on some other platform GNOME has been ported to, but I wouldn’t bet too much on it.

Now that all that’s out of the way, what’s new since the last status update? Plenty! The VM that the candidate solutions run on is now register-based (loosely based on the MIPS architecture), so solutions only crash inside the VM if they try to divide by zero nowadays. You can also save your progress in the evolver tool, it executes as quickly as possible (instead of at normal emulation speed), and you can watch and encode candidate solutions at your leisure.

So what does it take to get a little genetic programming action started? Let’s say you wanted to generate something to play Dr. Mario (since score is much more closely correlated with progress there than in Super Mario Bros.). First, you tell the Evolver what ROM image to use and what the initial state will be — here, we pick a state slot that begins on Level 10 on High speed:

Game tab on the Evolver

Next, we define our fitness function by specifying a set of metrics to measure and how to modify them. Here, our fitness will be the score plus the number of seconds before losing. Note the number of conditions we look for for when to stop evaluating a solution: the virus count dropping to zero (win!), the Game Over screen (lose!), the Pause screen (programs don’t need breaks!), and the Main Menu screen (ever try hitting A+B+Select+Start?):

Fitness tab on the Evolver

Next, we specify how we’ll be breeding our solutions. The Evolver has several knobs you can play with to control population size and how to generate new solutions from old ones. Lots more are certainly possible. For example, it’d be nice to control how important the fitness score is to surviving to the next generation — currently, the top n solutions make it and the rest are culled, instead of doing some random-by-weight sort of selection. Here’s the somewhat arbitrary settings I chose:

Breeding tab on the Evolver

Now all that’s left to do is switch over to the last tab, hit that nice green Run button, and watch the magic happen.

Population tab on the Evolver

So, how well does this work in practice? I haven’t done much intensive experimentation, but early results are encouraging. In the first generation (which are all purely randomly generated), most of the solutions either crash before getting very far or don’t do anything, letting the pills just stack up in the middle until the reach the top. A couple of solutions do exhibit slightly more interesting behavior, such as this one, which rotates like mad and shoves pills to the left as fast as it can:

Solution from Generation 1

The top performer from Generation 1, with a fitness of 16. (660 KB, 16 seconds)

Over the first few generations, the solutions that crash quickly die off, and the ones that just let the pills fall in the middle flourish. Then the ones that shove the pills to one side or another, buying a few more precious seconds and thus increased fitness, dominate. And then, in Generation 6, an exciting development: the first solution to kill a virus!

Solution from Generation 6

The top performer from Generation 6, with a fitness of 309. (383 KB, 9 seconds)

To be fair, that screenshot makes the solution look like it’s playing far more shrewdly than it really is. The video tells the whole story: it’s just shoving pills to the right, but does it at just the right rate to line up the red sections of the first three pills with the red virus.

And if you think that’s neat, check out what happens in Generation 7:

Solution from Generation 7

The top performer from Generation 7, with a fitness of 916. (700 KB, 16 seconds)

That solution pushes pills to the right and rotates them at just the right speed to knock out two virus with the first three pills. It is interesting how the end result after three pills is the same as what I’d do if I were behind the controller. I can’t help but wonder how an Intelligent Design proponent (that is to say, a creationist) would distinguish whether I or a seventh-generation evolved computer program were playing, given that screenshot.

Anyway, the next couple generations don’t turn up anything with a higher fitness than that, and that’s the point where I stopped it so I could work on pushing out this release. Nevertheless, Generation 9 did produce the longest-lived solution so far, clocking in at almost double the time any of its competitors does, and still managing to kill a virus along the way.

Solution from Generation 9

The longest-lived performer from Generation 9, with a fitness of 329. (1.2 MB, 29 seconds)

Not too shabby for a couple hours of evolution on a 1.7 GHz Pentium M, eh?

After all that I bet you’re dying to try this out for yourself, so I’ll reiterate:

Download Wallace 0.99.0 (source tarball, 713 KB)

8 Responses

  1. OK…I’ll admit I didn’t read this but…DR. MARIO!!

    I’m excited you’re coming in this week!

  2. So, I presume that some of the random opcodes involve reading CHR/SCR data, right? Or is that specified in the setup?

    I’m actually a bit surprised that your VM-based approach works that well. It’s possibly even better than an S-expression-based one. Interesting.

  3. I’ve largely been treating the emulation code itself as a black box, so I’m fuzzy on the details of NES emulation. What’s the difference between CHR and SCR data?

    The VM has a few opcodes for reading bytes and 2-byte words from game memory using the emulator’s FCEUI_MemSafePeek function, which takes a 16-bit address and returns the byte stored there. I’m pretty sure this gives you access to CHR, SCR, and everything else in the game’s address space. It’s the same technique that lets the fitness function look at the current score and virus count in the example in the main post.

    However, I’m doubtful any of the solutions I’ve generated so far make any productive use of those opcodes. From the videos, it very much looks like they each hit on a “dumb” strategy of repeating the same move over and over and over, which just happens to do something useful. One feature I’d like to add in the future is a way to examine just what’s going on inside the hood, without manually poking around inside the solution files and doing the disassembly by hand.

  4. Honestly, I’m totally absorbed by this project and I love to see it develop…

    but I’m not going to fiddle with it. I don’t run Linux, and I’d rather see what you do with it. :)

    More screenshots!

    P.S. I wonder if this game could possibly play Bubble Bobble?

  5. So, next version will have it automatically create and upload the generation-winning videos to youtube, right? And update a website that lets everyone view the winners, losers, right? Okay, maybe a bit much… but I’m thrilled that it is in working order!

  6. Oh, and just a specific comment on the fitness function for Dr. Mario; shouldn’t time be handled slightly differently? Isn’t this fitness function going to get you as close as possible to having an infinitely long play session that never completes the game?

  7. In the long run, it could be beneficial to remove time as a factor to prevent exactly that scenario, though to get that far you’d need something that’s really good at clearing pills — with the fitness function above, clearing just one virus is worth the same as 5 minutes of play.

    Of course, no one ever said you couldn’t change the fitness function whenever you feel like it….

  8. At first, I read that last line and I was adamant that no changes be made in the original metrics- that would change the initial parameters, and you wouldn’t have a baseline measurement, and it just wouldn’t be the same!

    But then I considered this as an analogue of genetics, and darwinism, and- well, what the heck! Measures of “good” or “best” change with time! People are no longer expected to know certain bits of information, or they are expected to know how to understand different concepts.

    Heck, you could change the metrics for each level of Mario Brothers and get a hybrid solution for each level.

Comments are closed.