Crossing the Streams

The latest update to Wallace, despite the delay since the last installment, doesn’t actually add any features directly related to its ultimate goal of using genetic algorithms to evolve something that can play Super Mario Bros.

Instead, it ports the audio/video code from SDL and drawing-pixmaps-on-the-window, respectively, to the GStreamer framework.

One of the neat things about GStreamer’s design is that it uses pipelines of elements to process multimedia streams, sort of like how you’d put together a bunch of simple programs on a Unix command line. Under this model, doing different things with the data just involves using different elements to process it. For example, taking the emulator’s output and sending it to the screen and speakers involves this pipeline:

Output pipeline

Walemulatorsrc is a custom-made element that acts as a source for the emulator’s output audio and video streams. The two queues act as buffers. Gconfaudiosink and gconfvideosink output the audio and video streams, respectively, using the user’s preferred output methods. This, for example, could let us render to video memory directly. But since the emulator has a very specific format for its audio and video streams, which may not match what those two sink elements accept, the streams are first passed through filters that can convert between different audio (audioconvert) and video (ffmpegcolorspace) formats.

The advantages of using a pipelined series of elements like this is what happens if, say, instead of drawing to the screen and playing to the speakers, you want to encode a video file instead? Well:

Encoder pipeline

Now after converting the streams to a convenient format, we run the audio stream through a Vorbis encoder, run the video stream through a Theora encoder, multiplex the two streams into an Ogg media file, and write the result to a file. Of course, there’s no reason you couldn’t generate a video file in whatever your favorite format is, either; it all depends on what elements you plug into the pipeline.

And as proof that this works, you can watch me beat up Metal Man from Mega Man 2:

Mega Man v. Metal Man
Ogg Theora video + Vorbis audio, 2.8 MB, 40 seconds. Windows users might be able to find the necessary codecs here if they don’t have them already, I guess.

Also, using GStreamer elements to render video on-screen scales the video to fill the window for free, which is a nice side benefit.

As nifty as all this is, and as simple as it seems in principle, implementing it turned out to be trickier than I expected. The main source of headaches was the need to implement that walemulatorsrc element myself. No prior experience writing elements + non-trivial behavior (two source pads, fed asynchronously by the emulation core) = lots of aggravation. I’m sure anyone who groks GStreamer would shudder at the code inside that element, which, let’s be honest, has a fair amount of the proverbial duct tape and baling wire.

Also, I had expected outputting the video via GStreamer to be faster than drawing a pixmap image onto the window about 60 times a second, but it turns out the GStreamer solution takes more CPU time than the “ugly and wrong” way. A little tentative profiling with Sysprof suggests that 20% of the CPU time is being spent inside that ffmpegcolorspace element, converting the 8-bit RGB bitmaps generated by the emulation core into the YUV color space that elements like gconfvideosink and theoraenc seem to want. It may be worthwhile in the future to eschew ffmpegcolorspace and do the conversion within walemulatorsrc, but you know what they say about premature optimization.

I promise, the next status update will actually further Wallace along to its ultimate goal. I can’t wait to see how what I have planned next stacks up.

[Editor's note: your homework exercise is to figure out why that last sentence is a combination clue and pun.]