Like Hell it Could

In case for some unfathomable reason you’re one of those people who thinks the Large Hadron Collider is going to destroy the planet (as though such a thing were that easy), there are no less than two websites you can check to see if the unthinkable (and impossible) has happened:

It always good to have multiple sources for this sort of thing, since there is some misinformation floating around out there. Plus, both of the two sites above have RSS or Atom feeds you can subscribe to, so that you’ll be notified in case the current status of the planet changes.

WARNING: Those prone to worrying about things like the end of the universe might want to avoid reading the HTML source of the first of those two links, lest they should see this:

The possibility that we are living in a false vacuum has never been a cheering one to contemplate. Vacuum decay is the ultimate ecological catastrophe; in the new vacuum there are new constants of nature; after vacuum decay, not only is life as we know it impossible, so is chemistry as we know it. However, one could always draw stoic comfort from the possibility that perhaps in the course of time the new vacuum would sustain, if not life as we know it, at least some structures capable of knowing joy. This possibility has now been eliminated.

The second special case … applies if we are now living in the debris of a false vacuum … This case presents us with less interesting physics and with fewer occasions for rhetorical excess than the preceding one.

S. Coleman and F. De Luccia (1980). “Gravitational effects on and of vacuum decay”. Physical Review D21: 3305.

Those prone to worrying about such things should probably also refrain from reading the above quote.

Please do not jump on the table

Last time, I mentioned one obstacle I ran into when writing a reverse engineering tool for Nintendo games. Namely, just because a piece of code is called using the JSR (Jump to SubRoutine) instruction doesn’t mean the code being called actually is a subroutine. Subroutines eventually execute an RTS (ReTurn from Subroutine) instruction to go back to the code immediately following the JSR. This bit of code in Super Mario Bros., however, doesn’t:

JumpEngine:
       asl          ;shift bit from contents of A
       tay
       pla          ;pull saved return address from stack
       sta $04      ;save to indirect
       pla
       sta $05
       iny
       lda ($04),y  ;load pointer from indirect
       sta $06      ;note that if an RTS is performed in next routine
       iny          ;it will return to the execution before the sub
       lda ($04),y  ;that called this routine
       sta $07
       jmp ($06)    ;jump to the address we loaded

As I mentioned before, the above code treats the “return address” that JSR pushed onto the call stack as the address of a jump table, with the index into the jump table passed in the A register.

The reverse engineering tool needs to be aware of this, since otherwise it won’t know about the jump table, and thus won’t find the executable code pointed to by the jump table’s entries.

To solve this problem, I added a little static analysis routine that inspects all the code called by a JSR, checking for stack-based shenanigans. A normal subroutine may push and pop values on and off the call stack, but it will leave the return address alone, and when the RTS is reached, the return address will be at the top of the stack. A jump table engine like the above will pop the “return address” off the stack and end not with a RTS but with an indirect JMP (JuMP). Truly weird code will do something else.

My program now uses the Boost Graph Library to build a control flow graph of the executable code it’s located, showing all the possible paths of execution through the program. The analysis runs as a depth-first traversal of each alleged subroutine, checking what the depth of the call stack will be at each instruction.

Naturally, the analysis is only a heuristic. It’s possible for the code to be a normal subroutine and look like a jump table engine if, say, it manually pops the return address off the call stack and does an indirect JMP to it instead of an RTS. I’m making the assumption that the code isn’t deliberately obfuscated like that.

During testing, I tried running the jump table engine detector against Mega Man to see what would happen, and look at what it found:

PRG4_1562: txa
PRG4_1563: asl A
PRG4_1564: tay
PRG4_1565: iny
PRG4_1566: pla
PRG4_1567: sta $F4
PRG4_1569: pla
PRG4_156A: sta $F5
PRG4_156C: lda ($F4),Y
PRG4_156E: tax
PRG4_156F: iny
PRG4_1570: lda ($F4),Y
PRG4_1572: sta $F5
PRG4_1574: stx $F4
PRG4_1576: jmp ($00F4)

The details of how it works are different (it passes the index in the X register instead of A, and it only uses two bytes of zero-page memory instead of four), but it does the same thing as the jump table engine in Super Mario Bros. The fact that my detector seems to work on two different games written by two different developers is a good sign — both that the detector works, and that the jump-table-via-subroutine trick is likely to be used in many games. (The detector also found a few subroutines that do some other form of stack weirdness, but I haven’t look yet at just what’s going on with those.)

The reverse engineering tool doesn’t actually make use of this new information yet, but that’s the next thing on the to-do list.

Comments Off