Hacker News new | past | comments | ask | show | jobs | submit login

This is cool but seems to be "just" the output from IDA with the functions renamed in the output? If that's the case I'm wondering why we don't see this sort of thing more often (other than the effort required of course)



>It's not an ordinary decompilation generated by IDA. They actually rewrote all the functions from reading MIPS assembly and compiled it with the original compiler, adjusting the code until it produced identical output to a vanilla ROM.

Reverse engineering a reproducible build is quite a bit more than "just" output.


What more is needed?


A perfect decompiler would just do it, but writing the exact inverse of a compilation tool chain would be... Difficult. You would have to have a different decompiler for each version each tiny little difference.

These guys were capable of figuring out and naming what every function did and then rewriting each of them over and over and over to get the original tool chain to output bit for bit the exact same binary as they started with.

Something like taking x-rays of an unknown machine and being able to recreate perfect pixel replicas of the engineering drawings or listening to a piece of music and being able to exactly write out the score.


There goes the old say: "that would be impossible just like attempting to get the original cow by putting together ten thousand hamburgers".


Sounds like a job for a genetic algorithm or machine learning.


I'd recommend watching this CppCon talk [1] about compiler optioning. It isn't just that recompilation is hard, or that it is difficult for humans to do. At a fundamental level, the information is not there at all, because the compiler can make very impressive optimizations. Machine learning can deal with weak signals, but it can't deal with no signal.

[1] https://www.youtube.com/watch?v=nLv_INgaLq8


There has been similar things done in the past, for example for Dune Legacy (Dune II) or REminiscence (Flashback)

Mario 64 is more recent so might be a magnitude more complex.

There are also other approaches, like a total re-implementation, not a decompile/re-source. ScummVM engine for Blade Runner, the adventure game from 90's is a recent very impressive example of that


I can’t believe ScummVM has re-implemented 72 classic game engines[1] from scratch. They also document their basic revere-engineering strategies[2].

[1] https://github.com/scummvm/scummvm/tree/master/engines

[2] https://wiki.scummvm.org/index.php?title=HOWTO-Reverse_Engin...


the `src` folder seems to contain the reverse-engineered C code. Most of the assembly-only files left look to me like they are just describing data.


They mention in the thread that the binary seemed to have been compiled without optimisation, which means it's a lot easier to translate back to source than most games would be.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: