Hacker News new | past | comments | ask | show | jobs | submit login
An opinionated guide on how to reverse engineer software (margin.re)
375 points by withzombies on Nov 3, 2021 | hide | past | favorite | 44 comments



I am still regularly cracking the banking app from my bank to be able to run it on telephone.

They have decided to rage war against their users by checking for root, safenet and if google spyware framework is installed on android phone, so suddenly I wasn't able to get to my online banking and since one company software is handling all banks in my country (and I would rather stop using banks than start using google spyware ecosystem) they have forced me to remember of my youth and patch the .so in apks.

It was fun and I have to patch a new version every few releases but except from that, my online banking now works. :)


Any links on reading it up with specifically android in mind? Or maybe you can make a blog post about it? :)


Doesn't safetynet prevent any and all kinds of tampering? I really hate how companies can just enforce that on your own device and you just have to take it.

I remember some multiplayer android games implementing the safetynet check client side and not passing it on to the server. I wouldn't expect a bank to make the same mistake.


They can check for a Google-signed "device integrity" response on the backend, and if they do, that's a game over. The "integrity" is checked by a TrustZone applet, which runs with higher privileges than the Android kernel and has access to the necessary keys.


Are there any “start here” guides for beginning reversing? Like a break it down Barney style? I’ve done some self study and shadowing teams at work but I need to fill in the gaps. Thanks!


Go to The Netherlands, go to Vrije Universiteit Amsterdam. Follow a course called "Binary and Malware Analysis" (if it's still called that). It's from the VUSec group. Follow: Hardware Security, Systems Security and Kernel Programming while you're at it ;-)

These were the hardest courses of my life.

That's all I know, I wish I had an easier answer. I happen to have lived there at the time. I was lucky in that regard for accidentally finding that course.


I can share what worked for me, although I am by no means a pro

I found the book "Reverse Engineering for beginners" quite useful. Its a bit tedious but if your serious about it and go through it a bit it'll give you some solid practice.

Also, write, compile, and reverse own snippets of your code to get more intuition about how things work.

Finally work on a actual target, not something crazy like Photoshop or Word but at least something real to get the practical experience. It might be a bit of a grind but when you do manage to crack/hack whatever it is your trying on, its an euphoric feeling


I learned by jumping head first into a program that used blowfish encryption to calculate a license and etc. the company was gone so this was the only option.

There is definitely a “clicks” moment. It took me three weeks from zero to key generator. Absolutely time worth spending.



"Becoming a full-stack reverse-engineer" [1] is a great high-level overview. It lays out a roadmap of three years.

1. https://www.youtube.com/watch?v=9vKG8-TnawY


Probably out of date but I believe the classic introduction was +ORC: http://www.textfiles.com/piracy/CRACKING/howtocrk.txt

Fravia was also popular, this seems to be a mirror of his site https://www.darkridge.com/~jpr5/mirror/fravia.org/aca400.htm


"Reversing With Lena" [0] is a great beginner walkthrough, including pictures.

[0] https://archive.org/details/lena151


As you can see in the other replies, there are many paths that lead to reverse engineering, but I feel like the best way to make the concepts stick is to have "reverse engineer X" be a problem standing in your way (where X is a piece of software, a protocol, etc.) When you have that problem and you need to solve it, you have a target to throw all the "darts" those tutorials give you, and this will probably be more effective than just reading the material and hoping you'll use some of it in the future.


In my experience, you can learn reversing in an empirical fashion. Download Ghidra or IDA pro, and get cracking. Google every question you may have as you place breakpoints and modify control flow.

Or atleast that's how I got started as a midschooler with infinite time. Also try playing around with Cheat Engine. Don't let it's name fool you, it's a very advanced debugger that's capable to do way more than make you a bullet sponge :)


Sad to see Cutter[1] (based on Rizin[2]) not mentioned in the list of RE tools.

[1] https://cutter.re

[2] https://rizin.re


Tangential, but from the cutter website:

> No Java involved.

What an odd thing to advertise as a feature.


I consider that a feature, Java is always a pain since the applications always expect a system-wide JVM install but then are picky about which JVM they run on.


For context - it's talking about the Ghidra decompiler there. It's written in C++, Ghidra itself is written in Java.


In this context, a GUI application, it has some meaning. Java GUIs are know to be not enjoyable.


Guides like these are gold - the advice inside also neatly applies to understanding open-source libraries, and massive codebases effectively.


Thanks and you're absolutely right. It turns out tricks for reverse engineering to understand software help when you just want to understand software without reverse engineering also.


I had my fair share of reversing more than 10y ago. Such a fun times, learned a lot!

> Ideally, readers will have acquired an interactive disassembler such as Binary Ninja, IDA Pro, or Ghidra

I'm guessing OllyDbg is no more an option


OllyDbg is more a debugger than a disassembler (although obviously it will show you disassembly as part of that process). The tools listed are more focused on exploring a binary and have more features to that end.

OllyDbg seems to have been replaces by x64dbg for most people I know


Yeah, I specifically mentioned interactive disassemblers here. There are only a few that let you fix up and annotate the disassembly or decompilation and they're leaps and bounds above the ones that don't.


I had my share 40 years ago. No such fancy tools. Wrote out pages of dot matrix print and drawing arrows with a pen. Learned a lot yes!


Ah! Dead-listing!

I learned the term from this old blog post by Nate Lawson: https://rdist.root.org/2008/07/03/dead-listing-while-on-vaca...

“What do I do on long plane rides or while listening to the waves lap against the beach? Dead-listing.

Dead-listing is analyzing the raw disassembly of some target software and figuring it out using only pen and paper. This is great for vacations because your laptop won’t get sand in it. You usually have a long period of time to muse about the code in question without interruptions, something hard to find at home these days. And I’ve gotten some of my best ideas after setting aside my papers for a while and going for a long swim.”


I learned the term "dead-listing" from Fravia's Pages of Reverse Engineering in the 90s [https://en.wikipedia.org/wiki/Fravia].

I felt like such a hardcore hacker removing the nag screen from some bit of shareware!

[edit: uh oh, might have fallen down the Fravia rabbit hole again. Here's a mention of dead-listing from '97! https://www.darkridge.com/~jpr5/mirror/fravia.org/siuL.htm]

[edit 2 - ha ha, the original link you posted the term "dead listing" hyperlinked... to Fravia!]


Using paper printouts to analyze code is a very old practice.

http://ars.userfriendly.org/cartoons/?id=20001002


Once upon a time there was SoftIce and SoftIce for Windows.

Oh i miss them!


I loved SoftIce.

It was a royal pain to lock windows up! It was clunky and you had to get a feel for if you were still in your program or you were dragged into the OS for things like Windows actual mouse code/logic.

I almost can’t imagine how I did it using another PC for my notes and scratchings. Having to manually type out so many addresses and the code I was inspecting.

Still, for some reason I loved SoftIce. I remember it just working.


This along with ollydbg and w32dasm brings back memories of the old keygen and patch days. SoftIce blew my mind at the time. Man that was fun!


Old games are really gaining new life with techniques like these. Games from 90s are still getting bug fixes and new content, such as "Caster of Magic" and "HoMM3 Horn of the Abyss"


This is a great guide. The “data is king” message is hard to overstate.

Some other bits and pieces not mentioned that could be helpful for beginners, from my own personal experience, that I hope show up in later parts:

• Know the calling convention[0] of the platform you are working on! If you don’t know where arguments come from or how values are returned from functions, you are not even going to be able to get started.

• In addition to working backwards from known system calls, embedded strings can also be a great place to start. They frequently contain filenames, error or debug messages, well-known magic values (PNG chunk identifiers, FourCC codes, file extensions, etc.), and other junk that can be used to identify potentially interesting functions. If you get really lucky and the binary uses some string-based message sending you will gain an enormous advantage this way.

• While I agree with the author that starting from `main` is mostly futile, it can still often be helpful to do a little bit there depending upon the kind of the binary you are reverse engineering. For example, GUI apps will generally have some initialisation code, then an event loop, then some teardown code. Identifying these sections is useful when you start looking at deeper functions to know if you are hitting a function that is called by init (more useful) or teardown (less useful) to make smarter choices on where to spend effort. The event loop will also use documented OS messages and types, so you can use those to get an idea of which system events map to which functions and what data they receive from the system, which again helps give useful context for deeper functions later on.

• Depending on the compiler or optimiser that was used, code within TUs may simply be emitted in sequence, so if you are disassembling some code and you can determine that a few functions next to each other are all member functions for the same C++ class, you can assume with high confidence that adjacent functions in other places also all operate on the same kind of object and create associations between groups of functions that way.

• Similar to the last point, looking for vtables in the data section of the binary can help to quickly learn about the size and fields of objects. If you see a sequence of function offsets, whatever function points to the start of that sequence is probably a constructor (and will likely use the offset like `mov [rax+0], offset <list of functions>`). The `malloc` call in the constructor or the constructor’s caller will tell the object size, and then reviewing all the vtable functions will let you fill out the object’s structure pretty easily.

• Notwithstanding “data is king”, knowing what code implementations of common data types look like can be very helpful. Once you know what it looks like to access a hash map, or traverse a linked list, or grow a vec, or get the length of a C-string, when you encounter that pattern, you immediately know what kind of object you’re working with.

• Understanding some of the optimisations that compilers use is also very helpful since they can look very weird, though this is more of a rote memorisation/exposure thing. Tools like Assembly x86 Emulator[1] can be super helpful when encountering confusing stuff since you can just paste in there, step through, and mess with the data/registers without needing to try to run a debugger on the original code.

• For whatever reason, the first few months I started reverse engineering I could not do it without using graph view. After that time, something shifted in my brain and now I can no longer work effectively without using a straight disassembly view most of the time. Pick what works for you, and just like an IDE, make sure to take time to configure your decompiler/disassembler with the visibility options that feel like they give you the strongest orientation.

The basic approach and difficulty curve for reverse engineering a binary is essentially the same as solving a jigsaw puzzle. Finding the corner pieces is simple, finding the edge pieces is harder, and finding the first pieces to connect to the edges is the hardest part. As you start filling in the picture, though, it becomes exponentially easier, so just keep at it.

[0] https://en.wikipedia.org/wiki/Calling_convention

[1] https://carlosrafaelgn.com.br/Asm86/


What type of paid work would a reverse engineer usually pick up?


Most of them work in the security industry and do:

* malware analysis (how does this malware work, how can it be detected, can we somehow decrypt data affected by this ransomware, are there any leads to who wrote this malware, etc)

* vulnerability research (finding and exploiting vulnerabilities in closed source software)

* assessment of closed source software (how secure is it, how does it work, does it have undocumented apis and how can those be used, etc.)

there is likely also a small market for analysis of competitor software, but I haven't seen this openly advertised yet.


> assessment of closed source software

I saw a company making addons for Microsoft Office, peculiar example of reverse engineering job not in security field https://www.think-cell.com/en/career/jobs/reverse.html


Microsoft once had to release a binary patch for Office to fix a security vulnerability in an addon where the source code was lost: https://news.ycombinator.com/item?id=15720923


There's also a pretty active academic scene for reverse engineering. The big thing right now (at least in the subfield I'm familiar with) is symbolic execution, where you generate a expression that shows the value of a variable at some point in time.


Selling private online game cheats.

Not the traditional route, but some of the best reverse-engineers I met were bypassing anti-cheats for a living.


Pentesting? Cyber security, stuff like that. Maybe anti piracy for games etc.

Not a RE but it seems to match doesn't it?


A friend of mine at uni, who was a hobby cracker, went to work for Eset, the antivirus company.


I love the way this is written, can't wait for part 2!


Hate the font though. Makes your eyes stumble everytime there is a t in a word.


Reader view is a nice firefox feature (press F9)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: