How Debuggers Work: Getting and Setting x86 Registers

dx87 · on Oct 23, 2020

Learning debugger internals is suprisingly frustrating. I think it was the the GDB source code that didn't explain what anything was in its source files about interacting with registers, and just had a comment at the top that said something like "this header file is for GDB internals, you're probably looking for the man page". Then I found a post on Stackoverflow looking for the exact same info I was, but it was closed for "being too narrow and likely no use to anyone else".

jcranmer · on Oct 23, 2020

The least annoying thing about writing and working with Linux debuggers is that all of the communication layers (kernel<->debugger, debugger<->C library, debugger<->loader, debugger<->compiler [1]) is basically undocumented. I mean, it's annoying trying to track down where in the source code these things exist, but if you have experience working in large codebases, this kind of task is needed sufficiently frequently that it's not difficult.

The real issue with debuggers is that ptrace is a pretty broken API. Supporting things like spawning threads, forking processes, fork+exec, etc. is difficult, and full of race conditions that are difficult to code correctly. Attaching to running multithreaded processes is another challenge. Writing a debugger that can correctly handle multithreaded applications is challenging, the documentation gives you zero insight into what the potential pitfalls are, and almost all examples are similarly uninformative, being too complex for their use case.

[1] Yes, there's DWARF. But if you're dealing with GNU extensions to DWARF, the documentation ceases...

saagarjha · on Oct 23, 2020

man 2 ptrace is really one of the worst man pages in existence, and it's made doubly bad because there are only two real clients that you can reference, one of which is pretty much the definition of an awful legacy codebase and the other which is over- and strangely-engineered and at times not even correct or complete. Actually, triply bad because ptrace(2) itself is the worst API and it bleeds into some other fairly reasonable APIs for signals and process notifications.

I guess the silver lining is that if you ever find yourself in the position of having to implement ptrace (like I did recently) you can get away with a surprisingly broken and incomplete API and GDB at least will take it mostly in stride, as long as you don't mess up a couple of the fundamental operations (your wait4(2) has to mostly be correct, a couple of the SIGTRAPs need the right codes). It's still really difficult to do–my implementation was designed by cross-checking against strace output and only supports basic single-task debugging–but you can return strange errors or "I don't support this" and GDB for the most part is OK with that, because not only is ptrace a broken API many of its implementations are broken themselves, or certain features are strangely missing in some places.

khuey · on Oct 24, 2020

I obviously don't know the details of what you were doing but it's generally easier to implement the gdbserver protocol than to implement ptrace itself.

krytarowski · on Oct 23, 2020

Thank you for your feedback.

> The real issue with debuggers is that ptrace is a pretty broken API.

Please note that this article focuses on NetBSD and FreeBSD first.

As your comment describes only one OS (Linux) please do not generalize as your comment seems to use truth sparingly.

The ptrace(2)/NetBSD API design and implementation is free from all of the difficulties you mentioned in your post.

> Supporting things like spawning threads, forking processes, fork+exec, etc. is difficult, and full of race conditions that are difficult to code correctly.

The difficulty of catching LWP creation events:

ptrace_event_t event = {}; event.pe_set_event = PTRACE_LWP_CREATE; ptrace(PT_SET_EVENT_MASK, child, &event, sizeof(event))

Then whenever a debuggee creates a child, it's fully stopped (so called all-stop mode from GDB) and reported to the debugger by sending a signal that is wait(2)ed.

Then, investigate the debuggee event through checking the signal passed (SIGTRAP) and investigating siginfo_t that contains new thread identifier.

Then, you can resume the whole process with a single PT_CONTINUE.

> forking processes

Same for forking, use PT_SET_EVENT_MASK+PTRACE_FORK. Fork events are reported for the forking parent and forked child. As you poll on events on a single PID only (for all events for all threads within a process), you have the deterministic order of reporting the forked parent first always, followed by polling for the forked child (you know its PID from SIGTRAP + siginfo_t submitted to the parent).

> fork+exec

This is a matter of catching EXEC and FORK events separately. All exec() events are reported as SIGTRAP + siginfo_t specifying TRAP_EXEC. No big deal.

> is difficult, and full of race conditions that are difficult to code correctly

I push this comment to the free market of opinions of the readers.

> Attaching to running multithreaded processes is another challenge.

It's 1-liner always:

ptrace(PT_ATTACH, pid, NULL, 0);

No matter whether this is a single-threaded or multi-threaded process.

> Writing a debugger that can correctly handle multithreaded applications is challenging,

Again, I defer this question to the free market of opinions.

> the documentation gives you zero insight into what the potential pitfalls are,

Please list the pitfails so we can improve the documentation!

> and almost all examples are similarly uninformative, being too complex for their use case.

There are a few hundreds of ptrace programs in NetBSD executing each small feature in minimal code. This is embedded into the regression test framework (ATF). This code can be reused (good license + simple) in 3rd party software.

For external examples, I recommend the most minimal event tracker of debuggers, that I wrote here:

https://github.com/krytarowski/picotrace

In particular, you can trace all events possible in all types of programs (at least in the current version of ptrace(2)) in around 300 LOC, as noted here:

https://github.com/krytarowski/picotrace/blob/master/common/...

FreeBSD has a distinct ptrace(2) API, but not far from NetBSD and is relatively comparable and quickly portable from one BSD to another.

If you have got any more questions or comments, do not hesitate to ask!

saagarjha · on Oct 23, 2020

I'm speaking from a mostly Linux perspective (and macOS-but that API is crippled so even though it's likely more similar I won't mention it much)–while I'll take your word for it that NetBSD has a better API, I am still curious if the various edge cases are handled. Are there multiple stop kinds that are somewhat difficult to distinguish against and keep track of? How you handle a child dying while stopped, or is this not possible? I don't think NetBSD has the same "tasks" model that Linux does, how do you distinguish between requests targeting threads and requests targeting the whole process? How do the rest of the OS APIs interact with a ptrace stopped process?

krytarowski · on Oct 24, 2020

> Are there multiple stop kinds that are somewhat difficult to distinguish against and keep track of?

Every type of an event has a dedicated pair of SIGNAL + SI_CODE (in siginfo_t).

The types of events are as follows: regular signals (usually not interesting for a debugger - NetBSD can mask them with PT_SET_SIGPASS), crashes (SIGSEGV, SIGFPE, SIGILL, SIGBUS) and debugger related events (SIGTRAP). Then, each debugger related event is distinguishable with checking si_code inside siginfo_t (TRAP_TRACE, TRAP_BRKP, TRAP_CHLD, TRAP_LWP, TRAP_DBREG, TRAP_SCE, TRAP_SCX) and in a few more cases with the additional ptrace(2) call PT_GET_PROCESS_STATE that can query additional information (spawned/exited thread; forked/vforked/spawned process).

Thus we always have the exact thread + type of event.

There is one tricky case that is harder to code. It's related to hardware assisted watchpoints, especially in multi-threaded processes with concurrent events of all kinds. We need to diligently handle the context of x86 Debug Registers that delivers the additional information about the fired hardware assisted watchpoint/breakpoint.

> How you handle a child dying while stopped, or is this not possible?

A stopped and traced child cannot just die, but it could be killed with SIGKILL. Then further ptrace(2) calls fail on it.

> I don't think NetBSD has the same "tasks" model that Linux does, how do you distinguish between requests targeting threads and requests targeting the whole process?

Generally, we have a pair of PID (process) + LWP (thread). We have got per-process + per-thread ptrace(2) operations. Whenever a thread is meaningful, like in the management of register contexts, we pass LWP as the 4th argument of the ptrace(2) call or embed in a structure transmitted from/to the kernel.

In Linux, the ptrace(2) call is per-thread only, which allows some flexibility (the GDB non-stop mode), but introduces the complexity of the management. The NetBSD kernel serializes the events inside the kernel and stops all the threads before returning to the debugger.

> How do the rest of the OS APIs interact with a ptrace stopped process?

It's an internal detail whether a process is stopped by a debugger, by the terminal or actively running. This is orthogonal to other system APIs. Generally we try to make the fact of being traced to be transparent to other applications, for example we fake the parent PID (after reparenting, that happens after attach). There are some corner cases and real bugs in applications that are exposed under a debugger, such as missing EINTR handling.

The NetBSD Project over the past few years significantly improved in the domain of debuggers (GDB, LLDB) and developer-oriented tooling (sanitizers, compilers). Thus, there is still room for improvement!

jcranmer · on Oct 23, 2020

> Please note that this article focuses on NetBSD and FreeBSD first.

There's a reason I prefaced my comment with Linux debuggers. I haven't played much with BSD kernels to know how problematic debuggers are there.

jjoonathan · on Oct 23, 2020

Stackoverflow always closes good questions. It's almost a sign of legitimacy at this point.

star-trek-fleet · on Oct 23, 2020

At Pixie, we developed a feature to dynamic trace program execution context, for example: arguments and return values of a function [1].

That was built on top of eBPF [2], and in the process we have studied how debugger works, particularly how to pull rich context information about the program executable's file structure (symbols, elf format) and dwarf information [3]. The other significant piece is golang's own implementation details, such as how interface is implemented.

It's a very refreshing learning experience. The end result is that we gained a deeper understanding of how program presents itself to operating system and chips.

[1] https://docs.pixielabs.ai/using-pixie/code-tracing/ [2] https://www.iovisor.org/technology/ebpf [3] https://en.wikipedia.org/wiki/DWARF

phendrenad2 · on Oct 24, 2020

This is good insight into how debuggers work on BSD/Linux, but they obviously work differently on other OSs.

person_of_color · on Oct 24, 2020

Anything from ARM??