Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How to learn proper systems programming?
336 points by systems_learn on June 21, 2021 | hide | past | favorite | 93 comments
I am a software engineer for the past 10 years, and did frontend and backend development. I am learning Rust at the moment and have the following books:

- "The Linux Programming Interface"

- "Systems Programming with Linux"

- "Adavnaced UNIX Programming"

What I struggle with: How to get exposure to projects to learn for a future job? I had a Rust job for around half a year, where people build web servers and came from a C and C++ background. Half of the stuff they wrote I didn't understand (flushing, opening another channel just for logs so we don't fill up the other ones etc. etc.).

Now I wonder how I can get access to this type of information, how to properly learn it?




Years ago I modified the Postgresql source code (8.xx).

It's something I was terrified of doing.

But once I got in there and started poking around, I realized it was just ordinary plain-vanilla C code. Not C++. Just C code.

With my local copy, I started to hack pg_dump to do something special that we wanted at the time. Even after 30 years of coding, I'm not that especially good of a programmer. But I ended up getting our own special version of pg_dump that did what we wanted at the time and it went into production dumping hundreds of gigs of data every day!

But what I'm not, is afraid. I'm not afraid to try anything.

And that's what it takes to do deep, systems level programming.

Don't be afraid.

Those bits are just bits. And it's just code... and most of it was not written by wizards. Just ordinary people like you and me. Don't be afraid man.

Clone the repo and setup a workable build environment and start tinkering and compiling and running to see what happens.

You would be totally shocked to find out what you can actually achieve.

Those books will only go so far.


I basically came here to suggest the same thing. The biggest part is a mental shift: realize that if you find a bug (or feature you'd like to have) in an open source project that you have the ability to fix it.

The rabbit hole of reading code you go down of reading code when you do that is, I believe, the best real introduction to systems programming. Once you've been doing that for a while, you internalize the idioms. I'd recommend large projects because you're less likely to just end up copying some person or couple of people's quirks, and learning to read large codebases is a major skill in itself. The process of submitting patches often turns into ad hoc mentoring, where you'll be told what you didn't get right this go around. After you've done that for a while, you realize the same things apply to other large systems projects, and you end up exposed to a number of different styles of systems stuff.


> Those bits are just bits. And it's just code... and most of it was not written by wizards. Just ordinary people like you and me. Don't be afraid man.

Code is easy and doesn't scare me. I'm scared about the platform though: code runs on platforms (POSIX, mostly), which are incredibly dated, ill-designed and full of terrible corner cases[1]. There's so many ways to shoot yourself in the foot I'm afraid to do anything sensitive on my own.

«It is not UNIX’s job to stop you from shooting your foot. If you so choose to do so, then it is UNIX’s job to deliver Mr. Bullet to Mr Foot in the most efficient way it knows.»

[1]: Dan Lu's writing on files are a good illustration of that:

- http://danluu.com/file-consistency/

- https://danluu.com/deconstruct-files/

- https://danluu.com/filesystem-errors/


Most POSIX APIs are crap, but most significant systems projects don't code to them directly. Almost all large projects have their own internal (or third party) libraries that either wrap or reimplement the underlying POSIX APIs.

That said, at some point knowing the POSIX APIs does become necessary, but that's usually relevant a bit later when you start becoming interested in modifying the toolkit of large projects.

If you're wanting to start a project on your own rather than jump into an existing project, in a nutshell, if you don't have hard real-time requirements, if you're using C, use glib, if you're using C++, use Qt. There are other libs that are useful for real-time usage (basically things that don't ever have hidden memory allocation), but I'm not familiar enough with that space to recommend one.


How is that not just good old “application programming”?

My own definition of “system programming” is exactly “developing on top of the system” vs “developing in the comfort of a helper library” (and its limits). I consider myself a decent application programmer, but I'm not a system programmer (at least not yet ;).

The OP was talking about Postgres, if you don't know how the pitfalls of write(2), or don't know how to use mmap(2), you're going to have trouble making a database on your own.


Most people don't get into systems programming by writing a database completely on their own, and it's fair to say that the line between application and systems programming is fuzzy.

Here are a bunch of things I would broadly consider systems programming:

- Kernel and driver development

- Low level library development (pretty much anything involving bit wrangling)

- Platform abstraction libraries

- Database development (not usage)

- Message queuing systems

- Daemon / server development

Perhaps the recurring pattern there, and differentiated from application programming is that most of those are tools for other applications, rather than applications themselves. I don't think that all library / daemon development is systems programming, but a whole lot of it is.

I'm coming from a background of having done 5 of the 6 of those groups (though there could obviously be more listed there). For most significant projects, as noted, there will be internal APIs that wrap system APIs (or in the case of the kernel, where the system APIs are totally irrelevant, except for the parts that implement the system calls). Generally someone first jumping into those projects isn't going to immediately start hacking on the internal libraries.

As it were, I have actually written a database from scratch [1]. That database uses Qt and Qt-like APIs internally where possible. There are places in there where you need to know the intricacies of mmap, fsync and similar, but they're compartmentalized to a couple of classes. I'd still call code that isn't in those couple of classes "systems programming".

[1] https://blog.directededge.com/2009/02/27/on-building-a-stupi...


I realize it's unlikely anyone will see this, but to amplify this point a little, I thought I'd poke through the Posgres source (which I'm not otherwise familiar with). By my count, mmap is only called in 4 places, in three files.

The implementation behind this file is one of them:

https://doxygen.postgresql.org/fd_8h.html

That is the actual file API that you'd use inside of Postgres. While the API there isn't going to win any beauty pageants, it's significantly more sane than raw POSIX.

For comparison, this is the interface to mmaped files within my company's internal database:

https://gist.github.com/scotchi/83609f5eb9b98ac3b4b4476ed621...

There's a grand total of 1 call to mmap in all of our source. Most of the time for writing code using mapped files here, all you'd see is the interface above.


I want to add to this. People might think they lack opportunities for this exercise because they don't have special requirements out of the open source packages they use. But realize that breaking a piece of code in a controlled manner also needs roughly the same level of understanding as adding a feature.

It could be as simple as a `sleep` in the right circumstances. Maybe your resulting patch is a boring handful like:

    bool cond1 = ...;
    bool cond2 = ...;
    if (cond1 || cond2) {
      sleep(1000);
    }
but you need to read and develop understanding of the code to define the conditions in terms of the program's state. And you also need the same understanding to know where exactly in the code to introduce this snippet. Not to mention compiling it; even with Makefiles, lower-level languages could still be tricky to build.

How this can be useful: Long time ago, we were observing odd behavior from Redis under extremely high load. Of course it did not replicate consistently and when it did, it lasted too short for us to properly observe and make hypotheses. So I had the brilliant idea of installing a custom Redis binary in our test environment that tweaked the odds so that the behavior happened almost consistently, no need to thrash the server. I had to read through Redis source code to make it happen (and added a hell lot of logs too). Plus in the end, it's not quite compiling Linux but, boy, I did compile Redis from scratch!


This.

I can relate to this as well. I will share couple of things just to add to this.

I have ~8YoE now. When I was at my first job, I did not have any formal CS degree, I had completed bachelors in commerce and was struggling with masters in computer applications (had year drops). My first job was in a company started by ~7-8 ex-veritas folks, all of them being hardcore system developers. I had dreams of being the same like them some day (yet to happen). ~1 year in this job and I shared my aspirations to become a system developer - I was given a task, implement persistent ram mechanism, something that will persist data in the RAM even after soft reboot, without dumping data on hard drive, using Linux kernel. "what to do" (trick/technique) was told by them, how to do it was left for me. It took me 4 days (2 weekends) to complete this. Over first weekend I learnt how to get Linux source code, add custom syscall, compile kernel etc. On second weekend I actually got to go through the code, find places to add patches, test etc.

I was also afraid before starting and my then boss had said few things like .. "because you think folks sitting in the west are something special, they are not ..", "whole thing is man made. If one man can do it, so can you". It was a matter of going through the code and understand. Do things repeatedly without giving up. Spend long hours, take notes. Once you have the context and that code is running in your head - you get what to do!

I also did the something similar few years after this. I was learning Go and wanted to do something better. I got into delve codebase (debugger for Go) and I patched it to work for cgo binaries. Its a small patch but for that I had to learn delve's architecture + what is dwarf standard and stuff. Context was pretty huge compared to what I needed to patch it. Same story though, I was little afraid thinking - omg! it's a freaking debugger, how am I going to understand all this to make changes. But in fact they are just the same constructs.

Go ahead and jump into some project you use, it helps understanding codebase faster. Read Read Read. Give yourself time to learn the codebase. You will definitely be able to contribute "properly" for that project. Don't be afraid :)


My perspective:

Anything that's been done by a lot of people, has been done well-enough by some stupid asshole.

I am a stupid asshole.

Therefore, I can probably figure out how to do [thing] well-enough, if it's been done by a lot of people.

It's worked out OK so far.


I have printed out a foreword of an old calculus book, and taped to my wardrobe door. It's a long-ish rant, but it ends with this:

    What a fool can do, another can.


    Considering how many fools can calculate, it is surprising that it should be thought either a difficult or a tedious task for any other fool to learn how to master the same tricks.

    Some calculus-tricks are quite easy. Some are enormously difficult. The fools who write the textbooks of advanced mathematics – and they are mostly clever fools – seldom take the trouble to show you how easy the easy calculations are. On the contrary, they seem to desire to impress you with their tremendous cleverness by going about it in the most difficult way.

    Being myself a remarkably stupid fellow, I have had to unteach myself the difficult, and now beg to present to my fellow fools the parts that are not hard. Master these thoroughly, and the rest will follow. What one fool can do, another can.

http://djm.cc/library/Calculus_Made_Easy_Thompson.pdf


Pretty sure that's Calculus Made Easy, a book which I, as a fool, certainly ought to recognize, and yeah, that's the gist of my approach to "scary" topics. "Have lots and lots of people done it? Then I'll likely be fine, because some of those people were assuredly at least as dumb as I am."


A small counterpoint: I once worked in a code base where small changes would regularly ripple out and cause bugs far far away. You had to traverse a 30-50 source file chain of logic to find what was happening. This was “normal.”

I’m not saying you have to be afraid, just that you sometimes have to tread a bit slower and do a lot of testing as you go until you learn the internals of whatever system you are working on

(Aside: this is what coupling does — you should fear coupling!)


My thoughts as well. Send patches in to FLOSS projects.

I sent in a patch to an experienced C programmer a few years ago and in his code review he said about my #ifdef flags - "Typically HAVE_X are about the environment and USE_X are optional features within that environment."

Listening to Linux kernel developers online, they say a lot of people start kernel development with one small patch, or attempting to port some new device or chip to work with Linux. Once their patch gets merged, they sometimes continue to send in patches. The developers say that good contributors tend to not last long as independent contributors, and tend to get pulled into companies who need such programmers.

And there are open jobs for people with such skills ( https://us-redhat.icims.com/jobs/84882/senior-software-engin... )


There are two books that taught me how systems work.

- One system in isolation - Operating Systems: Three Easy Pieces. Covers persistence, virtualisation and concurrency. This book is available for free at https://pages.cs.wisc.edu/~remzi/OSTEP/

- Multiple systems, and how data flows through them - Designing Data Intensive Applications. Covers the low level details of how databases persist data to disk and how multiple nodes coordinate with each other. If you’ve heard of the “CAP theorem”, this is the source to learn it from. Worth every penny.

More on why these two books are worth reading at https://teachyourselfcs.com


I misread the first line of your bullets and thought the titles were:

- One system in isolation

- Multiple systems, and how data flows through them

Which would be really good names for a two part series.


I agree with the Operating Systems Three Easy Pieces textbook. It was used in one of my courses in college and even as a bad dev then, it made me appreciate a lot of things I take for granted, as well as C. Although once I got to the scheduler, mutexes, etc I lost interest haha. But making my oen linked list was a very fun exercise.


On OSTEP, there's a course in educative.io from the authors of the book:

- https://www.educative.io/courses/operating-systems-virtualiz...

Which is worth checking as an introductory course.


Start using your operating system directly instead of relying on libraries to do things for you. Learn Linux system calls and use them for everything. A great way to do this is to compile your C code in freestanding mode and with no libc. You'll have to do everything yourself.

It is easy to get started. Here's a system call function for x86_64 Linux:

https://github.com/matheusmoreira/liblinux/blob/master/sourc...

With this single function, it is possible to do anything. You can rewrite the entire Linux user space with this.

The Linux kernel itself has a nolibc.h file that they use for their own freestanding tools:

https://github.com/torvalds/linux/blob/master/tools/include/...

It's even better than what I came up with. Permissively licensed. Supports lots of architectures. There's process startup code so you don't even need GCC's startfiles in order to have a normal main function. The only missing feature is the auxiliary vector. You can just include it and start using Linux immediately.

You can absolutely do this with Rust as well. Rust supports programming with no standard library. If I remember correctly, the inline assembly macros are still an unstable feature. It's likely to change in the future though.


What worked for me is trying to develop my own kernel. This might be too low-level for you, but it helps immensely to dispel the magic around how code gets executed by the CPU and the OS. You'll learn how the OS achieves protection from user programs, what are interrupts and how does the OS handle them, what are processes and threads and how does the OS schedule their execution, how virtual memory works (segmentation/paging), how the program is laid out in memory (code, data, heap, stack), how runtimes like the C runtime manages memory and links your code to OS syscalls, how dynamic linking works, what is an ABI, what is a calling convention, how does the compiler, the linker, and the OS know what each should do? How memory mapped file access works, etc. I can keep going, but you get the picture.

From there, you'll know where to go next based on what you've learned so far.

My recommended reading list is:

[1] Operating Systems: Three Easy Pieces https://pages.cs.wisc.edu/~remzi/OSTEP

[2] Intel Software Developer Manuals (especially volume 1 and 3A) https://software.intel.com/content/www/us/en/develop/article...

[3] OSDev wiki https://wiki.osdev.org


Pick an operating system, e.g. Haiku(BeOS), Plan9, Net/Open/Free-BSD. Download the source code and dig in. Read the source, the comments, commit logs and email threads. IMO Linux has gotten too big and complex due to having to support so many processors and devices.

Most existing systems programming is in C/C++. Rust is new and there isn't much battle hardened code out there.


Plan9 is known for having very accessible code.


There are a few good YouTube channels of strong system programmers doing live coding. Watching videos takes time, but you can pick up lots of techniques, big and small, by watching people work.

Since you're looking at rust, https://www.youtube.com/user/gamozolabs/videos could be a good fit.


Any channels you can recommend that have live system coding in C?


Hello there I have quite similar background. (10 years experience, in backend and native android development). Most of the comments already contain good advice. Systems Engineering takes time. I followed long route. I initially took some course in digital logic, and computer architecture and than later took following OS courses.

1. https://pdos.csail.mit.edu/6.S081/2020/schedule.html (check video links and do all the lab assignments)

2. https://www.youtube.com/watch?v=dEWplhfg4Dg&list=PLf3ZkSCyj1... (based on old MIT 6.828)

3. Networks Review (self study)

Somebody mentioned Intel manuals in the comment (those are really helpful)

These courses helped alot. I would suggest to take some firmware and device driver development course as well. My conclusion is that, these are tough skills and the learning process can be accelerated if you can find some entry level position in a small company which do this kind of work.


I just completed the xv6 projects for that course and they were damn difficult, but taught me a lot. I highly recommend it.

The only thing I missed was support for when I really ran into a wall, but perhaps I just don't know the right irc channel.


I totally agree on that. I tried the old version (intel based). And it to me quite some time to do the projects. But I learned a lot in the process.


If you're interested in systems programming with rust then i can recommend "Rust in Action" by Tim McNamara [1].

You've found some pretty good resources already by the looks of it though.

[1] https://www.manning.com/books/rust-in-action


Thanks for recommending my book Sam.

Author here - please feel free to ask any questions :)


First off, don't worry. You already have all the necessary tools you need.

Although the language you use to learn system & network programming doesn't matter much, it is better if you use C or C++ to practise and learn. This is because the kernel itself is written in C and exposes system calls that can be used directly from a C/C++ program. That said, "The Linux Programming Interface"(I am personally reading it) is a really good book. It talks a lot about how one should go about using system calls to get things done by the kernel. Make sure to read a little every day and try out the examples by writing C/C++ programs.

I recently realized that TLPI doesn't talk much about why are things the way they are(a very good example would be virtual memory and related stuff). You should refer some theoretical book for this. I suggest you go with "Operating systems" by Deitel & Choffnes.

Read man pages and practise using the libc/kernel APIs. For example, if you want to know about flushing, read 'man 3 fflush'. This might be needed when you want to flush all the input/output data that has been buffered by the C library before you can get fresh input from stdin. For example, if prompts are buffered, you definitely don't want to "scanf" before you have flushed the buffers. If you want to learn network programming, read chapters related to socket and refer 'man 2 socket'.

You will eventually get to a point where you will be able to connect all the dots(APIs) and be able to figure out what exactly you will need to get some problem solved.

Finally, don't learn for a future job. Learn for yourself. This will help you in the long run.


I took the Computer Security and Internet Security courses from Professor Du at Syracuse University, years ago. Both courses had end projects that required extending a kernel and userspace to implement security functionality.

At that time, we had the option to work with MINIX. Here are the MINIX Role-based Access Control and Firewall Labs:

https://web.ecs.syr.edu/~wedu/seed/Labs_12.04/System/RBAC_Ca... https://web.ecs.syr.edu/~wedu/seed/Labs_12.04/Networking/Fir...

Professor Du's materials are also packaged for self-learners and other teachers to use, as the open source SEED project. A few of the current SEED projects are implementation-exercises similar to the above two labs.

https://seedsecuritylabs.org/

I highly recommend the above resources.


Back when started coding, system programming were books like these ones,

https://www.atariarchives.org

https://archive.org/details/pcinternsystempr0000tisc

Granted they are probably too old, but the concepts of what is actually systems programming is there, you can then get hold of an Arduino or Rasperry PI like device and do your own little OS or bare metal game,

https://www.cl.cam.ac.uk/projects/raspberrypi/tutorials/os/o...

http://www.science.smith.edu/dftwiki/index.php/Tutorial:_Ass...

Or maybe trying your hand at compilers with https://www.nand2tetris.org or given the similarities of Rust with ML, maybe dive into Tiger Book (https://www.cs.princeton.edu/~appel/modern/ml).

Or maybe

https://www.manning.com/books/rust-in-action

https://www.apress.com/gp/book/9781484258590 (Rust for IoT)


KeithS from https://www.assemblytutorial.com/ https://www.youtube.com/c/ChibiAkumas/ does introductions to assembly programming on nearly every old microcomputer platform you can think of.


That is cool!


Thank you for mentioning Rust in Action.


From personal experience, what made low-level concepts "click" for me was looking at documentation for linkers and the various ELF tools. Once I could see the "ends", it gave me the inspiration to dig into the "means". Ymmv, of course.


Join the Handmade Network [0], which is a large online community of low-level programmers.

Later this year we're having our third conference [1], so it could prove useful* to meet up with systems programmers there.

* The usual self-plug warning (I organize these things.)

[0] https://handmade.network

[1] https://www.handmade-seattle.com


"Systems Programming" from a Application Developer's pov involves knowledge of;

  * Low-level languages, compiler runtimes, toolchains and libraries.
  * OS system call apis. 
  * Concurrency. 
  * Networking.
You need books/papers which will teach and walk you through sample idioms and applications in the above domains. A prerequisite is fluency in the C language. With that in mind the following are recommended (some are old books which you can buy used and cheap wherever possible);

* Computer Systems: A Programmer's Perspective by Bryant and O'Hallaron.

* The C Companion by Allen Holub. A gem of an oldie.

* ELF: From the Programmer's Perspective; a paper by Hongjiu Lu.

* UNIX Network Programming by Richard Stevens. Initially, get the old 1st edition since it contains TCP/IP, IPC etc. all in one volume.

* Advanced UNIX Programming by Marc Rochkind.

* Advanced programming in the UNIX Environment by Richard Stevens.

* UNIX Systems Programming by Robbins and Robbins.

* Programming for the Real World, POSIX.4 by Bill Gallmeister.


I've also been trying to learn system programming for a while and what did the trick for me was finding a way to use the C interfaces that UNIX provides.

I've tried to learn C probably 4 times now and I just don't like it. But then I came across LuaJIT ffi which, very easily allows you to use whatever shared library and call whatever syscalls directly and that was a game changer!

After that I decided to test ziglang, which a big part of it's design decisions is interoperability with C, and I'm in love with it! It really feels like anything I would need C for I can do in zig.

If Rust if your jam find a way to call the linux syscalls directly from Rust, not just using a cargo library, but actually importing the appropriate headers and successfully doing an epoll or something.

It will feel like suddenly the man (3) pages all make sense and are extremlly useful!

Good luck!


> I had a Rust job for around half a year, where people build web servers and came from a C and C++ background. Half of the stuff they wrote I didn't understand (flushing, opening another channel just for logs so we don't fill up the other ones etc. etc.)

I think it's worth mentioning that the majority of topics involved in this type of webserver work wouldn't necessarily be covered by a systems programming book. There's obviously overlap between all of systems programming, networking, and concurrent/distributed systems, but if you plan to focus on web servers, I'd pick up texts on the other topics as well.


It's mentioned below, but for certain "kinds" of systems programming the family of books at: http://aosabook.org/en/index.html would probably be helpful.


Some books I learnt off and still seem to be around

Modern Operating Systems by Tanenbaum is a good theory book - this will probably answer your questions about flushing etc

for down and dirty:

Advanced Programming in the UNIX Environment

TCP/IP Illustrated, Volume 1 (2 and 3)


These books are good, especially the last one by Stevens.

Three things helped me a lot to learn more about systems programming:

(1) the reading of existing systems code, especially (i) from a book called Dr Dobb's C-Tools, which includes a C compiler, assembler and linker as well as many command line tools and (ii) the Minix source code. It was the code in this book rather than K&R or Stevens that let me "get" systems programming because I needed to see the bigger picture, and many books only show small code snippets.

(2) the study of other people code; if yo; au have access to a C guru, it's really helpful to just peek over their shoulders for a couple of hours as they implement a new module and then debug it (thanks, Gero and Rolf!) - thankfully, there is a new trend of people recording coding sessions and putting them on YouTube, so more people out there can benefit from experienced hackers e.g. https://www.youtube.com/watch?v=1-7VQwWo2Tg . And, of course,

(3) implementing a non-trivial low-level component. For me, this was having to implement the buffer management of a relational database management system in C from scratch as an exercise in my undergraduate degree (we were given 6 weeks, but not full-time, as lectures were going on at the same time). This course, Systems Programming II, was as beneficial as it was gruesome, but I'm grateful to excellent line-by-line pencilled feedback of one tutor that read the complete code and commented every missing return value check etc.


I learnt this the hard way. Textbooks do not do a good job of teaching how real world code looks like. Best option is to read open source code. Start with sqlite or openbsd userland tools. Code is very clean and obsd style is very very easy to understand, esp the clever use of C macros. Sort github by the most popular rust, c++,go or D code and be on your merry way...


I can't give you specific advice, but in general it sounds you are already on the right track. You are basically exposing yourself to this new stuff, it raises questions that you want to have answered and you reach out to a community to know more. The stuff they wrote you don't understand YET is part of the journey and if you continue to practice work in that domain, you will be able to connect the dots. TL;DR My advice is, be fair to yourself, accept it's a new domain and be resilient and continue despite the difficulties. I hope a systems programmer will give you more specific advice.


A couple of books that are often recommended for understanding how to make robust software:

- Release It! (https://pragprog.com/titles/mnee2/release-it-second-edition/)

- Designing Data-Intensive Applications (https://dataintensive.net/)

I would suggest finding an open source project of interest and taking a deep dive into its code and documentation to understand how it works and why it was built that way.

Which reminds me, this should help with that: The Architecture of Open Source Applications (http://www.aosabook.org/en/index.html)


I will recommend that you go through Universities course pages on courses related to Operating Systems, Systems Programming, Computer Networks & Distributed Systems. Many of them have links to programming assignments and projects, try implementing those.

Advantages:

1. Problem will be well defined for you.

2. Better to implement eventual consistency between 3 nodes, distributed file system or single user database than trying to figure a bug in a large open source codebase.

You may find following links helpful in finding some of such courses:

https://github.com/Developer-Y/cs-video-courses

https://github.com/prakhar1989/awesome-courses


All the three books you've listed focus on APIs and code. It's also useful to know what's going on in the machine so that these APIs can be understood more easily.

The "Operating Systems - Three Easy Pieces" is one great book that has already been mentioned. I would also suggest "Computer Systems - A Programmer's Perspective" along the same lines (https://csapp.cs.cmu.edu/).

Computer Networking is another field you're likely to run into. "Computer Networks: A Systems Approach" is a good book (https://book.systemsapproach.org/)


Find the kind of system you'd be interested in working on and start building one yourself. For me it was a game engine + an editor for it, but it could be anything you find the motivation to build.

Start with a book or a few texts/tutorials on the subject and begin building. Along the way you'll find the questions and choices involved. Find the answers from the internet or books. This is when I'd recommend some open source projects (not earlier) to see how they solved the specific problems. If you just go into an open-source project you won't have an understanding of the problem, just the answers, so it won't help you nearly as much.


Follow-up question: how do I learn systems programming on Windows? I know it’s pretty easy to just set up a Linux VM or use WSL2, and I’m prepared to do that if I must, but it would be even easier if I could work in the same Windows environment I normally use. However, from what I’ve seen, practically all guides to systems programming seem to start off by assuming you’re using Linux.

(And by the way, thanks OP for asking this question! I’ve also been wanting to learn systems programming, but I haven’t gotten around to asking yet. And all the suggested resources look fascinating… there goes my university vacation!)


The best start would probably be both parts of Windows Internals:

https://www.amazon.com/Windows-Internals-Part-architecture-m...

https://www.amazon.com/Windows-Internals-Part-Developer-Refe...

These aren't about programming per se, but if you want to do systems programming it helps to have a detailed understanding of the system. :)

After that, specific reading probably depends on the exact task you want to perform, but MS has good documentation and tutorials in many areas. Writing drivers, for example:

https://docs.microsoft.com/en-us/windows-hardware/drivers/ge...


To add on, the windows internals books have exercises with the sysinternals tools. If sitting down and plowing through a doorstop of a book isn't your cup of tea, try picking a chapter or two which sound interesting, skimming them, and doing the exercises.


IMHO getting into the world of malware analysis and reverse engineering could be profitable for both your brain and your pocket. And they force you to go deep from day one.

I do not have the expertise to work on either field but this is something in the plan. The good part is that most malwares are targeting Windows so you get a lot of samples.


This is good advice.

Also, Windows is an excellent way to learn systems programming because the documentation and tooling is so good and things hardly ever change.

There is a wealth of documentation on MSDN for writing device drivers and such. And great tools for remote debugging so you can set up a VM in hyper-v and step through the code from the host system.

Do make sure if you're analyzing malware you do it in a VM on a machine you don't care about having to wipe, and isolate it from the rest of your network.


At the foundation of a Windows System are two layers, the Win32 calls, and the bedrock is the Native API - https://en.wikipedia.org/wiki/Native_API

You can access them via assembler, C, etc.


Hack on OS161. The assignments are a great start and there’s scope to extend a simple working system to do whatever you want.

https://ops-class.org/


maybe you can start with the classic exercise of writing your own shell. That’s a great way to learn most of the fundamentals of systems programming (and usually the go-to exercise for that class on CS courses)


Yep. Write a grep and find (IO-loop, filesystem), a shell (child processes, signals), a simple nmap-replacement (network, DNS), a forking and a multithreaded webserver (more network, more children, synchronisation), a clock with seconds display and something that synthesizes and plays music on keypress (timing, waiting).

Everything just in the simplest sense, no fancy features needed: the grep just needs a string parameter to search for and some files, the shell doesn't need completion, scripting or variables, just execution and background jobs. For the webserver, just serve some static files from the URL, ignore security, concentrate on getting lots of clients served at the same time however. Bonus points if you make the main process/thread gather meaningful statistics/logs and not screw up concurrency. Make sure to learn the right synchronisation primitives and use them properly. For the clock, make sure to explore sleep-based, loop-based and signal-based approaches and compare them. Get all three second-ticks to be in sync with your pocket watch and not skip/delay/hang. The music exercise is similar, just event-based (you get to handle key input and buffer low or tick events), either with something like select/epoll or multiple threads. The music itself is not interesting, a simple sine or rectangle signal suffices. But of course reaction time should be low and sound should be glitch-free.

What is missing from the above but important are the security aspects of systems programming, most of which are either problems with certain languages (learn how to avoid, recognize and exploit a simple buffer overflow, format string exploit), security aspects around file creation (especially temporary files, but also general symlink attacks), SUID-bits, permissions/ACLs/MAC and generally privilege separation. Those aren't easy to put into learning-by-doing exercises, because you would need an attacker to slap you over the head when you make a mistake there ;)

As a programming language, for Linux/Unix I would strongly recommend using C, not C++, not anything else. Use libc or plain syscalls, nothing fancier. When you have mastered the above in C, you know how to appreciate other, better languages, but also know where those may be lacking. If you just do the above in Python, you learned nothing about systems programming and everything about Python lib idiosyncracies


> Now I wonder how I can get access to this type of information, how to properly learn it?

You've already got half the solution... "Advanced UNIX Programming" was the goto since it was written, and it has a tonne of industry knowledge e.g fork twice and close handles etc.

"The Linux Programming Interface" is kind of that book for the 21st century, and covers pretty much any topic you'll ever encounter on the systems end.

> Half of the stuff they wrote I didn't understand

Sounds like you bought the books but never read them... because it's all in there.


Like any other area you want to learn you should go in with a goal that gives you some intrinsic motivation. At this point in history, Systems Programming is too broad for one person to ingest. Do you want to work on networking? Filesystems? Databases? If you want to get into networking for example (my area) I would recommend you just skip ahead and try to learn ebpf (which is clearly taking over, as Windows has now implemented it). If your interests are other, really do some up front research into what you should be learning.


I would also add Computer Systems: A Programmer's Perspective, 3rd Edition by Bryant and O'Hallaron and C Programming: A Modern Approach, 2nd Edition by K.N. King, to the list. I'm learning systems programming and theory at the moment as well. Both are venerable seminal works on their subjects for the self guided learner.

If anyone here wants a quality PDF of either, I can make that happen, I digitize textbooks as a hobby. I bought both books new from Amazon and were well worth the price new.


I’m interested in this. I live in a country where I can’t get the US editions without spending the equivalent of a big percentage of my monthly income. I have the international edition of CSAPP, but the booksite says it’s full of errors and even misordered pages, and the problem sets don’t match.


Take a look at the APUE course: https://stevens.netmeister.org/631/


If you want to learn Rust for systems programming, first learn C and write something non-trivial in it. Then run it through Valgrind [1] with your test suite. You can also try the sanitizers (AddressSanitizer, MemorySanitizer, UndefinedBehaviorSanitizer, ThreadSanitizer, etc.).

This will teach you what Rust is saving you from and help you understand how it does it.

[1]: https://www.valgrind.org/


The first book in particular is brilliant - I recommend reading it cover to cover.

I would suggest some kind of performance oriented project, where syscalls costs and concurrency issues matter? This will give you a 'feel' for what's going on, and why what's available is available.

Finally, I suggest reading lwn, which will give you a very good idea of what's happening in the kernel:

http://lwn.net


If you know folks in your company are writing things you are interested in reach out. Most of those people are happy that someone is taking interest in what they are doing. The information that they will provide will be meaningful and valuable. Use what you learn from those conversations and leverage it in your daily job functions to better position yourself for future roles/jobs/projects.


Come up with a project and write code. For example you can just implement the web server you were mentioning. That is a good start. Build it and see how to scale it to handle millions of requests. Just creating the code to test it will be a good exercise. You can dig into a lot writing a web server. After that you can pick up another project in the kernel.


Another interesting rabbit hole to explore is the compiler. Back in the day I wrote a toy compiler for a college course and used this text book: "Compilers: Principles, Techniques, and Tools". a.ka. "The Dragon book", but I would look at some of the other books here like "Modern Operating Systems" before this.


The Dragon Book is literally a bad compiler book and you shouldn't read it. I don't know why people recommend it (not saying you did.) It explains things poorly and gets far too into the weeds on things that don't matter.

I recommend "Parsing Techniques: A Practical Guide" and Muchnick's compilers book - though there's probably something better by now.


I feel like dragon book focuses too much on formal math

this is great:

https://news.ycombinator.com/item?id=25386756


Just as an add on to this. Are there any self-taught programmers looking for lab partners for the OSTEP book projects?


If you want to get into kernel or compiler development, we have plenty of bugs that we could use help with: https://github.com/ClangBuiltLinux/linux/issues


You know, I don’t think there exists a good book on the subject of systems programming in the sense of building robust concurrent software. There is so much information you gain by experience, and it just hasn’t been put together in one place, if written down at all.


Get a job in the field. Some places might be willing to take a chance on you if you are a good generalist engineer.

Also, make the case that many things that you learn while doing frontend programming like asynchronous programming are skills that will port over just fine.


You could try writing a few kernel modules, then implement a basic os (there are loads of great tutorials out there). Other good projects in the same space are implementing a shell, a memory allocator, a DB, a buffer overflow attack...


CSAPP is an incredible book that goes into the lower-level details of computer systems. There are lectures available online as well. teachyourselfcs.com is a good compilation and guide to resources like these.


https://github.com/SerenityOS/serenity folks are building a C++ based OS and they have discord access


These two new books on OS and performance are probably worth to explore and study:

[1] Operating Systems Foundations with Linux on the Raspberry Pi: Textbook

[2] Systems Performance (2nd Edition)


Make sure your body and mind are calm and alert. Turn off overhead lights at 9pm. Rise at a consistent time in the morning and view the outside world[1]. Eat nutritious food.

Make sure your desk is clean. Get yourself a nice big pad of dotted paper.

[1] https://youtu.be/nm1TxQj9IsQ


Just to check my own knowledge. Do you mean opening another socket for logs?

If not, what is a channel?


pretty sure he was referring to files, see http://www.di.uevora.pt/~lmr/syscalls.html ( search for channels)


Find an open source project.


These are some topics you might be interested in:

* C/C++

Learn C and it's standard libs (stdio etc..), if you haven't already. Choose a good book for this because many tutorials etc.. you find online are pretty incomplete. Then read one of your books about UNIX APIs.

Also it's worth learning how to use a C debugger (gdb or visual studio one).

* OS

You can look at linux source code but it would be daunting. I'd suggest starting with a Teaching OS and accompanying book, eg xv6 or minix.

Try doing modifications in it. toy OSes generally come with such exercises.

* Assembly language

Learning C doesn't give a perfect idea how machines work. Learn X86_64 assembly programming. you can inspect what assmebly output your C programs give using godbolt's compiler explorer website. Assembly is little boring so don't try to memorize instructions. The mental model is important thing.

* Basics of algorithms and data structures

Maybe you already know, because you worked in back end. But if you're not familiar with few data structures like hash tables, b trees etc.. it might be worth familiarising yourself.

Some miscellaneous topics you might get interested in: linkers / executable formats, OS level stuff related to computer networks, multi threading, file systems, SIMD / vector instructions in assembly.

As others said, write some code when learning. You don't need to do entire projects yourself, you can also play around with established projects.


>Choose a good book for this because many tutorials etc.. you find online are pretty incomplete.

As an aside, the C Programming Language by Kernighan and Ritchie is still the best C book I can think of.

And Hacker's Delight is always a fun book to play with algorithms.


> I had a Rust job for around half a year, where people build web servers and came from a C and C++ background

I don't think this is system programming rather network programming.

You should try embedded or try to write drivers

Better yet write a mini OS your own that interface with hardware

Edit: Wow, I'm getting down voted. I guess political correctness


> I don't think this is system programming rather network programming.

This is probably why you are getting downvoted. Squabling about semantics / gatekeeping what counts as systems-programming isn't useful. It probably doesn't answer OPs question, and it is rather rude to assume that what OP meant by "systems programming" is wrong just because it is different from what you mean by "systems programming".

edit: maybe it was your dead comment that caused the downvotes.


I think a good approach to learning systems programming would be to find a piece of hardware, read through its datasheet, and attempt to implement its driver. Or program a microcontroller in rust. Or try to get "hello world" to display on a PC operate without an OS.

There are a lot of neat, and not-too-intimidating things that will give you a good appreciation of the techniques used in systems programming you don't see in other areas. Eg reading technical documentation (datasheets, Reference Manuals etc), reading and writing registers, familiarizing yourself with memory layout, and communications protocols. Writing a HAL to see how you can abstract over register writes to make high-level code.

In other words, I agree, and don't think this post's parent was gatekeeping; it's the definition I had assumed as well.


Web servers would be counted as systems software by most people. I've been a 'systems' person my whole career, part of the system community, publishing at systems conference and doing work in a systems research group, and I'd count web servers as systems and so would most people I know.


Drivers are not all system software there is.


I meant to be hardware drivers like Display drivers.

The OP asked about System Programming not System software.

Driver is an interface and you can write application software on top of it.


> The OP asked about System Programming not System software.

System programming obviously involves writing system software.


in what world could the negative response to your pedantic and gatekeeping comment be considered "political correctness"




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: