Hacker News new | past | comments | ask | show | jobs | submit login
How to Make a Computer Operating System in C/C++ (github.com/samypesse)
235 points by SamyPesse on Dec 1, 2013 | hide | past | favorite | 85 comments



I am a graduate student, currently working on building a x86_64 unix like preemptive kernel from scratch, as part of a course. Most of the OS dev guides and books focus on 32 bit arch and I haven't found a single guide so far that is based on 64 bit arch. Since this this guide seems to be in its inception, I hope someone (hopefully me) will send a pull request for a 64 bit tutorial.

Building an OS has been in my bucket list for long and it has been one heck of an experience so far.

Here are some good step-by-step OS-from-scratch programming guides

http://jamesmolloy.co.uk/tutorial_html/index.html

http://www.osdever.net/bkerndev/Docs/title.htm

http://www.ijack.org.uk/

http://www.brokenthorn.com/Resources/OSDevIndex.html

http://viralpatel.net/taj/operating-system-tutorial.php

and here's a good wiki for reference

http://wiki.osdev.org/Expanded_Main_Page


I think that AMD Developer Guides & Manuals [1] are very good resources for x86_64 System Programming (better than Intel Manuals). They are better explained and focus on x86_64 arch.

[1]: http://developer.amd.com/resources/documentation-articles/de...


I still have a soft spot for the intel manuals.. probably because I managed to get print copies right before they stopped giving them out :)


There's absolutely a void when it comes to a full amd64 assembly (and associated low level) code. It's a shame, because amd64 is actually rather plesant (as opposed to x86) -- and now rather ubiqutous.

I haven't really looked at the code, but I guess the following at least contains a bit of example code:

http://www.returninfinity.com/baremetal.html

Menuetos also looks rather interesting, unfortunately, as I recall the license for the 64bit version puts the code in a bit of a limbo as far how it's actually useful. There's a fork, but not for amd64 afaik..

http://menuetos.net/

http://kolibrios.org/en/


My hobby OS is x86_64 only, straight C. It's GPLv3. I wrote it with the intention of writing a tutorial, partly because of this void, but still have not written the tutorial :)

https://github.com/nwg/nanos

x86_64 bootup / structures

https://github.com/nwg/nanos/blob/master/src/boot.asm

https://github.com/nwg/nanos/blob/master/src/kernel_init.asm


Building OS' is fun. I wish I could spend all day doing it but I've been pushed so far up the stack that I spend most of the time arguing with analysts and preparing documentation...

I've built two so far which were (and I think still were until recently) used in production equipment. One Forth system (all 8k of it) that ran a PLC system and a tiny (16k) kernel for an M68k system that was a router for a modbus-like protocol over RS485.

Nothing x86_64 but nevertheless the most fun I've ever had with a computer.


I've been a web developer my entire professional career, and for some ungodly reason I have this need to go work in the embedded sector instead. Am I nuts? ;)


Not at all!


Could you provide some pointers? I'm kinda in the same situation.


You really need an EE or CompSci degree these days. That's the first point of call.

I don't have either but it was a little different when I started. I managed to build a reputation up on the side and go from there. Eventually people stop asking about qualifications.

EE is probably more important than a comp sci education for embedded work. Its a fairly complicated field which deals with things far below the abstraction of a machine.

If anyone mentions the word Arduino though, stop listening there and find someone else. They're assumed to be like the real world but are faaaar from it.


I think that was pretty bad advice. If someone wants to start doing embedded you tell them to go get a CompSci degree (i.e. studying algorithms and logic).

And disrespecting Arduino like that does not seem good either. Doing Arduino will actually teach you about both microcontrollers and electronics and there's a lot of nice modules available. Of course it is not like writing your own OS but it's a start, and you can always use the hardware and ignore the Arduino software.

The best way to get started in embedded is to begin on a project where you actually build something that you want to build. That will help you with requirements on the platform you choose, be it Arduino, Raspberry Pi, FPGA or whatever.


I disagree. CompSci is mainly userland.

Most EE courses these days have the majority of a comp sci course built in. MIT actually blur the line between both. Pure comp sci will not help you understand why your RS485 bus drops have noise, why your serial wont sync and how to get stuff off an SPI bus by bit banging. It's all analogue at the bottom of the stack as well which is why embedded systems people still have scopes on their benches.

Comp sci only people shit a brick when you lug the scope onto the bench and crack out the probes.

The Arduino teaches you to do so many things wrong that its almost dangerous and requires a lot of unlearning. Build an AVR programmer and use gcc fine but lose all the arduino crap over the top and gain JTAG debugging, single step, more memory (!), full IRQ control, timers etc that aren't under a layer of crud. Oh an ISR that runs in under an hour as well, CMT. I could go on for a week.

For ref, you won't find a job writing software for or using an Arduino or a Pi either. FPGA yes, ARM SoC yes, AVR maybe, PIC maybe, x86/PC104 yes.


You misunderstood me completely. I was not saying one should study CompSci, that was you (even though you said EE was preferable).

I was merely saying that for someone wanting to _start_ doing embedded work saying "go get a ___ degree" is certainly not a very helpful answer.

I'm an EE doing both embedded HW (FPGA/ASIC) and SW for a living. I still have no problem saying to a beginner that they can play around with Arduino to get a feel for things.


There's some stuff about 64 bit over at http://xomb.org


It's not C/C++. C/C++ is not a language. The project's goal is to write a very simple UNIX-based operating system in C++ (does anyone else smell a contradiction?). The code is predominantly C++.


> write a very simple UNIX-based operating system in C++ (does anyone else smell a contradiction?)

I'm going to use this as an opportunity to champion C++ for systems programming despite the fact that it wasn't really your argument. I feel obligated to do this because I agreed with you for a long time on this but have changed my mind over the past year or two.

The argument against C++ as a systems programming language has always been that it gives you too much freedom to hide complexity, but I've written systems code in both C and C++ professionally and it's been my experience that the extra burden of having to write every damn thing over again in C (or use a relatively poorly tested container library, as compared to something like the STL) is not worth the "obvious performance characteristics."

I'd further contend that in non-trivial software all perf characteristics are non-trivial, so even though a + b might be doing all kinds of crazy things in C++, you're much more likely to run into problems invoking store_value() which might interact with a number of underlying libraries and system calls. At the end of the day you're hopelessly screwed without profiling and the utility perf works just as well on C++ programs as it does on C programs.

Finally, the majority of code in my experience is either slow path or at least not the bottleneck anyway. You can almost always optimize a systems program by making smarter system calls (or hardware invocations), while optimizing CPU is rarely worth it. Even if you're CPU bound, it's likely that it's in doing something like SSL, wherein you're almost definitely wrong to try to jump into a project like OpenSSL with micro-optimizations. With that in mind, why not allow yourself to move faster by creating a std::vector? When you use the proper conventions, it's generally as fast anyway.

Just because people can (and have, a lot) misused C++ and created way too much complexity with it doesn't mean that it's not the preferred language in the hands of someone who knows what they're doing. (And yes, it's much harder to learn how C++ works in a meaningful way to avoid these pitfalls, but if it's your job, learn it anyway).


You can have generalized containers in C, using the intrusive pattern.

For example, Linux's list.h has an intrusive list that is far better than the abysmal std::list from the STL.


Intrusive datastructures were the thing I missed most from C, but boost::intrusive satisfies my desires when it's absolutely necessary.

In general, I'm anti-boost, but I think it's a personal bias and we use the hell out of it at work to great effect. The one thing I'll give boost::intrusive over sys/queue.h is that the type system helps you a lot more to catch issues and the common case is a bit simpler (a struct that exists in a single linked list and a single hash table, for example).

I've been burned many times by something like the following

  struct foo {
    TAILQ_ENTRY(foo) lru_entry;
    TAILQ_ENTRY(foo) hash_entry;
  };
  ...
  void some_magic_func(struct foo *f) {
    TAILQ_REMOVE(static_tailq, f, hash_entry); // oops, should be lru_entry!
  }
Boost intrusive templatizes on the member as well (IIRC) and eliminates this whole class of problem.


In C, you can use a bit of macro trickery to get the same safety, e.g: by defining a 0-sized array of the some type alongside the intrusive node. Then, macros that do the "cast down" from the intrusive node to the container element can also do a type-comparison (using some trickery) between the anchor's array and the container type.

I've seen it implemented, but everyone uses the typical non-type-safe one anyway :)

I don't think I've ever had such bugs in a lot of code though, since I tend to wrap the "containerof" call with a little function like: foo_of_lru_entry and foo_of_hash_entry. A bit of boilerplate for each data structure, but worth it.


You might find http://www.locklessinc.com/articles/dlist/ interesting: it describes a (perhaps excessively) clever way to do intrusive lists that gives type checking.

(Unfortunately it does rely on `typeof`, an extension.)


Any C++ programmer knows not to use std::list anyway.


The rest of the data structures in the STL follow the same (anti-)pattern.

They all either take ownership of your data, or point at your data unidirectionally.

This means that given a pointer to your data you cannot, for example, delete it from multiple data structures it is contained within without going from these structures roots to re-find the pointers to your data.

Whereas with intrusive data structures (the way it is done in the Linux kernel and other "advanced" C projects), you can easily embed your structure in multiple data structures, such that you can do very quick deletion or modification of the data without re-finding it.


I think Boost has something like this.


How about: Any C or C++ programmer knows not to use linked lists anyway.


If you really think that, you've missed CS 101. Or maybe std::list is the only linked list implementation you've seen. In that case, I agree, one should never use std::list.

Linux's list.h is extremely useful, and for a wide variety of circumstances, is the most efficient way to manage your data.


Ok, I'll elaborate, especially since my view is at odds with your statement "for a wide variety of circumstances".

As I see it, the only use case where linked lists are superior to other types of lists, like, perhaps, ArrayList in java or vector/deque in C++, is if the following conditions are met:

1: you care about ordering - often you don't care about ordering and in that case, there is no need for a linked list because you can achieve O(1) insertion & removal then too: insertion can always be at the end, removal can be a "swap with end element, remove end element" operation.

2: inserting and/or removing from the middle of the list is a common operation, if it is not a common operation, then the added cost of doing so with a vector may still be outweighed by a vectors other advantages

3: you do not require random access - lists do not provide random access and lookup is O(n). At the expense of additional complexity in implementation and more memory overhead, you could reduce lookup to, OTOH, O(log n) by using a skip-list

4: you do not iterate through the list often - if you do, you are likely going to blow the cache and mess up prefetching due to poor cache locality. Iterating through an array-based data structure can be much faster in this case.

I would say that for a list to make sense, you MUST have 1 and 2 and probably should have 3. 4 is optional, but if true, should make you consider if there might not be a more suitable data structure. In my own personal experience, this is rare. In fact, in my own personal experience, usually, code either does not require 1 or requires 1 but not 2 - either way, lists are not the appropriate data structure in those cases.

Basically, the short version is that they have very poor cache locality, an (IMHO) narrow use case where other data structures don't have superior performance and they take up more memory per node than a lot of other types of lists.

You linked to a stackoverflow question in another comment saying that they attribute std::list's flaws to linked lists as a whole. The biggest issue they seemed to mention, though, was cache locality - I fail to see how intrusive linked lists solve this. The only solution would be to preallocate nodes in consecutive memory locations, but you still take a hit as the links take up memory (whereas in array-based lists you do not need to store links for each node) and if you need to insert/delete in the middle (why are you using linked lists if this isn't the case?) then you end up jumping around the preallocated nodes anyway and after a while will lose any cache-friendliness you may have had.

Maybe you can elaborate what you meant?

To end with an appeal to authority ;-) I'll quote tptacek[1]:

C programmers are trained to use linked lists. They are the first variable-length containers most programmers come into contact with and so C programmers tend to be imprinted on them like ducklings. Linked lists are a poor general purpose container.

EDIT: I guess I missed an anti-condition: if you don't care about performance, then use whatever models your problem best, which may well be a linked list (though the same is true for std::list).

[1] https://news.ycombinator.com/item?id=4455676


To further prove the point about cache locality:

https://www.youtube.com/watch?v=0iWb_qi2-uI&t=44m50s


This is a strawman. This isn't a useful case for lists.

Since he needs to scan the list to find the position to add/remove to/from.

If he used an intrusive list, the removal from a vector-with-list would be faster in a list already for relatively small N's.

What I learn from this video is that Stroustroup is also misguided about linked lists. He thinks you always need to do a linear search to do useful operations. The whole point of lists is the O(1) operations, not O(N) operations.

Indeed, std::list is the culprit here: "We shape our tools and then our tools shape us". std::list users become unaware that list operations are usable on elements without a linear search first.


Even O(1) operations on lists can break cache locality. Since you're chaining using pointers, all bets that your cells are nearby are off. With vectors, they always are.


When you need to memmove your whole vector you're going to thrash a lot more of the cache lines than when you touch 2 neighbor cache lines.


Assuming that your list elements are in the neighboring cache lines (also how it compares to vectors in real life performance, as opposed to time complexity, really depends on how often your vector has to grow for your workload.. often a reasonable size can be chosen up front to minimize this, for example), but your point is of course valid.

Though I have seen vector-like structure implementations that ocmbine arrays and linked lists so that when you run out of space, it allocates a new array and links it to the end of the old array. I've implemented memory pooling systems like this myself. Works well, because as long as you know which bucket/array the element is in, you have random access and you also have good linear scanning performance, and extending the size doesn't require memmove's.

I'm not saying that linked lists have no uses (my original comment about never was not meant quite serious - never is much too strong a word) - they certainly do - but I am saying that I feel they have much more limited use cases than most people seem to think.

I agree that intrusive linked lists have many advantages that non-intrusive linked lists do not have. The main advantage of intrusive data structures is, as you said, that membership in multiple structures is in one place allowing constant time deletion from all structures.

Replying to your other comment:

Deleting from the middle of the list is an extremely common requirement, when you have objects that need to be enumerable in some order[s] and sometimes deleted.

I don't know what work you do in C or C++ (in other languages, I happily use their lists data types without caring if they used linked lists under the hood, because chances are, if I care about absolute performance, I'd have been using C or C++ to begin with), but in my own work, requiring both order and common insertion/deletion in the middle has been rare - or the elements and lists were so small that linear scan and copy was fast. But I guess it depends on your work, your logic here certainly sounds reasonable to me.

When you need indexing, use a vector. When you need quick add/remove, use lists. When you need both, use both.

Sure, you could use, say, a vector of elements that are themselves nodes in an intrusive linked list. Using a random access data structure to index another data structure is something I've done myself. I suppose I'm mostly arguing against the naive use of linked lists that seems to be rampant, because I think that most uses do not do all the interesting things you mention, in which case, IMHO, my points still stand and linked lists have a narrow use case.

You don't need to jump around. Say you have a request object you found via a hash table. After handling it you decide to destroy it. You now want to delete it from various lists it is in. You can do it on all lists in O(1) for each list. This is relatively cache-local (only need 2 extra random accesses, potential misses, for each list).

See, this is the kind of thing that I was looking for when I asked you to elaborate. You're not really using the list as a convenient growable array replacement (as you say yourself, you don't think of them as containers), you're using them as a secondary structure to maintain order in cases where middle removal or insertion is important (or maybe the other data structures are the secondary ones.. whatever). I think I better understand what you meant all along now and your reasoning makes sense to me, I can definitely see the use of this and agree that the problems of linked lists basically boil down to the following: they're seen as containers.

I think in that case, everything I said is totally true. I was definitely somewhat wrong about intrusive lists in the cases where some combination of these are true: another structure is used when lookup is needed, middle insertion/deletion is common, copying is expensive, order is important, elements are members of multiple lists where membership is commonly modified together.

I regularly have my data objects within a hash table, a list, and a search tree simultaneously and efficiently, with no dynamic allocation to sustain any of that.

This sounds good - I don't think most people really do this though. The problem isn't just std::list (or std:: containers in general). Your original reply to me was "If you really think that, you've missed CS 101" but perhaps you missed CS 101, because in CS 101 they (in mine and in any course notes and data structure books I've read) teach the common unobtrusive-linked-list-container data structure for which my points are completely valid. They're not teaching your use cases, or if they are, its certainly not nearly as common.

EDIT: I wanted to add that all of this really depends on what you are storing to begin with. If you are only storing an integer or two, the overhead of the links will kill you and your cache efficiency. If on the otherhand you are storing lots in each node then a few extra pointers isn't going to do much. Also if you're only storing a few bytes (or a few dozen) copying isn't all that expensive and then whether lists or vectors are faster depends on access patterns like I said in my previous comment. A lot of, eg, my python lists are lists of numbers. In high performance c++ code a lot of my vectors simply contained ids, numbers, points (2d or 3d) or similarly simple data.


It's nice to be able to have productive discussion on the internet!

We can agree on almost all points, I think.

The CS 101 bit you mention is a good point: intrusiveness matters for asymptotics in CS, and should be taught in academia, not (very small portions of) industry.

By the way, when you mention "a vector whose elements are linked list heads", a much more typical scenario is an array/vector whose elements contain one or more list heads.

You were wondering about the kinds of software I was working on that needs this stuff: high-performance storage controllers. We need to maintain a lot of concurrency, handle a lot of kinds of failures, etc. We often require the same objects (e.g: an ongoing I/O request) to be looked up in various ways, associated with failure domains, timed out, etc. So we want it organized by many different data structures, and we need the O(1) of deleting it from the various structures when an I/O request dies.

We also shun dynamic allocations, aside from relatively rare memory pools for large structures that contain the various allocations we need. Intrusive style allows us so many nice things within this style:

* Avoiding the runtime costs of dynamic allocations

* Avoiding the memory costs of extra indirections incurred by dynamic allocations and STL (non-intrusive) style

* Avoiding handling out-of-memory errors in every single code path: there are almost no allocations anywhere, almost all functions become "void" error-free functions!

* Having optimal asymptotics for our operations

* Having reusable generic data structures in C without templates (we dislike C++)

Compared to all this, the STL stuff is just horrible. STL is widely considered to be a superb library, but I find it horrid.


It's nice to be able to have productive discussion on the internet!

I don't mind being wrong if it means I learn from it :)


I disagree. Will reply later, too long mto reply on my phone.


> 1: you care about ordering - often you don't care about ordering and in that case, there is no need for a linked list because you can achieve O(1) insertion & removal then too: insertion can always be at the end, removal can be a "swap with end element, remove end element" operation.

Whether you care about ordering or not, linked lists work great. Adding to an array is only amortized O(1). It is worst-case O(N). Why pay O(N) for the worst case when you can pay a cheap O(1) always?

> 2: inserting and/or removing from the middle of the list is a common operation, if it is not a common operation, then the added cost of doing so with a vector may still be outweighed by a vectors other advantages

Deleting from the middle of the list is an extremely common requirement, when you have objects that need to be enumerable in some order[s] and sometimes deleted.

> 3: you do not require random access - lists do not provide random access and lookup is O(n). At the expense of additional complexity in implementation and more memory overhead, you could reduce lookup to, OTOH, O(log n) by using a skip-list

Here is a false dichotomy dictated by the STL. With intrusive lists, you can have both your vector and lists of various orders. There is no contradiction.

When you need indexing, use a vector. When you need quick add/remove, use lists. When you need both, use both.

> 4: you do not iterate through the list often - if you do, you are likely going to blow the cache and mess up prefetching due to poor cache locality. Iterating through an array-based data structure can be much faster in this case.

Yes, if you want quick (repeated) enumeration, put it in a vector. Again, this doesn't contradict also putting it in lists.

> The biggest issue they seemed to mention, though, was cache locality - I fail to see how intrusive linked lists solve this.

Cache locality is based on your use pattern. When enumeration isn't your common operation, vectors don't have better locality than lists. For example, a sorted list of requests where you may want to time out the oldest one will only refer to the head and tail of the whole list. Cache locality is great.

> then you end up jumping around the preallocated nodes anyway and after a while will lose any cache-friendliness you may have had.

You don't need to jump around. Say you have a request object you found via a hash table. After handling it you decide to destroy it. You now want to delete it from various lists it is in. You can do it on all lists in O(1) for each list. This is relatively cache-local (only need 2 extra random accesses, potential misses, for each list).

> C programmers are trained to use linked lists. They are the first variable-length containers most programmers come into contact with and so C programmers tend to be imprinted on them like ducklings. Linked lists are a poor general purpose container.

I think the false premise here, espoused by the STL, is that a data structure is a "container" at all. Your data can be "contained" by anything (the stack, the heap, a vector, ...). The data structures involved (linked lists, hash tables, search trees) all do not contain the data. They organize the data.

When using vectors as containers STL-style -- you cannot really have your data organized by multiple data structures efficiently.

I regularly have my data objects within a hash table, a list, and a search tree simultaneously and efficiently, with no dynamic allocation to sustain any of that.

STL cannot do this because of the data-structure as "container" philosophy.

EDIT: Also read my other comment detailing why std::list is lacking the most fundamental properties expected of a linked list: https://news.ycombinator.com/item?id=6830526


I've only recently started C++ development...why is std::list abysmal?


I found a stack overflow question that seems to go over it. Basically it looks like there's almost always a better choice for whatever you're doing.

http://stackoverflow.com/questions/18449038/is-there-ever-a-...


That answer is attributing the badness to linked lists, whereas it is fully std::list's. Linked lists are very useful, but it's hard to see that when their canonical (and almost only) implementation is std::list, which is indeed almost entirely useless.


I still don't understand why std::list is bad. Could you explain please?


std::list can be used in 2 ways:

* With an std::list::iterator in each of your data nodes that represents its own position in the list (this is called the "intrusive style")

* Without an std::list::iterator in each of your data nodes

If you use the (more common) latter form: whenever you have a reference to your own object, you cannot do any of the linked list operations without an O(N) penalty to go and re-find your element in the list!

i.e: Say you have a list of requests, and a timeout callback pops up with a pointer to your request, you cannot use a request pointer to do an O(1) deletion from the list. This kind of operation is what lists are for, and std::list canonically cannot do it.

Any other operation you might want to do relating to the list, given such a pointer is impossible (e.g: create a new request that is immediately between the old request and its next).

All this assumes your object is within just one list. If it is within 2 lists, you have to use an extra indirection: std::list <request * >, which makes the problem worse. Even if you do find your request via one of the lists, there is no way to get an iterator for the other list. That means you cannot do any of the list operations on the other list. Again: This is the canonical thing lists were designed for, and std::list cannot do it.

Say that to solve this, you use the intrusive style with std::list. i.e: for every list this object is a member of, you hold an iterator inside your object.

Now the onus is on your to maintain these iterators, in addition to the common list operations. i.e: If you add an element, you need to both call std::list::add, and update the iterator.

Additionally, instead of paying with 2 pointers for each node, as an ordinary doubly-linked list should cost, you have to pay with an extra pointer or two (depending on how the iterator is implemented)!

If you use multiple lists with std::list<request * >, you pay with an extra pointer yet!

So if your data structure is within 2 doubly-linked lists, instead of paying the ideal 4 pointers per item, you pay those 4 + 1(indirection) + (2 for two iterators). 75% memory overhead, ruining your cache lines.

The code will be a mess too, due to the duplicate maintenance.

The alternative is much simpler: http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/i...

Use the intrusive approach exclusively. So that you don't need to use a list::add in addition to maintaining the two iterators. You only maintain the "iterators" (now called "list heads").

So your request would look like:

  struct request { struct list_head list1, list2; };
To add it to a list:

  list_add(&request->list1, &some_list_head);
Given a request, you can delete it in O(1) from both lists:

  list_del(&request->list1);
  list_del(&request->list2);
Easy to use and optimal memory use (exactly 2 pointers per list head).

Now, given some pointer from list2, to get back a request, you use:

  struct list_head *some_ptr = ...;
  struct request *req =
    containerof(some_ptr, struct request, list2);
This is slightly-boilerplatey, so I tend to write a little wrapper:

  struct request *request_of_list2(struct list_head *ptr) {
    return containerof(ptr, struct request, list2);
  }
EDIT: almost completely forgot that std::list also uses new to dynamically allocate elements. This effectively means the cost of adding to lists is many times greater than the simple list_add function. Even if you supply your own allocator, this is unnecessarily expensive and by default means that list::add, like remove, are both not O(1) like they ought to be.


> whenever you have a reference to your own object, you cannot do any of the linked list operations without an O(N) penalty to go and re-find your element in the list!

Can you show an example of when you actually need to do this? Because when I need to do something like this it usually means that some container other than list is more fitting for the problem.


I gave an example: a request that is in multiple linked lists. e.g one by chronological order for quick timing out of oldest requests, and one of active requests waiting on the physical wire.

Now the timeout elapsed, so you have a pointer to a request that needs to be destroyed.

In that case, you typically use the very cheap O(1) list_del on each of the lists it's in. In STL style you pay O(N) for each list it is in. Or you conclude lists are worthless and another structure should be used. But no other structure would give you the incredibly cheap O(1) add and delete you get from lists.


sorry, my bad for not reading all the way through


Indeed, coming from Prolog/Erlang-style "most everything can be represented as a tail-call with a linked-list accumulator" programming, I'm very confused about what operations the GP is talking about. Adding/removing nodes at a position other than the head? Lookup by value? If you need these, you should be using a different data structure.


Removing from any location is O(1), as is adding to any location you have a link to.

With a real linked list at least, not with std::list.


Thanks for your excellent and detailed response! This makes complete sense.


A more complete set can be pulled in with libcontainer: http://agottem.com/libcontainer


Hm, it doesn't claim it's one language. And it has indeed C code and C++ code...


I used to put C/C++ in my curriculum until I realized that my C++ code was getting a lot less "C-like" each year. Now I put it separatedly, which also allows for a nice explanation whenever I'm asked about it in a job interview.


I thought it was a reference to this* when I read the title, but after reading them both I'm not sure there is any connection.

*http://i.imgur.com/fQ1ST8w.jpg


please tell me it was some kind of joke


Possibly but I think it's real. I believe it's the same guy from this: http://i.imgur.com/evPQL2S.jpg


Found that guy's facebook. It does seem real, I mean the person seems real, but I still really hope they're doing this as a joke or something.


You are correct. Usually I see this in CVs. It usually means that the person is proficient in C++ but knows a little C as well. But C/C++ can be a language. It could be a code written in C++ but looks more like C than C++. Some people e.g. use character arrays instead of string class, avoid STL as much as possible, don't use object-oriented aspects of the language, etc. The end result uses some C++ libraries, compiles on a C++ compiler but resembles C more than anything. I think it would be appropriate to call it C/C++.


I mostly agree with your observation regarding CVs and what seeing this in one tells about the person who wrote it. Although I like to think that most proficient C++ programmers respect the differences between the two languages and prefer to call C++ what it is, and mention C separately if it needs to be mentioned.

I don't think the style of your code should change the name of the language you claim to use. If it doesn't pass through a C compiler, it's not C or "C/anything".

Now it is quite possible to use the subsets of C and C++ to write code that conforms to both language specifications and compiles with a compiler for either language. If some project deliberately does this, I don't have an issue with calling it C/C++ to hilight the fact. This however really is quite rare from what I've seen, even if I can name some projects that do make such code (the Opus audio codec is one example).

C/C++ still isn't a language though, so I would very much prefer to call it C in a case like this.

A different scenario entirely would be a project with parts clearly written in two (or more) different languages, which could happen to be C and C++...


I was looking at an analysis of id Tech's Doom3:BFG engine yesterday, and it's best described as this. The author described it as "C, with classes". Quite interesting, so I went and asked some game devs I know, apparently this is pretty standard!


A hobby operating system project - everyone's gotta have one, right?

There are many like it but this one is mine: https://github.com/rikusalminen/danjeros

It's an x86_64 bare bones kernel project, doesn't do much except for boot and handle some interrupts. There's a little multitasking and some related stuff.

Unfortunately there's only a finite amount of time in the world, and not too much of it is available for me to dedicate to this project.


Who remembers Nachos?

https://en.wikipedia.org/wiki/Not_Another_Completely_Heurist...

3rd year OS class we had to use Nachos as our base OS and build your usual file management, pipes etc. functionality on top of it.

Really taught you how an OS works.


Our school used Pintos, a derivative of Nachos. Taking the OS course there was an eye-opener for a student as technically immature and inexperienced as I was at that time, but it was also a tremendous learning experience. I've always wished I could go back and retake that course, the 'right way'...with these resources, though, maybe there's a better way.


Fyi, at least at first, this looks like it will be very similar to a few wonderful tutorials and projects I have been going through lately:

* http://www.jamesmolloy.co.uk/tutorial_html/index.html

* http://www.osdever.net/tutorials/view/brans-kernel-developme...

* https://github.com/klange/toaruos

...and just in case anyone is curious, my own project (about half way through James M's tutorial - working on virtual memory management):

https://github.com/slunk/Hobbyos

Oh, and btw, this looks AWESOME. I am looking through the finished chapters right now!


Might also be worth looking at xv6, which is a rewrite of Unix v6 in modern C, and comes with an exegesis modelled on the classic Lions book:

http://pdos.csail.mit.edu/6.828/2012/xv6.html

(Although if you want to run the original Unix v6, written in a now-archaic proto-C --- unsigned integer variables are declared as pointers because "unsigned" itself wasn't directly supported in the language yet --- PDP-11 emulators and system images are readily available.)


I just wish more people dealt with ARM instead of x86. It's a far better architecture for people to learn system design.


Why?

(I'm curious as I know little of these kinds of things, and because citing sources or giving arguments is always a good thing)


For example, look at all of the legacy stuff in an x86 system. You have the strange structure of the interrupt descriptor table/global descriptor table. You have the good old Intel 8042 keyboard controller, Intel 8259 programmable interrupt controller, old-style programmable interval timer, A20 gate and so much more.

There's so much legacy baggage inherent in the x86 architecture, all because of software compatibility. (ROM-BASIC will still run on a modern PC!)

ARM on the other hand has less to really worry about, sure there's some legacy infrastructure, such as the ARM vector table. Modern ARM architectures such as ARMv8-A remove old legacy baggage in favor of completely renovating it. (Look at the old coprocessor interface for instance. In A64 mode, it is completely gone. You must use 'mrs/msr' instructions compared to 'mrc/mcr'). There's also far less legacy software to worry about.


I've been looking for good OS dev tutorials for ARM. Does anyone know of any besides Baking Pi?


My research about 12months ago didn't give much, sadly.


For those who are interested in developing a realistically useful small OS in C++ (or even C++ 11), you may probably want to check out OSv https://github.com/cloudius-systems/osv/ developed by previous KVM guys.


Been reading Massalin's Thesis on Synthesis OS (partially evaluated, dynamically generated kernel code), it's a refreshing and very inspiring read:

http://valerieaurora.org/synthesis/SynthesisOS/


Maybe a little irationally, the idea of programming an os in c++ strikes me as very opaque. I think the vipri[1] approach of layering dsls, or the smalltalk idea of a relatively simple vm to seem more understandable than an os that embeds a c++ runtime...

As a side note, when looking up [1] I also ran across [2].

[1] http://piumarta.com/software/cola/ [2] http://www.acm.uiuc.edu/sigops/roll_your_own/1.helloworld.ht...


BeOS was primarily written in C++, although the kernel was dominantly C. For actual parts that require user interaction, C++ and the object model makes a whole lot of sense, although there are probably more mature languages out that would be better candidates now.


I loved BeOS and the C++ application framework... except for the linking model.

You had to provide re-linked executables for every new version of BeOS that came out. It was exceedingly lame to go to an application download page and have to select the version of the application that corresponded to the version of BeOS you installed.

Then when you upgraded your OS, you had to upgrade ALL of your applications. Sure, there were some framework upgrades that didn't necessitate a re-linked application, but as a user you never really knew which those were so had to do a full upgrade of everything to be sure.


Agreed. Although, I don't understand why they didn't provide "shim" .so files. I think this is what linux does, though in practice these days I just use ubuntu packages... And at the time to do the same thing in linux, or FreeBSD (which is what I migrated to when I left BeOS) you had to - or were encouraged to - recompile everything from source.


Haiku (BeOS' descendant) is written in C++ as well


How about making a language, compiler, OS and applications and not use C or C++ ?

https://news.ycombinator.com/item?id=6829464


You forgot to add in your list: making a completely new processor ;) Which the linked project actually achieves.


There's that too. Although I'd like to have the time to port Oberon to MIPS, maybe the PIC32 since the more powerful, modern MIPS chips are out of budget. Haven't done the analysis, but I think it would be a good match.



This is great. I am just jumping into open source from an academic background. I had assumed that open source was great for directly practical website creating and hosting (back-end, client side etc...) but weaker in terms of learning for OS / compilers. This post and its comments (with linked resources) are making me unsure about this assumption. What do you think?


Here [1] you can find another example of writing OS from scratch. Going through chapters progress is made iteratively by making limited changes in each step. This OS is intended to be (RT)OS for embedded computers, but example arch-level is implemented for x86.

[1] https://github.com/l30nard0/Benu


Obviously this is still in its very early stages, but I can see this becoming a very informative and practical course.

I'll definitely be keeping an eye on this. Do you intend on turning this into a book and perhaps publishing physical copies?


Yes it's very early stages I'm looking for feedback on how to structure the course.

No I don't think so, I just want to an informative resource from what I've learned.


In case you haven't heard of it, you might want to take a look at LeanPub, who specialize in books that are published while they are being written, have a particular focus on Software Development books, and accept manuscripts written in Markdown stored in a Dropbox folder. You can then take the finished LeanPub book to a print-on-demand company like Lulu or CreateSpace to created printed books as well:

  https://leanpub.com/authors
I actually assumed you were making a LeanPub book, and was considering buying the work in progress.


Thank you for building this. As weird as it sounds saying this out loud, you've made the world slightly better.


"It was written several years ago as one of my first projects when I was in High School" High school? Really? Are there any proper highschools like these? I am jealous now.


Very interesting. A friend and I tried creating our own OS back in 2002 using interrupts to write the boot program. I wish I had this tutorial back then.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: