Single_file_libs: List of single-file C/C++ libraries

raphaelj · on Oct 26, 2020

Am I the only one to see this single-file lib trend as a consequence of the failure of the C++ ecosystem to produce a modern and unified package management platform?

Managing dependencies in C++ is a nightmare, and I'm not even talking about the lack of actual modules inside the language itself. It's so bad that people (incl. me) tend to prefer to copy-paste libraries source code instead of setting up a dependency.

A Python, Ruby, Node or Java project would never have to head toward such a poor solution instead of using their respective package managers. But in C++ we do because the alternative (e.g. Conan) is such a nightmare to use.

flohofwoe · on Oct 26, 2020

Exactly, and what's especially elegant about single-file-libs is that they solve a real-world problem without requiring any new tooling, and you don't need to convince anybody to use your favourite build system or depedency manager (which is the main problem - you need to have a single standard solution right from the start, any time later is too late).

PS: the original motiviation for the STB single-file libs was actually that Windows doesn't have a default place where dependencies are installed:

https://github.com/nothings/stb#why-single-file-headers

mschuetz · on Oct 26, 2020

> PS: the original motiviation for the STB single-file libs was actually that Windows doesn't have a default place where dependencies are installed

I've had so many issues with shared libs on linux in the past, I gave up on it and started to include all the necessary source of any third party library in my projects. Makes building and distributing things much easier.

fnord123 · on Oct 27, 2020

In other language build environments it's called vendorising and Ruby, Go, and Rust offer tooling to do this for the various benefits (hermetic builds, dealing with down infra, not molesting third party source control, not leaking information about your dependencies to third parties).

humanrebar · on Oct 26, 2020

It helps with the problem, but doesn't solve it.

Just because you have five header files instead of five packages doesn't mean you can use all five headers at the same time. For example, some might require C++03 and some might need C++14 or newer.

Also, if all dependencies are single file with no non-standard dependencies, that means none share dependencies they should, like threading, telemetry, or logging libraries.

flohofwoe · on Oct 26, 2020

...shared dependencies are more often a problem and not a solution, even when a standard package manager exists (see the deep dependency trees in many Javascript projects where nobody really knows what's actually going on and what code ends up running).

CyberDildonics · on Oct 26, 2020

Do you have examples of this or is it just a guess?

Anything in its own compilation unit would be isolated. Anything written using C++03 should compile without too much trouble in a more modern compiler.

Also threading is part of the C++ standard library.

I don't know what you mean by sharing dependencies on telemetry and logging. That sounds like adding questionable complexity to libraries that are simple and modular.

johannes1234321 · on Oct 26, 2020

> Anything in its own compilation unit would be isolated

In isolation yes. But if it interacts you get events like the fact that gcc changed std::string between versions (from refcounted to one with small buffer optimisation for C++11 compliance) also introduction of move semantics and rvalue references can change interpretation of the same class declaration between compilers in different modes.

CyberDildonics · on Oct 26, 2020

I have never heard of people upgrading their compiler and/or standard libraries, but expecting to not have to recompile their compilation units.

You seem to be talking about major changes to the fundamental set up of a project while not taking a comparably trivial amount of time recompiling compilation units.

How often are you changing major compiler versions and breaking standard library versions that you would have this expectation?

fnord123 · on Oct 27, 2020

> I have never heard of people upgrading their compiler and/or standard libraries, but expecting to not have to recompile their compilation units.

You've never linked to a binary? I have no idea what debian's Qt was build with or libssl or SDL etc were built with but we can link to them all the same.

> How often are you changing major compiler versions and breaking standard library versions that you would have this expectation?

You don't configure Jenkins to build against multiple compilers/versions of compiler? Isn't that super normal?

CyberDildonics · on Oct 27, 2020

You are conflating the C ABI with the C++ ABI. The C ABI is stable, so linking to SDL works fine. The C++ ABI is actually relatively stable as far as I can tell, but ABI breaking changes are announced when they are released by compiler vendors.

fnord123 · on Oct 27, 2020

I am intentionally conflating the C ABI with the C++ ABI because the C++ ABI is only relatively stable so if you want to write software that doesn't require you to rebuild everything each time you update a compiler then you push your code to use more C ABIs rather than more advanced C++isms. This is exactly what SDL has done as the headers are C compatible but it has plenty of C++ in the direct3d parts.

CyberDildonics · on Oct 28, 2020

This was originally about grouping single file header libraries into compilation units that wouldn't need to be changed often. I'm not sure what your point here is, since most would have C functions for their interface or use templates. Even if you were using a C++ ABI it would be trivial to recompile. Qt is the furthest thing from a single file library.

thisiswas · on Oct 26, 2020

I find single file libs superior to typical package managers. What is more ergonomic than - download file, drop into your project, include and start coding, and I can use my favorite build system/way of setting up projects, no need to integrate anything.

Can trivially support multiple incompatible versions of the library, nothing needs to be fetched or resolved (once you've got the single file of course), can send and distribute it easily over any channel you want (web, email, free file host, google drive, your own website, etc).

quietbritishjim · on Oct 26, 2020

There are lots of problems with single-file libraries. Here are a few but I'm sure there are others I've forgotten or haven't thought of.

* You have to wait for compilation of the whole library at least once every time you do a build of your program - certainly every time you do a clean build, and potentially even incremental builds if it's header only.

* If the library is header only (many of the linked libraries are) then you you potentially have to pay that compilation cost more than once per compilation of your program - once per every one of your source files that include it.

* Again this is specific to header-only libraries, but to avoid code bloat you'll need to turn on link-time optimisation which is far slower than just allowing the linker to do its job by only compiling definitions into a single object file. (Admittedly LTO is a good idea anyway, but adding a bunch of duplicated symbols is avoidable extra work for it.)

* Some useful libraries are realistically just too big for their authors to write the whole thing in one file (e.g. protobuf, opencv, ... in fact most libraries I use on a regular basis seem to fall into that). They could "release" the library in single-file format, similar to SQLite's amalgam, but then if there are any problems (either a bug in their code or something in your code that makes you want to look at their code) you're now not looking at the original source but some mangled version of it.

* If the library is so large that its interface needs to be split over multiple headers (think Boost or OpenCV) then you're now bang out of luck. Hopefully the library has cleanly-enough separated modules you could potentially release these separately (e.g. OpenCV core, imgproc, imgcodecs, highgui, ...) but then you're essentially back to multi-file libraries.

* Adding a library with a lot of its own transitive depedencies takes proportionally the amount of effort as the number of those dependencies, rather than being handled automatically.

One interesting thing about all of these problems is that they get worse and worse as you need more libraries in your program, or need a larger library for your program. In contrast, using a package manager (I'm thinking particularly vcpkg here) tends to add a one-time cost at the start but allows you to scale your dependency list almost for free.

If you're writing your own library, rather than a application, then there are even more problems with this approach, but I won't quite open that can of worms.

attractivechaos · on Oct 26, 2020

The owner of the repo writes many single-file libraries in C. Apparently you are not familiar with his coding style; otherwise you wouldn't have these complaints on linking. For his (and many others') libraries, you "instantiate" the implementation in a .c file and declare functions in other source files. This way you only compile the library implementation once.

In addition, although the name of the repo is "single_file_libs", it links to many double-file libraries consisting of a pair of .c and .h files. These libraries won't have the issues you are talking about. Developers are well aware of the potential linking problems.

That said, you are right that single/double-file libraries tend to be small. It is hard to work with a file with >10,000 LOCs anyway.

quietbritishjim · on Oct 26, 2020

> The owner of the repo writes many single-file libraries in C. Apparently you are not familiar with his coding style; otherwise you wouldn't have these complaints on linking.

I never claimed to be writing only about his libraries, but instead about single-file libraries in general. Sorry if I wasn't clear about that.

> For his (and many others') libraries, you "instantiate" the implementation in a .c file and declare functions in other source files. This way you only compile the library implementation once.

I tried to make clear which of the bullet points only applied to header-only libraries (again, apologies if it wasn't clear). For example, my objection to compiling the same library multiple times specifically applied only header only libraries. As you say, with two-file (or single .c file, if you like) libraries you only have to compile the library once per build of your program - but as I made clear in a separate point, that is still potentially a pain if you do a clean build (but it could be mitigated by having a separate CMake target, or whatever, that contains all your library files).

> In addition, although the name of the repo is "single_file_libs", it links to many double-file libraries consisting of a pair of .c and .h files. These libraries won't have the issues you are talking about. Developers are well aware of the potential linking problems.

As I just said, I had tried to distinguish those cases. They're less problematic, but still somewhat problematic.

> That said, you are right that single/double-file libraries tend to be small. It is hard to work with a file with >10,000 LOCs anyway.

Yes, and if you only need a small number of small libraries then all my objections are a lot less severe. To spin the conclusion on its head - you can avoid going down the package manager route if you avoid having any large dependencies. For some projects that's acceptable. Personally, I find the pain of working without basic largish libraries like protobuf to be enormous compared to the relatively simple process of using a package manager.

attractivechaos · on Oct 26, 2020

By "instantiate", I mean to insert the actual implementation in one .c/.cpp file you write. This is a typical strategy used by single header libraries. It can be done in two ways in a .c file: a) call a huge macro after #include to insert the actual code and/or b) define a macro like "#define LIBRARY_IMPL" prior to #include to let the header insert the code. Single-header libraries don't necessarily have the problems you are describing if implemented properly.

The owner of this repo is a well known figure on game development. His libraries are somewhat widely used and have probably inspired quite a few single-header libraries which adopt similar strategies. My comments are not specific to his libraries, either.

quietbritishjim · on Oct 26, 2020

> By "instantiate", I mean to insert the actual implementation in one .c/.cpp file you write

I know. In my original comment I tried to make it clear that some (but not all) of my points applied to these libraries that have some separate compilation. In my second comment I apologised for not making that clear and tried to stress it even harder. I'm not sure what else I can do to make it clearer.

> The owner of this repo ... His libraries

If you just use this guy's libraries, and that works out for you, then all power to you.

But I'm worried about C++ beginners stumbling across this page on Hacker News and thinking, hmm, using C++ libraries is so hard that I need to restrict myself to libraries like these. That is simply a myth - using vcpkg is really not hard at all, and well worth it for all the other libraries it gives you easy access to. Again, if that doesn't matter to you then that's fine, but I want readers to know that the option is available.

By the way, that the vast majority of libraries linked to from this article are not by the author and do not have the separate compilation model you talked about - they're simply header only with inline functions.

CyberDildonics · on Oct 26, 2020

Compilation time realistically is not a big problem. Managing compilation units by grouping up what changes infrequently actually make compilation faster in general and much faster incrementally.

Even visual studio's compiler compiles sqlite's 6MB in under a second.

> Again this is specific to header-only libraries, but to avoid code bloat you'll need to turn on link-time optimisation

This is nonsense. This list seems like you are trying to invent problems that aren't there. The vast majority of the time you decide what compilation unit to put the definitions into and that's it.

quietbritishjim · on Oct 26, 2020

> Compilation time realistically is not a big problem. ... sqlite's 6MB in under a second

SQLite is notably pure C, which compiles orders of magnitude faster than C++, or at least certain types of C++.

I've worked on a project - not particularly large - where using precompiled headers reduced compilation time from something like an hour and a half to more like 15 minutes. Admittedly that's a very old crusty laptop, but that's without building the libraries, which are compiled separately! Even on much faster modern machines, building the dependencies is at least an hour, maybe more.

I wonder if our opinions are so strongly at odds because we're simply working in totally different situations to start with. As I hope I've illustrated, compilation time definitely IS an issue for us, but obviously it's not for the projects you're working on - which must surely be C-based or at least C-like C++ code. I would wager more people are in my situation, but to some extent that's irrelevant. The advice to other devs has to be: if you're happy to restrict yourself to small C or C-like libraries then single file libraries are fine, but if you want to take advantage of the full C++ ecosystem be aware that there package managers (again, I'm mainly thinking vcpkg) you can use with only a little initial effort. I think it's dangerous for beginners to see articles like the one linked to here in case it makes them think that using C++ has to be like that.

> > Again this is specific to header-only libraries, but to avoid code bloat you'll need to turn on link-time optimisation

> This is nonsense. ... The vast majority of the time you decide what compilation unit to put the definitions into and that's it.

I said that this paricular objection obly applies to header-only libraries, and that qualification is even in the bit you quoted. I never claimed that there are no single-file libraries that allow separate compilation.

But I do dispute that the "vast majority" allow or require separate compilation (either as a separately supplied .c/.cpp file, or by #defining something before one of the uses of the header). That is the exact opposite of my experience. As an experiment I looked at all the libraries listed under "argv" in the linked article, and 9 of them were header only with no option for separate compilation (Argh!, Clara, CLI11, cmdline, flags, kgflags, linkom, optionparser, ProgramOptions.hxx) while only 1 (parg) allowed separate compilation.

CyberDildonics · on Oct 26, 2020

> compilation time definitely IS an issue for us,

But is it due to 'single file libraries'? What I've seen is that having fewer, fatter compilation units speeds up compilation the same way 'unity builds' do. Instead of having a single monolithic compilation unit, a balance can be used to iterate faster and use multiple cores. Compilation units can be groups of things that are changed more or less frequently.

> But I do dispute that the "vast majority" allow or require separate compilation

It isn't about the library always being made for separate compilation. The reality is that unless you are using something fairly fundamental it will probably only need to be in one compilation unit in the first place.

What does need to be shared are data structures and those ideally have as little logic and dependencies as possible.

quietbritishjim · on Oct 26, 2020

> > compilation time definitely IS an issue for us,

> But is it due to 'single file libraries'?

That's a fair point, they're generally not. I just meant that if those libraries were single file, but otherwise unchanged, then the problem would only be worse. I see what you're saying about unity builds, but if you build your dependencies totally separately from the rest of your program then it's not so important exactly how fast they build. I'm imagining a situation where you leave vcpkg to build all your libraries overnight, then you spend the next few weeks iteratively working on your application code (which is how it usually works for us).

> The reality is that unless you are using something fairly fundamental it will probably only need to be in one compilation unit in the first place.

In the projects I work on, which are really quite varied, many (but not all) libraries are used throughout the codebase as vocabulary types. Again, I think this just comes down to very different codebases and uses of libraries for us.

CyberDildonics · on Oct 26, 2020

> libraries are used throughout the codebase as vocabulary types. Again, I think this just comes down to very different codebases and uses of libraries for us.

I don't know what 'vocabulary types' are, but I think this is an oversimplification. My guess is that most big programs end up like yours but my point is that it doesn't have to be this way. Single file libraries are not the problem and can actually help, although the real issue is deeper.

I think the problem is actually dependencies and many times single file libraries go to great lengths to not depend on anything else. To keep it as simple as possible, I think classes / data structures get over used and instead of just making an interface to some data, people stuff everything they are doing into some class they have.

Transformations from one type to another put into data structures means that the class now has dependencies on those other types. If those types have transformations then they have dependencies and so on. The extrapolation is that everything depends on everything else and even compilation units that should be tiny end up pulling in large parts of the entire program as source code. Lots of compilation units can then mean compiling huge chunks of the source hundreds of times.

KptMarchewa · on Oct 26, 2020

>What is more ergonomic than - download file, drop into your project, include and start coding, and I can use my favorite build system/way of setting up projects, no need to integrate anything.

This approach lacks any features that reasonable dependency manager provides, such as providing updates compliant with semver.

jasode · on Oct 26, 2020

>Am I the only one to see this single-file lib trend as a consequence of the failure of the C++ ecosystem to produce a modern and unified package management platform?

One counterpoint to your proposed cause & effect is that some observers think Javascript NPM's packaging convenience enables the explosion of single file dependencies. Example discussion:

https://news.ycombinator.com/item?id=11348798

klodolph · on Oct 26, 2020

What’s happening in the JavaScript ecosystem is dependencies are so easy that you’re getting a lot of small libraries, and they’re one file because they’re small.

What’s happening in C++ is that you’re getting one-file libraries that are not small, because developers will shoehorn their library into a single file rather rather than deal with a way of managing C++ library dependencies.

You might find in C++, for example, a header file with over 10k lines in it and a bunch of preprocessor conditionals. What you might find in JavaScript is a library that contains one or two functions.

_vbdg · on Oct 26, 2020

You know what's funny? I used to think this too, but I just started a new project with vcpkg and it works great. I understand that Conan is supposed to be good as well, though slightly less user-friendly. I can just define all of my dependencies in a json manifest, set my cmake toolchain, and run `find_package`, `include_directories`, and `target_link_libraries` as normal. It's great.

pjmlp · on Oct 26, 2020

I see it more as trend for the You Tube generation unwilling to learn how compiled languages work.

Maybe we should a couple of You Tube videos about linkers and stuff.

snazz · on Oct 26, 2020

This is just a natural consequence of programming becoming more accessible, I think. And let's not pretend that C and C++ package management is "good but misunderstood"—it needs a lot of work.

pjmlp · on Oct 26, 2020

Yeah, work like reading a book.

snazz · on Oct 26, 2020

You can understand how C/C++ builds and dependencies work by reading a book, yes. But you can still criticize it—there's lots of stuff there to criticize! Modern languages have shown that dependency management and builds can be easier, faster, more secure, and more portable.

pjmlp · on Oct 26, 2020

Which is why Conan and vcpkg do exist, and for quite long time OS package managers and installers.

raphaelj · on Oct 26, 2020

This does not have anything to do with knowledge. C/C++ build systems are archaic if not broken.

I literally wrote a compiler but still spend 15/20% of my development time trying to find the right not user-friendly CMake syntax, understanding while my Conan dependency broke overnight, or writing header files that could be automatically generated by the compiler (à la GHC).

pjmlp · on Oct 27, 2020

To me that experience sums up pretty well, apparently you were learning on how to use cmake.

As for the headers, C++20 modules, or just use an IDE like Visual Studio and the respective wizards.

junon · on Oct 26, 2020

This is a bit of a misunderstanding of the C++/C way of doing things. It makes sense in some ways but clearly there are better ways of doing it.

Which is precisely why C++20 has modules support.

Further, C++ does have a central package manager - whatever the system distributes. C/C++ programs constantly link against the operating system and thus have to get those system headers from somewhere. This means a universal package manager (like Conan) requires buy-in from OS vendors or at least someone who can manage those packages with conviction, else the package manager doesn't make a lot of sense to use.

This is contrast to Node, Ruby and Python as they are inherently Cross-Platform in nature and are not coupled tightly to the OS.

This is why C/C++, historically, have not had a strong centralized package manager.

Anymore, I see less and less package management being used anyway. Most C/C++ projects I work with either vendor in dependencies or use git submodules (the latter I prefer very much).

This is because, unlike e.g. the Node community, micro-dependencies are generally not worth the effort to bring in.

The usual exception to this sort of structure are codebases that will be distributed via a package manager - in which case, they generally rely on the package manager supplying correct versions of the dependencies.

These header-only libs work OK usually because they are able to be included with trivial linkage (something else you normally don't need to care about in scripting environments). Linkage is something you have to care about regardless of if you're writing C, C++ or machine code and since these languages give you pretty much free reign on the flexibility of all of these parameters, it's hard to generalize everything into a nice package like scripting languages can.

Usually the functions inside header only libraries will be static inline, and will almost certainly increase compilation time (noticeably so if you use the header in many places).

There is work being done to improve this situation, it's not like we're all just sitting here going "yes, we love the way things are and have no idea how to make things better." That's far from the case. Shit just takes time.

quietbritishjim · on Oct 26, 2020

vcpkg is a decent enough package manager to making using it worthwhile, despite the occasional problem.

Certainly, if anyone finds that things have got desperate enough that they're looking the OP's list of single-file libraries then they should just take the one-time hit of setting their project up to use vcpkg. Once that's done, using a supported library reduces to running `vcpkg install foo`, regardless of how many source files it has (or how many transitive dependencies for that matter).

fnord123 · on Oct 26, 2020

You are not the only one to see this single-file lib trend as a consequence of the failure of the C++ ecosystem to produce a modern and unified package management platform because the single-file lib trend is indeed a consequence of the failure of the C++ ecosystem to produce a modern and unified package management platform.

callesgg · on Oct 26, 2020

There is correlation there yes! But sometimes you just want something simple and easy and package managers comes with so much extra stuff that is very useful in more complex projects. Those things can be a major annoyance when you just want something simple.

_pmf_ · on Oct 26, 2020

> Am I the only one to see this single-file lib trend as a consequence of the failure of the C++ ecosystem to produce a modern and unified package management platform?

No, I think there is no other way to see it.

enriquto · on Oct 26, 2020

> Am I the only one to see this single-file lib trend as a consequence of the failure of the C++ ecosystem to produce a modern and unified package management platform?

Single files are a modern and unified package management platform. You copy the file and you use it. It does not get any simpler than that.

You are talking as if there was a kind of compromise between some imaginary disadvantages of single files. There are none, and there is no compromise. Single files are alright.

sigjuice · on Oct 26, 2020

I honestly don't understand this trend of single-file libs. Why not provide .c and .h files? If I already have a project with several .c files then surely I can add one more to whatever build system I have? I would rather I didn't have to add questionable #ifdefs each time I add a new file.

flohofwoe · on Oct 26, 2020

It's easier for simple command line tools which just consist of a single source file (or generally projects so simple that they don't need a build system), and at the same time it doesn't add any overhead for integration into more complex projects compared to .h/.c pair. In general, more problems are moved from the build system into code (mainly the configuration, you can configure the implementation via defines in the source code in one place before including the implementation instead of passing them in via command line arguments (meaning more complexity in the build system), or a separate config.h header.

I tend to think of single-file libs as a "poor man's module system". The entire "module" is in a single file that you drop into your project, and you're ready to go (and ironically it's a lot more straightforward than the C++ module system).

The real-world differences between single-file and header/source pair aren't such a big issue as many people make it out to be though, either way is fine. The actual problem is libraries that consist of dozens or hundreds of source files and a complex build system setup.

huhtenberg · on Oct 26, 2020

It is form over function.

This works for simpler libraries and templates, obviously, but once the function becomes less trivial, it should really be going into a separate source file.

Readability and manageability aspects aside, keeping implementation in a header causes all dependencies of that implementation to be pulled into your code as well. Including the heavy platform-specific stuff. If I need to do some string conversion, I really don't care for the implementation sneaking in <windows.h> into my sources just so that it can call WideCharToMultiByte.

thisiswas · on Oct 26, 2020

stb actually went to extreme lengths on his SDL/GLFW-like library where he didn't even depend on windows.h, i.e he put just the structs and function(pointers) he needed into the header (under stbxxx_ prefixes), hardcoded magic numbers instead of windows.h's #defines, bootstrapped with __declspec(dllimport) GetProcAddress (and even that symbol can be omitted with some trickery, there is a way to get to kernel32.dll functions from any .exe)

rogual · on Oct 26, 2020

Sometimes it's not quite so bad, because the single file will contain both implementation and header, with an #ifdef to switch between them, so you're supposed to make your own source file which uses a #define to include just the implementation part.

flohofwoe · on Oct 26, 2020

Single-file-libs done right have the implementation in a separate #ifdef/#endif block. They don't pollute your code any differently than a regular .h/.c setup would (at least outside that one source file which includes the implementation).

huhtenberg · on Oct 26, 2020

> They don't pollute your code any differently than a regular .h/.c setup would.

For that to happen you'd need to have a separate .c to include just that .h with IMPLEMENTATION defined. So in the end there's still an .h and a .c, except the .c is now completely superficial.

flohofwoe · on Oct 26, 2020

Yeah I just added that while you wrote your comment, sorry.

You can still put multiple implementations into the same source file and compile that into a library, for example:

https://github.com/floooh/sokol-samples/blob/master/libs/sok...

That's what I usually do for projects that don't just consist of a single source file (e.g. simple command line tools).

An actual advantage of this approach is that you can add any configuration defines in that same source file, instead of passing them in from the build system via compiler command line args.

Also for bigger projects you can have dozens or hundreds of header files but only a small number of source files (e.g. one source file per "system" or split by change frequency, or by any other criteria (like whether the implementations include Windows.h or other system headers). A small number of compilation units means fast compile times (since it's essentially a unity build), but at the same time you have enough control over the project's file structure to balance compile times vs "namespace pollution" vs what changes trigger a rebuild.

hburd · on Oct 26, 2020

You can avoid the .c file by building the .h and passing the compiler a -DIMPLEMENTATION option.

jb1991 · on Oct 26, 2020

Yet there are very many substantial libraries that go well beyond small tasks that are implemented this way, some of them very popular.

mschuetz · on Oct 26, 2020

Way simpler to integrate into your project. If there are multiple options, I'll chose the single header lib any day. If there aren't and the library turns out to be cumbersome to integrate, I may just reinvent the wheel or do something different. E.g., I'd never ever integrate boost ever again into any of my projects and just reinvent the wheel instead or just not do whatever would have required boost.

nurettin · on Oct 26, 2020

This was harder to say back in 2003, but these days a lot of boost is already integrated into standard c++ anyway. We already have refcounting smart pointers, a unified threading model, ways of handling datetime, etc.

attractivechaos · on Oct 26, 2020

I agree. The concept of single-file or header-only libraries is often abused. Except when generics are involved, having a .c/.cpp source file is preferred. Note that while the repo is named "single_file_libs", many libraries in it consist of a pair of .c and .h files. These developers also agree with you.

spacechild1 · on Oct 26, 2020

just came here to say that I don't understand it, either. What's so hard about adding a couple of source files to your project?

jb1991 · on Oct 26, 2020

> questionable #ifdefs

There’s nothing questionable about this practice, it’s pretty widely used throughout the industry.

eps · on Oct 26, 2020

Whoever maintains this needs to pay a bit closer attention to _how_ single-header libs are implemented.

In particular, stuffing non-inline functions into .h is, ultimately, a malpractice. Yes, you get to have a single .h, but you end up creating an instance of each function for every .c that includes this .h.

Example - https://github.com/tronkko/dirent/blob/master/include/dirent...

mariusor · on Oct 26, 2020

Sorry for speaking out of turn here, but the author, Sean Barrett, is a pretty well known figure in the game industry and his stb_X libraries are used _a lot_ in the indie game dev circles.

I would bet they are aware of most of the reasons that you might come up for why single file libs are not good, and yet they authored a bunch of them.

As another poster remarked in this discussion, he has a good overview for how to create a robust single header library: https://github.com/nothings/stb/blob/master/docs/stb_howto.t... If you can offer some good arguments against those, instead of generic dogma, then we can have a discussion. :)

eps · on Oct 26, 2020

His own libraries are implemented correctly.

If you can first understand the point of a comment, instead of not understanding it, the we can indeed have a discussion.

mariusor · on Oct 26, 2020

Probably you should clarify what you mean in the top post. I still can't see anything but a dig on why single header libraries are wrong. Apologies if that's not the case.

spacechild1 · on Oct 26, 2020

I think OP was quite clear. They didn't say that single header libraries are wrong per se, but that it's bad practice to not declare functions in headers as inline, because it can cause accidental code duplication.

EDIT: I see now that this advice is only valid for C++, where the "inline" keyword creates a weak symbol. I forgot that this doesn't work in C.

Anyway, the #define trick is only necessary for C, in C++ you would simply declare all functions as "inline" (which automatically happens for function templates, btw)

ludocode · on Oct 26, 2020

Minor nitpick, but inline and weak are not the same thing. Weak linkage is a property that is resolved by the dynamic linker (at runtime startup); inline linkage is a property that is resolved by the static linker (at compile-time linking.) GCC calls inline "vague linkage" and it only uses weak linkage if the platform does not have COMDAT support:

https://gcc.gnu.org/onlinedocs/gcc/Vague-Linkage.html

spacechild1 · on Oct 26, 2020

Thanks, you're certainly correct! I was a bit sloppy and didn't care to lookup the proper wording :-)

rsaarelm · on Oct 26, 2020

That's why you wrap the function implementations in an #ifdef and instruct the library user to set the corresponding define in exactly one place, like the list maintainer does in his own single header libraries: https://github.com/nothings/stb/blob/b42009b3b9d4ca35bc703f5...

eps · on Oct 26, 2020

Sure, that's an option.

jb1991 · on Oct 26, 2020

It’s not just an option, that’s how most header only libraries are explicitly created.

spacechild1 · on Oct 26, 2020

I haven't used a single C++ header-only library which required a #define. But that's because in C++ the "inline" keyword creates a weak symbol, so you can #include a header in several places and there will only ever be a single implmentation. For function templates this happens automatically, btw.

IMO, if you need a #define like this, you're are not really writing a header-only library, that's basically just a source file in disguise. You could just as well distribute headers + a single source file and just ask the user to add this one source file to their build system (which should really be a trivial task).

EDITED for more clarity about C vs C++.

_vbdg · on Oct 26, 2020

Also, while it's easier to just include another header file, it can have a pretty serious impact on build times. If you compare the same program with header-only libraries to one with libraries using the normal header declaration/source definition split, it's much faster. I never worried about this too much in C, but C++ takes long enough to build as it is.

As a sidenote, I remember my brain hurting when I learned about C++ just redefining `inline`. It would have made much more sense to just define a new keyword.

</grumble>

spacechild1 · on Oct 26, 2020

Personally, I think people should only do C++ header-only libraries if technically necessary (e.g. heavy reliance on templates). Otherwise it just unnecessarily hurts compile times, as you've correctly noted, because the compiler still has to parse all those "inline" functions.

eps · on Oct 26, 2020

I don't disagree. My top comment was that if you are maintaining a list of single-header libs, might also pay attention to whether they are implemented properly.

dvfjsdhgfv · on Oct 26, 2020

But this is precisely what #ifndef is for.

eps · on Oct 26, 2020

Look at the linked code.

tom_ · on Oct 26, 2020

The readme includes a disclaimer that covers exactly this sort of case: "I have not personally verified that any specific lilbrary is as advertised, or is quality software."

AHTERIX5000 · on Oct 26, 2020

These can be quite useful when prototyping indeed but please keep in mind that they are not the most robust code around in terms of security and UB if you are actually shipping a product using these.

ATsch · on Oct 26, 2020

I remember fuzzing stb_truetype and it finding hundreds of trivial segfaults instantly. Perhaps they fixed some but I can't advise using these with any untrusted data, it's obvious security is usually not a priority.

sherincall · on Oct 26, 2020

There is now a big fat disclaimer at the top of the file saying basically the same thing:

https://github.com/nothings/stb/blob/master/stb_truetype.h#L...

ATsch · on Oct 26, 2020

Ah, that's good, despite being six years late. However, I do not think there is any good reason for code written like this to exist in the first place. Bounds checking is easy and fast (even if less so in C) and things you consider "trusted" can become untrusted easily and quickly turn otherwise minor bugs like file type confusion into critical vulnerabilities.

ludocode · on Oct 26, 2020

I mostly agree with you. An important qualifier is that these libraries are meant for games, especially games that package their own font files, and especially games that are on locked-down platforms like consoles and mobile phones. On console games it makes sense to exclude bounds-checking your own data because you want to minimize load times. But yes, bounds checking should have been included and have been on by default with an option to disable it.

ATsch · on Oct 26, 2020

I think rust has indirectly shown that the performance impact of bounds checking mostly neglegible in practice. Even in very high performance code, I've never seen anyone turn them off for performance. This makes a lot of sense considering that on modern CPUs everything except cache misses is basically free, so a single unlikely branch compare usually just does not matter.

The only exception I can think of is accessing lots of indexes in a loop, where getting good performance requires rust programmers to insert awkward asserts or casts to fixed length arrays to get the checks to optimize out[1]. But afaik that's mostly because the bound checks impede other loop optimisations like vectorization, not because the checks are slow themselves.

[1] (e.g. when you access indexes 1 to n randomly, assert n < length before)

steveklabnik · on Oct 26, 2020

We have seen some of this; Dropbox has a macro that lets them turn off checks with a feature flag, for example, because it had a noticeable impact.

ATsch · on Oct 26, 2020

Ah, that's interesting, I assume this is the link:

https://dropbox.tech/infrastructure/lossless-compression-wit...

Shows a significant improvement in compression code and some examples of checks that can't be elided. I guess this makes sense because it's pretty much the worst case scenario for bounds check impact.

I can still personally say that every time I've blamed rust's bounds checks, I was wrong.

EDIT: neither the unsafe flag nor the macro appear to be present in https://github.com/dropbox/rust-brotli/ anymore today, removed some time in 2017. So it seems they found some other way to deal with it?

steveklabnik · on Oct 26, 2020

Ah interesting! Glad they removed it. I can't quite find out when, and I don't have the time to really dig in right now. Very cool, thanks.

flohofwoe · on Oct 26, 2020

This is unrelated to being single-file libraries though. Any library suffers from the same problem until it's properly fuzzed and fixed.

hnlmorg · on Oct 26, 2020

I didn't get the impression they were saying this is a problem specific to single file libraries but rather it was a problem specifically with the libraries suggested in the linked repository.

DoofusOfDeath · on Oct 26, 2020

I'm generally not a fan, but header-only libraries do make it easier to ensure that specific compiler/linker flags are consistent across your code and the library's code.

Some examples of when this is helpful:

- using gcc's / clang's sanitizer frameworks

- tweaking builds for deep-dive optimization and/or debugging

- letting the compiler target a specific hardware architecture, e.g. `-mavx512f`.

I guess this highlights a limitation in typical Linux systems: a given library can be built with many different configuration options, but there's no (widely known?) way to have all those variants installed at the same time.

I think fixing this would require more than just changing cmake, apt/dpkg, or the C++ standard. A clean solution might also require changes to the ELF standard and/or Linux's compilers, linkers, dynamic loaders, and debuggers.

todd8 · on Oct 26, 2020

I remember looking over Niklaus Wirth’s source code for a compiler for his new language Pascal. This would have been around 1975. I was stuck by the fact that it was one big single source file.

Later I came across a quote where he said something to the effect that what’s the point of a linker that slower than the compiler.

Sometimes I think what’s the point of this tangle of make files (or cmake/conan or whatever). It’s more difficult to get right than one big single source file.

psykotic · on Oct 26, 2020

> It’s more difficult to get right than one big single source file.

For all my personal projects, I use a single main.c file which #includes the topologically sorted .c files for each module, one file per module, preceded by a shared #include block for external library headers. If your module dependency graph is acyclic, you don't need any per-module headers; inside a module everything is sorted and you only need forward declarations for mutually recursive functions/types. The only real downside for me is that it breaks 'static' isolation. Even the slower C compilers like gcc, clang and msvc can build 50-100 kloc/sec on my aging laptop with code structured like this.

You get qualify of life benefits like fast, simple builds (one-liner scripts for building on each platform, no need for the compiler to chew through the same monstrous system headers for every .c file, fewer symbols for the linker to resolve) and less boilerplate you have to write (no redundant copies of declarations in headers, no redundant #include blocks at the top of every .c file). In case it's something you care about, structuring your code like this also makes it trivial to automatically amalgamate your entire project into an stb-style single-file header library for distribution purposes: it's already 'static' safe, you've already specified the topological order and you've already eliminated all the redundant boilerplate you wouldn't want in a single-file distribution, so it's mostly just a matter of writing a script that inlines the #include "foo.c" directives in the main.c file.

rramadass · on Oct 26, 2020

Neat technique. I had used it in a project where there was no linker and the compiler produced an absolute addressed binary for a custom chip.

The flip side of this technique is to be careful about blowing up the size of the final executable since everything gets included. You may need a linker pass to remove unused sections from the final executable. One good example to manage this is found in the dinkumware C library where each function is in its own .c (and hence corresponding .o) file. This way only the functions actually used get linked into the final executable.

psykotic · on Oct 27, 2020

> The flip side of this technique is to be careful about blowing up the size of the final executable since everything gets included.

Yeah, the 1970s linker model is very silly like this. The good news is that link-time optimization will handle this for you nowadays without any special per-symbol mark-up. As an experiment, put an unused 'deadfunc' function definition in a file called ltotest.c and then compare

    gcc -O2 ltotest.c -o ltotest && (objdump -D ltotest | grep deadfunc)

and

    gcc -O2 -flto ltotest.c -o ltotest && (objdump -D ltotest | grep deadfunc)

rramadass · on Oct 27, 2020

https://tetzank.github.io/posts/removing-unused-code/

psykotic · on Oct 27, 2020

Neat, I didn't know about -fwhole-program as an alternative to -flto for single-file builds. It should help with compile times a little bit (though I normally only use LTO for release builds, rarely during development, so it's not a big deal either way). With MSVC I think you still have to use LTO (or LTCG as it's called there) to get this effect.

rightbyte · on Oct 26, 2020

That is really neat. Never thought of just including the c-files in a c-file.

There is a breakpoint between invoking gcc manually and having a build system like make or cmake that is quite annoying and this would be a nice bridge.

suyjuris · on Oct 26, 2020

I do the same thing, it works great!

userbinator · on Oct 26, 2020

Sometimes I think what’s the point of this tangle of make files (or cmake/conan or whatever). It’s more difficult to get right than one big single source file.

There is a point at which multiple files and make systems make sense (no pun intended). Unfortunately, it seems most people overengineer at the start rather than adding complexity as needed.

sxp · on Oct 26, 2020

nothings' guide on how to write single-file libraries is informative if you want to write your own: https://github.com/nothings/stb/blob/master/docs/stb_howto.t...

spacechild1 · on Oct 26, 2020

Using something like #define LIBRARYNAME_IMPLEMENTATION is a hack that is only necessary for C, because in C++ you can just mark all functions "inline" and there will only ever be a single implementation, no matter how often you include it.

flohofwoe · on Oct 26, 2020

Please don't, inline code in headers is the single reason why C++ compiles so slowly, and besides: the inline keyword is also available in C (since C99), but that doesn't mean it's a good idea to use it:

The IMPLEMENTATION define cleanly separates the public API declarations (which are required in each source file where functions from this library are called), from the implementation code and any implementation-private declarations.

If you put all the implementation code into inline code, this code is parsed over and over again in each source file which would only need to include the declarations.

This is the main reason why C++ projects compile so slowly, and what gives "header-only libraries" such a bad reputation in the C++ world.

"STB-style single file libraries" don't suffer from this problem.

spacechild1 · on Oct 26, 2020

> the inline keyword is also available in C (since C99)

but it doesn't work the same way.

> If you put all the implementation code into inline code, this code is parsed over and over again in each source file

That's true. Good header-only C++ libraries provide "forward" headers, which you can use if you don't need the full definition.

But the best thing is to only write C++ header-only libraries if there is a technical reason (e.g. heavy templated code).

> "STB-style single file libraries" don't suffer from this problem.

That's right, but I also don't see the benefit. To me, telling the user to add a #define to some random source file feels incredibly hacky. On a technical level, it's the same as asking the user to #include a ".c" file.

Why not put the implementation in a single ".c" file, which the user can add to their project? I honestly don't understand this...

umvi · on Oct 26, 2020

I think this chart would be more useful if the full name of the project were used instead of the file name (i.e. "nlohmann/json" instead of "json.hpp"), and also including the number of repo stars as a column.

Just looking through the list of single file JSON libraries, "nlohmann/json" is buried in the middle (most popular C++ json library) and surrounded by small projects.

z0r · on Oct 26, 2020

This is missing my favorite header only library, cppitertools. I think it would qualify if it was considered a set of single header file libraries (one per iterator pattern), but since it has a dozen to choose from I guess it doesn't make this list.

EdSchouten · on Oct 26, 2020

All of that to compensate for the fact that the C/C++ community never came up with a decent & standardised build system. I never have to deal with issues like these when writing software in Rust & Go.

scoutt · on Oct 26, 2020

I agree that there is no easy way when it comes to build systems on C/C++ (other than known environments for specific platforms, like Visual Studio).

Also, following the reasoning line of your comment you can make a parallel conclusion: the C/C++ community and level of adoption are so big that the people of Rust & Go had to put together a "decent & standardised build system" to gain traction and consideration.

jb1991 · on Oct 26, 2020

So you’ve never had problems with vendoring in Golang? At least this particular strategy makes it trivial to have different versions of different libraries in different projects on the same machine, each project entirely manages its own dependencies without any reliance on language specific tools. That’s why this kind of library format is so extremely popular.

initplus · on Oct 26, 2020

With go's new module system, managing dependencies is quite pain free.

pjmlp · on Oct 26, 2020

That depends pretty much on how they follow the versioning rules.

acuozzo · on Oct 26, 2020

I can only speak for C.

With that being said, having a standardized build system would be nice, but it wouldn't solve the underlying problem here which is that portability is often not regarded as important enough to merit thinking about and designing for at the start of a project.

The consequence of this is that it creates a system in which C programmers have very little trust in one another's code.

I write a lot of C for ARM and I am often hesitant to use libraries that I can't inspect in a reasonable amount of time because I have been burned by e.g. liberal use of type punning which results in bus conflicts as it's idiomatic and works just fine on x86, but is ultimately undefined behavior in most circumstances. (Rant: Just use memcpy, please! Argh!)

gmadsen · on Oct 26, 2020

what is the advantage to a single header instead of 10 headers and maybe a main header file?

I really don't like looking at 10k+ line files..

petterparker · on Oct 26, 2020

this one might fit the description as well:

Rax is a radix tree implementation initially written to be used in a specific place of Redis in order to solve a performance problem, but immediately converted into a stand alone project to make it reusable for Redis itself, outside the initial intended application, and for other projects as well.

https://github.com/antirez/rax

jb1991 · on Oct 26, 2020

This is not a single file library so it doesn’t quite fall under this category of source code. Most or perhaps all of the nothings files are considered “header only“, something that can be very convenient for adding libraries.