Hacker News new | past | comments | ask | show | jobs | submit login

> I'm not saying it's impossible to write a bad printf line and never test it, only to have it fail years later in production. It's absolutely possible and it has happened to me. Lessons learned.

A modern compile time type checked formatter would have prevented this mistake, you are deliberately choosing to use poor tools and calling this "pragmatism" because it sounds better than admitting you're bad at this and you don't even want to improve.

In fact C++ even shipped a pair of functions here. There's a compile time type checked formatter std::format, which almost everybody should use almost always (and which is what std::println calls), and there's also a runtime type checked formatter std::vformat, for those few cases where you absolutely can't know the format string until the last moment. That is a hell of a thing, if you need one of those I have to admit nobody else has one today with equal ergonomics.




Thanks for the ad hominem, but let's put that into perspective.

My current project is a GUI prototype based on plain Win32/Direct3D/Direct2D/DirectWrite. It currently clocks in at just under 6 KLOC. These are all the format calls in there (used git grep):

        fatal_f("Failed to CreateBuffer(): %lx", err);
        fatal_f("Failed to Map() buffer");
        fatal_f("Failed to compile shader!");
        fatal_f("Failed to CreateBuffer(): %lx", err);
        fatal_f("Failed to create blend state");
        fatal_f("OOM");
        fatal_f("Failed to register Window class");
        fatal_f("Failed to CreateWindow()");
        fatal_f("%s failed: error code %lx", what, hr);
        msg_f("Shader compile error messages: %s", errors->GetBufferPointer());
        msg_f("Failed to compile shader but there are no error messages. "
        msg_f("HELLO: %d times clicked", count);
        msg_f("Click %s", item->name.buf);
        msg_f("Init text controller %p", this);
        msg_f("DELETE");
        msg_f("Refcount is now %d", m_refcount);
        msg_f("Refcount is now %d", m_refcount);
        vfprintf(stderr, fmt, ap);
        fprintf(stderr, "\n");
        fprintf(stderr, "FATAL ERROR: ");
        vfprintf(stderr, fmt, ap);
        fprintf(stderr, "\n");
        snprintf(utext, sizeof utext, "Hello %d", ui->update_count);
        snprintf(filepath, sizeof filepath, "%s%s",
        int r = vsnprintf(m_buffer, sizeof m_buffer, fmt, ap);
        int r = vsnprintf(text_store, sizeof text_store, fmt, ap);
        snprintf(svg_filepath, sizeof svg_filepath, "%s", filepath);
That's theory and practice for you. The real world is a bit more nuanced.

Meanwhile I have 100 other, more significant problems to worry about than printf type safety. For example, how to get rid of the RAII based refcounting that I introduced but it wasn't exactly an improvement to my architecture.

But thanks for the suggestion to use std::format in that set of cases and std::vformat in these other situations. I'll put those on my stack of C++ features to work through when I have time for things like that. (Let's hope that when I get there, those aren't already superseded by something safer).

(Update: uh, std::format returns std::string. Won't use.)


> std::format returns std::string. Won't use

just use `std::format_to_t` then and format to whatever your heart desires, without ever allocating once:

    std::array<char, 256> buf{};

    std::format_to_n(data(buf), size(buf), "hello {}", 44);
I've used `fmt` on *embedded* devices and it was never a performance issue, not even once (it's even arguably _faster_ than printf).

(OT: technically speaking, in C++ you shouldn't call `vfprintf` or other C library functions without prefixing them with `std::`, but that's a crusade I'm bound to lose - albeit `import std` will help a lot)


I noticed std::format and std::print aren't even available with my pretty up-to-date compilers (testing Debian bookworm gcc/clang right now). There is only https://github.com/fmtlib/fmt but it doesn't seem prepackaged for me. Have you actually used std::format_to_n? Did you go through the trouble of downloading it or are you using C++ package managers?

I'm often getting the impression that these "you're a fool using these well-known but insecure libraries. Better use this new shiny thing because it's safe" discussions are a bit removed from reality.

But I'm asking in earnest. Please also check out my benchmark in the sibling thread where I compared stringstream with stdio/snprintf build performance. Would in fact love to compare std::format_to_n, but can't be arsed to put in more time to get it running right now.


> my pretty up-to-date compilers > testing Debian bookworm

Debian and up to date compilers - pick one. <format> support comes with GCC 13.x, which has been released more than 3 months ago. MSVC has had it for years now, LLVM is still working on it AFAIK (but it works with GCC). `std::print` is a new addition in C++23, which hasn't been released yet.

> Did you go through the trouble of downloading it or are you using C++ package managers? I don't know of many non-trivial programs in C or C++ that don't rely on third party libraries. The C standard library, in particular, is fairly limited and doesn't come with "batteries included".'

In general I've been using {fmt} for the better part of the last 5 years, and it's trivial to embed in a project (it uses CMake, so it's as simple as adding a single line in a CMakeLists.txt). It has been shipped by most distributions for years now (see https://packages.debian.org/buster/libfmt-dev), for instance, it was already supported in Debian buster, so you can just install it using your package manager and that's it.

{fmt} is also mostly implemented in its header, with a very small shared/static library that goes alongside it. It's one repository I always use in my C++ projects, together with Microsoft's GSL (for `not_null` and `finally`, mostly).

> "you're a fool using these well-known but insecure libraries. Better use this new shiny thing because it's safe" discussions are a bit removed from reality.

No, I think that insecure code is insecure, period, no matter how much it is used or well known. Such focus on practicality over correctness was the reason why my C university professor was so set on continuing using old C string functions which were already well known back then to be a major cause of CVEs. That was, in my opinion, completely wrong.

This is especially true in this case, {fmt}/<format> are nicer to use than `sprintf`, are safer, support custom types and are also _faster_ because they are actually dispatched and verified at compile time. Heck the standard itself basically just merged a subset of {fmt}'s functionality, so much so that I've recently sed-replaced 'fmt' with 'std' in some projects and it built the same with GCC's implementation. `std::print`, too, is just `fmt::print`, no more no less (with a few niceties removed, afaik).

> where I compared stringstream with stdio/snprintf build performance

String Streams (and IOStream in general) are a poorly designed concept, which have been the butt of the joke for years for their terrible performance. This is well known, and I'm honestly aghast any time I see anyone using them in place of {fmt}, which has been the de-facto string format library for C++ for the best part of the last decade (at least since 2018) and is better than `std::stringstream` in every conceivable way.

If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.


Ok thanks for answering. In the meantime I had already tested <format> on MSVC as well as godbolt: https://news.ycombinator.com/item?id=36715949

Not practical for me to add a second of compile time for each file that wants to print something.

> Debian and up to date compilers - pick one.

gcc 12.3 was released only a few months ago and is included. gcc 13.1, some 80 days old, doesn't seem to have made it. Not everybody is closely tracking upstream. Immediately jumping on each new train is not my thing (hence why Debian is fine), nor is it how software development is handled in the industry generally.

Even on godbolt / gcc 13.1 which I linked in the other post, <format> isn't available. Only {fmt} is available as an optional library.

> {fmrt}/<format> are nicer to use than

I think otherwise, but maybe you enjoy brewing coffee on top of your desktop computer while waiting for the build to finish.

> _faster_ because they are actually dispatched at compile time

I don't actually want this unless I'm bottlenecked by the format string parsing. If I have one or two integer formats in my formatting string, the whole thing will already be bottlenecked by that. So "dispatching at compile time" is typically akin to minimizing the size of a truck, when we should have designed a sports car. The thing about format strings and varags is they're in fact an efficient encoding of what you want to do. Not worth emitting code for 2-5 function calls if a single one is enough.

If there is a speed problem, you need some wider optimization that the compiler can't help you with.

Apart from that, that compile time dispatching doesn't actually happen with fmtlib in the godbolt, not even at -O2. The format string is copied verbatim into the source. Which I like.

> If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.

Since I've already done a lot of work, show me some realistic and useful benchmark to support your claims?


>> If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.

> Since I've already done a lot of work, show me some realistic and useful benchmark to support your claims?

Duh, I apologize for not even reading your statement completely. So I went on this page and it is exactly how I imagined. libc printf 0.91s, libfmt 0.74s. 20% speedup is not nothing, but won't help when there is an actual bottleneck. (In this case the general approach has to be changed).

Also compiled size is measurably larger even only with a few print statements in the binary. Compile time is f***** 8 times slower!

These are all numbers coming from their own page -- expect to have slightly different numbers for your own use cases.


    (Update: uh, std::format returns std::string. Won't use.)
What type would you prefer?


Generally I use printf/fprintf and snprintf. I would be looking for a replacement for snprintf for the most part, i.e. I want to provide the buffer.

Apparently we can

    template< class... Args >
    std::print(std::ostream, std::format_string<Args...> fmt, Args&&... args );
(sigh of despair) since C++23, so that could be an option if there is an easy way to make a std::ostream of a memory slice. But I'm not on C++23 yet.

Compare to

    int snprintf ( char * s, size_t n, const char * format, ... );
(sigh of relief) and I think it takes a special kind of person to think that std::print is strictly better.


For snprintf(), how do you ensure that your format string and variadic arguments will not cause crash at runtime? The C++ version is compile-time type safe.

It haven't written C++ recently, but I recall that you can use ostringstream in place of ostream. If you don't like std::string, you can probably write your own ostream that will operate on a fixed size char buffer. (Can any C++ experts comments on that idea?)

About "sigh of despair" and "sigh of relief": Are you expressing concern about the template function signature?


> It haven't written C++ recently, but I recall that you can use ostringstream in place of ostream.

I don't know about those specifically right now, but in general these things have huge compile time costs and are also generally less ergonomic IMO. [EDIT: cobbled together a working version and added it to my test below, see Version 0].

> About "sigh of despair" and "sigh of relief": Are you expressing concern about the template function signature?

Yes. It's a mouthful, and I'm worried not only about static checks but about other things too -- like readability of errors, include & link dependencies, compile performance, amount of compiled code (which is minimal in case of snprintf/varargs)... I would need to check out std::format_to_n() as suggested by the sibling commenter.

And hey -- snprintf has been available for easily 30+ years ... while the <print> and <format> headers that people make such a fuss about, don't even seem available on gcc nor clang on my fully updated debian bookworm system. The reason is that those implementations aren't complete, even though <format> is C++20. The recommended way to get those headers is to grab https://github.com/fmtlib/fmt as an external library... Talk about the level of hype and lack of pragmatism that's going on around here. People are accusing each other for not using a standard library that isn't even implemented in compilers... And with a likelyhood they haven't used the external library themselves, and given that this library is external it's not heavily tested and probably contains bugs still, maybe CRASHES and SECURITY EXPLOITS.

But let me test C++ features that actually exist:

   #if VERSION == 0

   #include <iostream>
   #include <streambuf>

   struct membuf: std::streambuf
   {
      membuf(char *p, size_t size)
      {
         setp(p, p + size);
      }
      size_t written() { return pptr() - pbase(); }
   };

   int main()
   {
      char buffer[256];
      membuf sbuf(buffer, sizeof buffer);

      std::ostream out(&sbuf);

      out << "Hello " << 42 << "\n";

      fwrite(buffer, 1, sbuf.written(), stdout);

      return 0;
   }


   #elif VERSION == 1
   #include <sstream>
   #include <iostream>

   void test(std::stringstream& os)
   {
       os << "Hello " << 42 << "\n";
   }

   int main()
   {
      std::stringstream os;
      test(os);
      std::cout << os.str();
      return 0;
   }

   #elif VERSION == 2

   #include <stdio.h>

   int test(char *buffer, int size)
   {
      int r = snprintf(buffer, size, "Hello %d\n", 42);
      return r;
   }

   int main()
   {
      char buffer[256];
      int len = test(buffer, sizeof buffer);
      fwrite(buffer, 1, len, stdout);
      return 0;
   }
   #endif

Compile & link:

                   CT      LT      TT      PT      PL

     -DVERSION=0   0.361s  0.095s  0.456s  0.081s  32573
     -DVERSION=1   0.364s  0.089s  0.453s  0.074s  32338 
     -DVERSION=2   0.039s  0.088s  0.127s  0.031s    918
CT=compile time, LT=link time, TT=total time (CT+LT), PT=preproc time (gcc -E), PL=preprocessor output lines

Bench script:

    # put -DVERSION=1 or -DVERSION=2 as cmdline arg
    time clang++ -c "$@" -Wall -o test.o test.cpp
    time clang++ -Wall -o test test.o
    time clang++ "$@" -Wall -E -o test.preprocessed.txt test.cpp
    wc -l test.preprocessed.txt
My clang version here is 14.0.6. I measured with g++ 12.2.0 as well and the results were similar (with only 50% of the link time for the snprintf-only version).

For such a trivial file, the difference is ABYSMAL. If we extrapolate to real programs we can assume the difference in build times to be 5-10x longer for a general change in programming style. Wait 10 seconds or wait 1 minute. For a small gain in safety, how much are you willing to lose? And how much do this lost time and resources actually translate to working less on the robustness of the program, leaving more security problems (as well as other problems) in there?

And talking about lost run time performance, that is real too if you're not very careful.

> For snprintf(), how do you ensure that your format string and variadic arguments will not cause crash at runtime? The C++ version is compile-time type safe.

Honestly I just don't ensure it perfectly -- beyond running them once as described. I write a lot of code that isn't fully proofed out from the beginning. Exploratory code. A few printfs are really not a concern in there, there are much bigger issues to work out.

I also absolutely do have some printfs that were quickly banged out but that are hidden in branches that have never actually run and might never happen -- they were meant for some condition that I'm not even sure is possible (this happens frequently when checking return values from complicated APIs for example).

The real "problem" isn't that there is a possibly wrong printf in that branch, but that the branch was never tested, and is likely to contain other, much worse bugs. But the fact that the branch was never run also means I don't care as much about it, pragmatically speaking. Likely there is an abort() or similar at the end of the branch anyway. It's always important to put things into perspective like that -- which is something that seems often missing from C++ and similar cultures.

The more proofed out some code gets, the more scrutiny it should undergo obviously.

Apart from that, compilers do check printfs, and I usually get a warning/error when I made a mistake. But I might not get one if I write my own formatting wrappers and am too lazy to explicitly enable the checking.


This is a phenomenal reply. Thank you very much to teach me patiently. :)


Again, you're mixing functions up. `std::print` is the equivalent to "std::fprintf", the one you want to write on random buffers is `std::format_to_n`, which IS a strictly better version of `snprintf`.


Maybe if you used a typesafe language, you wouldn't have to track down 100 other more significant problems (:


I'm using C/C++, which do provide a good level of type safety.

And no, types are absolutely not my problem. In fact, rigid type systems are a frequent source of practical problems, and they often shift the game to one where you're solving type puzzles -- instead of working on the actual functionality.


Come on, dude. C provides almost no type safety whatsoever. There’s no point in saying “C/C++” here, because you won’t adopt the C++ solutions for making your code actually typesafe.


Type safety is not a standardized term, and it is not binary. Being black and white about things is almost always bad. One needs to weigh and balance a large number of different concerns.

A lot of "modern C++" is terrible, terrible code precisely because of failing to find a balance.

Many C++ "solutions" are broken by design.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: