Hacker News new | past | comments | ask | show | jobs | submit login

As always, people making more fuss around it than necessary. Code calling printf() with a constant format string literal is this class of code that you have to run a single time to know it works. Many C++ programmers have always been using printf() in preference to e.g. std::cout because of ergonomics. And they were right.

It's hard to take people seriously that try to talk it down for being a pragmatic solution that's been around for probably 30-40 years.




I've definitely written bugs where the format specifier is wrong. Maybe you used %lu because a type happened to be unsigned long on your system, then you compile it for another system and it's subtly wrong because the type you're printing is typedef'd to unsigned long long there. Printing an uint32_t? I hope you're using the PRIu32 macro, otherwise that's also a bug.


This. I have corrected uncountable lines of code where people just used "%d" for everything, potentially opening a massive can of worms. `inttypes.h` is what you should be using 99% of the time, but that makes for very ugly format strings, so basically nobody uses that. Otherwise you should cast all of your integer params to (long long int) and use %lld, which sucks.


Yes, this is annoying. Integer promotions can be annoying in general.

I'm often working with fixed size types, and still find myself using %d and %ld instead of PRIu32 etc most of the time, because it's simply easier to type. If unsure about portability issues it can help to cast the argument instead. But realistically it isn't much of an issue, printfs seem to be about 0.5% of my code, and >> 90% of them are purely diagnostic. I don't even recall the last time I had an issue there. I remember the compiler warning me a few times about mismatching integer sizes though.


I agree. Layers of legacy typedefs in the Win32 API always catch me off guard. Any large source base with lots of typedefs, it can be tricky to printf() without issue.


> you have to run a single time to know it works

A common use case for cstdio/iostreams/std::format is logging. It's not at all uncommon to have many, many log statements that are rarely executed because they're mainly for debugging and therefore disabled most of the time. There you go, lots of rarely used and even more rarely 'tested' formatting constructs.

I don't want things to start blowing up just because I enabled some logging, so I'm going to do what I can to let the compiler find as many problems as possible at compile time.


> Code calling printf() with a constant format string literal is this class of code that you have to run a single time to know it works

    void msg(char *msg, char *x, int k) {
        printf("%hhn%s%d\n", x, msg, k);
    }
So, how about it? I mean, I have code where that works exactly as expected, so I can "know it works" according to you, but I also have code where that blows up immediately, because it's complete nonsense. Which according to you shouldn't happen, but there it is.


I mean yes, I should have been more restrictive in my statement, but I'm sure you notice how we're veering more into the hypothetical / into programming language geek land. I had to look up %hhn because I've never used it.

(Have used %n years ago but noticed it's a fancy and unergonomic way to code anyway. In the few locations where printed characters have to be counted, just consider the return value of the format call!)

And btw. how is this a problem related to missing type checks with varargs? The only problem I see is that we don't know that those pointers are not null / the char-pointer doesn't point to a zero-terminated string. In other words, just the basic C level of type (un)safety.


Totally agree, given example is total nonsense.

Most issues with printf could be reported by static analysis, and modern compiler reports them as warnings, which in my book must be immediately converted to errors. All other weird usages should be either banned, or reviewed carefully, but they are very rare.

Also, std::iostream is ridden with bugs as well. Try to print out hex / dec in a consistent way is plain "impossible". Everytime you print an int, you should in fact systematically specify the padding, alignment and mode (hex/dec) otherwise you can't know for sure what you are outputting.


iostream _sucks_, I had to implement an iostream for zlib + DEFLATE in order to play ball with an iostream-based library, and I had to sweat blood and tears in order to make it work right when a simple loop and a bit of templated code would have worked wonders compared to that sheer insanity of `gptr`, `pubsync`, ... The moment you notice that they've methods called "pubXXX" that call a protected "XXX" on the `basic_streambuf` class is the moment your soul leaves your body.

IOStreams is superbad, and thankfully <format> removes half of its usages which were based on spamming `stringstream`s everywhere (stringstream is also very, very bad). They also inspired Java's InputStream nonsense which ruined the existence of an innumerable number of developers in the last 30 years.


This is an interesting post. Can you explain the two different scenarios with examples?


Sometimes the print statement is in untested sanity-checking error-case branches that don't have test coverage ("json parsing failed" or whatever). It's pretty annoying when those things throw, and not too uncommon.

Another case in C++ is if the value is templated. You don't always get test coverage for all the types, and a compile error is nice if the thing can't be handled.

"Type coverage" is pretty useful. Not a huge deal here I agree, but nice nonetheless.


printf and similar stdio string formatting functions have caused a non-trivial amount of security exploits.

If the compiler can statically check that your format string is valid and type checks, you can prevent this entire class of exploits.

https://en.wikipedia.org/wiki/Printf#Vulnerabilities


> is this class of code that you have to run a single time to know it works

Not sure what you're trying to say here.


Take as an example printf("%d\n", foo->x);. Assuming it compiled but assuming no further context, what could break here at run-time? foo could be NULL. And the type of foo->x could be not an integer.

Let's assume you run the code once and observe that it works. What can you conclude? 1) foo was not NULL at least one time. Unfortunately, we don't know about all the other times. 2) foo->x is indeed an integer and the printf() format is always going to be fine -- it matches the arguments correctly. It's a bit like a delayed type check.

A lot of code is like that. Furthermore, a lot of that code -- if the structure is good -- will already have been tested after the program has been started up. Or it can be easily tested during development by running it just once.

I'm not saying it's impossible to write a bad printf line and never test it, only to have it fail years later in production. It's absolutely possible and it has happened to me. Lessons learned.

I'll even go as far as saying that it's easy to have errors slip on refactors if there aren't good tests in place. But people are writing untyped Python or Javascript programs, sometimes significant ones. Writing in those is like every line was a printf()!

But many people will through great troubles to achieve an abstract goal of type safety, accepting pessimisations on other axes even when it is ultimately a bad tradeoff. People also like to bring up issues like this on HN like it's the end of the world, when it's not nearly as big of an issue most of the time.

Another similar example like that are void pointers as callback context. It is possible to get it wrong, it absolutely happens. But from a pragmatic and ergonomic standpoint I still prefer them to e.g. abstract classes in a lot of cases due to being a good tradeoff when taking all axes into account.


> I'm not saying it's impossible to write a bad printf line and never test it, only to have it fail years later in production. It's absolutely possible and it has happened to me. Lessons learned.

A modern compile time type checked formatter would have prevented this mistake, you are deliberately choosing to use poor tools and calling this "pragmatism" because it sounds better than admitting you're bad at this and you don't even want to improve.

In fact C++ even shipped a pair of functions here. There's a compile time type checked formatter std::format, which almost everybody should use almost always (and which is what std::println calls), and there's also a runtime type checked formatter std::vformat, for those few cases where you absolutely can't know the format string until the last moment. That is a hell of a thing, if you need one of those I have to admit nobody else has one today with equal ergonomics.


Thanks for the ad hominem, but let's put that into perspective.

My current project is a GUI prototype based on plain Win32/Direct3D/Direct2D/DirectWrite. It currently clocks in at just under 6 KLOC. These are all the format calls in there (used git grep):

        fatal_f("Failed to CreateBuffer(): %lx", err);
        fatal_f("Failed to Map() buffer");
        fatal_f("Failed to compile shader!");
        fatal_f("Failed to CreateBuffer(): %lx", err);
        fatal_f("Failed to create blend state");
        fatal_f("OOM");
        fatal_f("Failed to register Window class");
        fatal_f("Failed to CreateWindow()");
        fatal_f("%s failed: error code %lx", what, hr);
        msg_f("Shader compile error messages: %s", errors->GetBufferPointer());
        msg_f("Failed to compile shader but there are no error messages. "
        msg_f("HELLO: %d times clicked", count);
        msg_f("Click %s", item->name.buf);
        msg_f("Init text controller %p", this);
        msg_f("DELETE");
        msg_f("Refcount is now %d", m_refcount);
        msg_f("Refcount is now %d", m_refcount);
        vfprintf(stderr, fmt, ap);
        fprintf(stderr, "\n");
        fprintf(stderr, "FATAL ERROR: ");
        vfprintf(stderr, fmt, ap);
        fprintf(stderr, "\n");
        snprintf(utext, sizeof utext, "Hello %d", ui->update_count);
        snprintf(filepath, sizeof filepath, "%s%s",
        int r = vsnprintf(m_buffer, sizeof m_buffer, fmt, ap);
        int r = vsnprintf(text_store, sizeof text_store, fmt, ap);
        snprintf(svg_filepath, sizeof svg_filepath, "%s", filepath);
That's theory and practice for you. The real world is a bit more nuanced.

Meanwhile I have 100 other, more significant problems to worry about than printf type safety. For example, how to get rid of the RAII based refcounting that I introduced but it wasn't exactly an improvement to my architecture.

But thanks for the suggestion to use std::format in that set of cases and std::vformat in these other situations. I'll put those on my stack of C++ features to work through when I have time for things like that. (Let's hope that when I get there, those aren't already superseded by something safer).

(Update: uh, std::format returns std::string. Won't use.)


> std::format returns std::string. Won't use

just use `std::format_to_t` then and format to whatever your heart desires, without ever allocating once:

    std::array<char, 256> buf{};

    std::format_to_n(data(buf), size(buf), "hello {}", 44);
I've used `fmt` on *embedded* devices and it was never a performance issue, not even once (it's even arguably _faster_ than printf).

(OT: technically speaking, in C++ you shouldn't call `vfprintf` or other C library functions without prefixing them with `std::`, but that's a crusade I'm bound to lose - albeit `import std` will help a lot)


I noticed std::format and std::print aren't even available with my pretty up-to-date compilers (testing Debian bookworm gcc/clang right now). There is only https://github.com/fmtlib/fmt but it doesn't seem prepackaged for me. Have you actually used std::format_to_n? Did you go through the trouble of downloading it or are you using C++ package managers?

I'm often getting the impression that these "you're a fool using these well-known but insecure libraries. Better use this new shiny thing because it's safe" discussions are a bit removed from reality.

But I'm asking in earnest. Please also check out my benchmark in the sibling thread where I compared stringstream with stdio/snprintf build performance. Would in fact love to compare std::format_to_n, but can't be arsed to put in more time to get it running right now.


> my pretty up-to-date compilers > testing Debian bookworm

Debian and up to date compilers - pick one. <format> support comes with GCC 13.x, which has been released more than 3 months ago. MSVC has had it for years now, LLVM is still working on it AFAIK (but it works with GCC). `std::print` is a new addition in C++23, which hasn't been released yet.

> Did you go through the trouble of downloading it or are you using C++ package managers? I don't know of many non-trivial programs in C or C++ that don't rely on third party libraries. The C standard library, in particular, is fairly limited and doesn't come with "batteries included".'

In general I've been using {fmt} for the better part of the last 5 years, and it's trivial to embed in a project (it uses CMake, so it's as simple as adding a single line in a CMakeLists.txt). It has been shipped by most distributions for years now (see https://packages.debian.org/buster/libfmt-dev), for instance, it was already supported in Debian buster, so you can just install it using your package manager and that's it.

{fmt} is also mostly implemented in its header, with a very small shared/static library that goes alongside it. It's one repository I always use in my C++ projects, together with Microsoft's GSL (for `not_null` and `finally`, mostly).

> "you're a fool using these well-known but insecure libraries. Better use this new shiny thing because it's safe" discussions are a bit removed from reality.

No, I think that insecure code is insecure, period, no matter how much it is used or well known. Such focus on practicality over correctness was the reason why my C university professor was so set on continuing using old C string functions which were already well known back then to be a major cause of CVEs. That was, in my opinion, completely wrong.

This is especially true in this case, {fmt}/<format> are nicer to use than `sprintf`, are safer, support custom types and are also _faster_ because they are actually dispatched and verified at compile time. Heck the standard itself basically just merged a subset of {fmt}'s functionality, so much so that I've recently sed-replaced 'fmt' with 'std' in some projects and it built the same with GCC's implementation. `std::print`, too, is just `fmt::print`, no more no less (with a few niceties removed, afaik).

> where I compared stringstream with stdio/snprintf build performance

String Streams (and IOStream in general) are a poorly designed concept, which have been the butt of the joke for years for their terrible performance. This is well known, and I'm honestly aghast any time I see anyone using them in place of {fmt}, which has been the de-facto string format library for C++ for the best part of the last decade (at least since 2018) and is better than `std::stringstream` in every conceivable way.

If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.


Ok thanks for answering. In the meantime I had already tested <format> on MSVC as well as godbolt: https://news.ycombinator.com/item?id=36715949

Not practical for me to add a second of compile time for each file that wants to print something.

> Debian and up to date compilers - pick one.

gcc 12.3 was released only a few months ago and is included. gcc 13.1, some 80 days old, doesn't seem to have made it. Not everybody is closely tracking upstream. Immediately jumping on each new train is not my thing (hence why Debian is fine), nor is it how software development is handled in the industry generally.

Even on godbolt / gcc 13.1 which I linked in the other post, <format> isn't available. Only {fmt} is available as an optional library.

> {fmrt}/<format> are nicer to use than

I think otherwise, but maybe you enjoy brewing coffee on top of your desktop computer while waiting for the build to finish.

> _faster_ because they are actually dispatched at compile time

I don't actually want this unless I'm bottlenecked by the format string parsing. If I have one or two integer formats in my formatting string, the whole thing will already be bottlenecked by that. So "dispatching at compile time" is typically akin to minimizing the size of a truck, when we should have designed a sports car. The thing about format strings and varags is they're in fact an efficient encoding of what you want to do. Not worth emitting code for 2-5 function calls if a single one is enough.

If there is a speed problem, you need some wider optimization that the compiler can't help you with.

Apart from that, that compile time dispatching doesn't actually happen with fmtlib in the godbolt, not even at -O2. The format string is copied verbatim into the source. Which I like.

> If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.

Since I've already done a lot of work, show me some realistic and useful benchmark to support your claims?


>> If you look at benchmarks on fmt's Github page, you will see it consistently outperforms most other solutions out there, while being completely type safe and extensible. It's IMHO the best of both worlds.

> Since I've already done a lot of work, show me some realistic and useful benchmark to support your claims?

Duh, I apologize for not even reading your statement completely. So I went on this page and it is exactly how I imagined. libc printf 0.91s, libfmt 0.74s. 20% speedup is not nothing, but won't help when there is an actual bottleneck. (In this case the general approach has to be changed).

Also compiled size is measurably larger even only with a few print statements in the binary. Compile time is f***** 8 times slower!

These are all numbers coming from their own page -- expect to have slightly different numbers for your own use cases.


    (Update: uh, std::format returns std::string. Won't use.)
What type would you prefer?


Generally I use printf/fprintf and snprintf. I would be looking for a replacement for snprintf for the most part, i.e. I want to provide the buffer.

Apparently we can

    template< class... Args >
    std::print(std::ostream, std::format_string<Args...> fmt, Args&&... args );
(sigh of despair) since C++23, so that could be an option if there is an easy way to make a std::ostream of a memory slice. But I'm not on C++23 yet.

Compare to

    int snprintf ( char * s, size_t n, const char * format, ... );
(sigh of relief) and I think it takes a special kind of person to think that std::print is strictly better.


For snprintf(), how do you ensure that your format string and variadic arguments will not cause crash at runtime? The C++ version is compile-time type safe.

It haven't written C++ recently, but I recall that you can use ostringstream in place of ostream. If you don't like std::string, you can probably write your own ostream that will operate on a fixed size char buffer. (Can any C++ experts comments on that idea?)

About "sigh of despair" and "sigh of relief": Are you expressing concern about the template function signature?


> It haven't written C++ recently, but I recall that you can use ostringstream in place of ostream.

I don't know about those specifically right now, but in general these things have huge compile time costs and are also generally less ergonomic IMO. [EDIT: cobbled together a working version and added it to my test below, see Version 0].

> About "sigh of despair" and "sigh of relief": Are you expressing concern about the template function signature?

Yes. It's a mouthful, and I'm worried not only about static checks but about other things too -- like readability of errors, include & link dependencies, compile performance, amount of compiled code (which is minimal in case of snprintf/varargs)... I would need to check out std::format_to_n() as suggested by the sibling commenter.

And hey -- snprintf has been available for easily 30+ years ... while the <print> and <format> headers that people make such a fuss about, don't even seem available on gcc nor clang on my fully updated debian bookworm system. The reason is that those implementations aren't complete, even though <format> is C++20. The recommended way to get those headers is to grab https://github.com/fmtlib/fmt as an external library... Talk about the level of hype and lack of pragmatism that's going on around here. People are accusing each other for not using a standard library that isn't even implemented in compilers... And with a likelyhood they haven't used the external library themselves, and given that this library is external it's not heavily tested and probably contains bugs still, maybe CRASHES and SECURITY EXPLOITS.

But let me test C++ features that actually exist:

   #if VERSION == 0

   #include <iostream>
   #include <streambuf>

   struct membuf: std::streambuf
   {
      membuf(char *p, size_t size)
      {
         setp(p, p + size);
      }
      size_t written() { return pptr() - pbase(); }
   };

   int main()
   {
      char buffer[256];
      membuf sbuf(buffer, sizeof buffer);

      std::ostream out(&sbuf);

      out << "Hello " << 42 << "\n";

      fwrite(buffer, 1, sbuf.written(), stdout);

      return 0;
   }


   #elif VERSION == 1
   #include <sstream>
   #include <iostream>

   void test(std::stringstream& os)
   {
       os << "Hello " << 42 << "\n";
   }

   int main()
   {
      std::stringstream os;
      test(os);
      std::cout << os.str();
      return 0;
   }

   #elif VERSION == 2

   #include <stdio.h>

   int test(char *buffer, int size)
   {
      int r = snprintf(buffer, size, "Hello %d\n", 42);
      return r;
   }

   int main()
   {
      char buffer[256];
      int len = test(buffer, sizeof buffer);
      fwrite(buffer, 1, len, stdout);
      return 0;
   }
   #endif

Compile & link:

                   CT      LT      TT      PT      PL

     -DVERSION=0   0.361s  0.095s  0.456s  0.081s  32573
     -DVERSION=1   0.364s  0.089s  0.453s  0.074s  32338 
     -DVERSION=2   0.039s  0.088s  0.127s  0.031s    918
CT=compile time, LT=link time, TT=total time (CT+LT), PT=preproc time (gcc -E), PL=preprocessor output lines

Bench script:

    # put -DVERSION=1 or -DVERSION=2 as cmdline arg
    time clang++ -c "$@" -Wall -o test.o test.cpp
    time clang++ -Wall -o test test.o
    time clang++ "$@" -Wall -E -o test.preprocessed.txt test.cpp
    wc -l test.preprocessed.txt
My clang version here is 14.0.6. I measured with g++ 12.2.0 as well and the results were similar (with only 50% of the link time for the snprintf-only version).

For such a trivial file, the difference is ABYSMAL. If we extrapolate to real programs we can assume the difference in build times to be 5-10x longer for a general change in programming style. Wait 10 seconds or wait 1 minute. For a small gain in safety, how much are you willing to lose? And how much do this lost time and resources actually translate to working less on the robustness of the program, leaving more security problems (as well as other problems) in there?

And talking about lost run time performance, that is real too if you're not very careful.

> For snprintf(), how do you ensure that your format string and variadic arguments will not cause crash at runtime? The C++ version is compile-time type safe.

Honestly I just don't ensure it perfectly -- beyond running them once as described. I write a lot of code that isn't fully proofed out from the beginning. Exploratory code. A few printfs are really not a concern in there, there are much bigger issues to work out.

I also absolutely do have some printfs that were quickly banged out but that are hidden in branches that have never actually run and might never happen -- they were meant for some condition that I'm not even sure is possible (this happens frequently when checking return values from complicated APIs for example).

The real "problem" isn't that there is a possibly wrong printf in that branch, but that the branch was never tested, and is likely to contain other, much worse bugs. But the fact that the branch was never run also means I don't care as much about it, pragmatically speaking. Likely there is an abort() or similar at the end of the branch anyway. It's always important to put things into perspective like that -- which is something that seems often missing from C++ and similar cultures.

The more proofed out some code gets, the more scrutiny it should undergo obviously.

Apart from that, compilers do check printfs, and I usually get a warning/error when I made a mistake. But I might not get one if I write my own formatting wrappers and am too lazy to explicitly enable the checking.


This is a phenomenal reply. Thank you very much to teach me patiently. :)


Again, you're mixing functions up. `std::print` is the equivalent to "std::fprintf", the one you want to write on random buffers is `std::format_to_n`, which IS a strictly better version of `snprintf`.


Maybe if you used a typesafe language, you wouldn't have to track down 100 other more significant problems (:


I'm using C/C++, which do provide a good level of type safety.

And no, types are absolutely not my problem. In fact, rigid type systems are a frequent source of practical problems, and they often shift the game to one where you're solving type puzzles -- instead of working on the actual functionality.


Come on, dude. C provides almost no type safety whatsoever. There’s no point in saying “C/C++” here, because you won’t adopt the C++ solutions for making your code actually typesafe.


Type safety is not a standardized term, and it is not binary. Being black and white about things is almost always bad. One needs to weigh and balance a large number of different concerns.

A lot of "modern C++" is terrible, terrible code precisely because of failing to find a balance.

Many C++ "solutions" are broken by design.


Every time I have to sprintf a string_view I die a little bit inside.


There are several CVEs that thank their success story to printf's format string.


for basic things, sure. it is much much worse than this when you deal with different encodings for an application that needs to format and print things




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: