Is performance inversely proportional to dev experience? because what you wrote ...

jcelerier · on June 16, 2022

> and at the end of the day LLVM compiles 30min and uses tens of GBs of RAM on average hardware

I mean, that's the initial build.

Here's my compile-edit-run cycle in https://ossia.io which is nearing 400kloc, with a free example of performance profiling, I haven't found anything like this whenever I had to profile python. It's not LLVM-sized of course, but it's not a small project either, maybe in the medium-low C++ project size: https://streamable.com/o8p22f ; pretty much a couple seconds at most from keystroke to result, for a complete DAW which links against Qt, FFMPEG, LLVM, Boost and a few others. Notice also how my IDE kindly informs me of memory leaks and other funsies.

    C/C++ Header                      2212          29523          17227         200382
    C++                               1381          34060          13503         199259

Here's some additional tooling I'm developing - build times can be made as low as a few dozen milliseconds when one puts some work into making the correct API and using the tools correctly: https://www.youtube.com/watch?v=fMQvsqTDm3k

pjvsvsrtrxc · on June 16, 2022

Huh?

"10 compilers, IDEs, debuggers, package managers" what are you talking about? (Virtually) No one uses ten different tools to build one application. I don't even know of any C++-specific package managers, although I do know of language-specific package managers for... oh, right, most scripting languages. And an IDE includes a compiler and a debugger, that's what makes it an IDE instead of a text editor.

"and at the end of the day LLVM compiles 30min and uses tens of GBs of RAM on average hardware" sure, if you're compiling something enormous and bloated... I'm not sure why you think that's an argument against debloating?

tester756 · on June 16, 2022

>No one uses ten different tools to build one application.

I meant you have a lot of choices to make

Instead of having one strong standard which everyone uses, you have X of them which makes changing projects/companies harder, but for solid reason? I don't know.

>"and at the end of the day LLVM compiles 30min and uses tens of GBs of RAM on average hardware" sure, if you're compiling something enormous and bloated... I'm not sure why you think that's an argument against debloating?

I know that lines in repo aren't great way to compare those things, but

.NET Compiler Infrastructure:

20 587 028 lines of code in 17 440 files

LLVM:

45 673 398 lines of code in 116 784 files

The first one I built (restore+build) in 6mins and it used around 6-7GB of RAM

The second I'm not even trying because the last time I tried doing it on Windows it BSODed after using _whole_ ram (16GBs)

archi42 · on June 17, 2022

Compiling a large number of files on Windows is slow, no matter what language/compiler you use. It seems to be a problem with the program invocation, which takes "forever" on Windows. It's still fast for a human, but it's slow for a computer. Quite apt this comes up here ;-)

Source for claim: That's a problem we actually faced in the Windows CI at my old job. Our test suite invoked about 100k to 150k programs (our program plus a few 3rd party verification programs). In the Linux CI the whole thing ran reasonably fast, but the Windows CI took double as long. I don't recall the exact numbers, but if Windows incurs a 50ms overhead per program call you're looking at 1:20 (one hour twenty minutes) more runtime at 100k invocations.

Also I'm pretty sure I've built LLVM on 16GB memory. Took less than 10 minutes on a i7-2600. The number of files is a trade off: You can combine a bunch of small files into a large file to reduce the build time. You can even write a tool that does that automatically on every compile (and keeps sane debug info). But now incremental builds take longer, because even if you change only one small file, the combined file needs to be rebuild. That's a problem for virtually all compiled languages.

tester756 · on June 17, 2022

It's crazy that they have multiplied files count by 7 meanwhile the code just by 2

is it some C++ header file overhead? or they do something specific?

archi42 · on June 20, 2022

I can only guess, I am neither a LLVM nor a MSVC dev.

1. Compile times: If you have one file with 7000 LOC that and change one function in that file, the rebuild is slower than if you had 7 files with 1000 LOC instead.

2. Maintainability: Instead of putting a lot of code into one file, you put the code in multiple files for better maintainability. IIRC LLVM was FOSS from the beginning, so making it easy for lots of people to make many small contributions is important. I guess .NET was conceived as being internal to MS, so less people overall, but newcomers probably were assigned to a team for onboarding and then contributing to the project as part of that team. With other words: At MS you can call up the person or team responsible for that 10000 LOC monstrosity; but if all you got is a bunch of names with e-mail addresses pulled from the commit log, you might be in for a bad time.

3. Generated code: I don't know if either commit generated code into the repository. That can skew these numbers as well.

4. Header files can be a wild card, as it depends on how their written. Some people/projects just put the signatures in there and not too much details, others put the whole essays as docs for each {class, method, function, global} in there, making them huge.

For the record, by your stats .NET has 1180 LOC per file and LLVM 391 on average. That doesn't say a lot, the median would probably be better, or even a percentile graph. Broken down by type (header/definition vs. implementation). You might find that the distribution is similar and a few large outliers skew it (especially generated code). Or when looking at more, big projects you might find that these two are outliers. I can't say anything definite, and from an engineering perspective I think neither is "suspicious" or even bad.

My gut feeling says 700 would be a number I'd expect for a large project.

jcelerier · on June 22, 2022

> My gut feeling says 700 would be a number I'd expect for a large project.

aha, I remember when I was in class, the absolute rule our teachers gave us was no more than 200 lines per file

throwaway894345 · on June 16, 2022

I assume the parent was talking about the fragmentation in the ecosystem (fair point, especially regarding package management landscape and build tooling), but it's unclear.

optimalsolver · on June 16, 2022

>I don't even know of any C++-specific package managers

https://conan.io/

bhauer · on June 16, 2022

> Is performance inversely proportional to dev experience?

No. I feel there is great developer experience in many high performance languages: Java, C#, Rust, Go, etc.

In fact, for my personal tastes, I find these languages more ergonomic than many popular dynamic languages. Though I will admit that one thing that I find ergonomic is a language that lifts the performance headroom above my head so that I'm not constantly bumping my head on the ceiling.

kaba0 · on June 16, 2022

You haven’t touched a C++ toolchain in the last decade, have you?

spc476 · on June 16, 2022

TCC is a fast compiler. So fast, that at one time, one could use it to boot Linux from source code! But there's a downside: the code is produces is slow. There's no optimization done. None. So the trade off seems to be: compile fast but slow program, or compile slow but fast program.

scarmig · on June 16, 2022

The trade-off is more of a gradient: e.g. PGO allows an instrumented binary to collect runtime statistics and then use those to optimize hot paths for future build cycles.

tester756 · on June 16, 2022

is this actually this binary?

I mean what if there are features that take significant % of whole time

What if getting rid of them could decrease perf by e.g 4%, but also decrease comp. time by 30%

would it be worth?