Actually, I think that most people overestimate performance of C++. Using C++ th...

Malus · on Nov 10, 2010

It is not too difficult to generate small binaries from C++ code; modern C++ toolchains have link-time optimization (http://en.wikipedia.org/wiki/Link-time_optimization) and can optimize for size (-Os). Code size is rarely an issue in the C++ projects I work on (numerical and discrete event simulations).

dfox · on Nov 10, 2010

I don't know about any toolchain that is able to collapse multiple instantiations of same template with different types but otherwise identical code (ie. std::vector<Foo> and std::vector<Bar>) and on many codebases I've found out that these template instantiations are large part of resulting binary size.

While it's probably simple to write programs that results in small binaries it probably involves not using large parts of language. In my opinion it is often more reasonable to use straight C (which also makes performance characteristics of program more obvious in any case).

anonymous246 · on Nov 10, 2010

You mean identical source code or identical assembly code?

I would expect that compiler strip away typdef'ing so unless Foo and Bar are really different, only one copy of the code should exist.

Yes, you can probably defeat this heuristic by doing a couple of pathalogical things, but I expect it work unless you're actively trying to defeat the system.

beagle3 · on Nov 10, 2010

identical assembly. No C++ compiler that I'm aware of does that.

e.g., std::vector<Foo * > and std::vector<Bar * > generate identical code for whatever you do, as they both only deal with the pointer, never dereference it in any way. Also, in code like

    template <class T> T get(T x[], int i) { return x[i]; }

on a 32 bit platform, get<int>() and get<char* > generate the same machine code, whereas on a 64 bit platform, get<long>() and get<void * >() generate the same machine code.

No C++ compiler that I'm aware of folds these cases into one; The C# generics do (not sure about non-generic code with identical final machine code). I would be surprised if idiomatic C++ code could be shortened by more than 50% by a compiler that did. However, if you write your code for column-major access, the binary can go down 90% if the compiler did that.

edit:formatting

anonymous246 · on Nov 11, 2010

The C++ committee answer to this is to use "partial template specialization". I'm hoping all the standard libraries I use such as Boost etc have already done this to avoid code bloat.

But I agree that it would be nice if the compiler did this automatically.

killedbydeath · on Nov 10, 2010

C++ was designed in a way where only the features you use have overhead, so it is fairly easy to write the most computationally intense parts in C while using higher level abstractions everywhere else. This is something I missed a lot in Java when trying to write high performance code.

dfox · on Nov 10, 2010

But most of this design was done 20 years ago and so used assumptions about CPU performance are mostly wrong for both modern CPUs and modern operating systems. For example overhead of calling virtual and non-virtual method across library boundary on ELF based system on reasonably modern CPU is essentially same.

beagle3 · on Nov 12, 2010

This is a very minor detail - so an optimization that was made 20 years ago is no longer relevant. Neither is it harmful or especially expensive.

rbanffy · on Nov 10, 2010

> Using C++ the way authors intend

Like I said, you can write horribly inefficient C++ code if you really want, but that's not a fault of the tool. The fact remains you can write small and efficient C++ code, but that it's devilishly hard to do it with Java.

Or C#, BTW ;-)

runT1ME · on Nov 10, 2010

There are also many optimizations that a JIT can do that cannot be done with a static compiler.

rbanffy · on Nov 11, 2010

The optimizations will basically claim back the inefficiencies introduced by the compiler/vm stack. I am not sure those run-time optimizations allow you to beat reasonably optimal C++ code. Remember that many JIT optimizations apply only to interpreted code and would not be required with native code.

runT1ME · on Nov 11, 2010

>Remember that many JIT optimizations apply only to interpreted code and would not be required with native code.

No, I'm talking about things like being able to skip virtual table lookups for method calls, you can't do that in C++. You can't short circuit them either. You can in a JIT. You also get superior garbage collection performance with a JIT compared to static compiling.

beagle3 · on Nov 12, 2010

Many of them are possible in theory. Few are implemented in practice, and even fewer apply in code found in the wild.

e.g. promoting an Integer[] array to an int[] array is something that Java will not do, although in some cases it is theoretically possible (and LuaJIT2 comes close to doing something equivalent).