Looking at the "decompiled" version, it's clearly cubic, not exponential. There are 6 type arguments to the outer instantiation of 'Class' in the field declaration; each of those arguments is itself an instantiation of 'Class' with 6 type arguments, each of which is a third level of instantiation of 'Class' with 6 type variable references, for a total of 216 (= 6^3) type variable references.
There's nothing exponential going on here. Note that in the expression 6^3, the number that's varying is the 6, not the 3. If it were the other way around, that would be exponential.
I realize "exponential" has come into common use meaning just "growing very fast", and it's sometimes pedantic to insist on a distinction between polynomial growth and exponential growth, but in a discussion of algorithmic time complexity, we should get it right!
Ah, I see — it's cubic in the number of type parameters, but it's exponential in the depth of the type reference in the field declaration. My bad; I didn't read carefully enough.
Well, if you variate generic arity N, you are going to get N^3 which is exponential.
If you variate nest level M, you are indeed going to get 6 ^ N which is 'just' polynomial.
However, if you just look upon the code sample, in most case it is assumed that both N and M are variated, and if both of them are growing with same limiting behavior, then there is O(N^N) complexity. This complexity is even worse than factorial!
Oh crap... yeah. I meant O(N^3) and O(6^M). If both are varied - O(N^M), if both are increasing with same order of magnitude, then complexity can be written as O(N^N) which is O(N! * e^N / sqrt(N))
> Determining whether a given predicate has a set of valuations for its variables which make it true is an NP-complete problem; actually finding the set of valuations is an NP-hard problem. Therefore, overload resolution in C# 3.0 is at the very least NP-hard. There is no known polynomial-time algorithm for any NP-complete or NP-hard problem and it is widely believed that there is none to be found, though that conjecture has yet to be proven. So our dream of fast lambda type analysis in C# 3.0 is almost certainly doomed, at least as long as we have the current rules for overload resolution.
See Lambda Expressions vs. Anonymous Methods, Part Five[0] by Eric Lippert.
If the concern is regarding real-world performance and not DoS attacks, then why can't they just adopt the conflict-driven clause learning techniques of practical SAT solvers? The corresponding SAT instances are quite small, so the problems should be handled quickly by even a relatively naive CDCL implementation.
> So if it takes 3 mins to compile your code, allocates 2GB of memory and then crashes, take that as a warning!!
As someone who used to spend a lot of time in C++ and LTCG I got a good laugh out of this(aside from the crashing part, ICE are always a bad thing). Our release link was ~40 minutes on each change.
Yeah, I guess I should've added 'So if it takes 3 mins to compile your code .. in the C# compiler'
But either way, I'm glad you enjoyed it.
> As someone who used to spend a lot of time in C++ and LTCG I got a good laugh out of this(aside from the crashing part, ICE are always a bad thing). Our release link was ~40 minutes on each change.
A long time ago I did some C++, so I know it can take longer, although I don't think I ever worked on something with 40 min build times!
The 40 minute link times comes from Link Time Code Generation which can reduce your binary size pretty significantly(along with some code speed-ups)[1], without LTCG linking on the project was about 1 minute.
Usually LTCG is a tool of last resort since the build times are so painful(and at the time was the only step that couldn't be parallelized). We were shipping on X360 with a hard memory limit and couldn't have a binary larger than 20mb if we wanted to hit them. If I recall correctly LTCG gave us back about 5-10mb which was a godsend in those times.
gcc, clang, and Intel all both support this feature, calling it "Link-Time Optimization".[1,2,3] This allows them to inline virtual functions they can resolve, non-inline functions which are not visible to external units, as well as perform combinatorial optimizations by being able to "see" more of the code together at once.
In addition to binary size (which I've never worried about too much, as I mostly write code for scientific computing), it has often improved runtime significantly. The only other specific optimization besides -flto that I have had very real speedups from (not counting -O[23] or -Ofast) has been -funroll-loops. I use it on nearly every project. (And, to be honest, I haven't seen a big slowdown in compile time over just -O3, though I think a lot of my costs there are from templates.)
The Don Syme article (first link) mentions "the critical nature of the interaction that app domains pose to generics". What interaction is that, and what is critical about its nature?
I did a quick search in AppDomain.cpp [0] for 'generic' and there's quite a few comments/variables that mention it. Maybe this [1] is one of the tricky interactions?
This is CoreCLR, much younger version of CLR that does not support multiple AppDomains
In different open-sourced CLR implementation aka SSCLI20 (aka Rotor) there are similar interactions for type's domain discovery and DynamicClassTable, but there are also some interactions about GCStatics.
I'm not sure if fullblown .Net CLR implementation was released yet
I kinda guess this has something to do with System.__Canon type, which is a substitute type for open generic parameter type - there is no benefit in having different canon types per appdomain
When you want to show that something grows exponentially, plot it with a log scale on the y axis. That way, an exponential function becomes a line, and humans are much better at judging whether something is a line than judging whether something is an exponential function.
In fact, whenever you want to honestly show that something follows a certain law, scale your axes so that you get a line in the best case.
Indeed, especially considering how bad the exponential fit is for working set, it is pretty evident that it is not just growing exponentially, but faster than that.
Of course the article just had to include this bit:
> If we look at these results in graphical form, it’s very obvious what’s going on
When it really is not obvious at all what is going on.
> Indeed, especially considering how bad the exponential fit is for working set, it is pretty evident that it is not just growing exponentially, but faster than that.
I suck at maths too (we should start a club!), but playing around with Calc the best fit I could get (for working set) was with quadratic exponential, i.e. in the form of e^(n^2+n). It would be interesting if someone who actually knows anything about this could chime in and say conclusively what the growth rate really is here.
I think in this case the linear graph does a better job of showing the 'holy crap this exploded' better than a logarithmic graph would. That it is exponential is more of a neat aside.
It really depends on what you're after: dramatic effect, or convincing somebody of the actual relation (here that it's exponential, and not, say, x^5 or so)
The type generates a string that requires more than 65535 bytes to encode in Utf8 format in the constant pool
This is using Eclipse. So it is the Eclipse compiler which is integrated with the editor. I haven't tried Oracle's Java compiler.
Because the problem is to do with the size of a class constant pool, the limit of four Inner's should be universal across Java compilers, IDEs, and Operating Systems. (Assuming the class name mangling algorithm is part of the class specification.)
Hmm, I would've thought that this would be easier for the Java compiler to handle, because it uses type erasure for generics. So in metadata, everything is just 'Object' rather than exact types, but I'm no Java expert, so I could be wrong?
Most parts are erased, the method descriptor, the bytecode but the generic signature is kept for enabling separated compilation (you can also retrieve it by reflection).
Here the generic signature which is a String is just too big, the dictionary part of the classfile, the constant pool, stores a String as an UTF8 byte array prefixed by its size on 16bits (hence the 65535 limit).
Thanks for the info on Java generics, I always assumed that it 'erased' everything. I didn't appreciate that it would actually need to keep some info around, to make things work.
The limit is the class (or method) name that is bigger than 65535 in this case, it has nothing to do specifically with generics because of type erasure.
When I think about the generics discussion in the Go community (and even though I am personally for them), this is the kind of exponential complexity that I wouldn't want to see in any implementation thereof.
There's nothing exponential going on here. Note that in the expression 6^3, the number that's varying is the 6, not the 3. If it were the other way around, that would be exponential.
I realize "exponential" has come into common use meaning just "growing very fast", and it's sometimes pedantic to insist on a distinction between polynomial growth and exponential growth, but in a discussion of algorithmic time complexity, we should get it right!