The compiler cannot ignore an unused dependency. The compiler must first process...

zzzcpan · on June 17, 2013

I don't know how Go compiler works, but I don't see a problem detecting unused dependencies very early just by walking an AST of current package, since in Go you have to specify package name explicitly in order to use something from it (well, not all the time, but almost all the time).

And even this is probably unnecessary, you only have to process additional unused dependencies in your current package, other packages already did.

Anyway, it's just a technical problem with a few simple solutions.

scott_s · on June 17, 2013

Packages are processed before the current package is compiled. You can only establish that packages are unused after compiling the current package. So, you will still process unused packages. That is exactly the problem they want to avoid.

Your solution is, I think, to scan the current package twice: once to establish which packages are used, then process the packages, then actually compile the current package. But now you're scanning through each package twice, which will significantly add to compilation times, and make the compilation process more convoluted. Their solution is, I think, far simpler.

You're dismissing the problem, but I'm not sure you understand what the problem is.

zzzcpan · on June 17, 2013

I just checked, they actually do it very early, similar to what I suggested, but instead of ignoring and producing warning they simply throw yyerror and they do that without even walking a tree, but during parsing. So I'm still correct. You can verify it by executing "go run .." multiple times with and without import error, you'll notice how fast it generates "imported but not used" error, compared to actual compilation.

And the whole point of their solution has absolutely nothing to do with unused imports and variables. They are just reusing object files without recompiling and they can't have circular dependencies to do that. There is nothing more to it.

scott_s · on June 17, 2013

Based on your statements, I am not confident that you understand the compilation process and the terminology we use to describe it. In particular, I don't how you can say they don't "walk a tree," but do it "during parsing." Nor is timing the compiler sufficient to determine this kind of behavior.

In order to establish that a particular package was unused, they must: at least parse the header for that package, and parse the entirety of this package. Parsing the unnecessary package is the problem they want to avoid. Because that package could also have unnecessary packages, in large projects you could end up parsing more unnecessary packages than necessary packages.

This isn't even considering the building process, which is what actually produces the headers that get parsed for packages. So, if you import a package and it has not been built yet, it will build it. If that was an unnecessary package, then there's even more wasted time.

zzzcpan · on June 17, 2013

What can possibly be so confusing about parsing and walking a tree? Scanner (lexer) and parser are the very first stages of compilation. Parser produces syntax tree and some other stuff, like symbol table, there is no alternative terminology. Walking a tree cannot have multiple meanings either. Most of the work is done after parsing. And timing a compiler is sufficient to understand exactly that, since you claimed, that detection of unused dependencies required package to be compiled, which is of course total bullshit. For most unused packages even parsing is not required, you can look at symbols during parsing and set "used" flag for every package, that was used and only parse it if it has this flag set.

And parsing is not what they want to avoid. Parsing is cheap. They want to avoid full-blown compilation of unused packages.

scott_s · on June 18, 2013

I know what parsing and walking a tree is. And parsing is walking a tree. It requires going over the entire source, which is my point: they're trying to avoid processing unnecessary code.

You do not need to do code genration to establish what the unused packages are. But you must parse and perform semantic analysis of the entire package to determine what the unused packages are. There is more to "compiling" than code genration and optimization.

Parsing is what they want to avoid, not just code generation. Parsing is cheap for small projects. But nothing is cheap at scale. From the talk I have already linked to:

The construction of a single C++ binary at Google can open and read hundreds of individual header files tens of thousands of times. In 2007, build engineers at Google instrumented the compilation of a major Google binary. The file contained about two thousand files that, if simply concatenated together, totaled 4.2 megabytes. By the time the #includes had been expanded, over 8 gigabytes were being delivered to the input of the compiler, a blow-up of 2000 bytes for every C++ source byte.

As another data point, in 2003 Google's build system was moved from a single Makefile to a per-directory design with better-managed, more explicit dependencies. A typical binary shrank about 40% in file size, just from having more accurate dependencies recorded. Even so, the properties of C++ (or C for that matter) make it impractical to verify those dependencies automatically, and today we still do not have an accurate understanding of the dependency requirements of large Google C++ binaries.

If Go allowed including unused packages, you would end up with the exact same problem.

zzzcpan · on June 18, 2013

I don't think you get it. You cannot end up with the same problem if you ignore unused dependencies and prevent circular ones, it's just not possible.

scott_s · on June 18, 2013

That's my entire point: you can't ignore unused dependencies. I've explained at length why this is the case - the Go designers have also explained this. Simply, you must look at imported packages before you look at this package. You can only establish that a package is unused in this package after looking at all code in this package. Hence, you only can only recognize that a package is unused after you have looked at it.

Yes, you can avoid generating code for that package. But the designers of Go want to avoid looking at the package at all. They have collected data from inside Google which clearly establishes that this is a problem.

zzzcpan · on June 18, 2013

You and Go designers are misleading people by claiming false statements.

You don't have to look into the package to ignore it, I even explained to you how exactly to do it, but you keep repeating this bullshit. You have to look into the package and therefore parse it in one case only: when this package is imported into current namespace, which is almost never the case in Go. You can detect and ignore unused package with looking no farther than current package almost all the time, because each imported package uses unique namespace. If there were no symbols present with that namespace in current package - you don't have to look into the package with that namespace, it's that simple. And gc (Go's compiler) already detects unused packages in similar way.

scott_s · on June 20, 2013

And as I explained already, that complicates, and probably lengthens, the parsing and semantic analysis. (Instead of establishing the contents of a package before processing the current package, you have to track what symbols this packages has used from other packages, and then verify they are used correctly, or exist at all.) Which goes against their design goal.