Just want to highlight a couple points of caution/clarification by groob from the reddit discussion[1].
In response to "What conclusion should I draw from this?":
> I don't think the lesson here is to judge these frameworks. The graph alone is not an indication of quality, there are other things to consider. I think the lesson is that goviz is awesome and you should run it against your own projects, where you know the code well. Having tightly coupled code should be avoided. A dependency graph can pinpoint areas of your application that are in need of refactoring, as well as code that is either easy or hard to remove.
In response to "what's the deal with net/http?"
> Please don't draw the wrong conclusion from these graphs. Vault is a complex, production ready application which does much more than your typical CRUD app. I find the vault http code to be of good quality, and written in an idiomatic style. If I was learning how to wrote APIs in Go today, I'd look at the vault http package to learn from.
It would be interesting to see graph on different authors (or people with publishing and committing rights), to see what sort of attack vector we are seeing in here.
E.g. Beego there seems to rely only on astaxie's own packages (and Go packages).
That was my primary take away from NPM debacle, injecting malicious code on one repository from hundreds of authors of small packackges is risk, whereas it's not as big with few dependency authors.
It would be interesting to see this kind of graph created for other languages like Python, Node.js, etc. to see how they all compare. It would be neat to visualise which language/framework has the worst case of dependency hell
Rust is somewhere between Ruby and Node here: leaning towards small modules (one of my crates is in this graph, and it exports four functions), but not to the same level.
> It would be neat to visualise which language/framework has
> the worst case of dependency hell
So, one thing I've come to realize is that different people have different opinions on what "dependency hell" even means. If you have a lot of dependencies, but your tooling reliably makes it easy to get them, build them, and upgrade them, is that hell?
There are a number of objective measures of cyclomatic complexity in software. These metrics show that the higher the complexity, the lower the cohesion of the code[1].
Code that is complex, and has low cohesion, is harder to understand, and therefore harder to change. It's the elephant you have to push on every time you want your program to do something new[2].
"Dependency hell" might be subjective, but tools that reduce the upfront cost of increased dependencies don't remove the other burdens from you, the developer. In fact they often allow you to produce an impenetrable, unrecoverable tangle more quickly than doing without them.
Edit: Just realized who I responded to... "But, you knew all that."
So, what's interesting is, I would often consider many small bits to have a _lower_ cyclomatic complexity number. That is, whenever I've used tools that measure this kind of thing, the solution is always to take the big things and break them up into many, smaller bits. It's possible that this is bias in the tooling, though.
I think some of this comes down to individual preference as well. It's like that joke, would you rather fight one horse-sized duck or 100 duck-sized horses? In this admittedly very stretched metaphor, the former are relatively monolithic codebases, and the latter are relatively modular ones. I know that I used to prefer one hundred-line class to ten ten-line classes, but now, much much prefer the latter. My experience talking to people about this is that people fall somewhere on this line, often in different places, and that makes it harder to understand. The action that I'm taking to reduce complexity can often be perceived as increasing complexity, depending on where the other person falls on that line.
Lots-of-little-ones (LOLO? LOLO) is the right answer in a number of cases. One big god service? No; microservices (lots-of-little-ones). Large many-lined functions with copious branching and conditionals? No, LOLO. Big teams with multi-hour status meetings? No... LOLO. One big integration at the end of the project? No...
That dependency diagram for Servo actually looked relatively clean. You can clearly see which elements are library or utility code. The visualization would probably be better as a three-dimensional model with weighting.
For OO practice, state of the art is SOLID (with a sprinkling of RAII if you happen to be using C++). SOLID leads you straight down the path of LOLO. Small increments FTW.
Dependency hell for me is when I have one target depending on two or more libraries that result in conflicts that can not be resolved without resorting to butchering one or more of the libraries.
One nice example that springs to mind here is that there is a package in some linux C library that defines a 'list', which in turn conflicts with lots of other libraries and applications that also wish to define 'list' but in their own scope.
> but your tooling reliably makes it easy to get them, build them, and upgrade them, is that hell?
Dependencies are never free. If you have a bunch of dependencies and not have too much resources it's still hell.
Managing dependencies is not hard because of the tooling. It's hard because a dependency upgrade could cause pain, or the current version is buggy but the fix inside the upgrade works, but there will be another bug and so on.
Right, this is what I mean: different people mean different things by "dependency hell". Difficulty of upgrading is certainly a possible meaning, but it's not a unified term.
actually people just trust too much other people. but since you don't know anything about other people you should not trust them and especially not trust their ability to create good code.
> It would be neat to visualise which language/framework has the worst case of dependency hell
I've worked on very large projects in Ruby, Node.js, and Java. In my experience, Java programs tend to have the most dependencies, especially if you use Spring, but also the best tools for managing dependencies.
People like to rag on Maven, and there is a lot of truth to the criticisms, but I've had a much easier time resolving dependency problems with Maven than with other systems, and I find that it helps me avoid a lot of problems proactively. For example, you can exclude transitive dependencies from imports, if they will conflict with other versions you already have in your project. The result of this is that some of the Java projects I work on have 10x the number of dependencies as some of the Ruby projects, but far fewer dependency problems.
I see nothing wrong with having lots of dependencies, even on small projects, as long as you have the right tooling to manage them and can guarantee reproducible builds.
I think these graphs should have multiple colours: One for language built-ins and one for each third party origin. A deep graph isn't such a big deal, but I'd assume that a very colourful graph can be.
I check dependencies like this from time to time on the projects I work with in Java.
I use it to see if there is anything I can shave away. (After embracing the fact that we are deploying to a Java EE server builds have been steadily shrinking and I now think the main war is has shrunk by 60% and I have a POC that should shave another 10%)
(For maven/Netbeans users it is trivially simple, just open the pom file and find the dependencies view in the "pr file/ toolbar".)
That's really great! I went to 'ThoughtWorks On The Beach' in Cologne, Germany, and there was this really nice idea to just print a dependency graph of your project to see if you can still grasp it or if it is getting out of hand.
True, but in general, the more code, the more bugs. Now there are a number of qualifications to that statement:
- More dependencies doesn't necessarily mean more code. Each dependency could be small (and well tested).
- "More code" doesn't necessarily imply a higher percentage of bugs per amount of code.
- Having more dependencies doesn't necessarily mean I'm calling more functions or hitting more code paths.
However, a larger codebase will tend to have more "pieces" of code that can interact, which will tend toward combinatorial growth in the number of ways to (mis)use those bits of code. If I have more dependencies, there are more pieces of code interacting and exponentially higher potential for bugs. So, while I listed out those qualifications, I don't believe they are the norm.
On the other hand, if I need some functionality, then either I use a dependency the provides the functionality, or I implement whatever it is myself. In that case, I'm taking a bet that my code will be better tested and more correct than the library author's code. In most cases, I think I would lose that bet.
Agree. As long as your dependencies are good so you don't need to open the hood then don't worry about it.
I just accept it: I'm not an artist or an F1 driver, elegance and unparalleled performance in the code shouldn't be my concern as long as it is solid and works. I'm in the business of solving business problems, not at scale, not in realtime but on a shoestring budget.
Could someone tell me what sort of useful information I can distill out of those graphs?
I'm really not sure at the package level what conclusions we can draw outside of "many circular dependencies - bad".
I don't think the graphs would be very difference in Node for example - but unless you're doing something like chunked code bundling I don't understand why this information is meaningful.
Go and Node's problem is the dependency hell. Caddy, a simple HTTP 2 webserver, pulled N packages to build. There are Node packages depending on trivial things (remember left-pad?). Too much modularity and relying on others' work sucks. Go's problem is solved by static linking, and Node's by installing all the required libraries in the node_modules directory.
Just these two languages? No other languages or systems have problems with dependencies?
> Caddy, a simple HTTP 2 webserver
Caddy is simple to use, but is not a "simple" server. This is a simple server:
package main
import "net/http"
func main() {
panic(http.ListenAndServe(":8080", http.FileServer(http.Dir("./"))))
}
> Too much modularity and relying on others' work sucks
One of the things caddy implements is automatic ssl support by depending on a library that implements the acme protocol. This type of thing is not exactly simple or something you really want to be writing from scratch if you don't have to.
The problem with node is that there is no such thing as a javascript standard library and javascript tooling is not great at dead code elimination. This results in many trivial libraries that each contain a single function. Once the tooling gets better at dead code elimination there should be fewer larger libraries.
Thinking of fsnotify, know that it is configured to only be able to handle a given amount of notifications at a time. I ran into an issue where the number of monitored files exceeded that, and it silently ignored new files that were being created. Learned the hard way: check your OS for how to configure it and make sure you operate within its bounds. Or, said a different way, know your tools before you rely on them.
In response to "What conclusion should I draw from this?":
> I don't think the lesson here is to judge these frameworks. The graph alone is not an indication of quality, there are other things to consider. I think the lesson is that goviz is awesome and you should run it against your own projects, where you know the code well. Having tightly coupled code should be avoided. A dependency graph can pinpoint areas of your application that are in need of refactoring, as well as code that is either easy or hard to remove.
In response to "what's the deal with net/http?"
> Please don't draw the wrong conclusion from these graphs. Vault is a complex, production ready application which does much more than your typical CRUD app. I find the vault http code to be of good quality, and written in an idiomatic style. If I was learning how to wrote APIs in Go today, I'd look at the vault http package to learn from.
[1]: https://www.reddit.com/r/golang/comments/50uni1/dependency_g...