Can't believe no ones mentioned webcola yet, which implements some very cool layout algorithms and is the result of cutting edge academic research. https://ialab.it.monash.edu/webcola/
I really wish the computation would be offloaded to web workers though. Right now it completely freezes the browser when updating very large graphs (above around 1.5k edges and .5k nodes in my experience).
Another great option for the same family of problems is Viz.js[1], which is essentially Graphviz in your browser. Offers a number of different layout models and a ton of configuration options.
I love graphviz (ever since Sanjay Ghemawat of mapreduce fame showed me how to use it in a compiler to debug CFGs). However, it’s the main limitation, compared to these other systems, is that it’s algorithms aren’t incremental/interactive. They are a bit dated in that way.
I recently wanted to render some graphs and found there are a confusing variety of options. In particular https://github.com/dagrejs/dagre/wiki seems to be layout focused and then plugs into a bunch of other systems including d3. It would be nice if one of these tools was able to describe itself in contrast to the others.
Many tools that produce similarly-looking layouts use pretty much the same algorithms, just with different tweaks. I.e. hierarchical layout is usually based on Sugiyama's work; force-directed is also a staple that's very easy to implement and thus is widely implemented. There often is no simple list why one implementation is better/different than another. There are common improvements you can make or features you can add, though:
- In a hierarchical layout, supporting custom layering and sequencing to ensure that certain nodes will appear in a given layer, or that within a layer the order of nodes is fixed. This is fairly easy to add, since layering and sequencing have to be done anyway.
- Support for grouping, where nodes can be enclosed in a box, which requires the layout to keep them close to each other and also ensure that no other nodes protrude in that box (and preferably prevent edges from passing through as well).
- Customizability of edge routes, where you could, for example, say that edges have to leave a node on the bottom and enter from the top, instead of having that property implicit due to the layout direction and choosing to pur ports in the center of nodes.
- Some applications don't concern themselves with interactivity and thus don't care about algorithms that, for example, would ensure that after incrementally changing the graph, the resulting image would still look similar to the old one (when adding a node to a tree you don't want the order of subtrees higher up to change completely just because of that change).
- Any features you add may interact with each other and further complicate the implementation to ensure that each one still works (and results in acceptable runtime).
Generally, due to the breadth of features that can be supported/implemented in a given implementation, you sometimes have to study the documentation closely. If all you care about is getting a nice result with default settings, then that's often easy to test, though. But the nature of graph drawing is sadly that the most visible effect of an algorithm is what you see.
The numbers in the circles are not neatly centered. They're a little off. Fixing that (center them perfectly) would make them visually much more appealing.
That will involve some optical alignment considering the actual shapes of the glyphs? I imagine one would get this feeling of not perfectly aligned even when the center of the bounding box of the numbers coincide with the center of the circle.
It hadn't even occurred to me that there were serious academic attempts at visualizations of this sort. In addition to OP, this whole discussion is full of great alternatives to d3-force (my usual go-to). Kudos to all!
This library doesn't seem like it gets you much further than the original d3-hierarchy. If you are looking to render a graph that contains cycles, I have found that using the force-graph from d3 is a pretty simplistic approach; I recently used it to render a hierarchy that that contains cycles in a "tree-like" format, d3 is a very powerful and we'll constructed library if you take the time to understand what it is capable of
Of all the tools mentioned in this thread: how do they scale?
We have DAGs that cause graphviz to choke (tens of minutes running time, if it ever finishes). So far I'm not aware of anything that scales better than graphviz with similar results to the dot layout engine.
Even if you can calculate the layout for a very large DAG in a reasonable time, the zooming and scrolling required as a user to make sense of it and follow interesting paths makes it unwieldy in my experience. One approach to use with very large DAGs is to interactively expand / collapse nodes - for a simple tree example imagine something like a treeview control for browsing a file system, expanding folders of interest to drill down rather than viewing all expanded. For non-tree DAGs you need the layout algorithm and you might need to automatically collapse unrelated sections as you expand nodes along a path to keep the time reasonable.
This approach can be combined with lazy loading data as you expand a node as another performance enhancement.
With some effort you can add this interactivity to vis.js or d3. I have previously used a commercial product called KeyLines [1] which has this feature, uses WebGL to run the layout in gpu.
Can anyone recommend a JS library which calculates "optimal" node positions for an acyclic graph under the following constraints:
- Nodes are rectangles that have width and height (they'll later be rendered as HTML divs) and should not overlap
- If possible, edges should not overlap with nodes / cross any node's boundaries
?
I really just need the positions as I would like to have precise control over how and when things are rendered. So far, I've been using WebCola for this and while it works ok, it tends to become quite slow in some situations.
Yes, there are cycles, but you want to draw them as DAGs because the "backlink" (the thing that loops back up to the entry point of a loop) isn't on the same footing as the rest of edges.
The layout still needs to make room for the back links, doesn’t it? Thinking of that, with CFGs, you know which edges are backlinks, so you can put the backlinks in in reverse direction, do the layout, and then draw them with the arrow heads at the “wrong” side.
It's sometimes helpful, during program analysis, to collapse each strongly-connected component (SCC) of a directed graph into a single placeholder node representing that SCC.
When you do this, you're guaranteed that the resulting graph is acyclic.
All these graph layout algorithms should ideally be run on the backend. Most of these js libraries are missing the point in trying to run layout in Javascript on the frontend. If you have a huge graph, you can't view all of it at once anyways and you're viewing a subset of it any 1 moment.
So the ideal library should be running the layout on the backend and the frontend should be plain dummy, simply showing each graph node at the position co-ordinate sent from the back-end.
We find that is ok if you build it right for the bulk of business settings. The intuition is that if you can do youtube, streaming layout updates should not be a problem.
I would agree that building a distributed architecture that achieves this is a distraction for most solutions- oriented teams. Before you get to the benefits, most/all teams would drown from the complexity.
Most humans can't make much sense of anything more than 50-100 node graph at a time. At a graph size of containing 50-100 nodes or lesser which is a rather small graph, huge variety of graph algorithms are not very helpful.
These libraries are good to experiment with each layout algorithm but ultimately a graph analysis most of the time involves exploration and querying, like a repl and making sense of that. Most frontend js libraries will not be helpful in doing that.
It's a surprise that there is no robust backend library that does graph layout.
You can run d3 on the backend with node. It shouldn't be too difficult using it with a layout to constrain the potential visualization to a more comprehensible data set.
There's even a d3-node package available to generate a static SVG or png.
[edit]
Additionally, grouping or binning the data may be something to consider when dealing with a visualization with 100+ nodes. Never tried that with DAG datasets, not sure how that would work, but it's interesting to think about.
Graphviz can easily be used as a backend layout system to output SVG or PNG. It's just not very interactive; you can do mouseover annotations but you can't drag nodes around.
Yep, that's how we architected Graphistry, and then some. GPUs frontend for rendering, GPUs backend for layout, analytics, etc.
Also agreed on subsets. About 1-10M nodes can still work in the frontend (and we're pushing upper end of that). But around 100K-1M nodes, more about seamless & interactive abstractions... and reinforces why integration w/ smart backend matters.