> I recently wanted to take a hierarchy of items and draw them in a nice tree structure. For example, a Family Tree.
In general, a family tree is a directed acyclic graph. Branches of the tree will always join back together if you follow the history far enough.
The number of ancestors in a generation is 2^N, with N being the number of generations back you're looking. There were only 100 billion people that ever lived, so by the time you get about a thousand years back, it becomes literally impossible for all your ancestors to appear only once.
Of course, that's just an easy upper bound to make the point that becoming a graph is unavoidable. In practice, branches will come together a lot sooner than that.
A family tree is an overloaded term. The tree of ancestry, which describes who beget whom, sure cannot be cyclical.
The graph of civil relationships can certainly contain cycles. If the relationship in question is "is a legal parent of", for example, then adopt your parent.
The opportunities for cycles multiply if both blood and legal relationships are taken into account.
This works only until the concepts are the reason the problem is being tackled. A family tree based only on genetic relationships is interesting to few people outside geneticists and doctors. Most people are interested in familial ties as they are culturally defined.
Yea, it’s also not the best example because Mia Farrow was not legally wed though Allen was in a long term relation and they had a child. So, I am not sure if it qualified as a common law marriage before their separation.
In an August 1992 interview with Time Magazine Allen said, “I am not Soon-Yi's father or stepfather”, adding, “I’ve never even lived with Mia. I've never in my entire life slept at Mia's apartment, and I never even used to go over there until my children came along seven years ago. I never had any family dinners over there. I was not a father to her adopted kids in any sense of the word.” Adding that Soon-Yi never treated him as a father figure and that he rarely spoke to her before the affair, Allen seemed to see few issues with their relationship.[196]. https://en.wikipedia.org/wiki/Woody_Allen
However, if your mapping out relationships in software without considering the timeline it looks like a circular relationship. Further, I am sure you can find cases that are actually incestuous.
It's a source of ongoing perplexity to me that so many Linux file managers don't even offer a tree view, never mind some of the more advanced possibilities. Consider Robertson's work on Cone Trees from Xerox Parc in....1991.
Oddly, there has been some work done on source code trees such as Gource, but for some reason nobody has yet thought of applying this to file systems or browser navigation paths.
That's for console use. Trees are useful for looking at the whole thing at once, much as it's easier to appreciate a real tree that way than by maintaining a list of the leaves and branching distances.
I'm a a researcher and my source material is ~50gb spread across ~50k hand-picked files with pretty arbitrary structure, plus another 4-5 times that in mirrors of other collections. Trees let me see things that I would overlook by reading lists all the time.
Right now I half-ass use some bioinformatics tools originally intended for cladistics and proteome research to visualize the collection. Some days I go looking for specific information or catalog what I have, other days I stumble across a cache of valuable data and might select hundreds of individual files.
I use a variety of other graph-based tools too, such as Maltego, Gephi, and Cytoscape, but to be honest I do so in a very ad-hoc way for specific tranches of data ( afile collection, or a relevant social media stream) and keep most of it my head. Otherwise I will get hung up on tooling rather than the underlying material.
Not OP, but I'm guessing it's something similar to how I watched fast moving logs to look for anomalies:
I reduced text size enough, to a point it wasn't really text I was looking, but patterns of line content and length. This was enough to see that things were running proper.
Yeah I think it's obvious that graphical trees are prettier than they are effective user interfaces.
Broot is a good example of just how much effort goes into making nested structures easily navigable. Trees may be good for presentation but ultimately lists with indentation for nesting are for more navigable.
This translates to GUI concepts as well. Both Windows and MacOS show a few flat lists relative to the current directory rather than showing all nestings at once like a tree does.
With file system the resulting graph would be too big and too complicated to be useful. You would need heavy filtering to make any sense of it. In best case it would be a decent for determining where most of your files are, but those are probably hidden system directories and .git metadata.
Most tree visualization research is focused on static rendering of whole tree. For file browsing and many other applications it's better to have just section of tree with interactive navigation around trees. This makes visualization an easier with more predictable layout, and the user is not overwhelmed with data. This approach is basically replicating model of our attention/focus and how we make sense of complex structures.
The design of file browsers indeed seems to be stuck in 80s. Online storage services are copying this design as well. Is anyone aware of better design, in form of paper or prototype?
So the author mentions hearing about the Reingold-Tilford Algorithm which wouldn't work for the example given since R-T is meant for binary trees and a family tree might have more than 2 nodes.
Which led me to wonder why is there even an "algorithm" per se for binary trees: wouldn't it be acceptable to just trace out a complete binary tree with spaces reserved for each potential node, and only fill in a space with a node if it is non-null? Or does this not meet some criterion of "niceness" for drawings?
If you do that, you quickly run into the feasible depth limit for what you can show in a fixed space. Especially for sparse trees. Consider something like family trees, where you generally don't know anything about certain lines, but may have 20 generations or more on selected lines. No way could you show 2^20 nodes in a fixed format rendering, but you can make something kinda pretty by eliminating the empty space and organizing things to fill it intelligently.
I was just writing a radial RT-like visualization for displaying the branching structure of a binary tree and ran into the issue of information density you describe. Still pretty happy with the results, tho, in how it shows the structure and clustering.
That only works for complete (or "nearly" complete) binary trees, and you can't draw many of these since the number of children in a generation grows exponentially. For most trees you have to try to keep the width minimal while avoiding collisions between deeply-nested left and right children. One of the earlier papers to give some precise "niceness" criteria is "Tidier Drawings of Trees" (https://reingold.co/tidier-drawings.pdf) by Reingold and Tilford, with a companion Pascal program for computing the coordinates.
I'm surprised they chose that algorithm, it's fairly limited and not that efficient.
If your nodes have width, you should read "Improving Walker's Algorithm to Run in Linear Time" (I have an implementation here: https://github.com/Drup/tree_layout)
If your nodes have width and height, you need to read "Drawing Non-layered Tidy Trees in Linear Time"
I implemented something similar at https://automations.io converting a workflow stored as a flat linked list to a tree structure and then traversing and rendering it out in HTML and sprinkling CSS on top
Some trivial mathematical facts. No matter what algorithm you use, you want to make a quasiisometrical embedding of a tree into a finite dimensional space. Or, for a tree the no of nodes at distance R from the root is like exp R, but in any space of dimension N you can cram about R^N points which are at distance R from the origin. Hence the problem.
But you can embedd any tree into the Hilbert space :)
There are the various graph layout algorithms I've heard of, Sugiyama, KIELER, ELK - these can be used for trees of course, although I'm not sure they have highly constrained vertical levels like the article depicts.
Sugiyama has layers and can be used for trees just as well (and may give nicer results for almost-trees than removing non-tree edges and routing them individually). KIELER and ELK seem to be collections of various layout algorithms, including Sugiyama and specialized tree layout algorithms (same with our own library yFiles).
Nice write-up. I enjoyed how the material was presented - the introduction to problem, intuitive summary method and then detailed steps with visualizations of partial results.
In general, a family tree is a directed acyclic graph. Branches of the tree will always join back together if you follow the history far enough.
The number of ancestors in a generation is 2^N, with N being the number of generations back you're looking. There were only 100 billion people that ever lived, so by the time you get about a thousand years back, it becomes literally impossible for all your ancestors to appear only once.
Of course, that's just an easy upper bound to make the point that becoming a graph is unavoidable. In practice, branches will come together a lot sooner than that.