Union find takes as input some undirected graph <*N*, *E*> and internally constr...

jtolmar · on July 22, 2022

I don't think there's a way to extend union-find to do what you want in the general case. You might have more luck starting with a minimum cut algorithm.

For the specific case where you can identify some of your edges are more or less likely to be deleted, though, you can run union-find on only the stable edges, cache that result, and then do the unstable edges. Whenever an unstable edge is deleted, reload the cache and redo all the unstable edges. This works for something like a maze with open/closable doors.

kragen · on July 22, 2022

Those are some interesting ideas, thanks! I hadn't thought about trying to apply minimum-cut to the problem!

thaumasiotes · on July 22, 2022

> Union find takes as input some undirected graph <N, E> and internally constructs (and progressively mutates) a directed graph <N, E'> which it uses to efficiently answer queries about whether two nodes n₁, n₂ ∈ N are in the same connected component of <N, E>.

I don't think this is a valuable way to think about the structure. That's not what it's for.

A union-find is a direct representation of the mathematical concept of an equivalence relation. The input is a bunch of statements that two things are equal. It will then let you query whether any two things are or aren't equal to each other.

This is captured well by the more formalized name "disjoint-set structure". You have a set of objects. You can easily remove an element from the set. But it makes no conceptual sense to try to "separate two of the elements in a set". They're not connected, other than conceptually through the fact that they are both members of the same set.

A union-find is a good tool for answering whether two vertices are in the same connected component of a graph because being in the same connected component of a graph is an equivalence relation, not because the union-find is a graph-related concept. It isn't, and there is no coherent way to apply graph-theoretical concepts to it.

Putting things another way:

You describe a union-find as something used to "efficiently answer queries about whether two nodes n₁, n₂ ∈ N are in the same connected component of [a graph]".

But you focused on the wrong part of that statement. What makes a union-find appropriate for that problem is the words "the same", not the words "connected component".

kragen · on July 22, 2022

You are welcome to think about the union-find structure however you prefer to think about it, but I was describing the problem I was trying to solve, for which the correct description of union find I gave above is optimal. Contrary to your mistaken assertion, no contradictions arise from applying graph-theoretical concepts in this way; there is no problem of "coherency". It's just a form of description you aren't accustomed to.

If your way of thinking about union find makes it hard for you to understand the problem I was trying to solve, maybe it isn't the best way to think about it for the purpose of this conversation, even if it is the best way to think about it in some other context.

I'm not claiming my description is the only correct description of union find, just that it's the one that most closely maps to the problem I was trying to solve.

The description can equally well be formulated in the relational terms you prefer: given a relation R, union find can efficiently tell you whether, for any given x and y, (x, y) ∈ R', where R' is the symmetric transitive reflexive closure of R (and is thus the smallest equivalence relation containing R). It efficiently supports incrementally extending R by adding new pairs (a, b) to it.

This is equivalent to my graph-theoretic explanation above except that it omits any mention of E'. However, it is longer, and the relationship to the maze-construction application is less clear. Perhaps you nevertheless find the description clearer.

What I was looking for, in these terms, is a variant of union find that also efficiently supports incrementally removing pairs from R.

Does that help you understand the problem better?

— ⁂ —

In general there is a very close correspondence between binary relations and digraphs, so it's usually easy to reformulate a statement about relations as an equivalent statement about digraphs, and vice versa. But one formulation or the other may be more perspicacious.

More generally still, as you move beyond the most elementary mathematics, you will learn that it's usually a mistake to treat one axiom system or vocabulary as more fundamental than another equivalent axiom system, since you can derive either of them from the other one. Instead of arguing about which one is the right axiom system or vocabulary, it's more productive to learn to see things from both viewpoints, flexibly, because things that are obvious from one viewpoint are sometimes subtle from the other one.

thaumasiotes · on July 23, 2022

> You are welcome to think about the union-find structure however you prefer to think about it, but I was describing the problem I was trying to solve, for which the correct description of union find I gave above is optimal.

> If your way of thinking about union find makes it hard for you to understand the problem I was trying to solve, maybe it isn't the best way to think about it for the purpose of this conversation, even if it is the best way to think about it in some other context.

I'm not having any problems understanding the problem you were trying to solve. My problems are in the area of understanding why you thought a union-find might be an effective way to address that problem.

I'm telling you that, if you adjust your view of what a union-find is, you will have a better conception of where it can and can't be usefully applied.

> In general there is a very close correspondence between binary relations and digraphs, so it's usually easy to reformulate a statement about relations as an equivalent statement about digraphs, and vice versa. But one formulation or the other may be more perspicacious.

In this case, the relation is symmetric by definition, so you just have graphs. Yes, it's obvious how you can use a graph to describe a relation.

But the entire point of the union-find structure is to discard any information contained in an equivalent graph that isn't contained in the relation. (And there are many equivalent graphs, but only one relation!) The absence of that information is what makes the structure valuable. It shouldn't be a surprise that, if you want to preserve information from a particular graph, you are better served with graph-based algorithms.

kragen · on July 23, 2022

Well, that's an interesting way to look at it; I see better where you're coming from now.

But I wasn't trying to get the unmodified data structure to handle deletion efficiently; I was trying to use it as a basis to design a structure that could. I thought I saw a minor modification that achieved this, but then I found out why it doesn't work. That's what my linked note above is about.

More generally, if you only try things that will probably work, you'll never achieve anything interesting.