Nobody's asked this, so: I've been fascinated with operational transformation for some time, but I haven't found anything that breaks it down into understandable steps.
I vaguely recall that last time I poked around I found an extremely high-level algebraic breakdown that made no sense, and only after a _lot_ of digging was I able to turn up an actual example implementation that wasn't 100% useless (because contextless) theory.
I fear that understanding OT is a case of for example spending 6 months reading Etherpad.
6 months is a very optimistic scenario ;) A guy responsible for OT in Google Wave wrote once that he spent 5 years on it and it still does not work well in all cases. After 3 years of struggling to get it right, we know what he meant.
The original OT implementation has been designed for linear documents – texts. It does not fit very well with tree structure – you don’t want elements (tags) to intersect after resolving collision. It also does not fit undo where the order matters.
For sure, operational transformation is not a closed subject, which you can just learn. It is an open problem with many cases still unsolved. If you are looking for a good topic for a thesis, I would recommend it :)
> A guy responsible for OT in Google Wave wrote once that he spent 5 years on it and it still does not work well in all cases. After 3 years of struggling to get it right, we know what he meant.
...Wow, I see.
(As an aside, I think it's really sad Wave's "cool" UI was never properly released.)
I actually think this is one of the articles I found a while back. I agree that it's very thorough and in-depth, and, uh, I've managed to get slightly further into it before I started hitting PgDn repeatedly and my eyes glazed over than when I read this a few months ago! :)
> The original OT implementation has been designed for linear documents – texts. It does not fit very well with tree structure – you don’t want elements (tags) to intersect after resolving collision. It also does not fit undo where the order matters.
Ooooooooh yowch.
> For sure, operational transformation is not a closed subject, which you can just learn. It is an open problem with many cases still unsolved. If you are looking for a good topic for a thesis, I would recommend it :)
Hah!
When I was exploring OT a few months ago I also explored alternatives such as realtime diff/match/patch into a model held in server memory. The simplest way to do this ultimately stores approximately documentSize*clientCount for every single document. That lets me keep an exact copy of every client state on the server, letting me tag updates with the hash of what the document should now look like. For small-scale projects the massive-but-not-concretely-exponential memory requirements of such an approach is somewhat out of control but not completely unmanageable: 1000 documents of 1MB each being edited by 20 clients each requires 20GB of RAM. (But my first guesstimate of sane limits - 10000 documents and 500 clients - would require 5.2TB though, haha.)
Hmm.
I just realized... some of the buffer architectures commonly used in text editors might be usable to implement server-side deduplication. Or... you could even probably just implement a copy-on-write scheme. Either would probably be an effective design win because (in the contemporary/common case) all clients editing a given document are generally seeing a 99.99%-accurate representation of that document relative to other users; if the network is good, there's only really a few characters' difference between users.
OT feels like architectural overkill at the fundamental level, but I know that's just my overwhelmedness with the mental models required and the sense of "does it really have to be this hard?". I recognize that it really is that tricky, and kudos for keeping at it and then turning around and making the result open source! :D
The fact that that article needs 10 plus pages to describe one of the simplest use cases belies the unfortunate underlying truth: OT (and realtime co-editing) is intractably difficult for all but the most constrained data models. Which is a pity because it is undeniably cool, and why we built Convergence (https://convergencelabs.com) -- general-purpose realtime collaboration.
I vaguely recall that last time I poked around I found an extremely high-level algebraic breakdown that made no sense, and only after a _lot_ of digging was I able to turn up an actual example implementation that wasn't 100% useless (because contextless) theory.
I fear that understanding OT is a case of for example spending 6 months reading Etherpad.