Nobody's asked this, so: I've been fascinated with operational transformation fo...

pjasiun · on Oct 19, 2017

6 months is a very optimistic scenario ;) A guy responsible for OT in Google Wave wrote once that he spent 5 years on it and it still does not work well in all cases. After 3 years of struggling to get it right, we know what he meant.

Anyway, I believe that you can find a good entry point to OT here: http://www.codecommit.com/blog/java/understanding-and-applyi.... It gives a nice overview of the problem – in a human-readable language – but it’s only the beginning of the journey.

The original OT implementation has been designed for linear documents – texts. It does not fit very well with tree structure – you don’t want elements (tags) to intersect after resolving collision. It also does not fit undo where the order matters.

For sure, operational transformation is not a closed subject, which you can just learn. It is an open problem with many cases still unsolved. If you are looking for a good topic for a thesis, I would recommend it :)

exikyut · on Oct 19, 2017

> 6 months is a very optimistic scenario ;)

Oh.

> A guy responsible for OT in Google Wave wrote once that he spent 5 years on it and it still does not work well in all cases. After 3 years of struggling to get it right, we know what he meant.

...Wow, I see.

(As an aside, I think it's really sad Wave's "cool" UI was never properly released.)

> Anyway, I believe that you can find a good entry point to OT here: http://www.codecommit.com/blog/java/understanding-and-applyi.... It gives a nice overview of the problem – in a human-readable language – but it’s only the beginning of the journey.

I actually think this is one of the articles I found a while back. I agree that it's very thorough and in-depth, and, uh, I've managed to get slightly further into it before I started hitting PgDn repeatedly and my eyes glazed over than when I read this a few months ago! :)

> The original OT implementation has been designed for linear documents – texts. It does not fit very well with tree structure – you don’t want elements (tags) to intersect after resolving collision. It also does not fit undo where the order matters.

Ooooooooh yowch.

> For sure, operational transformation is not a closed subject, which you can just learn. It is an open problem with many cases still unsolved. If you are looking for a good topic for a thesis, I would recommend it :)

Hah!

When I was exploring OT a few months ago I also explored alternatives such as realtime diff/match/patch into a model held in server memory. The simplest way to do this ultimately stores approximately documentSize*clientCount for every single document. That lets me keep an exact copy of every client state on the server, letting me tag updates with the hash of what the document should now look like. For small-scale projects the massive-but-not-concretely-exponential memory requirements of such an approach is somewhat out of control but not completely unmanageable: 1000 documents of 1MB each being edited by 20 clients each requires 20GB of RAM. (But my first guesstimate of sane limits - 10000 documents and 500 clients - would require 5.2TB though, haha.)

Hmm.

I just realized... some of the buffer architectures commonly used in text editors might be usable to implement server-side deduplication. Or... you could even probably just implement a copy-on-write scheme. Either would probably be an effective design win because (in the contemporary/common case) all clients editing a given document are generally seeing a 99.99%-accurate representation of that document relative to other users; if the network is good, there's only really a few characters' difference between users.

OT feels like architectural overkill at the fundamental level, but I know that's just my overwhelmedness with the mental models required and the sense of "does it really have to be this hard?". I recognize that it really is that tricky, and kudos for keeping at it and then turning around and making the result open source! :D

alalonde · on Oct 19, 2017

There are a few codebases you can poke around in, notably ShareDB (https://github.com/share/sharedb/tree/master/examples).

This article had some visibility on HN a couple weeks ago, perhaps it is what you are looking for: http://digitalfreepen.com/2017/10/06/simple-real-time-collab...

The fact that that article needs 10 plus pages to describe one of the simplest use cases belies the unfortunate underlying truth: OT (and realtime co-editing) is intractably difficult for all but the most constrained data models. Which is a pity because it is undeniably cool, and why we built Convergence (https://convergencelabs.com) -- general-purpose realtime collaboration.

manigandham · on Oct 19, 2017

Try this video: https://www.youtube.com/watch?v=pMMDVphop40

One of the best overviews covering CRDTs and OTs, especially in the context of collaborative editing.

z3t4 · on Oct 19, 2017

I like this talk: https://youtu.be/yCcWpzY8dIA?list=LL9RQYUX_WIjk8SAcyuarJEw