Hacker News new | past | comments | ask | show | jobs | submit login
Text Editor Data Structures: Rethinking Undo (cdacamar.github.io)
216 points by starfreakclone 10 months ago | hide | past | favorite | 81 comments



Thinking about undo/redo is a great place to start thinking about your text editor's underlying data structure.

I went down this rabbit hole a while back. I had to really dig[1]: it's been 5 years...

---

Data structures aren't the only interesting rabbit-hole, though.

UI/UX doesn't get nearly as much attention as it deserves. There are really only two that I am aware of: Notepad and Vim. Vim's modal editing results in the user explicitly defining undo/redo points. You can even use the "." key to re-redo! Everything else essentially boils down to a greater or lesser version of Notepad: The user can't predict what state undo will take them to.

Something I have wanted to create for a long time (definitely more than 5 years) is a new modal editor. I don't want yet another vi clone: I want something that is defined from the ground up by user configuration. Maybe one of these days I will get far enough past the ADHD wall to make it happen...

[1] https://news.ycombinator.com/item?id=15381886


I am currently working on an editor. Definitely interesting to try different ideas just to find out why certain things in Vim are that way. Also the "." key is more complicated than I expected, the exact start of an edit operation depends on the behaviour of certain default keybindings. It just works so smoothly I had never thought much about it before.

> Maybe one of these days I will get far enough past the ADHD wall to make it happen...

That's definitely the most difficult part, for me it got easier at some point once it actually seemed realistic that I'd manage to finish it one day.


The coolest feature of Vim is its macros. They bring the entire UX into one cohesive feature. The keymap has meaning, because every key symbol is a character, and a macro is just a string. You can paste them, edit them, copy them, then play them back; all without ever leaving Vim's primary UX. Macros are stories written, not in vimscript, but in the language of Vim itself.

Macros are also the reason I don't use Vim anymore. The keymap has meaning, and it shouldn't. Want to change your keymap? Good luck.

I spent years in Vim. I conceptualized all of it. I built my muscle memory around it. Then I learned a new keyboard layout.

I tried remapping the keys. Not only was that impossible (circular dependencies), it broke the conceptual map, too. I have offended the beast, and am welcome no more.

What I really want is to start fresh. No defaults: just user config. Give me the pieces, and the glue to hold them together: I'll do the rest.


>Macros are stories written, not in vimscript, but in the language of Vim itself.

Last year I hit upon the idea that vim's commands are like a bytecode that my brain outputs and vim interprets. I guess that's true of any editor UI, but it just feels more appropriate in the vi case.


I had different reasons but I only hardcoded a few basic keybindings like Escape. Everything else is customizable, but still limited to the three modes (it's just a good combination). Defining explicit keybindings for every symbol on the keyboard turned out to be the cleanest way to fix bugs in the input handler, special behaviour just for insert mode isn't worth it.

> I tried remapping the keys. Not only was that impossible (circular dependencies)

I wonder if that could be remapped in the code itself? From a quick look they seem to be defined in ops.c and vim.h, but I'm not very familiar with the codebase.


> I tried remapping the keys. Not only was that impossible (circular dependencies)

I don't get this. To swap x and y is easy:

  noremap x y
  noremap y x
Of course you can't do it with map, that's why noremap exists.


Believe me, I have tried. It's been a while, or I would tell you precisely what went wrong. Suffice it to say, I did not give up easily.

Even if it did work, I wouldn't have the conceptual map that Vim's UX is made of. Without that, it's just not going to be a positive experience.


As what point does a fully featured undo/redo system start to look like git (/VCS of choice)?

Once you add deliberate checkpoints and the tree structure from the article, it feels like you’re more than halfway there.


As a music producer who has been working with various DAWs, I've always been super jealous that programmers have git. Not just for version control but also for collaborations.

I always thought, it must be possible because sound is just a list of samples (numbers). Kind of like how a binary executable or assembly code is just a list of values or instructions.

Unfortunately most music software and plugins are proprietary as this industry did not have the same political movements (GNU, F(L)OSS, OSS, etc) that the computer industry had and most professional software has been created for professionals by companies.

It's interesting though because software like Ableton Live has an unlimited undo history but since it is proprietary one can't really look at it.


This resonates with me. I use undo/redo up to a point, but eventually I just reload the file (which I'm saving frequently) or just check it out (from git, mercurial) again.


Right - using undo/redo for quick or ephemeral changes, before solidifying into something (more) permanent feels very similar to me to making a bunch of small commits while working on/exploring something before cleaning them up into semantic chunks with a rebase.

Maybe there’s some implicit idea here like treating a commit as an aggregate of a bunch of atomic actions (i.e what undo/redo act on)


> Vim's modal editing results in the user explicitly defining undo/redo points.

? How? Does switching to normal mode create an undo/redo point? Someone asked me if the granularity of Vim's undo/redo could be increased (i.e., more frequent undo/redo points). In what they demonstrated, Vim's granularity seemed less than other applications (i.e., undo/redo acted on relatively large chunks of input); in some brief research, I didn't find a solution.


Well, in Vim, "switching to normal mode" can be thought of as "ending an insert command" instead of "switching modes", which makes sense as soon as you realize that normal mode is, well, normal.

So when you press "o", I don't think "switch to insert mode". I think "insert a line with the following text:", which can then be undone with u, repeated with ., or whatever else you want.

At least that's my mental model of Vim, I don't know which is closer to reality.


The "solution" is to simply leave insert mode more often. That works great if you are using Vim for editing. If you are using Vim for writing... well, not so much.

Then again, if I am writing in Vim, I generally prefer the flow of actively editing what I wrote rather than undoing it.

This really applies to Vim as a whole: if you enjoy its UX, it's damn near perfection. If you hate it, you can leave. If you love it, but want it to behave just a little differently...then you can probably accomplish that with a plugin/configuration.

...but if you want to change Vim's UX more than a little, you're fucked. It feels totally possible until you realize it just isn't.


There’s a trick to remap space to automatically insert an undo point in vim: https://stackoverflow.com/a/4360415/13099

But, after getting used to vim’s commands, I find I go back to normal mode relatively quickly and so my undo steps are generally logical.


Simply stated, 'u' in vim undoes every change since the last insert.

I suppose for some, "more granular" undo points could be useful, but I'm in such a habit of doing something discrete and then escaping insert mode, that I've never felt pinched by the potential for vim's large undos.


I still use a lot of nvi on openbsd. and it has a strange quirk to it's undo.

so "u" undoes your last action. but if you hit it again it undoes the undo, a redo if you will. I did not really think about it much, just accepted that was how nvi worked, a single level undo. But that is not true at all. the trick is you have to "u" undo then "." repeat previous action and you get the full undo stack. and, like the parent post mentioned, you also get a full redo stack the same way. Get it to redo then repeat previous action.

I am not exactly sure what vim does, But I suspect this is one of it's improvements.


I can't believe that, in 16 minutes, nobody else on HN had the answer: nvi is following original, authentic vi behavior, not this newfangled stuff. The Vim manual explains it:

  How undo and redo commands work depends on the 'u' flag in 'cpoptions'.
  There is the Vim way ('u' excluded) and the Vi-compatible way ('u' included).
  In the Vim way, "uu" undoes two changes.  In the Vi-compatible way, "uu" does
  nothing (undoes an undo).

  'u' excluded, the Vim way:
  You can go back in time with the undo command.  You can then go forward again
  with the redo command.  If you make a new change after the undo command,
  the redo will not be possible anymore.

  'u' included, the Vi-compatible way:
  The undo command undoes the previous change, and also the previous undo
  command.  The redo command repeats the previous undo command.  It does NOT
  repeat a change command, use "." for that.
https://vimhelp.org/undo.txt.html

(Actually, I have so rarely used vi that I'm relying on the implications of the Vim manual that it's original vi behavior.)


With vim, "u" always moves back in time through the undo history, CTRL-r forward. "." repeats the last action which changed some text but not stuff like moves or going forward/backward through the undo history.

nvi sounds very weird to me.


Yes, in Vim, repeated u's just continues undoing, while ctrl-r is for redos.


> Something I have wanted to create for a long time (definitely more than 5 years) is a new modal editor. I don't want yet another vi clone: I want something that is defined from the ground up by user configuration.

This is more or less the philosophy of the editor I've been working on (and using ~exclusively) since 2014: https://github.com/alefore/edge/tree/master

Some parts of the UI are still defined in the compiled language (so don't fully fit your philosophy yet), but a big part of the UI comes from the configuration loaded at runtime: every time the editor starts, it interpretes and runs this configuration defining the UI: https://github.com/alefore/edge/blob/master/rc/hooks/start.c...

(This file is not compiled into the editor, but loaded and executed directly at runtime; the format of that file is my editor's extension language/configuration (the equivalent to emacs lisp or vimscript), which just happens to be a garbage collected C-like language.)


I'm working on a new modal editor. There's a demo on steam. It's definitely not a vi clone. https://store.steampowered.com/app/1537490/Tentacle_Typer/


Consider my interest piqued. We’ve seen a couple of others in the space, notably Kakoune and Helix. What do you have in mind?


TL;DR: Most (if not all) of the same features, just not as tightly integrated. Every feature is a piece of the puzzle that is your user config.

---

Picture Emacs without a default keymap. That's a start. The user builds their own UX from scratch; bringing each feature into their config explicitly. Alternatively, the user just grabs a curated config like Doom Emacs. The difference here is that they can read it: all of it.

Instead of writing a bunch of imperative elisp (that becomes its own web of circular dependencies), let's structure the config with some kind of purely functional additive data structure. That Piece Table I wrote is a good start. Instead of just piecing together letters, let's piece together UI and functions.

This idea can apply to everything. I envision it as the next generation of shells, of web browsers, of operating systems: everything. Imagine Blender, but you the user add one fruit at a time. Where's the button to extrude mesh? Exactly where you put it; or nowhere at all!

Software is made of incompatibility, and I'm sick of it. Where we should have borders, we have walls. Just imagine what it could be like if we tore them all down.


Compatibility arises from standardization, which only comes after maturity. I don’t expect software industry to mature until another hundred years or so.


> Maybe one of these days I will get far enough past the ADHD wall to make it happen...

I have that hope for literally hundreds of things... but I know I will never actually get to do any of those things. I can think of any one of them right now and be unable to even try. I hate this existence.


In my text editor (https://github.com/alefore/edge) I keep the undo/redo history always linear, which I find works pretty well.

So suppose the user does the following operations:

• Insert "Hello"

• Insert " world!"

• Undo once.

• Insert ", Cameron!"

At this point, undo operations would gradually transform the buffer thus: "Hello, Cameron!" => "Hello" => "Hello world!" => "Hello" => "". Obviously, one can redo at any point. The undo/redo days structure don't preserve the state, just the transformations (which are reversible).

The gist of it is that when you apply a modification to the buffer and the redo stack isn't empty, the redo stack gets flipped and inserted into the undo stack. If you undo a long list of transformations and then make a change, that long list is duplicated, but this hasn't been a problem in practice.

In case someone finds this interesting, this is mostly implemented here: https://github.com/alefore/edge/blob/28c031230a8babe888ffe1a...


This is how traditional Emacs undo is implemented as well. I like it but it seems to be not the norm; it does take a little bit to get used to. (There's plenty of packages to transform Emacs undo into something more like a tree structure, though I don't personally use them because I'm used to the above behavior.)


Interesting, none of the editors I just tested (notepad / visual studio / sublime / github) works like this.

I think of undo/redo as "time traveling", going back to where I was one / two / several steps bevor. The mental model of your implementation is more akin of actually "doing" the undo. Like if you undo insert " world!", you create an action that deletes " world!". The timeline still goes forward, but now has a delete action.


Emacs undo-tree does everything I need. Emacs also supports undo in region which most editors don't seem to support and wasn't covered by the article.

I actually used regular Emacs undo for years which lets you get everywhere in the tree with a kind of tree traversal but you won't know where you are. I resisted undo-tree for ages but it's definitely worth it as it stays out of the way until the occasion you might need to use it.


Emacs's undo is great in that invoking an undo command is itself undoable. And that is different from just your standard redo. It definitely needs some getting used to but it is very powerful.

But I think your second points deserves an even bigger mention: Emacs has the ability to apply undo only to a certain "region" - which in Emacs parlance is basically just a selection of text. For those of you who have never seen it: imagine two parts of a file you're editing, say one part at the top, one at the bottom. You start by editing the first part (top) and then then move your cursor somewhere down to the second part of the file (bottom) to do some more editing there. But then you realize that your edits in the first part of the file were baloney. In most other editors, if you wanted to undo them, you'd be forced to also undo the edits in the second part of the file. In Emacs, however, you can simply select the first part at the top of your file - if you hit undo then, it will only undo the last edits done inside that selection and leave all edits outside of that region untouched.


What happens when, for example, you atomically replace “mouse” by “elephant”, then select “epha”, and then region-undo?

The reason most editors don’t implement this is probably that it’s hard to conceive of how it should behave in the general case. (That’s not to say that the way Emacs is implementing it isn’t good and useful.)


Great question. I tried it and I got `elephantmouse`. I was surprised by this, but FWIW I've never encountered this edge case in actual usage.


Wow, that's genius! Thank you for explaining, TIL.


Wow, been using Emacs for decades and didn’t know about „undo in region“. Thanks for sharing!


It's one of those things like rectangle commands; once you know they exist it's hard not to miss them in editors that don't support.


Emscs undo both both of those reasons is just on another planet than any other editor. Being able to undo your undo (as infinitum) is killer because it just means you can never lose state.

Refional undoing is even more amazing.


vundo is a simpler implementation: it reuses Emacs's built-in undo/redo and just implements the tree visualisation part.

undo-tree is a reimplementation of Emacs's undo/redo, that supports a tree visualization.

* undo-tree LOC: 4700. https://gitlab.com/tsc25/undo-tree/-/blob/master/undo-tree.e...

* vundo LOC: 1350. https://github.com/casouri/vundo/blob/master/vundo.el


Nice, I will try vundo!


Indeed, undo in regions is the second best feature of emacs undo, wish it were more popular


I just skimmed it, but it looks like vim really has undo/redo "solved":

    - Modal editing makes for nice "undo" points, clarifying whether undo should undo "World!" or "!".
    - "g-" and "g+" eliminate the "orphaned redo".  They walk the entire undo/redo tree rather than just the linear undo/redo.
    - Time travel undo/redo is really handy when you want to go to where you were on the wall-clock.  ":earlier 15 minutes" takes you to the code as it was 15 minutes ago.


> but it looks like vim really has undo/redo "solved"

Does it support restricting undo/redo to a selection? In Emacs you can select any region of text and just apply the undo history of the selection. That is extremely useful. Imagine working on to functions in the same file. You have are working on g() and realize you want to undo some change on f(), that you did before. Most editors don't support that. In Emacs you can just select the code of f() and press undo (C-_).

Several popular undo extension libraries break this feature.


Just to emphasize part of your comment as I think a lot a people aren't fully aware - Vim has an undo tree https://vimhelp.org/usr_32.txt.html#usr_32.txt


> ":earlier 15 minutes" takes you to the code as it was 15 minutes ago.

TIL. Very cool. tnx


Been using vim for few years, never hears of that feature. Still a student.


Don't forget about the Gundo plugin: it lets you navigate your tree of undo history.


Gundo is great. There is also mundo, which is a fork of gundo. And, undotree, which doesn't require python support.


Far from solved

Modal editing isn't enough if you type whole sentences/paragraphs of text within a single insert session

Also undo in selection is required to "solve" it, as well as semantic points besides "last 15 minutes" (like "last saved session" or "last time you quit the editor")

Also UI with a diff is sometimes better than having to undo/redo to see changes


>Modal editing isn't enough if you type whole sentences/paragraphs of text within a single insert session

Why not? Honest question.


Because it's too coarse: undoing the whole paragraph instead of just the last word with a typo is too much, so you'd have to be always aware of this limitation and break flow to switch modes for no other reason than to insert "undo points", and these are unnecessary mental bookkeeping chores


I don't have this problem in practise. It's not "break flow to switch modes" for me. I don't type this way. There's always as much movement as there's typing, especially while programming. Even when writing prose I exit the insert mode each time I think of what to write next making this always a good "undo point".

If I make a typo while in the insert mode I just remove last character or last word. I guess you could argue that this is something that undo should also cover but I don't see much need for that.

I'm not in the insert mode, exiting to normal mode for movement and undo. I'm in the normal mode entering insert mode to write.


While the inferiority of your workflow doesn't matter to you, it's still a point against undo being "solved" by vim


What problem does a magical "does what I mean" undo solve that pressing ^W or ^H doesn't? At a minimum, your "superior workflow" is "<Esc>ui" or "^Ou", both of which are more keystrokes and require your editor to read your mind.


Or my superior workflow is Alt-u or hold U, which is a single shortcut/keystroke, so you lose there as well

The other issue with your attempt at solving undo without undo is that the deletion doesn't become part of redo, also a single word with a typo could be in the middle of a paragraph where you navigated with a trackpad, while ^W would delete near the current cursor position

You don't need magic to fix the ubersimplicity of current defaults, just a bit of openness


^W is ctrl-W.


So what? How does it address the issue I've described?


It's the same number of keypresses as alt-U.


So now it's not "more keystrokes" as you originally claimed (and it's a higher number of keypressses vs holding U), and you still haven't addressed the issue with ^W I've described


There are alternatives to "undo", which are probably better suited to "undo last word" without making "what will undo undo" more mentally complicated. Having undo reverse the last edit, whether that's a word or a few words or adding an argument definition to a function, or adding an "if" block -- makes it very easy to keep in your head. "I want to undo that last thing I did".

The alternatives are:

    - Control-W (delete the last word, stay in insert mode)
    - "W" to move back a word (or "B" to move back a big word).  So: <Esc>BC
There's apparently also a recipe for setting up vim to insert undo points whenever you hit space, so you could also enable that for the files you want to do that on.


understanding of what that "last thing I did" isn't session specific, like "adding an if block and a long comment describing what that block does" is semantically two last things I did, the fact that they're happening within a single insert session doesn't help. And I don't really need too keep it in my head, a solved undo would make it visually obvious to free my head

The alternatives you suggest are bad: 1. This is no undo, so then it's not part of redo

2. Why would I ever want 3 undo points if I indent my comment a bit by 3 spaces?


I love vim but I actually find undo/redo to be one of the few annoyances of the program, because of the whole linear undo/redo thing. So I guess I need to look into this g+ option…


As I mentioned elsewhere, gundo is really helpful for navigating the undo tree.

My favorite thing about vim's flow is that it isn't just limited to undo/redo: you can re-perform your most recent action anywhere using the . key.


The idea of an undo “graph” in memory is something I’ve implemented, and in C++ even, so it was fun to read the article. Couldn’t agree more — once you’ve had it, going back to the linear, often truncated thing most application implement is a little sad.

In my case when state A transitioned to B, was undone and further changes made to get state C I would retain the XOR of the changed memory of the A->B transition as run length encoded bytes B’ and similarly C’ so it was lightweight and extremely fast to undo and redo (the latter being exactly the same as the former but applied to A rather than B, since that’s how XOR works). Cool stuff.


Maya's C++ API was what taught me the most about the importance of proper do/undo functionality. As soon as you start to modify the DAG, you really must ensure that the undo operation leaves everything (be it meshes or shaders or curves, whatever) as it was before your custom object got inserted into the DAG. Else your plugin is worthless.

See the doIt/undoIt methods in https://help.autodesk.com/view/MAYAUL/2022/ENU/?guid=Maya_SD...


I've been working on an editor (not text) in C++ and pretty early got into undo/redo. I went down the route of doIt/undoIt for commands but that quickly got old. There was both the extra work needed to implement undo separately for every operation, but also the nagging feeling that the undo operation for some operation wasn't implemented correctly.

In the end, I switched to representing the entire document state using persistent data structures (using the immer library). This vastly simplified things and implementing undo/redo becomes absolutely trivial when using persistent data structures. It's probably not something that is suitable for all domains, but worth checking out.

https://github.com/arximboldi/immer


Man, fonts sometimes really matter.

I've been pondering about calling your developers nasty names and what a dolt method is actually doing ...

Until I realized that they were DO-IT, UNDO-IT. <facepalm>


Bit of trivia:

The first Macs (or maybe it was the Lisa) had DO IT instead of OK. The people in the focus groups testing it got annoyed that they were being called dolts.


Not the first Macs, the work-in-progress Lisas they used for user research https://www.folklore.org/StoryView.py?story=Do_It.txt:

“Starting in the summer of 1981, Larry organized a series of user tests of the nascent Lisa software, recruiting friends and family to try out the software for the first time, while being observed by the Apple designers who recorded their reaction.

[…]

Finally, the team noticed one user that was particularly flummoxed by the dialog box, who even seemed to be getting a bit angry. The moderator interrupted the test and asked him what the problem was. He replied, "I'm not a dolt, why is the software calling me a dolt?"

It turns out he wasn't noticing the space between the 'o' and the 'I' in 'Do It'; in the sans-serif system font we were using, a capital 'I' looked very much like a lower case 'l', so he was reading 'Do It' as 'Dolt' and was therefore kind of offended.

After a bit of consideration, we switched the positive confirmation button label to 'OK' (which was initially avoided, because we thought it was too colloquial), and from that point on people seemed to have fewer problems.”


Reading the comments on how vim and Emacs have implemented Undo/Redo with various options was very helpful. I personally don't use these editors, but I'm working on a graph drawing tool where the user usually modifies multiple parts of the graph. Right now we only have linear redo/undo implemented. Reading about all this other options is really helpful. Especially the Time travel and the regional undo are very smart features that are also very helpful for all graphic based editors.


> Time travel and the regional undo

Jetbrains’ IDEs have both of them and more. The feature is called “Local History”. [1] You can see the history of an file, a selected region of text, or even your entire project. It can undo file deletions/moves/renames. It feels like a personal “automated git without the git hassles”. It has got me out of some really bad situations.

[1] https://www.jetbrains.com/help/idea/local-history.html


Both Ardour and Cubase (both DAWs) had branching undo/redo systems in the mid-2000s. Ardour and I think also Cubase abandoned it before 2010 because almost all users could not deal with the complexity.

Maybe for programmer-oriented text editors the user reaction/experience might be different.


I'm surprised there isn't much discussion of data structures here, as that would inform an undo algorithm.

I infer from 'PieceTree'

That this is a binary tree based piece chain.

That makes undo really easy, especially character by character.

Nice explantation of piece chains

https://www.catch22.net/tuts/neatpad/piece-chains/


Some points that often get left out of 'undo' discussions:

* The usual linked-list (or tree) implementation is very cache-unfriendly if you're undoing several steps at a time. Using "linked list inside a buffer" is better; this does not preclude trees, the back "pointer" just has to specify the offset as well as the previous buffer (when you reach the fixed-size allocation you will also have to do this). If you undo across a buffer change you'll also need to update the "most recent redo" backlink from the new buffer.

* SSO strings will usually beat external strings for small edits (this can be variable-size in the linked-list-in-a-buffer case); for large edits see if you can just incref part of the main editing buffer rope.

* It is highly useful to expose a few "shortcut" undo commands: undo to previous save, undo to previous build, etc. Manual tags is probably not very practical most of the time, and "save" is essentially one anyway.


I'm not sure if it was recorded, but I think Don Knuth's Christmas lecture last week was on a related topic. It's a follow-up to his 2018 lecture:

https://online.stanford.edu/donald-e-knuth-lectures


If you are looking for counter examples, take a look at Excel on windows. It undoes in multiple windows.

Say you have two documents open. You make a change in the first then change the second document then go back to the first and make a change. One document has two changes and the other has one.

First undo impacts document one. Second alters document two. Infuriating


Yes. This is arguably the most infuriating implementation of undo/redo that I've used in any application.

It's worse than an undo that just wipes-out the document. In that scenario I'd just not use the feature.

The behavior in Excel tricks you into using the feature by working as you'd expect in a single document scenario. Then you open a second document and end up trashing one or the other when you undo the wrong thing.

I don't know how anybody ever thought this implementation was the right answer. Ever.


more interesting part is, when you find these are shared in powerpoint and word, too.

there is not warning about I modified another file, if I dare to work on multiple task, everything probably wreck.


Devil's advocate: For Excel, it makes a bit sense to have app global undo because sometimes Excel books refer other opened books' data. For Word and PowerPoint, it's hard to advocate because I've never seen referring other docs.



I used an editor with a redo tree, DeScribe (I believe) word processor on OS/2 and Windows. The redo tree was pretty cool, but I'm so used to losing my stuff on change after undo that I don't miss it. If I really cared to save a version, I would've committed it to my local git.


Sorry, but the page you were trying to view does not exist <-- ?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: