Hacker News new | past | comments | ask | show | jobs | submit login

My big fear about implementing a text editor is writing a rope data structure to be able to edit large files. I don't really know if it's mandatory for all text editors though.

I was a little frustrated with sublime text folding code using indentation instead of syntax (there's an issue but they don't want to fix it). I have large C++ files, and it seems visual studio does a better jobs at folding.




It's not mandatory. There should be good rope implementations already, though. If you're in Rust there's ropey and at least one other pretty good one.

On modern computers, you can get away with a flat buffer of text (not even a gap buffer) for documents up to a few megabytes. I don't really recommend a gap, because it adds complexity and doesn't help with the worst case, though of course it cuts your average case down.

If you're going for simpler than a rope, my recommendation is array of lines. You have to do logic to split and fuse lines (for example, when backspacing over a newline), but it's not too bad. The only thing they don't do really well is single long lines.

I don't recommend piece tables. They have superb performance on first load, but then fragment. The reason I'm such a huge fan of ropes is that they perform excellently in the worst case - long edit sessions, long lines.

Best of luck!


In my text editor, I just use a doubly linked list of lines, instead of a rope. It doesn't limit the performance in any situation I've found. A performance problem I do have is large column mode edits. eg if I want to delete the first columns of the entire file (typically some log file with timestamps at the start). I suspect the rope data structure would make that worse.

I like the simplicity of the doubly linked list of lines. I'm not sure what advantage the rope data structure would bring.


Doubly linked list of lines tends to be slow on, for example, editing documents with very long "lines" e.g. doing a find/replace on a 10mb json that's not pretty-printed and is a single line.


I think that ring buffers as lines could do a little speedup in that specific case and shift/unshift ops.


If you use c++ you can start with a std::vector. Computers are really fast. Don’t optimize prematurely. And hey, maybe you don’t need a rope. Just put std::vectors to a std::map or std::list per row of text?

Lots of problems that have an optimal complex solution can be solved with way simpler constructs in the prototyping phaze.


> in the prototyping phaze

I'd go further than that. I'd say in production too. If the simple thing is fast enough even when you test on a slow machine, then implementing the faster but more complicated data structure is a form of premature optimization.


use a gap buffer: https://en.wikipedia.org/wiki/Gap_buffer.

For scaling large docs, you can do a linked list of gap buffers and avoid re-allocs of large buffers.

Super simple, efficient enough for most “i wrote my own text editor”, can always plug in a more complex structure later.

Author’s implementation will drag down if you have very long lines (e.g., transpiled JS).


As another commenter points out, using RRB trees is a great option! The immer version is both fast and easy to work with and you get undo for free (and even undo trees).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: