Hacker News new | past | comments | ask | show | jobs | submit login
Build Your Own Text Editor (viewsourcecode.org)
711 points by tosh on Aug 4, 2019 | hide | past | favorite | 103 comments



Quote (from memory) from Rob Pike, a few years before he wrote the editor "sam": "Everybody writes a screen editor. It's easy to do and makes them feel important. Tell them to work on something useful".

This was not long after he had left Caltech, where he had been a system admin/system programmer for CITHEP (Caltech High Energy Physics), for Bell Labs. CITHEP hired a few undergraduates as the new admins and system programmers. I was one of those undergraduates.

Most of us used the line-oriented editor QED, but as programmers were wont to do in the early '80s, said that someday we were going to write ourselves awesome screen editors.

One evening, Karl Heuer and I were both boasting about how great our screen editors would be, and it somehow turned into an editor writing throw down. We took terminals at opposite sides of the room, and both started hacking away, designing as we coded.

This continued all night, mostly in silence, with the occasional boast ("I've got text search working!") and counter-boast ("I had that an hour ago--I'm doing regular expressions now!"), and frequent trips downstairs to the vending machines for soda and snacks.

Another of the undergraduate system admin/programmers, Norman Wilson, then arrived, and saw what Karl and I had been doing all night. He sent an email to Pike about it, and the quote at the start was Pike's reply.


Thanks for sharing.

Just wondering how did ones write an editor in one night.


In case anyone hasn't seen it this CppCon talk https://youtu.be/sPhpelUfu8Q talks about rrb-trees and how they're a great data structure for writing a text editor that can handle unlimited sized documents without locking up, and allows for editing while saving and opening, and an easy way to support an undo command.

Most text editors today do not support such functionality. With lockless data structures like an rrb-tree it becomes easy. It really is an ideal way to do write a text editor.


For those like me who are unfamiliar with Relaxed Radix Balanced trees, here's the paper presenting the algorithm:

RRB-Trees: Efficient Immutable Vectors (2012)

https://infoscience.epfl.ch/record/169879/files/RMTrees.pdf

EDIT: For further reading:

Improving RRB-Tree Performance through Transience (2014)

https://hypirion.com/thesis.pdf

Article: https://hypirion.com/musings/thesis

Code: https://github.com/hyPiRion/c-rrb


Are there any good ones that exist? I'm a security analyst and I frequently work with large text files, 200MB-1GB and the text editors I've tried are incredibly slow.

A lot of the time I can use grep and other command-line tools to find what I need, but given the nature of the job I don't always know what I'm looking for so having the raw file open in a text editor is sometimes the only way.


I’d recommend xi-editor, vi/vim/neovim or emacs for that


Try JOE, it can handle files this large no problem.


The person from the talk may have a github repo with the text editor he wrote shown in the talk.


As an ultimate noob of this area of expertise, can I ask if Sublime Text works? If not, why?


I'm the noob, not the parent. No sarcasm intended.


Gvim can open files of 2GB or more and search through them without any problem.


What about piece tables?


To contrast with most of the "write your own small text editor" articles out there, I feel compelled to post this:

http://kparc.com/$/edit.k

This is a text editor written in K, an APL-family language. Determining how it works is left as an exercise to the reader. I wonder whether an explanation would be longer or shorter than the tutorial in the article...


Unfortunately I believe this is written in k5, for which there is no public implementation, hence it's not possible to run this editor. (I'd be delighted to be wrong here, would be very interested to see this running)


Wow! a text editor in 3 lines of code?

It would be nice if a short write up accompanied this?


Probably most of the work is hidden in the underlying "framework" or "language". Here's a text editor in 1 line of html: <textarea></textarea>


APL is an extraordinarily concise programming language, even without taking its environment into account. This is a well known implementation of Conway's Game of Life in APL:

↑1 ⍵∨.∧3 4=+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵


Maybe this will help? I wrote the following up a few years ago now:

https://news.ycombinator.com/item?id=8476633


Thanks for that write up. Is there a decent reference or tutorial for ‘k’ online that you would recommend? I would like to learn more about it’s semantics. For example, the views concept and how old versions of a view/var? Are available.


Not to my knowledge. There are a fair number of documents[1] describing the language to be sure, but it is difficult to divorce K from its implementation(s), so k2, k3, k4, k5 (etc) are all different languages. Your best bet is to download some implementations and try it out. k4 is built-in to kx's q which has a free download[3], just enter a line with a backslash on it to get into k.

Views are just a convenient syntax for memoization. Old versions of the value aren't available.

[1]: http://web.archive.org/web/20041022042401/http://www.kx.com/...

[2]: https://web.archive.org/web/20130801233812/http://www.kuro5h...

[3]: https://kx.com/


There's a decent tutorial now at shakti.com:

https://shakti.com/tutorial/


oK is an implementation of K6 in Javascript. No views, but you can play with it in your browser and the documentation is pretty good.

http://johnearnest.github.io/ok/index.html

https://github.com/JohnEarnest/ok/blob/gh-pages/docs/Manual....

https://github.com/JohnEarnest/ok/blob/gh-pages/docs/Program...


The first line loads http://kparc.com/$/view.k, where the first lines are multiple variable definitions per line.

A non obfuscated version is at http://kparc.com/edit.k where some stuff are explained.


Obligatory clip: Game of Life in APL

https://youtu.be/a9xAKttWgP4


These sorts of languages make me wonder, can this be reversed? As in:

sdj8340u2u$@(:@#P+_A:@L@#!@)_ZL?> M OW {E)@(

Can a language be invented where this random string I hammered out would be a valid program?


Yes of course. If you balanced the parentheses it would be syntactically valid k. It assumes a bunch of names, and would be unlikely to do anything useful, but it would be legal.

The k7/shakti tutorial playfully notes that running one's finger along the shifted top row of the keyboard is syntactically valid.


>Can a language be invented where this random string I hammered out would be a valid program?

perl :-)


pretty sure that is a valid vi command


Shameless plug. I'm working on a somewhat novel modal text editor: https://github.com/dpc/breeze

It's written in Rust, and it tries to improve on Kakoune, which tries to improve on Vi/Vim.

If you feel like hacking on a text editor, and would be interested - let me know.


Interesting. I had this experience with Kakoune: https://lobste.rs/s/v17gol/switching_sublime_text_from_vim_2.... I'm curious if you expect to improve on it along the dimensions I mentioned.


Ha! I have a very similar experience and I agree with all your points. I really enjoy the basic experience and `kak` is my main text editor ATM, but some details and parts of the philosophy are infuriating me.

I started `breeze` as a prototype way to show kakoune author how kak could potentially work https://github.com/mawww/kakoune/issues/2590 , and later decided that actually I could maybe push it further into a full new text editor.

I'd like to make it more natural to use, and get rid of the (IMO) pointless idiosyncrasies like the plugin API being keystroke based. I'd like to also make it easy to embed into other software, so that there's some hope that it could be easily integrated into existing software, so I'm not doomed to forever switch between kakoune and classic-vi editing.


The lack of built-in window management is a major feature of kakoune for me. I try to delegate all window management to my window manager, including such things as browser tabs, so I like that kakoune is designed to be used like that. The biggest downside of kakoune's approach for me is that it makes vimdiff-functionality pretty hard to achieve.


Cool! I wanted to do similar things, neat idea. These days i'm wanting to try something similar but with a nice UI - i'm so tired of the Terminal, but it's hard to get away because editors like Kakoune and Vim are just so powerful.

My wish is that you or I (if i ever have the time) will implement something like this onto XiEditor. Ie, implement this as a feature to a lower level editor backend, so that you get the frontends "for free".

It drives me nuts that there's so much work done for the frontend side of things, but the backend of editors are being reinvented repeatedly for little gain.

I hope Xi can make a performant, hackable backend to plug into solid frontends.


I doubt this will ever happen. It's all about complexity, and in case of text editors there are layers and layers of abstractions that often (for UX reasons) have to be integrated tightly. Trying to unpack all pieces into fully generic bring-your-own implementation is going to be a lot of work upfront and in maintenance, and complexity explosion.

I really like some technical aspects of Xi, but I view it as a typical wishlist project/technical showcase. In the meantime projects like kakoune, with much smaller scale but better focus, in shorter time, build ecosystems that can be a practical Vim alternative.


One of NeoVim’s goals was to also be a backend to multiple front ends. The eternal dream of having Vim on a <textarea> or when writing email.

I don’t know how far they’ve come as I switched back to Vim when version 8 came out.


There is a ticket about this that might interest you. It seems like there is work to do, but that it might get there

https://github.com/xi-editor/xi-editor/issues/1187


Ok(Event::Mouse(_)) => { // no animal support yet } ;)


Thanks, I had not known about Kakoune (or breeze). It's interesting looking at Kakoune's issue list. The idea for the editor is good, but you can see the huge of amount of work needed to make a feature complete bug free editor.

Also it's interesting to see the code when you write with the latest iteration of C++ instead of ancient versions of C.


I didn't notice actual design ideas (apart from the philosophy which sounded good to me). Do you have things on your mind on what kind of actions user can do?

The list of what already works seems just the same as in Vim.

I'm just curious what kind of modal system it actually is.


What I instantly liked is, how literally it starts from scratch and builds a couple of lines at a time.

Can anybody refer to a similar step by step guide to building a compiler?


Niklaus Wirth's "Compiler Construction", which teaches how to build an Oberon subset compiler.

https://inf.ethz.ch/personal/wirth/

Afterwards you can follow up with building an workstation OS on a systems programming language with GC, all the way from the boot sector loading code to the graphical UI, by reading and implementing "Project Oberon".

The 2013 edition uses an FPGA instead of the original Ceres hardware.

A̶n̶d̶ ̶s̶i̶n̶c̶e̶ ̶W̶i̶r̶t̶h̶ ̶d̶o̶e̶s̶ ̶n̶o̶t̶ ̶f̶e̶e̶l̶ ̶i̶t̶ ̶i̶s̶ ̶t̶i̶m̶e̶ ̶t̶o̶ ̶a̶c̶t̶u̶a̶l̶l̶y̶ ̶r̶e̶t̶i̶r̶e̶,̶ ̶h̶e̶ ̶h̶a̶s̶ ̶u̶p̶d̶a̶t̶e̶d̶ ̶t̶h̶e̶ ̶b̶a̶c̶k̶e̶n̶d̶ ̶t̶o̶ ̶t̶a̶r̶g̶e̶t̶ ̶R̶I̶S̶C̶ ̶V̶ ̶a̶s̶ ̶w̶e̶l̶l̶.̶


> And since Wirth does not feel it is time to actually retire, he has updated the backend to target RISC V as well.

Are you confusing RISC-V with RISC5 [1]?

[1] https://en.wikipedia.org/wiki/RISC5


Apparently yes, thanks for the correction.


http://www.penguin.cz/~radek/book/lets_build_a_compiler.pdf PDF form of "let's build a compiler" as mentioned by others (Haven't read it fully myself but I hear it's a very good introduction)

I haven't got a copy, but as far as I know Andrew Appel's book is structured in that manner, i.e. basic steps first then more advanced topics like garbage collection later.

"Engineering a compiler" is not step by step but very readable and contained if you want to skip the discussion of (say) various scanner implementations.

There is a decent sized gap between any real compiler and a related toy or book implementation (In general), so take a look at a compiler for a language you know (that isn't C++).

http://www.drdobbs.com/architecture-and-design/so-you-want-t... Walter Bright (of Digital Mars C++ and D fame) wrote an article about language implementation e.g. how to write a good compiler rather than a toy one.


It's not finished yet, but you probably want https://craftinginterpreters.com/

I loved his "Game Programming Patterns" book, and this new book is looking totes spiffy


>> step by step guide to building a compiler?

I haven't found any better resource than nand2tetris, see Projects 6 to 12 where you start with an assembler and end up with a compiler [1]. There is the accompanying books [2] and coursera courses [3]. Hope this helps, best!

[1] https://www.nand2tetris.org/course

[2] https://www.nand2tetris.org/book

[3]https://www.coursera.org/courses?query=from%20nand%20to%20te...

[4]


There are several super long (8+ hr) live coding videos of building an assembler and compiler for CP/M here: http://cowlark.com/2019-07-15-cowgol-prototype/index.html

I haven't had the stamina to go through one but have watched some sizable chunks and it's pretty cool, even with me not really knowing much about low-level stuff like that.

There's also a session where he live codes a vi-like text editor here: http://cowlark.com/2019-06-28-cpm-vi/index.html


Thanks a lot for sharing this video! This is absolutely the kind of thing that I hope to find on HN. Plan to spend all of tomorrow on this.


Another notable video series, Handmade Hero: https://www.youtube.com/user/handmadeheroarchive


I enjoyed Kyle James' "The Super Tiny Compiler."

https://github.com/jamiebuilds/the-super-tiny-compiler


Shameless self-plug, but I specificities wrote interpreterbook.com and compilerbook.com because I’m also a huge fan of start from scratch and build it line-by-line tutorials and couldn’t find such a thing for interpreters/compilers that wasn’t about a toy language. You might like it too.



I've been considering making a video tutorial on how to make a compiler from scratch. I'll try to keep it very concise yet include the very basics for anyone who isn't a computer scientist. (I am one by education).

Would you be willing to pay for it (how much)? And what would be a good platform for this?

I'm doing this to pay bills while I work on something of my own. I will use C myself (might include a couple of 'parallel' videos for Rust too wherever applicable) but the way I explain, one would be able to use any programming language.

The contents would roughtly go like this-

1) Explaining the 'theory of computation', theoretical and not really necessary for learning how to build compilers but I love this topic and it does give you formal insight in the 'power' of computers and if you're going to build non-toy compilers, may help you write more efficient algorithms.

2) Tokenization and parsing.

3) Type system

4) Symbol table

5) Assembly

6) Brief introduction to some basic optimization techniques.

I want to hear if people here have ideas on how to go about doing this. Is there a good platform for interactive online coding+slides class?


"written in C" is not compatible with "basics for anyone who isn't a computer scientist", IMO. It just gets you so bogged down with low level details which is not necessary for an introductory course.

I still haven't found a better introductory text than Appel's "Modern Compiler Implementation in ML". I also like pairing this with "Language Implementation Patterns", which uses ANTLR.


Video makes it easy to ramble about design-choices, etc. But I'd rather read content. Seeing snippets of text on-screen, at my own pace, is much better than seeing shots of a monitor or slides. Much like this story.

That said there are a couple of great books out there already, interpreterbook.com, compilerbook.com, and craftinginterpreters.com - all recommended. If you're doin g this for money you'll struggle, unless you have a lot of clarity / something new to offer.


Game scripting mastery, pretty much line by line, and you have the source code for each chapter.


Eugene Wallingford at the University of Northern Iowa recommends "My First Fifteen Compilers"

http://composition.al/blog/2017/07/31/my-first-fifteen-compi...

I haven't tried it yet, but it's on my list of "someday" projects


There is an old one on compilers called "Let's build a compiler by Jack Crenshaw" Its in Pascal but I think there are modern rewrites of it (like C on an x86)


Here it is, https://compilers.iecc.com/crenshaw/

As for modern, one can follow along using Free Pascal.


Compilers have a lot of relatively independent pieces, unlike a text editor. Better luck at looking at those pieces instead (Lexing, parsing, type checking, etc...)


For your information, even a simple a text editor (like much of user-facing software) too is comprised of several logically independent pieces, in the spirit of the model-view-controller paradigm.


It depends on how you implement it. I tend to implement mine as part of a compiler/run time, so it’s just another stage for me.

But my point was that any of those editor parts aren’t really identifiable as it’s own thing outside of editing.


Same request but for ACID compliant database engine :)


A long time ago I had to write a Windows custom control for text rendering. This was really hard to do in a performant way once you dealt with different fonts, pages sizes, scrolling and all the other stuff we usually take for granted. One of the most painful months in my life but I learned a lot since nobody told me how to do this so I had to make it all up.


I’d love if you go into more detail.. dealing with similar ish right now and would love to hear other war stories


Shameless plug - Text editor in 512 bytes (1kb if you include the kernel): https://bitbucket.org/danielbarry/saxoperatingsystem/src/mas...


In the same vein there's kilo by antirez:

https://github.com/antirez/kilo

Edit: oops, yours is in assembly, so definitely not in the same vein :-p (except for both trying to be small!)

Edit2: ... and, the article is about kilo. D'oh. This is what happens when you go first through HN comments, only to later open the article (which I do often!).


My big fear about implementing a text editor is writing a rope data structure to be able to edit large files. I don't really know if it's mandatory for all text editors though.

I was a little frustrated with sublime text folding code using indentation instead of syntax (there's an issue but they don't want to fix it). I have large C++ files, and it seems visual studio does a better jobs at folding.


It's not mandatory. There should be good rope implementations already, though. If you're in Rust there's ropey and at least one other pretty good one.

On modern computers, you can get away with a flat buffer of text (not even a gap buffer) for documents up to a few megabytes. I don't really recommend a gap, because it adds complexity and doesn't help with the worst case, though of course it cuts your average case down.

If you're going for simpler than a rope, my recommendation is array of lines. You have to do logic to split and fuse lines (for example, when backspacing over a newline), but it's not too bad. The only thing they don't do really well is single long lines.

I don't recommend piece tables. They have superb performance on first load, but then fragment. The reason I'm such a huge fan of ropes is that they perform excellently in the worst case - long edit sessions, long lines.

Best of luck!


In my text editor, I just use a doubly linked list of lines, instead of a rope. It doesn't limit the performance in any situation I've found. A performance problem I do have is large column mode edits. eg if I want to delete the first columns of the entire file (typically some log file with timestamps at the start). I suspect the rope data structure would make that worse.

I like the simplicity of the doubly linked list of lines. I'm not sure what advantage the rope data structure would bring.


Doubly linked list of lines tends to be slow on, for example, editing documents with very long "lines" e.g. doing a find/replace on a 10mb json that's not pretty-printed and is a single line.


I think that ring buffers as lines could do a little speedup in that specific case and shift/unshift ops.


If you use c++ you can start with a std::vector. Computers are really fast. Don’t optimize prematurely. And hey, maybe you don’t need a rope. Just put std::vectors to a std::map or std::list per row of text?

Lots of problems that have an optimal complex solution can be solved with way simpler constructs in the prototyping phaze.


> in the prototyping phaze

I'd go further than that. I'd say in production too. If the simple thing is fast enough even when you test on a slow machine, then implementing the faster but more complicated data structure is a form of premature optimization.


use a gap buffer: https://en.wikipedia.org/wiki/Gap_buffer.

For scaling large docs, you can do a linked list of gap buffers and avoid re-allocs of large buffers.

Super simple, efficient enough for most “i wrote my own text editor”, can always plug in a more complex structure later.

Author’s implementation will drag down if you have very long lines (e.g., transpiled JS).


As another commenter points out, using RRB trees is a great option! The immer version is both fast and easy to work with and you get undo for free (and even undo trees).


I very much enjoyed this article, as a sort of kata.

I started a clean repo, loaded stock vim, and typed every line in order.

Learned a lot! Kudos to the author and antirez, and may I suggest that it's worth working through slowly, as presented.


As the author/maintainer of a once popular text editor I feel it is easy to get started then very hard to finish. Modern text editors require a lot of details and even more demands from those who use it. It’s a fun programming challenge. I still find myself thinking about awesome features to add.



This looks like fun. Does anybody know of other guides for different kinds of programs (other than compilers, which I've seen tons of here on HN)?


How about ray tracing? Peter Shirley (whom I learned a ton from in grad school) wrote a trilogy of mini-books beginning with "Ray Tracing in One Weekend."

https://github.com/RayTracing/raytracinginoneweekend


I don't see any references to things like HarfBuzz or DWRITE, which are NECESSARY to properly display any scripts that's not European. All properly implemented text editors are rich text — because you will need to support font fallback and BiDi and all the Unicode complexity


You absolutely don't need any of these for a tutorial text editor... Those are all things you can hook-in later, after you've understood how to make text editors...


Did you see that this is a command line app (like VI, or EMACS)? Those libraries look like they're used for GUIs. Do you know if they can be used for the terminal?


I prefer The Craft of Text Editing by Craig A. Finseth:

https://www.finseth.com/craft/


shameless Plug: I'm working on writing this in Go(Golang). https://github.com/ankur-anand/goditor


Even more shameless: I used it as a base for my own ersatz-emacs in golang! https://github.com/japanoise/gomacs

This is actually my go-to editor for small files/quick edits now


Good stuff for reminding me that `x/sys/unix` exists and the terminal-raw-mode it allows. Should inspire someone out there eventually to plumb everything in pure Go that people would otherwise seek from ncurses, GNU readline lib etc.


You should attach a license to your code.


if you use windows (or linux with wine) you can write self-contained apps using Autohotkey. It's free from autohotkey.com. It's usually advertised as a macro-recorder which it also does.


This is a pretty awesome resource! I wish this existed for Rust.


I followed this guide but implemented it in rust. https://github.com/khadiwala/kilo-rs If you’d like to take a look.


I had the same plan. Thanks for the link, nice to have a reference if I get stuck (as a Rust noob).


Disclaimer: I haven't read the article just yet. Saving it for later.

An alternative approach to this is to open vim with a blank .vimrc and add config as you need it. Nothing except what you REALLY need NOW to get things done. That's how I started my adventure with vim a few years ago and it's still paying off.


This is about programming your own text editor, not configuring one to your requirements.


Which is what writing vim script essentially is. Or if vimscript is not enough, you can use literally any other language and integrate it via vimscript. I'm not an expert though but I have spent quite a lot of time on doing that.


I get that ViM is really powerful, so is Emacs and the like, however I'd guess a large part of going into the trouble of building your own text editor would be to learn about the algorithms and data structures used, something you can't do by writing ViM configuration files.


Good point - if someone is after that, then what I wrote isn't the best way to achieve this


Well - not to be needlessly contrarian, but I've programmed that way (vim with nothing except what's included) for nearly 25 years now... it already has everything you need.


No it isn't.


I'd be more willing to change my mind if you provided something else than that.


> Nothing except what you REALLY need NOW to get things done.

Do you really need `g?` (rot13) to get things done ?


I used it once or twice in 5 years




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: