Hacker News new | past | comments | ask | show | jobs | submit login
Aminal: Golang terminal emulator from scratch (github.com/liamg)
182 points by guessmyname on Nov 28, 2018 | hide | past | favorite | 80 comments



I hadn't heard of SIXEL (https://github.com/saitoha/libsixel/blob/master/README.md) before. It's cool to see a standard for bitmap graphics in the terminal catching on.


Sixel is an old graphics format:

• [1988] http://www.vt100.net/docs/vt3xx-gp/chapter14.html

• [1990] http://www.digiater.nl/openvms/decus/vax90b1/krypton-nasa/al...

It is one of the reasons why iTerm2 got so popular: https://iterm2.com/documentation-images.html


As that page says, iTerm2 uses its own proprietary format. There's an open bug about adding SIXEL support: https://gitlab.com/gnachman/iterm2/issues/3240


I think currently one of the most popular Linux termemu is kitty which also supports this. So its definitely catching on.


Kitty does not use SIXEL, but defines it's own format: https://sw.kovidgoyal.net/kitty/graphics-protocol.html


KiTTY is useful for Windows users but are any significant amount of people using it on Linux? I would imagine that aside from people using the default terminal emulators that come with their distro or desktop environment like xterm, gnome-terminal and konsole, the most widely used terminal emulator on Linux would be urxvt.

Personally I use terminology which I am satisfied with. https://www.enlightenment.org/about-terminology.md


OP is talking about this Kitty (https://sw.kovidgoyal.net/kitty/), not this one http://www.9bis.net/kitty/.


Ah, I see. Thanks.


Yeah I've used it on OS X. It renders fast and is the most responsive terminal I've tried (alacritty, iTerm2, Terminal.app). Its font rendering isn't super optimal, and it's a little rough (no hotkeys for zooming, etc.), but it's really your only option if lag bothers you.

Nowadays I just use terminal in gvim. It solved like all of my problems. Should've known Vim would save me again haha.


Another of the most popular (and enduring) terminal emulators, xterm, can be built to support sixel.


While a GPU accelerated terminal emulator sounds great on paper, do keep in mind that GPU acceleration doesn't really give you that much in terms of raw performance. You still have to render glyphs individually, almost always on CPU. So, the reality is probably more that GPU acceleration gives you more flexibility, rather than giving you performance. Anecdotally, I just ran this and it takes roughly twice as long to slog through my dmesg as gnome-terminal.

That said... pretty cool. I don't know if there already existed VT100 emulation written in Go, but it doesn't hurt to have that. Plenty of applications might want to embed a terminal or otherwise have terminal emulation.


There's probably lots of room for optimization, for example currently it sets and resets all sorts of GPU state on each print call, for each character, which is rather expensive:

https://github.com/liamg/aminal/blob/master/glfont/font.go#L...

Speaking of the main loop, for every columm and every row:

    cx := uint(gui.terminal.GetLogicalCursorX())
which in turn calls another function -- move that outside of the loop and do it once. Generally buffer values that don't change and get used in loops, instead of fetching them by calling functions on each iteration.

Also, while it may not matter much in this case unless there are other sources of GC churn, just because I noticed it: don't create an array for the vertices for every character to then throw it away and recreate it, have one and keep reusing that.


If you have good generational GC (with relocation), then creating and destroying short-lived objects should be almost free?

In practice, you will be re-using the same areas of memory.


Go's GC is deliberately non-relocating, which precludes being generational. https://blog.golang.org/ismmkeynote The repeated state in question might be benefiting from escape analysis, but I can't tell from a cursory examination.


Sure, but almost free GC is free in the same way throwing away 3 cents is almost free. Once it adds up enough to cause lost frames, anyway. Just like a straw practically weighs nothing, except when it's the straw that causes the camel to miss a frame.

In this case, I also doubt it would make a difference by itself, but I haven't looked at all of the source, so I just mentioned it as something to maybe watch out for. Also, every bit of GC you can super easily avoid "buys" you room for GC that would make the code more complicated to avoid. Waste not, want not :)


Ok, drop the 'almost'. GC can be as 'free' as eg stack allocation (which nobody seems to mind to much).

See the comment by kibwen (https://news.ycombinator.com/item?id=18551218) for some good pointers to more concepts.


I've been using alacritty, another new-ish GPU accelerated terminal emulator, and I can confirm that its performance is better than gnome-terminal or similar. So at least there is something to this idea.


Yes, its README explicitly says:

> Alacritty is the fastest terminal emulator in existence. Using the GPU for rendering enables optimizations that simply aren't possible in other emulators.

I wonder though why people need fast terminal emulators (?) I'm using "xterm" and I haven't found any issues with speed.


I'm using only alacritty for the past year and at least on MacOS the rendering latency is in another level. The response is super fast while scrolling through text on less or vim, changing windows on tmux seems to happen instantly, even typing seems to happen without a delay compared to iTerm or Terminal.

One thing that I thought would justify this is that they don't support scrolling text by themselves and to have it you're required to rely on tmux or screen.


Alacritty supports scrolling as of version 0.2.0 https://github.com/jwilm/alacritty/releases/tag/v0.2.0


Coincidentally xterm is the fastest I know, in particular if you use bitmap fonts. Gnome-Terminal for instance feels rather sluggish.


A while back LWN did a great comparison of terminal emulators, performance was one aspect they looked at: https://lwn.net/Articles/751763/


Always nice to see one's own subjective experiences backed up by hard measurements :-)


In my experience, urxvt is both faster and uses less memory than xterm by a good margin, although there's not such an appreciable difference on a modern system. Gnome Terminal, on the other hand, is beyond slow, far worse than the still slow but acceptable Konsole.


You can read stuff as the text scrolls flying by.

Without acceleration it's a blur.


I sometimes dump megabytes of text to stdout/stderr as a way of monitoring or debugging long-running computations. Terminal speed matters then.


In those cases, I usually redirect the output to a file, so I can search through it.

Another approach is to run "screen", which has several advantages: (1) not all text needs to be written to the terminal, only the text when you actually look, (2) you can open the computation on a different computer later (e.g. perhaps at home to check if everything is ok), and (3) if you accidentally close the terminal the computation keeps running.

In both cases, my terminal emulator does not need to be fast, really.

My biggest issue with speed in the terminal comes from network latency (which is difficult to fix).


tmux is a modern alternative to screen.

Oh, and while you are at it, have a look at mosh as well. Mosh bills itself as the ssh alternative for mobile, intermittent connections but it does take the idea of 'not all text needs to be written to the terminal, only the text when you actually look' even further.

Mosh also has lots of network latency hiding tricks up its sleeve.


I bet you could implement a fast-enough terminal in QUIC, and keep the idea of non-permanent connection.


Rendering speed shouldn't matter in this case because the terminal shouldn't be trying to render every single line which is sent to it. The terminal should just process the stream and commands, and real updates to the screen.


If you pre-render the glyphs to a texture, you can draw them to the display as fast as your GPU can go. Changes in font size or face mean you'd have to pre-render them all over again, but that's still not terrible.


With a fixed width font in a single size surely caching glyphs goes a long way? Most of the characters probably come from a small set of characters, and obviously a cache can support full unicode. I guess you couldn't do subpixel hinting but an alpha blended character can be tinted any colour for FG/BG.


You can do caching with any renderer, GPU or not. Almost any text renderer caches glyphs to some degree. Of course, even fixed width fonts can have things like ligatures and composite characters which complicate matters. Rather than caching individual code points, you'd really need to cache something more like individual grapheme clusters.

Subpixel rendering is tricky, but should be possible to do with shaders. You could just render to a normal single channel texture at 3x the horizontal spatial resolution (basically, stretched 3x horizontally) then when rendering move 1 pixel across the glyph rendering for each subpixel, alpha blending individually.


Ah true. I guess I personally do not use a font with ligatures so it didn't really come to mind.

Subpixel renderering with shaders is a neat idea, is it something that has been done before?


I don't know; I assume yes because it seems possible and I doubt I'm the very first person to think about it.


Don't forget, some fonts (particularly emoji) have multicolor glyphs!


> You still have to render glyphs individually, almost always on CPU.

How do you figure? As I imagine it you would stream the buffer to the GPU and render it with a pixel shader. Even if layout or glyph calculation is done on the CPU it should be highly cacheable.


GPU acceleration doesn't change the dynamics of caching glyphs. Existing text renderers already cache aggressively. Also, worth noting that caching the rendering of individual code points won't work effectively even in all cases for a terminal, because of diacritics, ligatures, etc.

Of course it's all pixels. You can do it all in pixel shaders. But of course, it's a lot more complicated than it seems. Supporting RTL requires some pretty advanced layout logic. Supporting OpenType ligatures also requires some pretty complicated, stateful logic. And you probably want to support "wide" glyphs even for a fixed width font, which are present in terminals where you are dealing with, for example, kanji.

If you want subpixel AA, that's another complicated issue. If you want to be able to do subpixel AA where glyphs are not locked to the pixel grid, you will need to do more work.

If you want to be able to render glyphs on the GPU purely, you'll need to upload all of the geometries in some usable form. Most GPUs don't render curves, so you will probably turn them into triangles or quads. That's a lot of work to do and memory to utilize for an entire font of glyphs.

You also might think you could utilize GPUs to perform anti-aliasing, but the quality will be pretty bad if done naively, as GPUs don't tend to take very many samples when downsampling.

Since a lot of the work is stateful and difficult to parallelize, doing it on CPU will probably be faster, that way you only pay the latency to jump to the GPU once.


> Since a lot of the work is stateful and difficult to parallelize, doing it on CPU will probably be faster, that way you only pay the latency to jump to the GPU once.

You can still easily cache the glyphs post processed, especially if you don’t use subpixel AA. There isn’t that much state to a scrollback buffer post glyph processing.

I don’t get the resistance to this type of rendering when at this point there are at least three major monospace glyph rendering libraries implemented for the GPU, and I bet there are dozens I don’t know about.


No such resistance here; I've written text renderers myself. I'm just pointing out that it's not simple and there aren't trivial performance gains. Like I said, you can't really just cache codepoints. The way this particular terminal emulator does it, it's keeping a cache of individual codepoints. Even forgetting OpenType ligatures, this also won't work for things like diacritics.


it's great to see more terminal emulators, especially ones that have good unicode support and are trying to be faster than the average garbage.

There's at least one other terminal emulator that does things akin to the image-catting and viewing colors when you click them, namely terminology [0].

There's also at least one other that uses opengl for faster rendering (alacritty [1]).

I don't know of any other that do both these things; use opengl for speed and also innovate in interactivity.

[0]: https://www.enlightenment.org/about-terminology.md

[1]: https://github.com/jwilm/alacritty


Also: iTerm2 can use macOS’ Metal graphics API to great effect: https://gitlab.com/gnachman/iterm2/wikis/Metal-Renderer


Sixel is great. But you could also interpret the headers of common image file formats as terminal escape sequences, so that you can directly "cat lena.png" and see the image. Now, that would be a real hack!


The terminal couldn't do that as it wouldn't know the difference between a user typing `cat lena.png` into Bash from typing the same sequence of characters into vim (for example).

This is why there are separate commands like `icat` which will output ANSI escape sequences for rendering a PNG. Unfortunately a great many of these are terminal specific (iterm, terminology, kitty and VT200/VT300 (sixel) all have their own unique escape sequences. The closest to a standard we have is sixel and, frankly, I don't think it's that good on modern systems[1]). So you end up with different command for each "standard" you want rendered.

The approach I've taken in my own $SHELL is to have a builtin command like `cat` (I call it `open`) and that auto-detects which terminal emulator you're running and picks the right encoding to match what is supported by the terminal (if no standard can be auto-detected, then it falls back to ANSI art using coloured blocks). However the issue with that is now people need to run new custom shell instead Bash / Zsh / whatever. But the logic behind the auto-detection is really quite simple so that could be written it's own standalone tool.

[1] sixel re-encodes images in character blocks of 6 pixels. Where as iTerm, Kitty and Terminology send the image as is over the terminal via a base64 encoded string. All those solutions are inlined via ANSI escape sequences though.


> The terminal couldn't do that as it wouldn't know the difference between a user typing `cat lena.png` into Bash from typing the same sequence of characters into vim (for example).

This is not what I am talking about. I mean the binary headers inside png, jpg, tiff files, etc. These binary sequences can be easily interpreted as terminal control sequences, and they are very unlikely to appear otherwise. For pnm files I agree that the situation is a bit more delicate, as these sequences are likely to appear in the wild (e.g., in the pnm man page for example).


Ahhh. That's an interesting idea. I like it


This is an inevitable feature of the perfect terminal. The alternative is showing the binary contents of the image file in ASCII, which is always undesired.


Not just undesired; it can sometimes play havoc with your terminal session. I've had tmux status bars break; bash prompts go "weird" and all sorts of other glitches before when I've accidentally piped binary data to the STDOUT.

I'm very tempted to fork Aminal and have a play with your idea.


That would be lovely! I do a lot of image processing using unix pipes, and when I forget the "|display" at the end it breaks everything.

Notice that you do not even need to use sixels in your case. Since you control the rendering, you can simply put the pixels of the image on a rectangle (that allows paning and zooming, for example), and move the cursor below it.


It wouldn't be until I'm off for Christmas holidays (at the earliest) though. I don't have a whole lot of time left between work and family; and have a few projects on the go as it is already.


This is neat, and it's important to iterate on things we take for granted like what terminal emulator we use.

But a larger question, at what point do we focus on the novelty of the project, not that it was written in a specific language?


> the novelty of the project, not that it was written in a specific language?

That is the novelty:) Don't worry; we'll tire of Go in 1-50 years;)


It’s so that we, hackers, can see if it’s worth looking at the source. Personally I look at the source of Go or Rust projects because I’m not top proficient in the languages and it’s a way I can learn idioms.

In fact, since this is for a developer audience, I’d wonder why you wouldn’t say what language something is in.


I don’t think your question has any sense in a forum with the word “Hacker” in the title.


Once we've rewritten everything in go. More realistically, once the language is stable and known enough that these things become uninteresting.


Could you elaborate? I don't understand what you are implying


When you cannot do X in language Y stops being an issue.


It's great that there's a hotkey to web search for the current selection, but I'd not be surprised to see people requesting that it not be google. Know your audience and all that.


Seems like a config option would be easy enough.


Looks very cool. What’s the timeline for it being released for Windows?


It would be great to see this working with the new ConPty API.

Note: I work for Microsoft but I have never touched anything to do with ConPty.


> Retina display support

I want to know what is difficult to support this.


This may be referring to running under Linux with a high DPI monitor - a lot of programs which are naïve about this don't scale properly and are teeny-tiny.


Isn't OpenGL deprecated now? iTerm recently switched to Metal.


Since it's Linux compatible it's much simpler to support a global standard like open gl and mostly work everywhere rather than use whatever apple is pushing now and also support open gl anyway for Linux.

I assume they're open to contribution if someone wants to add Metal support for mac osx.


ETA on tmux -CC support?


I knew this would be created eventually but I was thinking it would just be a shell. This looks impressive at first glance, implementing everything with OpenGL.


I would have loved to see someone with a plan9-golang background creating a modern 9term-like "dumb" terminal. One that is mostly an editable text buffer with mouse plumbing capabilities.

Nowadays, terminals within vim/emacs is what's closer, feature wise. Not quite dumb though.


Does it not use VTE? If so that would be great.

Also, to the other commmentator- be glad its not written in "modern web technology". I keep 20 terminals open usually.


Edit: it's GTK that provides multithreading functionality, not VTE itself.

VTE based terminals are actually multithreaded globally, so you only ever have one process regardless of number of tabs/windows open.

But I agree, VTE is something I avoid ever since the scrollback fiasco. (For the uninitiated: the library wrote all scrollback to disk, which is not something end-users might expect security wise. Scrollback is encrypted now, but it appears you still cannot disable writing to disk, as e.g. the output of dmesg, consisting of 62210 bytes of output on my machine, causes 5-digits of bytes to be written each time on any VTE-based terminal I've tried even with "infinite" scrollback disabled. And none appear to honor TMPDIR).


I wish I understood it enough to throw it into GDB - it looks like it uses O_TMPFILE, but I don't think that quite gives you what you want.

https://github.com/GNOME/vte/blob/65d67f6f814df4f4ab800898bb...


The O_TMPFILE flag is just an anonymous file as far as I understand. From that source file, VTE uses glib, particularly g_get_tmp_dir() and g_file_open_tmp() (which uses the latter), which claims to honor TMPDIR according to the docs: https://developer.gnome.org/glib/stable/glib-Miscellaneous-U...

Perhaps a terminal emulator is unable to read variables exported in .bashrc, as the terminal is started prior to executing the shell.


> I keep 20 terminals open usually.

I prefer tmux windows; I have 1-3 terminal emulators and 10-30ish windows/tabs (not counting splits/panes), all on 1 tmux session via `tmux new-session -t $SESSION`. Acceptably navigatable via prefix-[1-9] and prefix-w (which gives a list to jump to).


I find tmux makes less sense within a tiling window manager because then you have two contexts effectively providing the same functions (or three, if you use vim :split (or four, if you use tilix)).


I also use a tiling wm, and use tmux. The main reason is that it means I can easily transition to using the same tool when working with remote machines using mosh. Personally, I've always felt odd when trying to use i3 tiling for terminals (maybe that's because I used tmux before I used i3).


Same- I don't even use split vim unless have to do heavy copy paste and avoid extra keystrokes to invoke system clipboard


Cannot edit previous comment so adding new-

Can you please add some details on Unicode support? Most terminals claim to have it but fail miserably on Asian (Asia continent) fonts.

1) RTL for west Asia

2) Indic fonts and dependent vowels

3) Double width cjk.

I genuinely believe that without all 3, no terminal emulator deserves to claim Unicode support.


And I am not sure how many of them support Arabic right to left either..


Arabia is west Asia :)


Or rather, Middle East.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: