The surprising thing here is that this is the Ubuntu terminal. Indeed, each keystroke goes through the input driver, to USER32, through the console code, to the terminal process in question. But then it goes through a virtual input device, across the VM boundary (which, on Windows is a VM exit plus an entry), through the Linux input driver, into dash (or whatever Ubuntu uses these days), and then all the way back out again, and then it gets rendered.
The fact that this whole dance is faster than notepad.exe suggests that Linux is doing pretty well, that MS’s hypervisor stack is performing well, and that the fancier MS frameworks are really quite slow.
(To be fair, the GNOME Wayland stack has recently been upgraded from awful to sort-of-decent latency-wise. It’s all too easy to design a system where latency wasn’t a primary consideration from day 1 and to have a very hard time improving latency after the fact.)
I think most of the latency between key press and rendering to screen is in the layout / drawing phase.
Forwarding a few function calls is usually pretty fast, the text layout and rendering part is what is slow. Especially since many UI frameworks don't render individual characters; you may end up redrawing a full line or more for every keystroke. (I only have experience with macOS / Cocoa rendering, and I was surprised how slow it is compared to all other things my code does)
All of the actual work can easily happen in under a millisecond. What makes input to screen output slow is anything asynchronous - polling (on the input side, e.g. in the USB protocol itself - USB to PCI uses interrupts!) and waiting for the next frame on the output side. Let's say the console waits for vsync to render its next frame, then the compositor grabs it and renders it in the next frame. That is one frame of gratuitous latency.
Then there's also the input lag of your monitor too which is really important.
A lot of monitors have really bad input lag, in the 50-60ms range and it's highly variable. This spec is also not usually listed by the manufacturer either and it's not the same thing as response time which is typically 1-10ms in most modern LCDs.
Your monitor's input lag plays a very big role in how fast key presses are perceived because ultimately what makes something feel fast and snappy requires an end to end measurement of you pressing a key and then your eyeballs being able to register it.
The monitor I picked has about 10-14ms of input lag which is very good compared to the average. That's running at 2560x1440 (60hz) at 1:1 scaling too.
I still use the same monitor today and I would buy it again today if I were thinking about upgrading. Although I kind of regret writing that blog post now because the monitor is almost twice as expensive today as it was 3 years ago.
> Physical size doesn’t constitute how much you can fit on a monitor. For example my mom thinks that a 25” 1080p monitor is going to let her fit more things on her screen than a 22” 1080p monitor. Don’t be my mom!
> The only thing that matters for “fitting more stuff on the screen” is the resolution of the monitor.
This is only true under the assumption that your eyes have infinite resolution. In the more likely case that they don't, the larger size of the pixels at a higher physical screen size means you need fewer pixels, with the result that you can indeed fit more stuff on the screen at the same resolution.
It's not just eye resolution, its how the application is designed. An older WinForms application will have very compact ui and mad information density, but some modern web apps will dedicate an entire 1080p window to display 1 icon and two buttons.
I would say that combined with DPI scaling, his port provides a reasonable rule of thumb.
One frame of latency is not gratuitous; it's perfectly reasonable to not draw directly "ahead of the beam" and instead double-buffer. Otherwise you get a lot of tearing.
I very much doubt the console is watching vsync either.
The extra frame that I mean is: console renders, vsync, compositor grabs it and renders, vsync. That is what happens on X11 with compositing AFAIK. Only one frame time / one vsync is really required before you consider specifics of the software involved. I've read somewhere that Windows also has an extra frame of latency for similar reasons as X11.
In terms of key press latency it makes no difference. Both the old and new MS terminal have about the same input latency.
The problem is, the MS terminal doesn't work well with tmux which is a 100% deal breaker.
WSL2's I/O is better but that's out of scope for this discussion and also a pretty sketchy thing to run as of today since it requires opting into the insider's release of Windows which has very questionable telemetry requirements.
Depends on the operation in question. Direct syscalls should be faster, but there are some big differences between the NT and Linux kernels. You end up getting all the poor performance areas of BOTH. That, and HyperV has an extremely fast lightweight VM mode for particular hardware. For example, running containers on HyperV actually runs them with VM level isolation, a la Kata containers... and at competitive speeds.
An example of the worst of both worlds result is the filesystem. In the NT kernel, every filesystem call goes through a series of Filters: any program can provide filters for given file types. Filters can do anything, from antivirus programs scanning the file on access, to Notepad adding itself to the Context Menu as an "open with Notepad" item. This makes filesystem access - even metadata checks - very time expensive on Windows. On Linux, we keep file metadata in memory, which is memory expensive, but it makes certain filesystem operations very fast. The result of mapping one onto the other is a combined filesystem operation which is memory expensive AND slow.
So for example, on WSL 1 (mapped syscalls), using a git repo of any size is impossibly slow... on the order of 10 seconds to get a git status for a work project. But on WSL2, the same syscalls live entirely inside the VM and are almost native fast.
The fact that this whole dance is faster than notepad.exe suggests that Linux is doing pretty well, that MS’s hypervisor stack is performing well, and that the fancier MS frameworks are really quite slow.
(To be fair, the GNOME Wayland stack has recently been upgraded from awful to sort-of-decent latency-wise. It’s all too easy to design a system where latency wasn’t a primary consideration from day 1 and to have a very hard time improving latency after the fact.)