ht: Headless Terminal

MobiusHorizons · 2024-06-04T14:09:42 1717510182

Reading the readme, I find myself wondering what problems this solves for. The example of wrapping nano strikes me as particularly odd, since editing files is already fairly easy to do programmatically either via direct file operations or with tools like sed. Aside from editors, most tools that offer a tui also expose the functionality for programmatic access (typically with some additional flags to the same binary). So it just strikes me as the wrong abstraction to interact with most things.

The other potential goal I could imagine was automation. If that is the case, I would recommend the author to make that more clear in the examples and also to describe it in relation to `expect`, which would probably be my go to tool for such use cases.

Whatever the case it does look like a fun project, even if I don’t see any cases where it would be the right tool for the job.

andyk · 2024-06-04T16:38:54 1717519134

Hey, project lead here. I had a very specific use case in mind: I’m playing with using LLM agent frameworks for software engineering - like MemGPT, swe-agent, Langchain and my own hobby project called headlong (https://github.com/andyk/headlong). Headlong is focused on making it easy for a human to edit the thought history of an agent via a webapp. The longer term goal of headlong is collecting large-ish human curated datasets that intermix actions/observations/inner-thoughts and then use those data to fine-tune models to see if we can improve their reasoning.

While working on headlong I tried out and implemented a variety of ‘tools’ (i.e., functions) like editFile(), findFile(), sendText(), checkTime(), searchWeb(), etc., which the agents call using LLM function calling.

A bunch of these ended up being functions that interacted with an underlying terminal. This is similar to how swe-agent works actually.

But I figured instead of writing a bunch of functions that sit between the LLM and the terminal, maybe let the LLM use a terminal more like a human does, i.e., by “typing” input into it and looking at snapshots of the current state of it. Needed a way to get those stateful text snapshots though.

I first tried using tmux and also looked to see if any existing libs provide the same functionality. Couldn’t find anything so teamed up with Marcin to design and make ht.

playing with the agent using the terminal directly has evolved into a hypothesis that I’ve been exploring: the terminal may be the “one tool to rule them all” - i.e., if an agent learns to use a terminal well it can do most of what humans do on our computers. Or maybe terminal + browser are the “two tools to rule them all”?

Not sure how useful ht will be for other use cases, but maybe!

MobiusHorizons · 2024-06-04T20:18:19 1717532299

This makes a lot of sense. I would call that out, because it's really surprising out of context. Hopefully you can see how unusual it would be to try to use human interfaces from code for which in at least the majority of cases, there are programatic interfaces for each task that already exist, and would be much less bug prone / finicky. I guess the analogy would be choosing to use Webdriver to interact with a service for which there is already an API.

andyk · 2024-06-04T23:56:11 1717545371

done! in the "Alternatives and related projects" section I just added to the ht readme -- https://github.com/andyk/ht/blob/main/README.md#alternatives...

loa_in_ · 2024-06-05T00:09:41 1717546181

I'm surprised you didn't have luck with tmux as there's built-in buffer to file command

mariocesar · 2024-06-04T15:01:13 1717513273

I had this issue of needing to control Docker containers in a VPS, without sharing access to the server itself. It seems like it will be easy to create a simple web service that can communicate with the ht API, list my containers, show me the stats, and restart containers if I want to. I can manage all security in the web service.

This could be a nice case.

shanemhansen · 2024-06-04T15:44:48 1717515888

That would work but wouldn't making http requests to the docker socket itself be a little easier?

Example:

    curl --silent --unix-socket /run/user/1000/docker.sock http://v1.41/version

From: https://dev.to/smac89/curl-to-docker-through-sockets-1mhe

You could even do something like a reverse proxy to very limited paths although I tend to think that would ultimately be a bad idea and making your own http calls is probably better.

mbreese · 2024-06-04T16:13:29 1717517609

You probably don’t want to expose that service to the internet…

I see this as something like the console management for VPS. Back in the day, I remember reading about how prgmr.com had setup a console that you’d directly SSH into. That’s now this interface [1] (and a company name change), but I could see how programmatically working with this would be helpful.

[1] https://tornadovps.com/documentation/vps-console

johnmaguire · 2024-06-04T16:32:28 1717518748

The comment you're replying to mounted the Docker daemon as a local socket, accessible only on the machine. (It exposes an HTTP server still.)

I don't see why one would be any more comfortable exposing a shell to the internet than the Docker daemon. It grants _more_ capabilities. Either should likely be protected by authentication.

mbreese · 2024-06-04T18:28:22 1717525702

My understanding was that there was a server running Docker containers where the admin wanted to allow others to control/start containers without giving them access to the machine through a local login. The idea proposed was to make the docker port accessible to the outside world (authenticated, somehow).

I'm not sure I'd want to expose the Docker port to the outside world (or outside of a strictly firewalled subnet). Even if it is wrapped, this seems to dangerous to me.

The service I talked about is not a shell. It's a command line program that operates as the shell when you login via SSH. Instead of bash/zsh/etc, this program runs instead. The purpose is to give the VM admin access for out-of-band management (serial console, reinstalling the OS, etc). I'm a big fan of this approach, where you don't necessarily get full access to the host, but you do have enough access to do the work that's needed (and still SSH encrypted). No more, no less. To me, this seems like a great approach for something like restricted VPS or container admin.

I ended up doing something similar for an SSH jump box a few jobs back where you could setup some basic admin things (like uploading SSH keys) using a CLI program that was used as an SSH shell.

To bring it back to the original post -- like others, I had a hard time seeing what the OP could be used for. Until I thought about this OOB CLI. It would be great for scripting access to something like this.

shanemhansen · 2024-06-04T21:34:37 1717536877

I think I was very unclear. I didn't mount anything, the docker daemon by default is accessible over http (over a unix domain socket rather than the typical TCP).

I was proposing that this persons web app not do any sort of subprocess automation via something like ht, and instead take in requests and talk to the docker daemon on clients behalf. Since that allows any sort of authentication or filtering that needs to happen.

I wasn't really seriously proposing the straight reverse proxy setup. That's one of those layer violations like PostREST that is either genius or lunacy. I haven't figured out which one.

woodrowbarlow · 2024-06-04T14:37:41 1717511861

to me this seems essentially like 'screen/tmux without the multiplexing features' which is useful because most of us do 'terminal multiplexing' via our window manager and we're really just using screen because we want to detach the process from the terminal session (e.g. as a glorified nohup wrapper). another similar tool is `nq`.

ComputerGuru · 2024-06-05T00:53:45 1717548825

That’s “most of us who aren’t ssh’d into a remote machine” because ssh once and tmux thereafter beats ssh’ing in ever window/tab.

andrewshadura · 2024-06-04T22:23:13 1717539793

Try dtach

hitchstory · 2024-06-04T16:14:36 1717517676

I wrote an integration testing framework which I wanted to integrate with a tool exactly like this so it could be used to, e.g. test a command line app like vim.

Expect is what I tried to integrate with first. It falls over quite quickly with any kind of app that does anything mildly complicated with the terminal.

andyk · 2024-06-04T18:06:19 1717524379

Interesting. When we decided to build ht we didn't compare it to expect (which I hadn't heard of or used) but I'm comparing the two now as they seem related.

How exactly did `expect` fall over?

From what I can tell, expect does not provide the functionality of a stateful terminal server/client under the hood for you so it isn't as easy to grab "text" screenshots of a Terminal User Interface, which is one of the main motivations behind ht (will update the readme to make this main use-case more clear)

hitchstory · 2024-06-09T16:42:21 1717951341

Screenshots was one thing that didnt work that I needed but I think lots of control characters used by command line apps also messed up pexpect.

I built my own probably not very good equivalent of your thing called icommandlib. I'm going to investigate ripping it out and replacing it with your tool.

colinsane · 2024-06-04T15:46:48 1717516008

here's a terrible script which runs as root on all my boxes (as `redirect-tty /dev/tty1 unl0kr`): https://git.uninsane.org/colin/nix-files/src/commit/9189f18c...

none of the Linux greeters meet all my needs, so i fall back to `login`. but i still need a graphical program for actually entering in my password -- particularly because some of my devices don't have a physical keyboard (i.e. my phone). so i take the output of a framebuffer-capable on-screen-keyboard [1] and pipe that into `login`. but try actually doing that. try `cat mypassword.txt | login MobiusHorizons`. it doesn't work: `login` does some things on its stdin which only work on vtty. so instead i run login on /dev/tty1, and pipe the password into /dev/tty1 for the auth.

yes, this solution is terrible. a lot of things would make it less terrible. i could fix one of the greeters to work the way i need it (tried that). i could patch `login` (where it probably won't ever be upstreamed). i could integrate the OSK into the same input system the ttys use... or i could reach for `ht`. everything except the last one is a day or more of work.

1: https://gitlab.com/postmarketOS/buffybox/-/tree/master/unl0k...

tstack · 2024-06-04T22:29:30 1717540170

As others have kinda alluded to, it could be useful for testing TUI applications. I develop a logfile viewer for the terminal (https://lnav.org) and have a similar application[1] for testing, but it's a bit flaky. It produces/checks snapshots like [2]. I think the problems I run into are more around different versions of ncurses producing slightly different outputs.

[1] - https://github.com/tstack/lnav/blob/master/test/scripty.cc [2] - https://github.com/tstack/lnav/blob/master/test/tui-captures...

wolrah · 2024-06-04T20:58:42 1717534722

The immediate thought I had upon reading the description was "this would be great for Minecraft servers".

Most of us running Minecraft servers on Linux have it wrapped in screen or tmux because the CLI is the only way to issue certain commands including stopping it properly.

This could provide an alternative.

Xelynega · 2024-06-05T03:48:35 1717559315

What I typically do is create a systemd service for game servers and attach a TTY. That way it starts with the rest of my web services, and linux already handles "i/o to processes" via files that other process can access(e.x. /run/minecraft/{stdin,stdout,stderr})

wolrah · 2024-06-05T13:19:55 1717593595

Would you mind expanding on that, or can you point me at some relevant documentation?

I've never seen a systemd service example for Minecraft which allowed for sending commands to the server CLI and seeing the result without involving screen/tmux/etc. The top result on Google just doesn't allow command input at all, running the service "headless", the one on the official MC wiki uses screen, and the only other options I've seen use RCON which is neither secure nor does it show the responses you'd get on the MC console.

If there's a way to run just the straight Minecraft JAR as a background service and still be able to interact with it in the occasional cases where I need to I'm very interested.

Xelynega · 2024-06-06T22:00:25 1717711225

Oh yea one more comment, this stdin redirection isn't really necessary in minecraft from the last decade.

The minecraft server has a built-in RCON server running on a separate port than can be enabled(https://wiki.vg/RCON), and once enabled can be interacted with an RCON client(like https://github.com/Tiiffi/mcrcon).

So instead of redirecting stdin to a systemd process, you can also just leave stdin disconnected and use the built-in RCON server to do commands every so often.

Xelynega · 2024-06-06T21:51:37 1717710697

Basically, you setup the standard minecraft service and then create a "socket" in systemd to use as stdin for the process(relevant documentation in systemd.socket and systemd.exec).

For me this looks like

-- /etc/systemd/system/minecraft.service --

  [Unit]                                    
  Description="Minecraft server service"
  
  [Service]
  Environment=JAVA_HOME="/usr/lib/jvm/java-22-openjdk"
  WorkingDirectory=/home/steam/minecraft/1.20.6/
  ExecStart=/usr/lib/jvm/java-22-openjdk/bin/java -Xmx4096M -Xms1024M -jar /home/steam/minecraft/1.20.6/server.jar nogui
  User=steam
  Group=steam

  Sockets=minecraft.socket
  StandardInput=socket
  StandardOutput=journal
  StandardError=journal

  [Install]
  WantedBy=multi-user.target

-- /etc/systemd/system/minecraft.socket --

  [Socket]                                  
  ListenFIFO=%t/minecraft.stdin         
  Service=minecraft.service

-------------

What this will do is add a systemd dependency on minecraft.service to start minecraft.socket first(which creates the fifo `/run/minecraft.stdin`) then setup minecraft.service to listen to this socket for it's StandardInput(while leaving stdout and stderr pointing towards the journal).

The service can then be started and set to automatically start on boot(`systemctl daemon-reload && systemctl enable --now minecraft`). While running, data can be written to the socket file with `echo` and redirection(e.x. `echo "help" > /run/minecraft.stdin`), and the output will be visible in the journal(`journalctl -xef --unit minecraft.service`)

If you set stderr/out to go over the socket as well, then you can attach something like `screen` to it and use it like a typical TTY(or `telnet`).

This uses the file `/run/minecraft.stdin` as the socket, but the documentation for systemd.socket shows that this can also be a TCP port to listen for connections(and systemd.service shows using regular files, but then you have to manually set them up).

andyk · 2024-06-04T18:23:34 1717525414

andyk here. it's clear our readme is lacking use cases! adding some now. When we introduced ht on twitter I gave a little more context -- https://x.com/andykonwinski/status/1796589953205584234 -- but that should have been in the project readme.

Also a few people comparing to `expect`. I haven't used `expect` before, but it looks very cool. Their docs/readme seem only slightly more fleshed out than ours :-D Looks like the main way to use expect is via:

  spawn ...
  expect ...
  send ...
  expect ...
  etc.

so, the expect syntax seems targeted more towards testing where you simultaneously get the output from the underlying binary and then check if it's what you expect (thus the name I guess). I can't see if there is a way to just get the current terminal "view" (aka text screenshot) via an expect command?

ht is more geared towards scripting (or otherwise programmatically accessing) the terminal as a UI (aka Terminal UI). So ht always runs a terminal for you and gives you access to the current terminal state. Need to try out expect myself, but from what I can tell, it doesn't seem to always transparently run a Terminal for you.

There might already be some other existing tool that overlaps with the ht functionality, but we couldn't find it when looked around a bunch before building ht.

metadat · 2024-06-04T22:41:41 1717540901

Expect is The Original Way, and has been the standard since before I learned to program more than 20 years ago. :-D

Expect is also extra cool because of `autoexpect'.

  generate an Expect script from observing a (shell) session

https://manpages.ubuntu.com/manpages/focal/en/man1/autoexpec...

m0shen · 2024-06-04T21:42:30 1717537350

`expect` is absolutely geared towards scripting, as it's an extension of TCL. Though as far as getting a "current terminal view" `expect` has `term_expect`: https://core.tcl-lang.org/expect/file?name=example/term_expe...

andyk · 2024-06-04T23:33:58 1717544038

Sorry, my wording wasn't very clear. I wasn't trying to imply that ht is more geared towards scripting than `expect` (in fact I'd say `expect` is more scripting-oriented being an extension of a scripting language) but rather that ht is more geared towards scripting the terminal as a UI than `expect`.

Am I wrong about that? (I may very well be since I haven't used `expect` before)

follower · 2024-06-05T08:22:53 1717575773

Based on my understanding/recollection of `expect`, the concept is that you're scripting a command/process (or command sequence) via a "terminal connection" (or basic stdin/stdout), based on the (either complete or partial) "expected" dialogue response.

e.g.

1. make initial connect over ssh (e.g. spawn ssh cli process) 2. expect "login: " response 3. send "admin" 4. expect "password: " response 5. send "password" 6. expect "$ " 7. send "whoami\n" 8. etc etc

I guess that might in theory be possible to script a TUI with but I suspect it'd get pretty convoluted over an extended period of time.

(BTW I mentioned this in a comment elsewhere in this thread but check out https://crates.io/crates/termwiz to avoid re-inventing the wheel for a bunch of terminal-related functionality.)

m0shen · 2024-06-05T00:54:23 1717548863

I see what you're saying. When I was writing scripts in `expect`, I didn't really ever try to automate tui programs. So, this could absolutely be a better way to script the the terminal as a ui as you said.

I'll certainly tuck it into my toolbox. Thanks :)

dheera · 2024-06-05T00:23:37 1717547017

Thank you for this, this is exactly what I was needing last week and got into a hellhole of wrapping Python subprocess pipes in a class.

pzmarzly · 2024-06-04T13:37:27 1717508247

I think it is a really good idea to separate vt100 emulation "backend" from the UI. Then all terminal emulators could use a common implementation and just focus on displaying the text, instead of emulating quirks of 50-year old devices.

Using JSON as RPC also seems like a good idea, especially when SSH-ing in, as the language server protocol has shown.

That being said, I don't see this project doing anything (yet?) about escape sequences (for colors, clickable links, mouse and clipboard integrations, setting title and cwd, etc) and handling shortcuts (e.g. translating ctrl-c to sigint). There are very few terminal emulators that get all of that right (I think Kitty, iTerm, WezTerm), so it would be great if this project could lead towards more of them being written.

ku1ik · 2024-06-04T17:46:52 1717523212

Most sequences used by CLI apps that are supported by popular (widely used) terminal emulators are supported here, i.e. there's good compatibility with VT100/VT220/VT520/etc.

At the moment there's no support for mouse and clipboard - ht uses asciinema's avt for terminal emulation, where this was never needed (although may be added to avt if ht will need it).

Regarding the colors: internally there's full support for standard indexed colors (palette) and RGB. The "getView" call currently returns a plain text version of the screen buffer, stripped of all color information (plaing text is what andyk's headlong project, which uses ht, needs), but we can easily make it return color attributes for each cell or segment of text (either by modifying "getView", or adding complementary "getRichView" or sth like that).

ComputerGuru · 2024-06-05T00:55:32 1717548932

Look up OSC 52 for a standard that lets apps and the terminal exchange clipboard data.

mananaysiempre · 2024-06-04T17:13:43 1717521223

Is there an actually convincing simple protocol for a 2D character buffer? (As opposed to a jumped-up typescript like the DEC terminals.) I’ve tried looking at the IBM tradition, and while it’s certainly different, I wouldn’t say it’s better on this particular point.

(There is also the part where a VT100-style display is a bit more than just a character buffer—there’s rewrapping on resize, for example, which IIUC be on for some lines and off for others—but let’s assume we’re willing to postpone updating window contents until the resize is concluded so can afford to do that on the application/VT100-processor side. Even though that feels like 1995.)

follower · 2024-06-05T08:10:33 1717575033

FWIW WezTerm does seem to split much of its "generic terminal" functionality into separate published re-usable crates:

* https://lib.rs/gh/wez/wezterm/wezterm / https://crates.io/crates/termwiz

(Projects that build on this crate includes "ratatui" among others...)

In particular, I note at least these specific feature areas:

* "support functions for applications interested in either displaying data to a terminal or in building a terminal emulator": https://lib.rs/crates/termwiz

* "to help with parsing input received from a terminal": https://docs.rs/termwiz/latest/termwiz/input/index.html

* "Model a cell in the terminal display": https://docs.rs/termwiz/latest/termwiz/cell/index.html

* "parse escape sequences and attach semantic meaning": https://docs.rs/termwiz/latest/termwiz/escape/index.html

* "abstraction over a terminal device": https://docs.rs/termwiz/latest/termwiz/terminal/index.html

* https://docs.rs/termwiz/latest/termwiz/color/index.html

* "cross platform API for working with the psuedo terminal (pty) interfaces": https://lib.rs/crates/portable-pty

* "Low level escape sequence parser": https://lib.rs/crates/vtparse

So, seems like quite a lot there to help avoid re-inventing the wheel for terminal-related functionality (for andyk/ku1ik & others).

ku1ik · 2024-06-06T17:14:34 1717694074

ht uses asciinema’s avt as the underlying terminal emulator, so there wasn’t much re-invented here really. It was mostly glueing avt, PTY and JSON RPC together.

jauntywundrkind · 2024-06-04T14:34:09 1717511649

Combined with nohup, this could probably be useful for detaching/reattaching to long running processes across user sessions?

I'm back to using tmux again, but for a while I was using another program dtach, to start vim sessions that I could disconnect and reattach to. Inside neovim I'd have a bunch of terminals & buffers & what not, so it felt redundant having tmux also hosting an event higher level environment.

Dtach is super super lightweight. Tmux is keeping copies of each screen in memory, is doing some reprocessing. Dtach is basically a pipe wrapping input a program's input and output, data in data out.

I even wrote a little shell script to let me very quickly 'dta my-proje' which will go autocomplete my-project name and either open the existing dtach session in that project or go create dtach session with vim in it. https://github.com/jauntywunderkind/dtachment/blob/master/dt...

It would be interesting to see something like dtach used for automation or scripting, as it seems targeted for. The idea of being able to relay around the input and output feels like it should have some neat uses. There isn't really a protocol, afaik, and there's definitely no retained state for a getView like ht. But it should in many ways function similarly?

hexsprite · 2024-06-04T14:20:15 1717510815

Seems like it could be useful for e2e testing of command line applications

cpendery · 2024-06-04T17:56:09 1717523769

You could always try https://github.com/microsoft/tui-test too. It could still use some more polishing on my part though

lostmsu · 2024-06-04T18:47:20 1717526840

> npm i -D @microsoft/tui-test

sleeps_darkly · 2024-06-14T19:13:32 1718392412

It's pretty interesting. I have actually stumbled over this yesterday, because I needed to figure out a way to automate a very specialised and ancient TUI application (think car rental software that is still used worldwide). Meanwhile I've wrote a crude parser for ANSI escape sequences and managed a virtualised screen representation; but I'm also having a fair bit of problems sending commands.

ht seems to be almost "there" for me, and would allow me to easily build a sequence of actions. However, it's kinda missing the color representation, which is also a problem for me: I need to read the color of a specific row/column to know that a specific menu item is selected correctly, and then proceed from this.

Let's see to what it can evolve later.

Y_Y · 2024-06-04T13:49:35 1717508975

How does this relate to classic tools like `expect`?

blueflow · 2024-06-04T14:13:48 1717510428

Also there is script(1).

Zambyte · 2024-06-04T15:59:10 1717516750

I don't think you can reasonably use expect with TUIs, but you can use this with TUIs.

andyk · 2024-06-04T21:01:18 1717534878

I tried to contrast to `expect` in a couple of my other responses, but yeah this is my sense too after looking briefly at `expect` - that ht always transparently sets up a terminal for you under the hood and you interact with that so you can always grab a screenshot of any terminal UI.

I don’t think `expect` is targeted at this use case (though I am only learning about `expect` right now so could be wrong)

sesm · 2024-06-05T01:28:44 1717550924

Is it fair to say that 'expect vs ht' is the same as 'curl vs headless browser'?

ku1ik · 2024-06-05T07:44:22 1717573462

I believe it’s a pretty good analogy, yes.

ComputerGuru · 2024-06-04T16:08:53 1717517333

fish-shell does for its integration tests, and yes, it’s brutal :)

yjftsjthsd-h · 2024-06-04T14:01:54 1717509714

For one, this looks like a client/server model instead of one program directly controlling another.

xuhu · 2024-06-04T16:20:51 1717518051

"getView command allows obtaining a textual view of a terminal window."

electric_mayhem · 2024-06-04T13:51:00 1717509060

First impression: nifty.

Having sat with it a few minutes though, why would I choose this over the classic:

https://core.tcl-lang.org/expect/index

or any of its analogs such as: https://pkg.go.dev/github.com/google/goexpect https://pexpect.readthedocs.io/en/stable/ https://www.rubydoc.info/gems/ruby_expect/1.6.0/RubyExpect/E...

edit: ah. it preserves the UI. This has potential.

andyk · 2024-06-04T18:17:43 1717525063

thanks for surfacing `expect` to our attention. I'll add a compare/contrast to the ht readme

1vuio0pswjnm7 · 2024-06-04T23:50:54 1717545054

Thank you for providing a musl binary.

https://github.com/andyk/ht/releases/expanded_assets/v0.1.1

follower · 2024-06-05T08:37:32 1717576652

Indeed, maybe musl will finally be the long-term compatible Linux native binary format to one day replace win64+wine.

glibc compatibility breakage is the bane of my existence.

The situation is made worse by GitHub pushing people to build against the default extremely recent glibc in "ubuntu-latest" for binary artifact builds rather than what should be done: building against the oldest glibc possible (you can still do that within a "ubuntu-latest" container).

glibc breaks "version compatibility" at times for the most IMO ridiculous reasons but also has support for running binaries linked against older glibc versions on newer glibc that just gets completely ignored.

ComputerGuru · 2024-06-05T00:57:07 1717549027

Is it even feasible to provide gnu binaries? I only upload statically linked musl binaries as Linux releases for my cli stuff.

follower · 2024-06-05T08:43:18 1717576998

> Is it even feasible to provide gnu binaries?

It depends on one's definition of "feasible", I guess.

Building against the "most recent glibc pushed by GitHub" doesn't make it feasible but building against an older glibc (which you can still do on a container which is otherwise using a recent glibc) is at least somewhat feasible.

The primary benefit from building against musl is that it "coincidentally" means that even the people who don't care at all about supporting older systems end up doing so "accidentally" due to their desire to use musl (for, presumably, some other reason).

ku1ik · 2024-06-05T07:48:43 1717573723

You’re welcome :) The asset names got slightly weird in this release (and GH doesn’t let me change it because there’s a dot in the name and it prevents removal of file extension), but with a future release the GH action should name them better. But you would probably rename the downloaded file anyway.

remram · 2024-06-04T16:28:43 1717518523

I was wondering about this, is there a tmux API that is not command-line? A way to access the tmux socket directly?

I would rather not multiply the number of tools I use, even though ht might be more appropriate in a clean-room environment.

TickleSteve · 2024-06-04T16:30:43 1717518643

libtmux https://github.com/tmux-python/libtmux

andyk · 2024-06-04T17:08:01 1717520881

Oh this is cool. I looked at using tmux before we built ht because I’ve used screen and tmux forever. I didn’t find libtmux though. Will def check it out.

remram · 2024-06-04T18:20:58 1717525258

This is a cool wrapper, but it calls tmux on the command line.

I'm really confused as to why the tmux protocol can't be used directly, and entirely different systems like ht have to be created.

jeroenjanssens · 2024-06-04T17:49:51 1717523391

I've created tmuxr, an R package to manage tmux [0].

[0] https://github.com/jeroenjanssens/tmuxr

m1keil · 2024-06-04T13:30:33 1717507833

That's pretty interesting. What would be an example use case of this?

telotortium · 2024-06-04T13:39:57 1717508397

I’m guessing it’s mostly useful for writing a terminal emulator in the browser. In theory I think all input and output on the terminal is done via text, which may contain control codes, so you’d still have to write code to render the text, support mouse input and selection, etc.

craftkiller · 2024-06-04T14:22:14 1717510934

My first thought was using it as a compatibility layer for VT100-style CLI programs. Hypothetically, if we wanted to finally replace VT100 emulation and move to some new legacy-free protocol for terminals, we would need some sort of shim for running legacy VT100 programs and displaying them in our shiny new VT-next terminal. Similar to the `vt` program from plan 9.

guyman50 · 2024-06-04T13:40:54 1717508454

It looks like the use case is if you want to script the usage of a tui program. In most cases it would be best to script the operation yourself with sed rather than ht/nano. I could see this being useful for scripting internal tui tools without access to the source code.

andyk · 2024-06-04T17:10:45 1717521045

I shared the motivating use case for why Marcin and I built this (LLM agents using terminals) in a diff comment but I’ll also expand the readme to give examples of use cases.

npace12 · 2024-06-04T13:39:12 1717508352

could be useful for llm tools using the command line

andyk · 2024-06-04T17:12:01 1717521121

Yep this was my main reason for wanting it, though lots of other good ideas in these HN comments.

m1keil · 2024-06-04T22:14:30 1717539270

Ah, that makes sense, cheers.

thornycrackers · 2024-06-05T05:21:16 1717564876

This is an awesome project. It aligns with what I was hoping to get out of expect for certain interactive commands but had a hard time doing. Like selenium or cypress for terminals. I could see this being valuable for end to end testing developer tooling workflows. Maybe it was me not knowing enough about expect. Looking forward to seeing what comes next!

broknbottle · 2024-06-04T15:22:25 1717514545

Hmm this may have some use cases for macOS automation where there’s no MDM.

nxobject · 2024-06-04T14:41:19 1717512079

Every legacy IBM mainframe application from the 60s with a web interface would like to have a word... s/IBM 3270/VT100.

beaugunderson · 2024-06-04T23:52:16 1717545136

reminds me quite a bit of websocketd as well, which converts a stdin/stdout program to a websocket: http://websocketd.com/

rank0 · 2024-06-04T15:57:38 1717516658

Super cool! Also super difficult to secure if used server-side…

sigmonsays · 2024-06-04T15:38:07 1717515487

i dont see any way of actually typing key presses, like modifiers.

This project looks pretty interesting, maybe i'm missing something.

ku1ik · 2024-06-04T18:17:52 1717525072

To expand on what andyk wrote in the sibling comment:

Programs running in a terminal don't get individual components of composite key presses such as ctrl+a, shift+b, so they don't see "a with ctrl modifier" or "b with shift modifier". The modifier keys are handled by terminal emulator before sending the key's ascii value to the program, modifying the regular ascii letters appropriately. So when "a" (ascii value 0x61) is pressed while holding shift, its ascii value is ... shifted (down) by a constant 0x20, making it ascii 0x41, which represents "A". Similar with ctrl key, which shifts down the ascii value by 0x60, turning "a" into 0x01. So to send "ctrl+d" you send input with a single byte of value 0x04 ("d" ascii 0x64 minus 0x60). ht uses PTY under the hood, and this is how you send keyboard input into to a program via a PTY. This is kinda low level though, and there's definitely a possibility of implementing a high level input method in ht, which would parse string such as "<ctrl+d>" and automatically turn it into 0x04 before sending it to the process.

In other words, the way input in ht works right now was the easiest, simplest way of implementing this to get it out the door.

andyk · 2024-06-04T17:18:31 1717521511

Include in your input json the ascii control character that the keyboard combo would generate (e.g., \x03 for ctrl-c).

To send control-c to the terminal, for example, you'd send the following JSON message to ht:

  { "type": "input", "payload": "\x03" }

pama · 2024-06-04T14:29:32 1717511372

This could simplify RL on ncurses codes.

blueflow · 2024-06-04T14:50:12 1717512612

What is RL referring to?

theblazehen · 2024-06-04T15:05:50 1717513550

Potentially readline? https://en.wikipedia.org/wiki/GNU_Readline

throwanem · 2024-06-04T15:42:31 1717515751

More likely reinforcement learning, I think.

pama · 2024-06-05T00:27:06 1717547226

Yes reinforcement learning. Apologies for the jargon.

doubloon · 2024-06-04T17:56:33 1717523793

Would be awesome for ui testing.

no-dr-onboard · 2024-06-04T17:02:58 1717520578

seems really interesting for fuzzing.