It's doing an extra ECC operation on every connection, but that shouldn't take 5...

honzaik · 2024-04-13T10:12:12 1713003132

Another interesting thing regarding the ECC is they use Ed448, compared to something conventional like ECDSA with P-256 or Ed25519, which is way slower (30x-ish slower verification):

(OpenSSL benchmark numbers)

                              sign/s verify/s

   256 bits ecdsa (nistp256)   34642.6 11700.3

                              sign/s verify/s

   456 bits EdDSA (Ed448)   3209.5 409.5

There is basically no incentive to use Ed448 unless you think ECDSA with 256-bit curves is insecure or will become in the near future.

Aachen · 2024-04-13T11:14:37 1713006877

At >400 operations per second, that doesn't explain most of the 2/second (500ms) operations the discoverer apparently observed. Were they running on an old Pi and you on a currently high-end CPU or so? Basically, what hardware is this benchmark on?

honzaik · 2024-04-13T11:20:19 1713007219

I mean it highly depends on the CPU so I only posted it to show the relative slowdown compared to ECDSA. I ran this on my free tier Google Cloud server so it is not some super CPU.

However yes, even on this, not so powerful CPU, it doesnt take 500ms so I dont think it explains it.

Aachen · 2024-04-13T17:21:44 1713028904

Thanks! That indeed sounds like it rules this out as the reason why it was found.

Curious that asymmetric crypto isn't even the slow part here. Feels like they just messed up somewhere (but I don't have the low-level asm/C skills to check that without considerable time investment)

tux3 · 2024-04-13T08:27:34 1712996854

I believe it's all the ELF parsing and disassembly in memory that happens on startup.

The really went crazy with the home-made x86 disassembler and everything they do with it. Must have been super fun to write though! Someone clearly had a great time coming up with clever ideas for a backdoor.

pja · 2024-04-13T08:35:12 1712997312

That only happens once at startup though?

IIRC the original detection was because sshd was using a lot more CPU coping with the torrent of spam connections any sshd on the internet gets than it usually would.

tux3 · 2024-04-13T08:53:57 1712998437

sshd forks a new clean process for each connection, so this whole machinery happens each time you connect. Each connection is a full fork/exec, and the backdoor has to set itself up from scratch again, including all the parsing and hooking.

rsc · 2024-04-13T14:45:14 1713019514

I have looked in the sshd code at https://github.com/openssh/openssh-portable and I cannot find it forking and re-execing _itself_. It forks and execs other commands, of course, and it forks to handle new connections but does not re-exec in those paths that I can see.

If some inetd-like program were listening on port 22 and fork+exec'ing sshd to handle each incoming connection, that would explain it. But on my systemd-based Linux system I see a long-running sshd that appears to be taking care of port 22.

I do agree that it seems like the best explanation of the delay is that somehow sshd was being exec'ed per connection, but I haven't seen all the dots connected yet.

tux3 · 2024-04-13T15:02:34 1713020554

This is behavior introduced in OpenSSH 3.9 (grep for "re-exec" in the release notes https://www.openssh.com/txt/release-3.9)

You should also be able to see the error message "sshd re-exec requires execution with an absolute path" in the source. If you follow the `rexec_flag` tested above this message, you can see where it calls execv, later in the code.

rsc · 2024-04-15T00:44:21 1713141861

I see it. Thanks very much! That could certainly be clearer...

markhahn · 2024-04-13T14:55:07 1713020107

not ecc - this article mentioned all the effort put into hiding suspicious strings. all those string-obfuscating lookups took lots of extra cycles.