It's doing an extra ECC operation on every connection, but that shouldn't take 500ms on a modern CPU.
The people who have reverse engineered the code say it also requires the command message to be bound to the ssh host key as well, so if the host key is an RSA key then it might be doing an extra RSA decryption operation on every connection as well?
Another interesting thing regarding the ECC is they use Ed448, compared to something conventional like ECDSA with P-256 or Ed25519, which is way slower (30x-ish slower verification):
At >400 operations per second, that doesn't explain most of the 2/second (500ms) operations the discoverer apparently observed. Were they running on an old Pi and you on a currently high-end CPU or so? Basically, what hardware is this benchmark on?
I mean it highly depends on the CPU so I only posted it to show the relative slowdown compared to ECDSA. I ran this on my free tier Google Cloud server so it is not some super CPU.
However yes, even on this, not so powerful CPU, it doesnt take 500ms so I dont think it explains it.
Thanks! That indeed sounds like it rules this out as the reason why it was found.
Curious that asymmetric crypto isn't even the slow part here. Feels like they just messed up somewhere (but I don't have the low-level asm/C skills to check that without considerable time investment)
I believe it's all the ELF parsing and disassembly in memory that happens on startup.
The really went crazy with the home-made x86 disassembler and everything they do with it. Must have been super fun to write though! Someone clearly had a great time coming up with clever ideas for a backdoor.
IIRC the original detection was because sshd was using a lot more CPU coping with the torrent of spam connections any sshd on the internet gets than it usually would.
sshd forks a new clean process for each connection, so this whole machinery happens each time you connect. Each connection is a full fork/exec, and the backdoor has to set itself up from scratch again, including all the parsing and hooking.
I have looked in the sshd code at https://github.com/openssh/openssh-portable and I cannot find it forking and re-execing _itself_. It forks and execs other commands, of course, and it forks to handle new connections but does not re-exec in those paths that I can see.
If some inetd-like program were listening on port 22 and fork+exec'ing sshd to handle each incoming connection, that would explain it. But on my systemd-based Linux system I see a long-running sshd that appears to be taking care of port 22.
I do agree that it seems like the best explanation of the delay is that somehow sshd was being exec'ed per connection, but I haven't seen all the dots connected yet.
You should also be able to see the error message "sshd re-exec requires execution with an absolute path" in the source. If you follow the `rexec_flag` tested above this message, you can see where it calls execv, later in the code.
The people who have reverse engineered the code say it also requires the command message to be bound to the ssh host key as well, so if the host key is an RSA key then it might be doing an extra RSA decryption operation on every connection as well?
That would probably do it.