OP may also benefit from the use of an SSH honeypot. I use kippo (https://code.google.com/p/kippo/) with great success. It tracks all commands run, as well as keeps copies of all downloaded files.
In addition, it limits available commands to a certain predefined subset, allowing the host to prevent damage caused (e.g. a DoS attack in this case) by the system being compromised.
I ran kippo for a while and it seemed that all attackers were trying to upload files over SCP, which kippo does not support. A few attackers resorted to logging in and downloading with wget. However, the vast majority of attacks ended with a failed SCP session.
If you like this kind of stuff, http://www.honeyd.org/ is pretty full featured as well and provides a lot more emulated services (http, ftp, network file shares, smtp). It is also built so it is (relatively) easy to add your own emulated services.
Not to say that kippo isn't good, I didn't look too closely but it seems to be mainly focused on ssh and terminal capture.
Would it have recorded also statistics like the connection activity?
Seeing all the UDP traffic, and being able to trace its origin to the "@udp1 39.115.244.150 800 300" command, received not via shell but via a TCP connection, was pretty cool.
The original article is a bit unclear on this, but the command was not directly received from a TCP connection initiated by the botnet owner.
Rather, his server connected to some IRC server and joined a channel with all his other bot friends. The owner then sent the command to the IRC channel, and it is then broadcasted to all bots by whatever IRC server he is using.
(This is why lots of IaaS providers will forbid you from hosting an IRC server, and sometimes block all IRC traffic (by port anyway) on their networks)
I don't think it measures the inbound/outbound bandwidth - especially over UDP. It's more of an SSH emulator (so to speak) with everything being logged (commands, files, etc.).
In addition, there are quite a few good visualization tools to show the logs made by kippo. You can save them to a db, plot them nicely, etc.
I anonymized the IP addresses in a consistent way before publishing the blog post, so in the very worst case the DoS attack will go towards a completely new host :)
In my case it was 5 hours http://www.fduran.com/blog/honeypots/ so although that's just another anecdote if you put up a server with an obvious dictionary ssh password, expect it to be compromised within hours.
At least in the early days of EC2, there were more than a few higher profile AMIs with password logins enabled, e.g. some of Oracle's AMIs used a trivial default password.
As far as I know, you can still create/publish AMIs where password auth is enabled, but all of Amazon's stock images only allow ssh-key auth.
Genius idea. Love it. Shared it with my favourite web host. I hope more security companies think like you do and do this type of reverse-phishing on the bad guys ;)
Most security companies do this; the term for a monitored, weakly-secured server like this is a "honeypot". It's a great way to find out about exploits in the wild, which is really valuable knowledge for every hat.
Having said that, it's amazing what OP could do with just one monitoring tool. Very impressive.
There is a company called Smart Honeypot (http://smarthoneypot.com) offering this service. I believe they are using combination of these techniques to track attackers.
The point is to make so much of these honeypot so economically make it difficult for attackers to freely run these scripted attacks. Imagine out of 100 attempts for SSH, 90% be a honeypot. This will massively waste attacker's time and effort.
Cool article! A friend and I once did this but then recorded the commands attackers ran and replayed them on a big tv in our office. We called it hacker fishtank.
I had a similar experience as the OP years ago. Luckily, the attacker forgot to erase a bash history so I could recover almost all the command lines. It seems that most of these operations were pretty standardized; they first downloaded a bunch of exploits from another cracked site, then in my case they tried to run a local exploit to gain a hole in kernel (it was a Linux 2.4 box whose privilege escalation bug was fixed just weeks ago). I had a patch applied to the kernel so it didn't succeed. Then they started an IRC bot as a disguised name (like /usr/X11/X or something) and left. I felt embarrassed but in hindsight it was a pretty good lesson.
Wouldn't it have been better if the attacker had removed only the last few lines recording his commands from the log files instead of the entire files? Wouldn't the lack of continuity in the log files be very noticeable?
Also, is this a script running this sequence of commands or an actual person?
And, is there a log somewhere on the system of 'make' activity?
1) Yes, it would have been better but I honestly think this attack was completely botnet-driven and the attacker didn't really mean to cover his footprints too much: in the timespan of 10 minutes, he sent over 800 MB of UDP traffic. That would have been caught even by the most oblivious sysadmin pretty quickly, so these guys are just playing a number game, trying to break in as many hosts as they can knowing that the lifespan of the hacked hosts will be very short, maximizing the short-term profit then.
2) The attacker directly ran these commands on the login shell (no script was copied over scp or something else), so there was no script executed on the host itself, but the whole thing lasted roughly 2 minutes and a lot of commands were "typed", so I am almost sure this was just an automated script ran from another probably compromised host.
3) I didn't check if the build left logs, but by showing every executed process with "evt.type=execve" (which goes deeper than the spy_users chisel) you can see all the processes executed by the build: 99% are just uninteresting sed/gcc/autoconf.
I'm one of of the sysdig creators.
Sysdig is a pretty young project (we released it around a month ago), so I can't promise it will be flawless in a production environment. However, we've had many installations under several different environments, and during the month after the release we had extremely few crash reports, which we've worked to fix right away.
(Since HN isn't letting me post a reply under chippy1337's comment, I'll post it here.)
I found chippy's comment interesting and helpful and don't know why it was downvoted to hell while other "+1"-style comments (that didn't add any value) are left as-is: https://twitter.com/taoeffect/status/464090445677481985
Yes, I didn't put it in the article because it was getting too long otherwise, but the attacker immediately tried brute-forcing the root account, and after a handful of common passwords ("qwerty", "qwerty123", "pizza" among those) he found "password".
I was able to find all the attempts by looking at the I/O activity of the sshd process, and also the syslog activity recorded every attempt.
Doesn't your system refuse root login by ssh by default ? If I remember correctly, on ubuntu server, sshd is configured by default to not allow root login from remote addresses.
On some providers yes, in fact I explicitly enabled root SSH login for those.
Other providers (such as Digital Ocean) use the root account by default even for Ubuntu, although the password is set to a really secure and random one.
In addition, it limits available commands to a certain predefined subset, allowing the host to prevent damage caused (e.g. a DoS attack in this case) by the system being compromised.