Hacker News new | past | comments | ask | show | jobs | submit login
Linux Threat Hunting: ‘Syslogk’ a kernel rootkit found in the wild (avast.io)
140 points by rmdoss on June 20, 2022 | hide | past | favorite | 54 comments



Seems to only relate to RHEL 6, or derivatives of, such as CentOS 6. Yes: 6. Which is as EOL as enterprise software gets: https://access.redhat.com/support/policy/updates/errata#Life...


Comparatively, RHEL 6 is still kind of fine, at least it is still officially supported as virtualized OS in oVirt...

We run a lot of CentOS 5 virtual machines (and some physical ones! ; and some RHEL4! , and a few Fedora core 8 and 4 !!!), with no end in sight... :(

It is a huge concern for the Infra team, a source of many headaches, and we need to go through oops to keep them running, but:

- Clients don't want to move from OLDPRODUCT that requires extremely old php.

- Dev team is not interested in migrating OLDPRODUCT to a modern platform, or even try to put it in a container. Their eyes are turned to the shiny NEWPRODUCT that is seemingly never fully coming to production (only one client has signed for it).

- New clients are still regularly signed on OLDPRODUCT.

- No one in the org wants to pay for for a migration anyway.

- Since some clients have complained about poor apparent security, what was visible was just hidden behind newer haproxy.


> Dev team is not interested in migrating OLDPRODUCT to a modern platform, or even try to put it in a container.

Surely this wouldn't take more than 2 weeks: just figure out the install instructions for the old piece of software, rewrite them as a part of a Dockerfile (or similar set of instructions to build an OCI image, there are other options out there, too), setup some basic CI/CD which will execute Docker/Podman/... build command and update any relevant documentation.

I actually recently did that with some legacy software, apart from additional optimizations like clearing yum/dnf cache after installs, it was pretty quick and also easy to do! If you are also in a situation where you need to use EOL software, I don't think there are many other decent options out there, short of trying to reproduce the old runtime on a new OS (as others suggested).

Running the old EOL OS will simply leave you with more vulnerabilities with no fixes in the long term and is generally a really bad idea. How did your security team greenlight that idea? In some orgs out there the devs would be made to stop doing whatever they're doing (outside of critical prod fixes) and would be forced to migrate to something newer before proceeding.


> Surely this wouldn't take more than 2 weeks

Which, based on what previous commentator said, is just about 2 more weeks than anyone want to spend on it.

If no one wants to do it, it doesn't matter if it takes 2 days or 2 months, it won't get done.


And if you take it upon yourself to try, suddenly you become the sole point of contact for anything that ever goes wrong with it from then on, even if it would have failed before the change.


OT, but: why don't you just compile up an old php version from source on a new OS? It's a bit of a hassle the first time you do it, sure, but less than the hassle of running multiple legacy OS?


The legacy OS is even less hassle, because it never gets any updates and just sits there.

If you compile old software on a new OS, every single update to that OS has a chance to blow up your compiled old version, so it takes more hand-holding.

Can be worth it at times, but other times it's just easier to firewall and hope for the best.


This sounds so familiar. Running an old OS that gets no updates and patches because of an app that relies on an old php version, also the fear of migration costs and an open ended project that never finishes. This is so wide spread it becomes a meme.


Are you sure you want to disclose your employer's security risks and tech debts on a public forum?


I think that as long as the person remains mostly anonymous (personally I've no interest in digging further, for this exact reason), things like this need to be talked about.

Sure, lying about some of the details while getting the gist across might be a good idea (e.g. naming a different OS when you actually have Ubuntu), but there definitely needs to be discourse about the circumstances that people are dealing with in reality, instead of everyone pretending that they are on the forefront of the industry with their security and other practices.

The more you think about it, the murkier everything gets - talking about how things were in a company 5 years ago that's now defunct might be the best possible circumstance, but in a currently active org that might also be a way of getting yourself sacked, depending on how juicy of a target it is and how much attention it attracts.

That said, the company has made the choice to use EOL software, their clients have made the choice to use EOL software and it's bad practices all around - honestly, should any of the software be exposed publicly, I'm pretty sure that this person's comments won't be the first to call attention to the setup, since nation X's hackers/crackers might have already ran automated attacks against it anyways.

In summary, bad practices probably deserve to get called out, just so we know what the situation in the industry is like in reality, but hopefully not at too great of a personal expense.


https://lwn.net/Articles/863008/

About 1 million CentOS 6 boxes of some description hitting the update servers as of July 2021. Compared with roughly 2 million on v7 and half a million on v8.

Would be interesting to know how those percentages have changed since then.


This is one of the reasons why if we ever touch a box (say to update whatever it is doing) we also bring it to the latest possible version of the OS we can find; luckily for us CentOS exploded just before the last one, so we moved to Ubuntu instead of CentOS 8 which was our original plan.


They could well be talking about any company that has multiple legacy products from two decades ago still running, which is quite a few of them...


Yeah, let's not underestimate how common this is. Narrowing it down would be neatly impossible. And that's even considering parent was fully truthful which he might not have been to preserve anonymity.


I think they know exactly what they're doing. Nothing bumps an upgrade up the backlog like an attack on an old cashcow.


They didn't say who they were working for, so why not?


>NEWPRODUCT that is seemingly never fully coming to production

>- New clients are still regularly signed on OLDPRODUCT.

I mean what's the WHY behind that? Why don't even new customers sign on to the new product? Why is the new product not in production? Is that the same reason?


NEWPRODUCT is almost always designed by Sales and OLDPRODUCT is old enough to have enough actual users that it works and does what is needed.

The end result is usually to force everyone to move to NEWPRODUCT and deprecate OLDPRODUCT violently (Salesforce Lightning vs Classic, etc, etc). Hopefully enough fixes for NEWPRODUCT get done before all the customers leave.


I am too familiar with this dilemma. A work-around that comes with some caveats is to disable loading of modules. [1] There are more caveats than the article mentions including rebuilding initrd/initramfs can break unless you first reboot. Do not set the settings they mention in a persistent file like sysctl.conf.

[1] - https://linux-audit.com/increase-kernel-integrity-with-disab...


> - Clients don't want to move from OLDPRODUCT that requires extremely old php

I know that problem for a thankfully long gone Java internal application, and well... I went with running old stuff in `debian/eol` Docker containers [1]. Turns out you actually can use Docker as a sort of extremely lightweight VM service.

[1] https://hub.docker.com/r/debian/eol/


> at least it is still officially supported as virtualized OS in oVirt...

If they're not providing security fixes for RHEL 6's packages, then why does this matter?


RHEL 6 is in Extended Lifecycle Support until June 2024 (that is: customers with suitable subscriptions still get critical patches). It’s a zombie, but it’s not quite dead yet. I’d bet that there are still enough (+) people out there running it.

(+) or rather too many.


There are paying customers. It might not be shiny/fun, but there is a reason Red Hat became the first one-billion dollar open-source company in 2012.


Wouldn't IBM be the title holders there?


"open-source company" is kind of an subjective term at this point I guess, so hard to say.

But what can be said, is that even though IBM contributes a lot to open source, not many would claim IBM is a "open-source company" I think, at least when compared to Red Hat.


Yeah, I get that distinction, and the vagueness of the phrase 'open source' doesn't help with this kind of definition.

If IBM put US$1B into 'Linux' 22 years ago - but this was a small part of their operating budget at the time - do we look at absolute or comparative value?

If IBM buys, three years ago, RedHat for US$34B, does that mean IBM is now the biggest 'open source' company? (If so, the next question is obvious.)


For centos6 there is cloudlinux also providing extended release support (paid) for those who don't have a RHEL subscription until 2024.


Despite that, RHEL 6 appears to be entrenched. Rust has been proposing to update its baseline Linux and glibc versions to circa-2012 vintage, which would exclude RHEL 6, and has been receiving pushback from people whose customers still use RHEL 6.


OpenBSD has removed loadable kernel modules back in 2014; macOS is aggressively moving in the same direction. Meanwhile - is running a Linux system without module support even viable these days?

$ du -sh /lib/modules/$(uname -r)

294M /lib/modules/5.10.0-15-amd64


It's not that hard to run Linux without modules, I've been doing it on my laptop for a decade.

Just build the kernel and set the right options, this is for a Dell XPS13: https://github.com/jcalvinowens/misc/blob/main/kbuild/config...

It takes a few hours to whittle it down for a particular piece of hardware, but I've never broken anything on Debian by running kernels built with CONFIG_MODULE=n.

* Edited for clarity


What is that and how is it used?


Sorry that was really unclear, I edited. It's the kernel build configuration for my laptop, with module loading disabled. All the drivers are statically linked.


You can disable module loading at any time by writing to a /proc file: echo 1 > /proc/sys/kernel/modules_disabled

(you must reboot to re-enable module loading)

Useful on servers where specifying all modules to load is practical (netfilter modules are usually the only new modules unless hardware changes). But, on a workstation, doing so will be very frustrating unless you never plug in any new usb devices etc.


> But, on a workstation, doing so will be very frustrating unless you never plug in any new usb devices etc.

If you know what the devices you are likely to plug in, you could just modprobe them all before disabling it.


My impression is that Darwin did it by moving more drivers directly into user space. But yes, you can absolutely run Linux with everything statically compiled into the kernel as long as you're not using some handful of things that resist it (below comment mentions nvidia, ZFS). You can even run without an initial ramdisk if you're not doing RAID or ZFS or encrypted disks or something like that.

Edit: I should mention, this will either result in a massive kernel that consumes a lot of memory, or in very little driver support and your machine will not tend to just work when you plug new devices in. Linux has a lot of drivers; there's a reason why it uses modules.


I wonder if you can force the code to compile ZFS in, since the license problem is one of distribution not of user/runtime.

Ubuntu might not be able to distribute said "no module" kernel, but it might run.


I believe ZFS at least used to have an option to insert itself directly into a Linux source tree, in which case it would look just like a normal driver. I don't know if that still exists and I never tried it, but it was a thing. Note that you probably still need an initial ramdisk to get the userspace tools to actually bring a pool online if you're using it for root.

Edit: I'm having trouble finding it in the official documentation, but here's a page that describes how to do it on an old version: https://slackwiki.com/ZFS_root_(builtin) and here's what looks like a script to do that on the current tip of master: https://github.com/openzfs/zfs/blob/master/copy-builtin


How is the vector of persistence of any significance here? At some point the attackers got root access on your system, game over.


I believe the concern is that the attackers gain root access on system A but hide their presence/activity - even in the presence of logs to remote, more trusted server B.

https://github.com/c-blake/kslog has maybe a little more color on this topic, though I'm sure there are whole volumes written about it elsewhere. :)

EDIT: But maybe your "game over" point is just that it is kind of a pipe dream to hope to block all concealment tactics? That may be fair, but I think a lot of security folks cling to that dream. :)


> I believe the concern is that the attackers gain root access on system A but hide their presence/activity - even in the presence of logs to remote, more trusted server B.

That's generally called pivoting and has nothing to do with method of persistence of the malicious code.

OP makes a point that certain systems move or have moved away from giving root user the ability to extend/modify kernel code at runtime via kernel modules, my argument is that none of that matters since root user can still extend/modify kernel code at runtime via binary patching.


> my argument is that none of that matters since root user can still extend/modify kernel code at runtime via binary patching.

OpenBSD restricts that ability as well[1]. Neither /dev/mem nor /dev/kmem can be opened (read or write) during normal multi-user operation; you have to enter single-user mode (which requires serial console or physical access to achieve anything useful). Raw disk devices of mounted partitions can't be altered, immutable/append-only files can't be altered, etc.

You can also choose to completely prohibit access to raw disk devices, although that gets annoying when you e.g. need to format an external drive. There is of course still a lot of potential to do harm as root, but it's not as easy to create a persistent threat or resist in-system analysis by an administrator.

[1]: https://man.openbsd.org/securelevel


From your link:

> securelevel may no longer be lowered except by init

> The list of securelevel's effects may not be comprehensive.

So yes, it's a nice sandbox that can help prevent accidents, but doesn't sound like something you should rely on for actual defense.


You sound like you're dismissing it, but even if it wasn't all that useful on its own, it's a part of defense in depth strategy - it's just one layer in a carefully thought out system. Pledge/unveil is another, so is privsep+imsg, W^X, (K)ASLR, syscall origin verification, boot-time libc/kernel relinking, and a couple dozen other features I can't even recall now.

Most importantly, all of these features and mitigations are enabled by default, and are pretty much invisible to the end user or administrator; and actually easy to use for a developer. Contrast this with e.g. seccomp or SELinux. Google is even hinting "selinux permissive" and "selinux disable" in top 3 suggestions...


Ah. I misunderstood your "persistence" to mean "persistence of logs" not "of code/illicit powers". Sorry - I read too quickly.

I do think the defense mentality, as evidenced by many comments in this thread, remains a bit too much about "how challenging to make things" rather than the "in theory possibility". Besides binary patching a static kernel as you say, for example, you could have remote hashes of all relevant files a la tripwire, and remote access and programs to check said hashes. If the attacker can detect and adapt to a hash checking pattern then they can "provide the old file" for some purposes/etc. to hide their presence. To do so they have to also write the code to detect/conditionalize. The rationale of this defense mentality seems to hope for a "distribution of attacker laziness" that may at least "help", but sure - it is just a higher, finite bar. And once the work has been done..game over. But I do not mean to belabor the obvious. Anyway, thanks for clarifying your argument.


Aye, I think what you're describing is "security by obscurity" - i.e. the capability is still there, I'm just counting on the attacker not knowing that it is because I've hidden it so well. It can work really well in combination with actual security practices, but it absolutely shouldn't be considered one.


Certainly viable but your non-modular system might not support all the features you want. The Linux system I am using to post this comment is running without CONFIG_MODULES and have been so for years, but I am not using ZFS nor anything from NVIDIA.


     ~ # zgrep -F CONFIG_MODULES /proc/config.gz 
    CONFIG_MODULES_USE_ELF_RELA=y
    # CONFIG_MODULES is not set
    CONFIG_MODULES_TREE_LOOKUP=y


I'm not sure I understand, how OpenBSD load drivers then?


IIRC the *BSDs are much more likely to recompile the kernel and reboot if needed for new hardware, whereas most Linux distributions have gone the "build every possible driver in the world as a module, load as needed" route.


They're all statically linked into the kernel.

$ uname -sr

OpenBSD 7.1

$ du -sh /bsd

22.0M /bsd


> To load the rootkit into kernel space, it is necessary to approximately match the kernel version used for compiling; it does not have to be strictly the same.

>> vermagic=2.6.32-696.23.1.el6.x86_64 SMP mod_unload modversions

do you know why they say "approximately match"? I thought it had to match exactly so that the kernel accepts to load the module


A kernel module doesn't have to match the kernel version, it has to be able to resolve all the symbols (function calls, variables etc) it uses into valid symbols supplied by the kernel you are loading on.

The greater the difference between the kernel version you compiled for, and the kernel version you are trying to load it on, the greater the chance something you are relying on changed and the module loader cant resolve all the symbols and so it fails.

So saying a kmod has to match the kernel version is good practice but the reality is not quite as strict.

Red Hat has a list of "white listed" symbols that they try to maintain across a major version of RHEL so if your kmod only relies on them and nothing else then it should load on any kernel version within that release. But that's a Red Hat thing, not a Linux kernel thing.


Perhaps also worth noting that rootkits don't have to follow the usual rules; you don't have to rely on the kernel linker if you don't want to.

(Tradeoff of runtime DIY symbol resolution / code grovelling being it's more work, and more likely to be crashy).

As a rootkit author you have considerably more flexibility than most module authors who are constrained by "sanity", maintainability, accepted practice and licensing terms.


I don't know the exact rules, but note that this is targeting RHEL6 and Red Hat makes a deliberate effort to preserve kernel ABI compatibility so it is probably a lot easier than on most Linux distributions.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: