Linux Kernel Runtime Guard

jws · on Feb 12, 2020

In a nutshell: This watches critical kernel data and reports or acts on atypical changes to normally static data.

Linux Kernel Runtime Guard (LKRG) is a loadable kernel module that performs runtime integrity checking of the Linux kernel and detection of security vulnerability exploits against the kernel. As controversial as this concept is, LKRG attempts to post-detect and hopefully promptly respond to unauthorized modifications to the running Linux kernel (integrity checking) or to credentials (such as user IDs) of the running processes (exploit detection). For process credentials, LKRG attempts to detect the exploit and take action before the kernel would grant the process access (such as open a file) based on the unauthorized credentials.

brendangregg · on Feb 12, 2020

Neat, but for everyone writing kernel modules: can this be a BPF program?

Looking at what LKRG does, it sounds like it can.

At my employer, asking us to add a kernel module to our BaseAMI (since this needs to run on every instance) is a very hard sell. Asking us to add a BPF program, which comes with security by design, is much much easier. Or put it this way: we add zero kernel modules to our BaseAMI, but last I counted we were at 15 BPF programs (and Facebook has over 40.)

pmccarren · on Feb 12, 2020

Wasn't familiar with BPF programs until I saw your comment, and subsequently went on a search trail. Really like what I've read. BPF and XDP provide so much utility!

I thought Jessie had a good overview[0] of BPF

[0] https://blog.jessfraz.com/post/linux-observability-with-bpf/

pstuart · on Feb 12, 2020

Brendan Gregg literally wrote the book on BPF: http://www.brendangregg.com/bpf-performance-tools-book.html

He is an engineering god.

pjmlp · on Feb 13, 2020

Android also uses BPF for a couple of features.

https://source.android.com/devices/architecture/kernel/bpf

wyldfire · on Feb 12, 2020

The threat model [1] page describes more details about what it's intended to do. I think if there's no false positives for the checks (or false positives are limited to other kernel features which can be disabled), it would be great if LKRG just generated an NMI.

I'm curious about how this is received, though. Has it been discussed on LKML? Are there any distros interested?

[1] https://openwall.info/wiki/p_lkrg/Threat_model

heavenlyblue · on Feb 12, 2020

Why not setup a page write access exception?

Genuine question, what are the litations?

kevingadd · on Feb 12, 2020

If a user mode process already managed to touch kernel data, that means it bypassed page access restrictions somehow. Runtime integrity checks are a way to detect that

wyldfire · on Feb 12, 2020

But couldn't they just patch out the calls to LKRG code too? That way it would never run.

zahllos · on Feb 12, 2020

Of course. This is simply a cat and mouse game at this stage. PatchGuard on Windows has the same weakness, and its defense is simply by being implemented through very obscure means. Pretty much every version of Windows has changed the techniques being used, so disabling PatchGuard is very much hitting a moving target.

So yes you could, but you'd have to know it was running in the first place.

The suggestion elsewhere to create an eBPF variant might be interesting to explore.

zxcmx · on Feb 13, 2020

I'm not sure how this plays out as open source.

In closed source you have a bit of leverage (defender advantage) - changes which might be relatively easy for you to implement could take a long time to completely reverse-engineer to the point they can be beaten.

In the open source world, even the design discussions are going to be out in the open.

And as a user, you want this to be obscure enough that people don't routinely publish bypasses, but well-used enough that it's properly maintained and reviewed.

Seems like it might be hard to thread that needle.

zahllos · on Feb 13, 2020

I'm not sure either, if I'm honest.

Patchguard isn't 100% about malware. In the XP-- days, antivirus/firewall vendors did all kinds of DKOM to install their hooks. This resulted in Windows being unstable in some cases, so with Vista Microsoft provided well defined hook points or highlighted existing ones like PsSetLoadImageNotifyRoutine (and deprecated the awful TDI stack for network inspection). The message at the time was "Dear AV people: use these defined hooks please" and Patchguard was basically "and we really mean it - DKOM is dead, stop doing it". It basically means you have a choice when distributing stable software: try to hack around with the kernel, risk bluescreening all your customers either because patchguard changed under you or you failed to correctly disable it etc, or you comply and use the blessed apis. Needless to say one is far less risky.

It also provides a bit of a speed bump for malware. To what extent this will do so for Linux is hard to say. There's plenty of public information on reverse engineering PatchGuard (https://github.com/tandasat/PgResarch, https://github.com/hfiref0x/UPGDSED for starters), and as you say this will likely come with public documentation of its inner workings.

I think this is interesting, but I think efforts like syzkaller and other "kernel hardening" efforts (to find correctness issues and fix bugs as fast as possible) are more valuable.

wyldfire · on Feb 13, 2020

I think it would be interesting to have a processor that allowed you to specify a page mask of immutable pages once you cross a one-way privilege/ring threshold.

Does such a MMU/proc feature exist already? Seems like a feature like LKRG would be pretty effective in a case like that.

And if so the big remaining risk would be to the boot device chain security (which LKRG considers out of scope and for which several processors/SoCs already have covering security features).

swatkat · on Feb 12, 2020

Looks similar to PatchGuard[0] in Windows. PatchGuard simply ended the whole rootkit mess, and rootkit vs anti-rootkit wars on Windows.

[0] https://en.wikipedia.org/wiki/Kernel_Patch_Protection

TrueDuality · on Feb 12, 2020

The problem I have with this as a solution is that the environments where custom kernel modules or kernel modifications in general are being used as a layer of security are already largely customizing these threats out.

Disabling kernel module loading, or restricting it to signed modules shuts down many of the vectors without using out of tree code. There are many security switches that are generally left off in widely distributed kernels that provide deep protection when you don't need to support everyone's project and app.

For specific distributions like those listed this is fine, but those also generally aren't used in higher assurance environments either.

cbsks · on Feb 12, 2020

> There are many security switches that are generally left off in widely distributed kernels that provide deep protection when you don't need to support everyone's project and app.

Is there a place where these options are listed? Preferably with the pros/cons of enabling each option.

inetknght · on Feb 12, 2020

> Is there a place where these options are listed? Preferably with the pros/cons of enabling each option.

Good place to start: building your own kernel from source. I tried that once and was quite overwhelmed with the sheer number of knobs and features that are available.

While I do still want a centralized list of things to do/check for hardening a kernel, I don't think it will _ever_ be exhaustive. And some pros/cons would involve very _deeply_ complex behavior which would be difficult to determine whether or not the tradeoff is even relevant.

Hello71 · on Feb 12, 2020

that doesn't sound like an accurate portrayal of the article at all. "Kernel Read-Only Primitive" and "Kernel Write-Only Primitive" are vague, but as far as I can tell, "SWAPGS", "BadIRET", "SysRet", and "Pop SS" are in core kernel code which cannot be configured out with Kconfig. CVE-2017-5123 is in waitid, which I believe cannot be configured out, and CVE-2017-1000112 is in TCP, which can be disabled, but is virtually never done. I don't think anything in this article talks about exploits from kernel modules. In fact, according to the article, LKRG can be trivially disabled from root by simply running "rmmod p_lkrg".

LinuxBender · on Feb 12, 2020

One alternate method is to have a startup script that loads your custom kernel module, then uses

    sysctl -w "kernel.modules_disabled=1"

which can not be unset without a reboot.

rvz · on Feb 12, 2020

With some very business critical limitations [0]. If one was to implement security features like LKRG and it disrupts other components in the system, then just as it looks promising security research, then I'm afraid that we'll have to wait for it to improve before we can use it.

[0] https://www.openwall.com/lists/lkrg-users/2019/11/18/1

NewJazz · on Feb 12, 2020

I don't mean to detract from your point, but if virtualbox is business critical, I would suggest changing your business. Apart from KVM being a far superior hypervisor, oracle licensing provisions are a noticeable liability.

https://www.reddit.com/r/sysadmin/comments/d1ttzp/oracle_is_...

Hello71 · on Feb 12, 2020

also, the vbox drivers are considered to be "crap": https://www.phoronix.com/scan.php?page=news_item&px=OTk5Mw

mwcremer · on Feb 12, 2020

in 2011. (They may still be crap, but nine years is a long time.)

Thaxll · on Feb 12, 2020

The dev at Oracle must feel great to have their work labeled as "crap".

NewJazz · on Feb 12, 2020

As if they didn't already know. Remember -- we are not defined by the shitty hacks we write to pay our rent.

lwh · on Feb 12, 2020

When tickets last that long unfixed, the dev working on it now isn't hurt ;)

asveikau · on Feb 13, 2020

It hasn't been an Oracle product forever.

I remember it as Sun. But looking it up, Sun acquired it from something called Innotek GmbH.

I don't get the impression that Oracle is putting a lot of serious engineering effort on the old Sun products, aside from maintenance. Anyone keeping the lights on is probably unlikely to get offended by the thing from 2 layers of acquisition being called "crap".

throwaway2048 · on Feb 12, 2020

Maybe if nobody called their code crap, it would be better code?

paulmd · on Feb 12, 2020

This link reflects the extension pack, not Virtualbox itself, which is CDDL licensed.

Virtualbox represents exactly as much threat to your business as an OpenIndiana or FreeBSD server. Are you sure that a sysadmin is not using (CDDL licensed) ZFS?

p_l · on Feb 13, 2020

Oracle can't go after CDDL licensed work - VirtualBox extension pack wasn't CDDL or even open source project.

paulmd · on Feb 13, 2020

That's the point, yeah. Virtualbox is CDDL. Extension pack is not. Using virtualbox doesn't hurt, they got nailed because of the extension pack.

Watch your licenses. Oracle Java JRE/JDK switched to a proprietary commmercial license (it's time to switch to openjre/openjdk including on the desktop. You're either out of date on patches or running oracle non-cddl code (prepare to bend over)

openjdk jre link: https://adoptopenjdk.net/installation.html?variant=openjdk11...

Affero GPL/AGPL is commercially toxic as well. Java iTextPdf is probably the most common thing.

reanimus · on Feb 13, 2020

Further in the thread [0] they explain why this is happening (hint: it's due to the virtualbox driver allocating an RWX page and executing it, which violates integrity rules and has been exploited in the past). Either way, the authors have provided build instructions for VirtualBox users.

[0] https://www.openwall.com/lists/lkrg-users/2019/12/02/1

loeg · on Feb 12, 2020

There's the full secure extreme — turning it off and unplugging it — but the usability tradeoff isn't great. TFA lies somewhere along that spectrum, but it isn't clear to me it's a sane default.

(It is not especially clear to me what exact mitigations TFA describes; it seems to be glossy ad copy rather than technical documentation.)

I'm not sure I'd want "!!!" log lines going to the primary business log when only the suspicion of an attack exists.

Stierlitz · on Feb 12, 2020

LKRG sounds brilliant, why not integrate such features directly into the kernel?

perlgeek · on Feb 12, 2020

It's currently experimental. When it matures, and if there's enough developer power behind it, it might find its way into the mainline kernel.