After reading through the original Qualys discussion[1] of this new vuln I was definitely left with a feeling that this class of bugs has not yet exhausted its treasure trove of vulns. This article only confirms it.
Very interesting discussion that will likely continue as more people wrap their heads around this tough problem.
So how does it detect stack overflows past that limit?
Edit: Looks like a fix for LLVM being proposed (https://reviews.llvm.org/D9653) is just to probe every page that a stack allocation covers, rather than just the final one. Doesn't sound tough at all.
First read to me as someone who hadn't read the article was "someone found an ancient kernel hole and declared it closed, but in truth it's actually not," which actually seems to be what was intended... it's a fairly common use of "(not)."
It seems to me that this is a clash of personalities. Linus has been as acerbic before. See https://lkml.org/lkml/2012/12/23/75, for example.
Open Source Security (the company behind grsecurity) is still existing, so it's got enough customers to support the grsecurity developers (both the ones we know, as well as (presumably) PaX Team).
It seems to me like grsecurity aren't interested in upstreaming their patches. That's one reason (among others) why their patches are distributed as one big file, instead of (for example) a patch set or a Git branch.
Why are they not interested? Well, it seems to me like Linux devs may not have placed as high a value on security, when compared to other metrics (such as performance). I understand what it's like to be advocating for a position, not finding acceptance after a long try; finally saying "screw it" and going their own (commercial) way.
So any 'debate' related to grsecurity is going to include people (including LWN commenters) who feel very strongly against grsecurity's position, and how they should be more open, and free, and willing to compromise (if that's the right word) to get the grsecurity/PaX Team patches upstreamed.
And then you have very strong personalities in the kernel dev community. That makes compromise very difficult.
It my opinion, it's the same reason why having a conversation related to politics is nigh impossible: Your strong views affect what you say, how you say it, what you hear and how your perceive it.
Honestly, grsec has rightly earned a reputation for being hostile and even irresponsible with their conduct, regardless of how good their patches are (of course, the quality is known to be high- but there have been some blunders. Don't run `script /dev/null </dev/zero` on an older grsec kernel ;)
Isn't it funny how the 'for example' of Torvalds' language is always exactly the same email? Over 25 years of continual interaction with developers and the greater public, and it's always this email.
And if you need a screencap, get the one of him flipping the bird to camera, but don't give it any context (I believe that's the rule for that particular image).
Performance is one thing, but i think a bigger one is that likely a number of the changes pushed by grsec would break existing kernel behavior in relation to userspace. And that is one big red line you do not cross with Torvalds.
Can someone give a short explanation of this exploit please? Something a bit deeper than "an attacker can exploit this confusion to overwrite the stack" from the OP, but not the super detailed PDF he references. I'm hoping to get a feel for the root cause of this exploit.
The classic vulnerability is that you manage to grow the heap or the stack so that they collide (they start at opposite ends and grow towards the middle) then you get all kinds of exploits possible by stack overwriting the heap or heap overwriting the stack.
While you can prevent the heap from being allocated over the stack, the stack may grow "autonomously" during normal processor instructions where the OS can't intervene.
The classic fix was to put a "guard page", i.e. a segment of memory marked as forbidden, so if stack would grow into there, it'd trigger a page fault and then the process can be stopped.
However, while stack usually grows in small increments, you may allocate some very large data on the stack and not write or read anything in that buffer - so if you do it correctly, the end of the stack will "jump over" that guard page, leaving it in the middle of the stack but not touching it in any way; and the end of the stack will overlap with the heap, enabling all the fun exploits again.
you may allocate some very large data on the stack and not write or read anything in that buffer
We had to deal with something very similar to this when I was working on REALbasic. When targeting Windows, if the compiler happened to generate a function prolog which would allocate more than one page of stack space for local variables, we had to generate extra instructions which would touch each intermediate page, in order, or the kernel would trap the first access beyond the guard page as out-of-bounds. It was an interesting bit of semi-security.
OK, if I understand correctly, there is no privilege escalation. Eg., running this exploit in user code doesn't get you root. When they described it as a "kernel hole" I assumed it was much worse. So the word "kernel" is there because the design flaw is in the kernel, not because it allows a complete system takeover (unless it's against a privileged program).
No, this does mean privilege escalation (pretty much all recent proof of concepts for this problem are local user-to-root escalation) because it allows you to engineer a stack/heap overwrite in many non-buggy binaries that run as root.
The other article says that this is a privilege escalation:
> The exploits and proofs of concept that we developed in the course of our research are all Local Privilege Escalations: an attacker who has any kind of access to an affected system can exploit the Stack Clash vulnerability and obtain full root privileges.
Yes, it's a kernel deficiency which only allows taking over userspace processes and only if you are able to cause them to allocate the right memory and write the right data to it.
> So the word "kernel" is there because the design flaw is in the kernel
The vulnerability is in the userland process. However, the kernel can utilize a stop-gap measure that makes exploitation difficult in most scenarios (but this is arbitrary).
Not really. This time it's CPU and kernel failing to provide expected behavior in some corner cases which makes otherwise correct userland programs vulnerable to attacks for no fault of their own.
It is impossible for the CPU (without architectural changes, e.g. stack segment) and kernel to provide expected behavior in the general case. They do not know whether the userland process is trying to use the stack or just access some other mapped region. Indeed, as long as these accesses do not land upon unmapped space (or space mapped with permissions that conflict with the access), the kernel doesn't have any idea what's going on.
There exists a fix for the vulnerability at the source, namely, a compiler feature that will add extra code that will "touch" every 4k page of the stack if large amount is allocated.
However, for that fix to be useful, every single distribution needs to recompile every single userland package and distribute it to every single user - i.e., instead of downloading a security patch, you need to download a new version of every executable on your hdd.
Can't the changes be gradually introduced? We change distros every year or reinstall the OS every time a new version of the distro is released. That's a newly compiled set of executables that are being installed every year.
If the distro guys just did for the next version (Atleast the guys like Debian, Fedora etc from whom most other distros are derived) wouldn't most systems be protected?
It would take a while but it wouldn't take 10 years atleast. Mass adoption is impossible but incremental changes are.
If critical executables are patched then we have atleast a partially protected system right? (I am not an expert but this sounds a bit more plausible than breaking the entire userland)
This applies generally to any program running as root (or a user with other privileges than yourself) if you can find a way to allocate a large enough segment of memory.
To butcher an LWN quote, "there are potentially millions of buggy programs, but only one kernel". It's a measure to prevent a class of exploit, rather than depending on people writing safe C code (which I believe decades of software engineering have proven we are collectively incapable of doing).
I mean, that's probably a good practice. In my case I sort of had this idea that largish objects should be placed on the heap but I couldn't really figure out a good reason why, or what the limit should be, so I guess this vuln proposes both a reason and that the page limit is a good threshold.
I too had to spend a few minutes to understand it.
The problem is with stack growing into memory previously allocated with mmap, including but not limited to mmap being used to back up large malloc allocations. If this happens, writing to objects in the mmap-ed area can be used to overwrite stack frame of the current function, including its return address, and hence control code execution after returning.
In 2005 it was discovered that when a process fills up all its address space with allocations, Linux will allow an mmap call to allocate memory directly near the top of the stack so that any stack expansion will run into this allocation. This was not fixed.
In 2010, Rafal Wojtczuk developed a working exploit against Xorg using this principle. The problem was finally fixed by enforcing a one page gap at the top of the stack where mmap allocations were forbidden and preventing the stack from growing too close to the nearest mmap allocation. TFA author couldn't resist bitching about some technical issues with this patch which were promptly fixed.
Unfortunately, one page gap is not enough because it can be "jumped over": a sufficiently vulnerable application can be tricked to mmap memory directly adjacent to the gap and then create a large stack allocation (alloca, variable length array or just a fixed-length array larger than 4kB) which covers the whole gap. As long as this allocation isn't used, the kernel has no way of knowing that it exists. If the application then is tricked into calling another function before using this allocation, a new stack frame is created behind the big allocation and the safety gap, in mmap area.
Isn't this a fundamental problem with all small address space architectures? The mitigation of having user programs touch every stack page to ensure the guard page is trapped works, but at some performance penalty.
Interesting. I independently dreamt up this class of vulnerability in 2007 or so. I didn't know there were any instances of it in the wild. When I realized it ought to be possible, I just wrote a PoC vulnerable app and an exploit for it, and then sort of forgot about it. Cool to see it's a real thing now.
Unfortunately I don't think I have the code anymore. But the gist of it is that say you have two threads in a process. One of the threads contains a function like this:
void recurse (int m, int n) {
char arr[n]
recurse(m--, n)
}
If 'm' and 'n' are user-controlled, 'recurse' is executing in thread A, and thread A's stack is above thread B's stack, then you can cause 'recurse' to recurse down to the point that it makes its way into thread B's stack region. Normally, this is prevented by 'guard pages' in between the two thread's stacks. However, if you have a variable length array, as in the example, it's possible to allocate enough memory to just skip right over the guard page. Once you've done this, you're now writing directly into the other thread's stack, with all the fun stuff that entails.
Very interesting discussion that will likely continue as more people wrap their heads around this tough problem.
[1] https://blog.qualys.com/securitylabs/2017/06/19/the-stack-cl...