Defending against Rowhammer in the Linux kernel

Zenst · on Oct 29, 2016

A nice relevent discussion from a post Alan Cox started upon G+ recently https://plus.google.com/u/0/+AlanCoxLinux/posts/AFqqpTPpKZ5

Lunus covers the TL;DR in this quote "there is nothing remotely sane you can do in software to actually fix this." and gets down to ECC memory being the golden solution.

ryuuchin · on Oct 29, 2016

ECC isn't really the golden solution. It will likely just turn rowhammer into a DoS instead of something which is exploitable (and even with ECC it can still be exploitable just harder). While this is an improvement I wouldn't call it a golden solution.

lambda · on Oct 29, 2016

A DoS from a program already running on the same hardware is nowhere near the same level of vulnerability as a privilege escalation.

Security is all about increasing those difficulty thresholds, not absolutes.

ryuuchin · on Oct 29, 2016

I agree, the only thing I was really taking issue with is calling it a "golden solution". When in fact it is just a (useful) mitigation.

TRR might better fit the description of a "golden solution" although it is still technically a mitigation. I would very much like to see more ECC in consumer/desktop systems.

Zenst · on Oct 29, 2016

I agree as I also found the lack of transition of ECC into consumer level pricing and hardware a lamentable state.

Now with these public issues becoming more open, one can only hope that some consumer product moves to ECC and ends up driving ECC memory prices down in much the same way as mobile phones have driven many sensors and associated hardware down in price, thru volume.

snuxoll · on Oct 29, 2016

I mean, with ECC memory you need to cause 3 bit flips for it to turn into an exploit. 1 bit flipped pew row is recoverable, 2 will cause a detectable fault, and from what I've seen on rowhammer so far getting past that is going to be pretty difficult.

Zenst · on Oct 29, 2016

Your right, my bad and it is more case of it being the best solution to date.

ta161028 · on Oct 29, 2016

If you can change the hardware then the best solution is TRR, which should completely eliminate bit flips due to rowhammer. This doesn't do anything for the millions(?) of vulnerable systems in the wild.

poizan42 · on Oct 29, 2016

What is TRR?

qb45 · on Oct 29, 2016

Targeted Row Refresh - a DRAM command which the memory controller can issue to refresh rows adjacent to heavily used rows and prevent corruption.

CalChris · on Oct 28, 2016

This approach was also covered at Black Hat:

https://www.youtube.com/watch?v=dfIoKgw65I0

https://www.blackhat.com/docs/us-15/materials/us-15-Herath-T...

ta161028 · on Oct 29, 2016

If you know how memory addresses map to row/column, it should be possible to force refresh the entire array with a few tens of thousands of reads, rather than delaying for 64ms. (At least on conventional DRAM, reading a row implicitly refreshes it; I presume it's the same on DDR SDRAM.)

salessawi · on Oct 29, 2016

A very similar approach, that attempts to refresh rows adjacent to the ones that are being repeatedly accessed is described here(disclosure: I am one of the authors of the paper): Paper: https://iss.oy.ne.ro/ANVIL.pdf Kernel Module Code: https://github.com/zaweke/rowhammer/tree/master/anvil

It is tested on on an Intel SandyBridge CPU. For a full deployment, we would need to know which bits of the physical address are used to select the DRAM banks and rows for each specific CPU. There has been some effort to reverse engineer these mappings by various people. Two excellent sources regarding these mappings: http://lackingrhoticity.blogspot.com/2015/05/how-physical-ad... https://www.usenix.org/system/files/conference/usenixsecurit...

antocv · on Oct 29, 2016

This, this is the real golden solution. Thank you for your work and links, so full of information, golden, golden!

PhantomGremlin · on Oct 29, 2016

Yes, I just posted this in somewhat greater detail. https://news.ycombinator.com/item?id=12821946

There are a lot of details to consider. E.g. reading DRAM (rather than from on-chip cache) consumes a lot more power, which could be quite harmful on a laptop.

mjevans · on Oct 28, 2016

Wouldn't an effective countermeasure to this also be knowing what types of neighboring cell reads are vulnerable and forcing reads on those instead? Bonus points if the reads pull in to the cache data that is actually useful to sequential access.

Sanddancer · on Oct 28, 2016

With modern processors, that becomes a much more difficult proposition. For various reasons, processors now scramble addresses before writing to dram. This makes it more difficult to figure out which cells are neighboring, and much more difficult to implement that type of countermeasure.

mjevans · on Oct 28, 2016

Are you talking about userspace memory mapping (via pagetables) to physical address space, getting confused about what Address Space Layout Randomization does, or thinking of the very latest CPUs (announced) which offer to encrypt the content of memory pages (Secure Memory Encryption and Secure Encrypted Virtualization)?

In any event, the kernel can know what's behind the curtain and that is the context in which the suggestion and news item exist.

Sanddancer · on Oct 29, 2016

This is called Data Scrambling on Intel processors. There's not a lot of info I can find from the horse's mouth, but the datasheet [1] has a blurb:

    The memory controller incorporates a DDR3 Data Scrambling 
    feature to minimize the impact of excessive di/dt on 
    the platform DDR3 VRs due to successive 1s and 0s on 
    the data bus. Past experience has demonstrated that
    traffic on the data bus is not random and can have 
    energy concentrated at specific spectral harmonics 
    creating high di/dt that is generally limited by data 
    patterns that excite resonance between the package 
    inductance and on-die capacitances. As a result, the 
    memory controller uses a data scrambling feature to 
    create pseudo-random patterns on the DDR3 data bus to 
    reduce the impact of any excessive di/dt.

So basically, all the kernel knows is that the data is in there "somewhere", and can be accessed traditionally.

[1] http://www.intel.com/content/dam/www/public/us/en/documents/...

wtallis · on Oct 29, 2016

That seems to be exclusively about the data bus, not the address bus. So it affects what bit patterns get stored, but has no effect on where they get stored.

makomk · on Oct 28, 2016

No, the physical addresses used by the kernel and other devices are scrambled before they're used to access DRAM on modern hardware. This is not visible to the kernel.

mjevans · on Oct 28, 2016

What is the specific feature called?

It //STILL// sounds like something that happens in the MMU, at the time that the virtual address is mapped back to physical addresses via the page tables; which means that the kernel still knows the real backing addresses.

shawnz · on Oct 28, 2016

I think what they are saying is that physical addresses don't necessarily map to bytes in DRAM in the sequential order that you might expect.

Here are more details of how it actually happens: http://lackingrhoticity.blogspot.ca/2015/05/how-physical-add...

mjevans · on Oct 29, 2016

I see where you're going with this.

dmidecode seems to indicate that the description from the actual RAM is lacking transparency about it's internal geometry.

I can get the ranking for slot-level interleaving from dmidecode on my systems (which means a kernel could or already has it as well).

Thinking about the inside-chip geometry issue as well as the current on-list proposal in the news item I've reached a different potential solution.

If page faults are tracked /by process/, the top N faulting processes could be put in to a mode where the following happens on a page fault:

* Some semi-random process (theoretically not discover-able or predictable by user processes) picks a number between 1 and some limit (say 16).

* The faulting page, and the next N pages, are read in (which should pull them in to the cache).

This would help by making it harder to mount a /successive/ attack on different memory locations. Processes chewing through large arrays of objects in memory legitimately shouldn't be impacted that much by such a solution; surely less so than a hard pause on the process.

ecma · on Oct 29, 2016

Are you potentially confusing a page fault which results in the page being pulled into main memory with a cache miss which pulls a line from memory into on-CPU memory? My understanding is that the cache miss is managed by the MMU without much kernel involvement and we're relying on non task aware PMU statistics local to a single CPU for the proposed mitigation.

Am I missing some aspect of page mapping/cache management?

moyix · on Oct 29, 2016

If you want a bit more info about the scrambling process, see:

https://www.usenix.org/conference/usenixsecurity16/technical...

wtallis · on Oct 29, 2016

Calling that "scrambling" is pretty unhelpful, though. It's interleaving accesses across the whole collection of memory devices in a fairly regular pattern, but the mapping isn't trivial because the different levels of parallelism have different latency overheads.

ikeboy · on Oct 29, 2016

If the exploit code can do it, why can't the kernel? Rowhammer is only possible because the non privileged code can predict which rows get affected, so surely the kernel can also?

PhantomGremlin · on Oct 29, 2016

I have an alternative suggestion for how to mitigate this.

I designed a number of DRAM memory boards "back in the day", but haven't kept up with recent developments. But this idea could be used as a starting point by someone more in tune with current hardware to write a kernel module to help mitigate Rowhammer.

One key thing to know is that, internally, a DRAM chip isn't accessed by a single row (of let's say 32 bits) at a time. What happens is that a read causes a large number of bits (literally thousands) to be accessed and refreshed at once. Then the selected 32 bits are returned to the CPU. But, as a side effect, all those 1024 bits (or perhaps a lot more in current DRAM chips) are refreshed.

So what's needed is a background process that does the following for all of physical memory:

   perform an uncacheable read of 32 bits direct from DRAM
   increment read address by perhaps 32 words
   pause for some small amount of time (perhaps 1 usec)
   repeat forever

This task of repeatedly sweeping through physical memory will, as a side effect, cause all memory cells to be refreshed.

Obviously there is some magic needed, which can only be done in the kernel. First, all of physical memory must be able to be accessed by this process. Second, some tuning must be done to keep the task from consuming too much memory bandwidth. Third, it might make more sense to do something like reading quickly a burst from 4 different physical memory locations then pausing for 4x as long.

Unfortunately, running this type of program would be devastating in terms of power consumption. DRAM chips consume much more power while being accessed than while they are in standby. So it probably would have an large deleterious effect on a laptop. But it would probably be OK on a desktop or server.

That just the basic idea. There is a lot of tuning that could be done. For example, instead of reading thru all of physical memory, perhaps just read only the kernel memory. That's a lot less memory, a lot lower power consumption. The idea is that corrupted kernel memory is potentially a lot more harmful than corrupted memory used by some random user process.

angry_octet · on Oct 29, 2016

DRAM refresh occurs regularly and automatically, think of it as signal regeneration. But this does not correct bit flips, even on ECC memory.

For ECC memory systems your idea is called ECC scrub. The idea is to trigger SBE correction before more bit flips occur, turning it into an unrecoverable or undetectable error. Usually it appears in BIOS as Patrol Scrub (continuously and slowly walk through all memory, slow enough not to contend with active programs) and Demand Scrub. See also https://github.com/andikleen/mcelog

I've never seen a reason for Demand Scrub until now... If the kernel talked to the memory controller to map which pages were which banks it could cause a scrub specific high risk pages (e.g. the kernel as you say) or after X uncached accesses... Or it might be possible to segregate less trusted code to specific bank groups.

https://lackingrhoticity.blogspot.com/2015/05/how-physical-a...

Another approach might be to use memory integrity. Intel SGX has memory encryption and uses a tree of hashes to validate. It has holes against an on-machine attacker, but would defend against bit flips. https://eprint.iacr.org/2016/204.pdf

ecma · on Oct 29, 2016

From the Wikipedia article on memory refresh [0] for DRAM describing distributed refresh techniques:

"For example, the current generation of chips (DDR SDRAM) has a refresh time of 64 ms and 8,192 rows, so the refresh cycle interval is 7.8 μs."

Sounds like your idea is already a thing! That's cool! Unfortunately it seems like the performance-cycle period tradeoff is difficult and Rowhammer is taking advantage of that.

Protecting subsets of memory is an interesting idea but what bits do you choose? What if I could Rowhammer the memory frame behind a COW page belonging to a setuid binary? I feel like it might be an all or nothing kind of thing in this situation. Who knows though, maybe there's merit in specifically protecting kernel owned frames, it's difficult to say for sure.

0 - https://en.m.wikipedia.org/wiki/Memory_refresh

stugillibrand · on Oct 29, 2016

I realise the article already touched on this, however issueing non temporal stores (MOVNT family) will bypass caches thus mitigating this particular proposed protection.

Marat_Dukhan · on Oct 29, 2016

I don't see how it will help. First, if the vendors can ship a new Linux kernel, they can as well ship updated firmware to increase memory refresh frequency. Secondly, on mobile devices CPU and GPU share the same memory, and it could be only a matter of time before we see GPU-powered rowhammer attack.

angry_octet · on Oct 29, 2016

Why do you think the firmware would be capable of programming DRAM refresh rate to a heretofore unheard of frequency? This capability does not exist in the memory controller.

It might be hard to get the GPU to hammer memory hard enough because of the caches (http://wccftech.com/intel-skylake-gen9-graphics-architecture...). Maybe with glBufferData in DRAW mode it would force uncached access, depends if the graphics core cache snoops the address bus (coherent) or not, in which case it could be tricked. However the AMD Fusion docs certainly seem to indicate it is possible, and significantly higher read bandwidth than from the CPU (http://developer.amd.com/wordpress/media/2013/06/1004_final....). Indeed, it makes me wonder why (given the repetitive read pattern required for updating VBOs) whether the system instability people have seen before is not due to the GPU doing unintentional rowhammer.

Marat_Dukhan · on Oct 29, 2016

I thought refresh rate is programmable. But if not, at least firmware could lower the DRAM frequency.

Intel GPUs have write-back caches because they share L3 cache with CPU cores. AFAIK, other GPUs typically have write-through caches, which doesn't help against rowhammer.

salessawi · on Oct 29, 2016

A kernel module implementation along the same line: https://news.ycombinator.com/item?id=12822490

yyhhsj0521 · on Oct 29, 2016

Just curious, is it possible to rowhammer cache?

cesarb · on Oct 29, 2016

Cache is usually SRAM, while rowhammer is a problem specific to DRAM.

partycoder · on Oct 28, 2016

Flagged for linking subscriber only material.

detaro · on Oct 28, 2016

https://lwn.net/op/FAQ.lwn:

Where is it appropriate to post a subscriber link?

Almost anywhere. Private mail, messages to project mailing lists, and blog entries are all appropriate. As long as people do not use subscriber links as a way to defeat our attempts to gain subscribers, we are happy to see them shared.

partycoder · on Oct 29, 2016

Ok then, unflagged. Thanks for the clarification.

wtallis · on Oct 28, 2016

LWN's editor has repeatedly said he doesn't mind the occasional subscriber link being posted here.

edmccard · on Oct 28, 2016

FTA:

"The following subscription-only content has been made available to you by an LWN subscriber."

Was that not there 15 minutes ago?

partycoder · on Oct 29, 2016

By saying "subscriber only content shared by a subscriber", made me think it could be a subscriber-to-subscriber thing. But well, someone else cited how this assumption was wrong.

They should phrase it in a less ambiguous way.

CalChris · on Oct 28, 2016

The actual content was in fact made available by a subscriber.