The big thing here is that the GPU has historically been a pain point for Android, because it has extreme access to the AP in ways that basically sidestep any mitigation that you put in its way. Any bugs in the driver's mapping code (and there have been many) end up giving very powerful primitives, and this fact has repeatedly been used in in-the-wild exploits. Unfortunately, I don't think much is going to change here until this gets rearchitected.
> A number of GPUs use a standard Arm SMMU instead of an IOMMU already.
Yes, I'm talking about using cores like an ARM SMMU (which is an IO-MMU). Perhaps some GPUs do, but many (most?) don't including the Mali-G710 in this article that's currently shipping in the Pixel 8.
> The problem with those GPUs in general is driver issues, the hardware is fine.
Exactly. I want them to stop writing bespoke kernel code manually fiddling with some custom page table format that gives physical memory read/write primitives when they get it wrong.
> What is interesting about this vulnerability is that it is a logic bug in the memory management unit of the Arm Mali GPU and it is capable of bypassing Memory Tagging Extension (MTE)
The rest of the article appears to be describing that a bug is actually caused by a race condition and use-after-free is simply a consequence of it.
One of the main goals of GrapheneOS is to release security updates as soon as possible, so if it's patched upstream GrapheneOS almost surely includes the patch.
Sometimes they even adopt pre-release AOSP security patch levels or backport security fixes from unreleased AOSP or kernel sources.
given that this is related to a hardware-ish problem (maybe firmware inside it?) in the GPU I'd bet it even affects it after the march update which was related to the bluetooth stack.
EDIT: Ignore me, I was confusing that with the recent blog post they had about finding an issue with MTE applying to all system apps too. Looks like GrapheneOS should have this as of their 2024030600 release because it brings in the "full 2024-03-05 security patch level"
The right kind of mitigations targets the 1st order primitive; the root cause of the bug.
Hardware solutions: CHERI (Morello, CheriIoT), MTE
Software mitigations: kalloc_type+dataPAC, AUTOSLAB, Firebloom, GuardedMemcpy, CastGuard, attack surface reduction
Safe programming languages: Rust, Swift
MTE/CHERI play pretty nicely - they help ensure that whatever bugs we have in these areas are killed at their root cause… MSR, MSRC and Azure Silicon pushed for… scaling CHERI down to RISC-V32E, the smallest core RISC-V specification.
CHERI-based microcontroller that aims to… get very strong security guarantees if we are willing to co-design the instruction set architecture (ISA), the application binary interface (ABI), isolation model, and the core parts of the software stack… our microcontroller achieves the following security properties:
Deterministic mitigation for spatial safety (using CHERI-ISA capabilities).
Deterministic mitigation for heap and cross-compartment stack temporal safety (using a load barrier, zeroing, revocation, and a one-bit information flow control scheme).
Fine-grained compartmentalization (using additional CHERI-ISA features and a tiny monitor).
> There are around 13 billion lines of open source C and C++, which end up in various TCBs. This number gets even bigger when you include proprietary code… if we all stopped writing C/C++ code now and every software engineer focused on rewriting legacy code in safe languages (and on the assumption that everything can be written in safe languages) then it would take 5-10 to replace everything and we’d likely see a lot of logic bugs because we’d be replacing old well-tested code with new code that would need different algorithms and data structures to fit with allowable idioms in safe languages.
> If we didn’t do the rewriting thing and just stopped writing code in C/C++, then at normal code replacement rates, our TCBs would be entirely safe in around 50 years. If we don’t all agree to stop writing C/C++, it’s at least 100 years.
> In contrast, if the major CPU vendors shipped CHERI CPUs in five years, most machines (and all high-value ones) would have memory safety within 15 years of today, without needing programmers to change their behaviour.
For anyone interested in CHERI for embedded/IoT and other similar use cases lowRISC (whom I work for) are building a couple of FPGA based evaluation platforms for CHERIoT (The Microsoft created CHERI variant referred to above): https://www.sunburst-project.org/
The first is the Sonata system: https://github.com/lowRISC/sonata-system. This comprises a dedicated PCB with an FPGA along with various peripherals and headers. The PCB design is done and will be available through Mouser (plus it's open source including the board layout so you can assemble your own if you like). We're currently working on the RTL for the FPGA. When complete you'll have a complete CHERIoT based microcontroller like system with documentation and tooling.
CHERI is great, but until it becomes a widespread product and not ARM Morello test board, or current RISC-V prototype, anything else in production is better than nothing.
Yes and no. CHERI provides bounds safety but not lifetime safety. If you use capability enhanced garbage collection you can have both, but obviously bolting garbage collection on top of everything you're already doing with manual management (reference counting, etc.) in your existing C/C++ codebase is going to be the worst of both worlds.
Lifetime safety is a much harder problem to solve. Despite CHERI providing ""more robust"" bounds safety, the fact that you get decent lifetime safety for essentially free from MTE is a huge plus. The two technologies aren't incompatible so in theory you could bolt the two together to get MTE lifetime safety and CHERI bounds safety, but that would likely waste a ton of memory.
GPU hardware is crawling with bugs. Hardware is only re-spun for things that cannot be worked around in the driver at an acceptable cost. That approach is possible because GPUs do not allow relatively direct hardware access like CPUs do.
This is great research and a great write-up, but I'm a little (pleasantly) surprised to see it on GitHub's blog.
Does anyone know what their "business reason" for doing research like this is? (not that a business reason should be needed, but like I said, I'm a bit surprised to see it here)
They got bought by Microsoft and so have the resources to sponsor research, including of this kind. There’s a GitHub app, and the security of that app is not outside their purview. if an attacker manages to install a lurky app on your phone, they could do stuff as you. if you're someone with GitHub clout, that could be real damaging so it's in their interests to find such vulnerabilities.
They have hosted action runners for arm too. So, they may have an interest in checking and verifying the security capabilities of arm hardware with MTE for sandboxing.
> Does anyone know what their "business reason" for doing research like this is? (not that a business reason should be needed, but like I said, I'm a bit surprised to see it here)
I think it's basically basic research [0]. In first order reasoning, github, as a product doesn't really need android security experts. But employing them has some potential long-term benefits.
Although, it wouldn’t be abnormal for a security team to have free time, and dedicate it to researching an emerging technology whether it directly contributes to the business goals or not. Of course I’m not talking about a security team that is reading log files from their SIEM while sitting in a SOC.
i understand people disliking using tone indicators, especially when they can ruin a joke, but they are really wonderful things that can prevent misunderstandings like this online
Wow, that's just absolutely incorrect. Ignoring that tons of security teams are actually stupidly busy, this person's specific role at GitHub is security research. GitHub have security products for code security, which he ties into.
My colleagues at the GH Security Lab saw this and made this thread/response [1]
I’ll paste:
Why does GitHub Security Lab do research like @mmolgtm’s recent work on bypassing MTE on the Pixel 8? This question was asked on Hacker News and we think it’s worth a short thread.
news.ycombinator.com/item?id=397522…
First an important point: we only research open source code, which means that many parts of your phone (for example most of your apps) are out-of-scope for us. That said, all open source code is in-scope, including projects that aren’t hosted on GitHub. (Quote tweet reply to this tweet [2])
Open source software is the foundation of much of the world’s software. So when open source wins, we win. And that’s why @GitHub takes its responsibility seriously, to help make open source software more secure.
GitHub Security Lab sits within @GitHubSecurity, and we focus exclusively on open source security with four main priorities:
First, we run the GitHub Advisory Database, which is a comprehensive database of open source vulnerabilities. https://t.co/U4HlXO2l1G
Second, we share information around secure coding practices, through blogs and video content. https://t.co/EdO5SZtR0B
Third, we use GitHub’s CodeQL to scan thousands of open source repositories for common security mistakes, like SQL injections or path traversals. https://t.co/m72rt2a5RL
And fourth, we do deep research on critical open source projects. @mmolgtm’s recent work on Arm Mail is an example of this. https://t.co/jxVYeoJjtO
Similarly, our work with CodeQL provides feedback to the code scanning team to help improve and further develop the feature so that more vulnerabilities are caught quickly and automatically. https://docs.github.com/en/code-security/code-scanning/intro...
And these activities also benefit open source, because GitHub security products, including Dependabot and CodeQL, are free for open source projects!
Our deep research work is primarily intended to inspire the community, so that we can improve open source security together. That’s why we publish detailed blog posts and proof-of-concept exploits.
We’re big believers in Linus's law: “given enough eyeballs, all bugs are shallow”. Together, we’re making open source software secure. https://en.wikipedia.org/wiki/Linus%27s_law
I am surprised no one introduced yet a CPU and phone which has little if any GPU and called it a business phone. The obvious advantages include security, cost, power consumption.
Swipe up from the bottom of your iPhone. Oops, you're suddenly doing 3D transformations.
There are dozens of UI effects which rely on the GPU, and there's just no such thing as a 2D GPU these days, it makes no sense unless you're building a retro console or something.
Things like inertial scrolling are not 'fancy UI animations', they're core components of a touch ui. Take out the touch UI and you're back to something like a nicer Treo.
Anyone who has an eInk device (where such animations are impossible due to the refresh rate of the screen) can tell you that it's still fully usable and has nothing to do with getting back to BlackBerry or Treo.
It looks less nice and is limited in some ways, but for business needs to does the job perfectly.