Hacker News new | past | comments | ask | show | jobs | submit login
FORCEDENTRY: Sandbox Escape (googleprojectzero.blogspot.com)
260 points by ivank on March 31, 2022 | hide | past | favorite | 48 comments



It's striking how often these PZ iOS exploits at some point come around to too-polymorphic use of object unarchiving. Other examples:

https://googleprojectzero.blogspot.com/2020/01/remote-iphone...

https://googleprojectzero.blogspot.com/2019/08/the-many-poss...


It would seem the solution is to require every 'unarchive' operation to provide a 'schema' which specifies exactly which classes will be used in the unarchive operation. That schema would be read-only and specific to the unarchive call site.


You're sort of describing NSSecureCoding. Unfortunately, it's not used everywhere, and (because Cocoa relies heavily on internal class clusters) it still allows subclasses of the specified class.


I also recall one exploit that happened despite using nssecurecoding because a special internal class was a subclass of a secure coding class


> starting in OS X 10.5 (which would also be around the launch of iOS in 2007) NSFunctionExpressions were extended to allow arbitrary method invocations with the FUNCTION keyword: "FUNCTION('abc', 'stringByAppendingString', 'def')" => @"abcdef"

This is always a bad idea. If you want a callback, register it manually so only registered functions are available.

Java serialization and Log4J made same style of security bug.


It's easy to just look at features and call them "obviously" bad but this feature provides an extensible interface that isn't much different from any other API in Objective-C that takes a selector, e.g. sortedArrayUsingSelector:. This is just how the language is. It is your job to restrict what can go into that function; in this case where the attacker controls the address space it doesn't matter anyways because an attacker will just create a calling primitive elsewhere. So this is really just a kneejerk reaction that doesn't meaningfully help security here, and removes a fairly useful API.

Note that if you actually wanted to make the attack harder here I'd go after the NSPredicate evaluation instead, which you can see in the blog post is done without checking the sender, rather than going after what the predicate itself can do.


I honestly cannot understand the motivation of allowing arbitrary function execution in an NSPredicate at all. That seems like an unbounded complexity risk, especially when serialized. They've just defined a fun little scripting language in this random corner of Foundation. Testing it appropriately seems nightmarish, and the wrong inputs would just crash the app.

A system that only allows a small set of permitted functions, or implements hard-coded functions like the first version (min, max, avg) just seems far more ... reasonable.

It's the job of an API framework provider to be defensive against these kinds of vectors especially when they may not deprecate this for like, 10+ years. How this made it through Apple's API review is beyond me.

If I saw someone try and pull something like this at the office I would immediately make sure the design was rejected.


NSPredicate dates from before when Objective-C had closures. Objective-C didn’t have closures because it was a thin layer on top of C, and C uses callbacks not closures.

NSPredicate is serialisable because you might want to include it in e.g. a schema. These schemas would be loaded from the app bundle so there’s no security concerns unless your threat model includes modifying the app bundle (which is more of an OS level issue).

The issue is that the serialization needs to distinguish between data that is serialized by the developer of the app at build-time, and data that is serialized at runtime.


I remember, I was a macOS engineer when it came out.

I think NSPredicate, even a serializable one, makes sense. What doesn't make any sense is arbitrary selector invocation. Not just because it's a security risk, but also because it's a stringly-typed method name that isn't checked by the compiler. The name of the selector changing would just cause it to crash - without a compile-time check. A plist with a serialized selector that no longer exists would cause the app to crash. A plist with a malicious selector would cause the app to crash. And crashing seems like a best case outcome.


Racket allows for macro/exec, but it strictly controls the namespace for both of these things. The former mainly via hygienic macros and the latter via requiring exec to be given a namespace to execute in.

I think allowing things like NSFunctionExpressions are ok, but you need to have an explicit namespace, and the default needs to be an empty namespace, not the current/full namespace. Having a default is a good idea because if you don't add one as an API designer, other people creating wrappers around your library will, so at least try to make the default safe. It sounds like that is somewhat similar to your idea of registering a function.


NSPredicate has from the very beginning been the moral equivalent of `eval()`, with all of the security implications that has. Sending arbitrary messages to arbitrary objects is just fundamentally the thing it does. The addition of FUNCTION() certainly made it easier to turn a security vulnerability into a useful attack, but if an attacker can control the format string passed to NSPredicate (or more generally the construction of a NSPredicate) you always had a security vulnerability.


The fact that the extension of NSFunctionExpression in Leopard and the launch of iOS happened around the same time is interesting to me. Clearly one side of Apple was continuing to do everything they could to make their platform more attractive to high-level application developers, continuing the focus on this area that went back to the early years of NeXT [1], unaware (or unconcerned) that another part of Apple would be using this same framework in a consumer device where any way of abusing all of this power could be weaponized.

[1]: See in particular the part of the original NeXT presentation right after intermission, with demos of gluing together high-level components using Interface Builder. https://www.youtube.com/watch?v=92NNyd3m79I


The ostensibly domain-specific interpreter tucked away in NSFunctionExpression is interesting, and I think it would be reasonable for Apple to try to rid their platforms of all such interpreters (with the inevitable exception of JavaScriptCore exclusively for the web). But I think the most important part is this:

> The expressive power of NSXPC just seems fundamentally ill-suited for use across sandbox boundaries, even though it was designed with exactly that in mind.

It reminds me of the YAML deserialization vulnerabilities that plagued Rails. It's clearly necessary to ensure that any data received from an untrusted source is merely data, with no generic way of instantiating arbitrary classes.


Agreed. I worked in closure for a hot minute and learned of a pretty nice solution they had to this called EDN (pronounced "eden"): https://github.com/edn-format/edn

I suspect it was inspired by the whole "data is code" philosophy of lisp languages, but it seemed like a well thought out pattern for encoding and decoding data in relatively safe ways. It had a way of tagging fields to indicate that they required processing to derive the decoded value, e.g.

    #inst "1985-04-12T23:20:50.52Z"
Would be interpreted as a Java DateTime object, but one could just as easily read the raw data without respecting those tags if one didn't trust the safety of the data being read.

In effect the format split the work of parsing the data from decoding the data, which is a distinction I haven't seen in many other data encoding mechanisms.


It’s funny that no amount of (Rust-like) compile time or (Java-like) runtime checks would have prevented this — every line of code is working as intended. The parallels to the recent Log4J vulnerability are evident.

If you happen to think, as I do, that we need safe(r) languages to have any hope of creating secure systems, then problems like these are a striking reminder that memory safety alone isn’t sufficient to achieve security.


The problem is historical serialization APIs that date to before people really thought about deserialising as an attack vector. All the big/enterprises serialization APIs of the era made the same mistake (and later on added layers to allow the developer to limit the set of classes that could be instantiated)

Modern serialization frameworks all seem to have moved to a no-polymorphic instantiations model. Eg when deserialising a field of type X, they will only deserialise into an X.


Very cool that this is just from logic bugs! I wonder if we should as a rule assume that sandboxing that is not formally verified or battle tested over a really long time is unlikely to be free of bugs.

What's the long-term solution for these kinds of problems? How can we get out of this tar pit? Of course in the short run we can be dilligent about updates and bug bounties etc, but how can we actually eliminate these kinds of errors in a 'complete' way?


Not just any logic bug. I think the most succinct identification so far of the specific type of logic bug is in this comment (not mine): https://news.ycombinator.com/item?id=30871034


Scarily elegant.


This. The "weird machine" they built in the original exploit is the single most impressive exploit I've ever seen.


I heard that Apple had a very hard time even understanding the exploit when they rushed a bandaid-patch for it, leaving most of the holes unpatched for the time being.


This isn't the first weird machine exploit used in the wild, and the people who respond to these kinds of things are quite familiar with them.


The knowledge of even what the scope of the attack surface is is incredible, beyond any other exploits that have ever been made public.


What if we had multiple sandboxes instead? Then we could make the rogue malware 'think' it has escaped by using a decoy environment, and then it has to break another sandbox. Having just one sandbox is a SPOF and once that's breached, it's fair game. (Good) security has always been about /layers/ of security.


Is layering always more secure? Layering may create more complexity in the system overall (usually bad for security) and spreads efforts thinner if you are maintaining both sandboxes. What's the difference between good and bad layering?


> Layering may create more complexity in the system overall

Have you ever heard of the `swiss cheese method of security`, where each slice of cheese has a hole in it? The idea being: each slice may have a hole, but when sandwiched together, the route of entry is blocked. I've heard the adage: 'complexity enlarges the attack surface' before, but not in all cases especially. Sometimes security 'maxims' get in the way of security.


This wouldn't help, because malware developers have access to iOS and can see the implementation of multiple sandboxes.


Also it's fencepost security. Doing the exact same thing because "more is better" without a concrete reason why the additional copies would change the outcome. If the outer sandbox is the same as the inner one, why wouldn't the malware break through immediately? If it's different in some way, why not apply those additional mitigations to the inner one?


Have you ever heard of the `swiss cheese method of security`, where each slice of cheese has a hole in it? The idea being: each slice may have a hole, but when sandwiched together, the route of entry is blocked. I've heard the adage: 'complexity enlarges the attack surface' before, but not in all cases especially. Sometimes security 'maxims' get in the way of security.


That's where I think Rust could be a game changer, if all those deserialization libraries used in phone would be written in safe Rust it woudn't be an issue.


In the introduction "In this post we'll take a look at that sandbox escape. It's notable for using only logic bugs". My understanding was that Rust only cover memory corruption type of bug.


Notably, no language will ever be able to catch pure logic errors by itself. The computer can't and shouldn't try to divine what you meant. It'll only do what it's told.

Now, there are certainly advantages different languages have in ease of writing tests and things like that. There's also formal verification. But unlike memory errors, it's impossible to know if you told the computer to do something you didn't intend.


> Notably, no language will ever be able to catch pure logic errors by itself.

It's true you'll never be able to catch all logic errors automatically, partly because you need full correctness specifications for that.

But languages like Rust with powerful static type systems can catch lots more important logic errors than C, or C++ --- e.g. using typestate (catches bugs with operations on values in incorrect states), affine types (catches bugs with operations on "dead" values), newtypes (e.g. lets you distinguish sanitized vs unsanitized strings), sum types (avoids bugs where "magic values" (e.g. errors) are treated as "normal values").

Also modern languages like Swift, and Rust to a lesser extent, are treating integer overflow as an error instead of just allowing it (or worse, treating it as UB).


typestates are great, but in rust i wonder if it would be more ergonomic and easier to get right if it was officially a feature instead of a pattern you have to implement...?


Amazingly, typestates used to be a headline feature of Rust, in the very, very, very old days. Like, "Wow, Rust is the language that's going to make typestates mainstream!" was a thing people thought when seeing it.

Eventually it was removed. https://pcwalton.github.io/2012/12/26/typestate-is-dead.html


  > The reason was that “in practice, it found little use”
thats a real shame...


Indeed this is the Halting Problem.


The halting problem doesn't prevent correctness proofs, it only means that you get a three-valued answer: proof, counterexample, or too complex to determine. "Too complex to determine" usually means that the code needs to be rewritten to have simpler structure.

And of course the proof is only for those properties that you write down, and you could also have a bug in the spec for those properties.


I guess that's the true Turing Test, can an artificial general intelligence determine the complexity level and relative risk level of infinite loops for an algorithm written for a Turing machine.


This exploit chain illustrates how once you have obtained a memory corruption primitive it is very hard to prevent full compromise. This makes it more important than ever to limit the ability of attackers to acquire memory corruption primitives.

The best hope for legacy C and C++ code right now is a combination of extensive fuzzing and dynamic analysis to detect as many memory corruption bugs as you can, plus mitigations in CPUs and compilers to make it harder to turn memory corruption bugs into malicious execution, plus sandboxing to limit the damage of malicious execution. This exploit chain demonstrates techniques to bypass that entire mitigation stage (and also shows that Apple severely bungled their sandbox design).

This doesn't bode well for the mitigation approach. They add significant complexity to the silicon and software stack, which has a cost in performance, but also ultimately security --- at some point we will see these mitigations themselves contributing to attack vectors. In return they make exploitation a lot more difficult, but mitigation bypass techniques can often be packaged and reused. For example stack overflow techniques have been effectively mitigated but that doesn't matter to attackers these days, who are now very good at using heap overflow and UAF. Meanwhile those stack overflow mitigations (stack cookies etc) still have to remain in place, making our systems more complex and slower.


For this specific work, any mitigation is much worse than just solving the problem correctly.

The WUFFS code to do this sort of stuff (parse file data, turn it into an array of RGB pixel values) is not only inherently safe, it's also typically faster than you'd write in C or C++ because the safety gives programmers that fearlessness Rust talks about for concurrency. The language is going to catch you every single time you fall, so, you get completely comfortable doing ludicrous acrobatics knowing worst case is a compiler diagnostic or a unit test fails and try again. When you have a hidden array overflow in C++ it's Undefined Behaviour, when you have a hidden array overflow in (safe) Rust it's a runtime Panic, when you have a hidden array overflow in WUFFS that's not a valid WUFFS program, it will not compile now it's not so hidden any more.

So you're right, this doesn't bode well for mitigation - the answer isn't "more complex and slower" but "use the correct tools".


Rather than Rust, the correct choice here was WUFFS because Apple's explicit purpose here was to Wrangle Untrusted File Formats Safely which is literally why WUFFS is called that.

Unlike Rust, WUFFS isn't a general purpose language it can only express a subset of possible programs. For example, WUFFS can't express the program NSO wanted here ("Provide a way to escape from this sandbox") whereas of course in a general purpose language that would feel annoyingly restrictive. What if you want to escape from the sandbox?

WUFFS is exactly the right tool for, say, parsing a file you received to turn it into an image for display on the iPhone's screen.

Except, of course, if your focus is on features at all costs. If you don't care about security then it sucks that WUFFS can't take a file and maybe overwrite the operating system or whatever.


As explained in the introduction, this escape used only logic bugs. So rust would still be affected


From the first paragraph:

> By sending a .gif iMessage attachment (which was really a PDF) NSO were able to remotely trigger a heap buffer overflow in the ImageIO JBIG2 decoder.


Yes, this attack had multiple stages. First stage is to get something running, and to do that they do use a memory bug that Rust would prevent. The second stage is the sandbox escape that the post describes which is done with only logic bugs, so that would still be vunerable if written in Rust.

While rewritting the world is not possible, all these attacks show that definitely new systems should probably not be written in memory unsafe languages.


They're referring to a previously mentioned exploit:

> Late last year we published a writeup of the initial remote code execution stage of FORCEDENTRY, the zero-click iMessage exploit attributed by Citizen Lab to NSO.

The sandbox escape only uses logic bugs:

> In this post we'll take a look at that sandbox escape. It's notable for using only logic bugs.


You didn't even read the blog post, did you?


I did not indeed.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: