> particularly if they're as complex as "an unauthorised user cannot obtain some information". Even a complete and total elimination of all memory safety bugs and all concurrency bugs won't get you anywhere near that goal
Actually that's not really true. If you take that as a base and remove ambient authority, then you have capability security which does get you as close to fulfilling that property as is possible to achieve. I'd say that's pretty near that goal.
1. You need to verify that all the sensitive data is indeed guarded with the appropriate capabilities.
2. You need to verify that the capability objects are created only after proper authorisation.
3. You need to verify that the capability objects cannot escape their relevant context.
4. You need to verify that the data is not accessible by "extra-program" means, e.g. the file containing the data may be read via a dumb directory-injection attack.
Some simple sound techniques by the programming language can help with 3 and a part of 2, but not the rest. Indeed the current #1 top vulnerability [1] is a misconfigured or incorrect access control.
1. "An unauthorised user cannot obtain some information" is not a property that can be achieved by any access control system, even in theory, eg. an authorized user can just provide a copy to an unauthorized user. I will outline the only achievable property below.
2. All resources in a capability system are designated by capabilities, just like all objects in a memory safe language are designed by references.
3. You can obtain more capabilities by accessing capabilities contained within resources designated by other capabilities, just like you can obtain more references by accessing fields of objects that contain references.
4. Capabilities intrinsically combine designation with authorization. If a user has a capability to some resource, they are by definition authorized to access it. There is no such thing as a "context" in which a resource may or may not be accessed, you only either have or do not have a capability to it.
5. Capabilities are not ambient in the environment, and so the only path to obtaining capabilities is through the capability set you're given. An example of an ambient authority is the ability to obtain a file handle
to any file in a system using an arbitrary file string you can just construct. Another example is being able to convert an arbitrary number into a reference, ie. violating memory safety.
Hopefully it's now clear how there's a formal correspondence between capabilities and memory safe languages that have had ambient authority removed. This does not in any meaningful way restrict our ability to construct programs, at worst we must pass around a few more parameters that we used to access using ambient authorities (like File.Open). Access to resources can now only flow along the reference graph and security analysis now becomes tractable, basically another type of dataflow analysis.
So the security property that you want to verify is that there is no way for user X to access resource Y if given capability set C. This means that the transitive closure of C doesn't include Y.
There are numerous ways to achieve this, but the most trivial way is to use a typed language and create a type specifically to hold Y, and two capability set types for authorized and unauthorized users, and then you can ensure by simple inspection that a Y never appears in any reference placed in a C (or you can go more precise and only ensure that Y never appears in any return/escape position, thus allowing X to operate on Y's indirectly via some API). The type system is already a dataflow analysis, so this just piggybacks off of that to achieve the property you're after. Note that these are comparatively simple steps compared to getting memory safety to begin with.
That said, I agree that "extra-program" means of undermining the foundation of one's secure system obviously undermines the security properties. This is not really a compelling objection to me anymore than the proof of the validity of an algorithm must assume that your computer correctly adds two numbers.
You're thinking like a language designer rather than a black-hat hacker :)
> All resources in a capability system are designated by capabilities, just like all objects in a memory safe language are designed by references.
The problem is that data doesn't begin life as a typed object but as a string of bits. I.e. it passes some "IO barrier" between the outside world and the world controlled by the language. At some point, some code must assign those bits to an object, and this code could be vulnerable.
> If a user has a capability to some resource, they are by definition authorized to access it.
This is defining whatever the program does as being correct, but that is not what we usually mean by correctness [1]. It's like saying that anyone who has obtained root privileges is, by definition, authorised to have them.
I fully understand the point about what is attainable and what isn't, but this is another example of my point about how some who think about correctness focus on what can be done without properly considering how much what can be done is worth, which is defined entirely but what needs to be done (or what's desired). The thing that I, the application's owner, want is not to have my data protected by a programming language's type system or, say, by means of a fuzzer but to have my data not leak to people I don't want to give my data to. In the case of security, the attacker would always go after the weakest point, whatever it is, and the question is by how much you've reduced the chances of a successful attack and at what cost.
Saying I can't give you what you want but here's what I can give you is perfectly fine, but how much what you can give me is worth to me depends entirely on how much it covers what I want. The value of reducing the chance of a successful attack by 70% is the same regardless of the means by which you achieve it, and then becomes purely a matter of cost.
I fully understand the difficulty of judging probabilities, but you need to understand that that is the sole source of value in correctness. But what I want is to lower the probability of some failure as much as possible for my $100 of investment. I don't care how it's done. That means that debates over assurance methods should be grounded in that question of cost and value. What makes an assurance method better or worse than another isn't how sound it is but how much it costs per unit of value, which is fault reduction (obviously the cost isn't linear). Soundness is a technique for the purpose of controlling value and costs, which may be more or less effective depending on circumstances. The point is that it's a means not an end.
I don't take issue with anything you say here, just your earlier comments about how expensive and non-compositional soundness necessarily is. I just described a programming language with built-in access control that only needs to be verified once for memory safety and absence of ambient authority, and then you can build any system with it and declare security policies, compose them, and have them enforced for no more cost than any other language. So why wouldn't you?
There is no cost-benefit here, it's just benefit. This is why I keep pushing back on how you characterise soundness, it just doesn't necessarily entail the kinds of costs and non-compositionality that you describe.
> how expensive and non-compositional soundness necessarily is
It is "deeper" properties (these are defined more robustly as those requiring a greater number of alternating universal and existential quantifiers) that are usually non-compositional and expensive to prove (and soundness means proof). Simple properties are easy to prove and can be made compositional, but they're also less valuable.
> So why wouldn't you?
No reason, except for the fact that you've defended against things that are easy to defend against easily and didn't defend against the hard things. In other words, of course you should prove as many properties as you can for a low cost, but don't exaggerate what it is that you've gained by doing so. That is precisely the delta between "can" and "need" that I was referring to. It's cool that I can get some extra security cheaply, but if that extra security amounts to 5%, then its value is not that high.
> There is no cost-benefit here, it's just benefit.
Of course there is. When the benefit is low then it's particularly easy to offset by other concerns, e.g. the speed at which I can churn out code or the cost of writing a compiler for an exotic architecture, a relatively common concern in the embedded space.
In any event, I always prefer typed languages when building big/important software (I've rarely used untyped languages), but remember that the original post is not about the simple properties that we get cheaply in typed languages, but how non-aliasing can make proving deeper properties easier. And what I was saying is that these proofs are nowhere near cost effective in most cases even after they've made a little easier.
> and have them enforced for no more cost than any other language
By which I mean that you get the sound policy enforcement for free, without any additional programming effort, where in typical programming languages this isn't the case.
Actually that's not really true. If you take that as a base and remove ambient authority, then you have capability security which does get you as close to fulfilling that property as is possible to achieve. I'd say that's pretty near that goal.