I think it's reasonable to look forward to verifying that small Rust libraries u...

pron · on May 16, 2024

Even if that were the case, you'd be working very hard, wasting your assurance budget, to prove something underwhelming, like memory safety or other similarly low-level properties. After all that hard work you'd only end up at the same point where most of the software world -- which is already using memory-safe languages -- is today. On the other hand, if you were to spend your budget how they do (rather, should do) with modern unsound methods, you would likely end up not noticeably worse than if you did both.

Remember, the goal isn't to prove some technical property, but to find the greatest number of bugs that matter to you -- those you believe are most dangerous for your particular program or library -- within budget. Researchers are often interested in discovering what properties they can verify, but practitioners care about the functional properties they need to verify.

Deductive methods (including type systems) are particularly myopic in that regard precisely because they're so limited in their effectiveness to begin with. People interested in those particular techniques are rightfully very excited when they discover they can do a little bit more than they previously could, even if what they can do is still so far behind what practitioners need (and are getting elsewhere with more practical methods).

In general, the gap between research and what practitioners think it might mean for their industry work is large in this field and greatly misunderstood by practitioners.

roca · on May 16, 2024

As a practitioner but also a small-time CS researcher, I don't expect verifying that small Rust libraries use 'unsafe' correctly will "waste my assurance budget".

On the cost side: we don't know yet how hard it will be to formally verify small Rust libraries that wrap unsafe code in safe abstractions. That's partly because we don't even know exactly what the specification is that unsafe Rust code has to satisfy. But once we've settled on that and built good proof assistant tools, I think the work will usually be pretty easy, because most of these libraries are fundamentally quite simple and small, even including all their dependencies.

You correctly identified compositionality as a key issue. Fortunately, verifying correct use of 'unsafe' is inherently compositional: knowing nothing but the code of the library and its dependencies, we must prove that the necessary safety properties hold in all possible contexts.

On the benefit side: proving that libraries do not undermine Rust's safety guarantees is important, and the importance grows with the popularity of the library. And of course those guarantees aren't just "memory safety" but data-race-freedom and the general "mutable-xor-shared" property --- the importance of which is the subject of the OP here!

pron · on May 16, 2024

> You correctly identified compositionality as a key issue. Fortunately, verifying correct use of 'unsafe' is inherently compositional

Yes, it is an inductive property. Sadly, not a very powerful one.

> On the benefit side: proving that libraries do not undermine Rust's safety guarantees is important, and the importance grows with the popularity of the library.

It is only important if you take it as an axiom that all programs must be proven to be memory-safe. That may be an axiom accepted by some Rust fans, but it is obviously not an axiom of software. What software needs are methods that reduce dangerous bugs (where "dangerous" means different things for different programs) in the most cost-effective way. It may well be the case that, say, soundly eliminating 95% of bugs caused by memory unsafety and using the rest of the budget on unsound methods is more effective than spending it on eliminating the last 5% of those particular bugs. Software's needs do not currently justify complete sound memory safety at all costs, so the question of cost is still the most relevant one. There is no justification to focus on one cause of bugs just because it can be soundly eliminated.

> And of course those guarantees aren't just "memory safety" but data-race-freedom and the general "mutable-xor-shared" property --- the importance of which is the subject of the OP here!

Neither of these are worthwhile enough to move the needle. This is precisely what I mean by the myopic focus on what can be done as opposed to what needs to be done. If the gap is 10 orders of magnitude, narrowing it by 1 doesn't have a big impact, unless it were completely free, which it isn't.

roca · on May 16, 2024

Your focus on "the most cost effective way" is fine in principle but in practice, we don't know ahead of time what bugs exist and what their impact is. Experience has shown that the prevalence and impact of memory safety and data race bugs is very difficult to predict, and the impact has very high variance. There are so many examples of memory safety bugs and races that lie dormant for years and suddenly become very important (e.g. because someone malicious discovered them, or because the software started being used in a different context).

Note that these safety bugs are really different from most other kinds of bugs in that their potential impact is unbounded: they can cause the program to behave in arbitrary ways. Logic errors, even stuff like non-UB integer overflow, have impact that is still constrained by the expected semantics of the programming language.

Given that uncertainty, it makes sense to take precautions. If you have cheap tools to eliminate those bugs, do it just in case. That might even be cheaper than trying to figure out ahead of time how important those bugs are going to be over the lifetime of the software (which is often an impossible task anyway).

roca · on May 16, 2024

Ah, you seem to think I proposed using formal verification to eliminate 100% of memory safety bugs no matter what the cost. In fact I didn't say that, and I don't recommend that. There will certainly be cases, typically in very large libraries that use 'unsafe', where the invariants you have to prove are very complicated and fragile and difficult to verify, and therefore the cost outweighs the benefit.

What I'm looking forward to is formalisms and tools to pick the "low hanging fruit" --- as I said, verifying the correctness of small libraries that use 'unsafe'. In small libraries, in general, and invariants and proofs are smaller and proof automation is correspondingly more effective.

roca · on May 16, 2024

> If the gap is 10 orders of magnitude, narrowing it by 1 doesn't have a big impact, unless it were completely free, which it isn't.

It doesn't need to be completely free to have a positive net benefit.

tialaramex · on May 16, 2024

> Remember, the goal isn't to prove some technical property, but to find the greatest number of bugs that matter to you

No. And this has been a particular challenge for poor C++. If we prove a property then the property holds, but if we just found and meticulously fixed 846 of the crucial holes in our system, the bad guys exploit #847 to break in just the same. Is there a #848? #849? Maybe.

We as an industry are significantly under-delivering and we ought to take responsibility for that.

I think you're correct about the gap but wrong about what the gap is. WUFFS exists today. Lots of people on HN have written or even are writing codecs in a general purpose programming language and don't even realise that's a terrible idea. "I'm pretty careful" substitutes for a machine checked proof and the results can be seen easily.

pron · on May 16, 2024

> And this has been a particular challenge for poor C++. If we prove a property then the property holds, but if we just found and meticulously fixed 846 of the crucial holes in our system, the bad guys exploit #847 to break in just the same. Is there a #848? #849? Maybe.

That is an unattainable goal for most properties we care about, particularly if they're as complex as "an unauthorised user cannot obtain some information". Even a complete and total elimination of all memory safety bugs and all concurrency bugs won't get you anywhere near that goal (and you can verify that for yourself by looking at common server vulnerabilities). I should have said that the attainable goal is etc..

naasking · on May 16, 2024

> particularly if they're as complex as "an unauthorised user cannot obtain some information". Even a complete and total elimination of all memory safety bugs and all concurrency bugs won't get you anywhere near that goal

Actually that's not really true. If you take that as a base and remove ambient authority, then you have capability security which does get you as close to fulfilling that property as is possible to achieve. I'd say that's pretty near that goal.

pron · on May 16, 2024

I disagree, for multiple reasons:

1. You need to verify that all the sensitive data is indeed guarded with the appropriate capabilities.

2. You need to verify that the capability objects are created only after proper authorisation.

3. You need to verify that the capability objects cannot escape their relevant context.

4. You need to verify that the data is not accessible by "extra-program" means, e.g. the file containing the data may be read via a dumb directory-injection attack.

Some simple sound techniques by the programming language can help with 3 and a part of 2, but not the rest. Indeed the current #1 top vulnerability [1] is a misconfigured or incorrect access control.

[1]: https://owasp.org/www-project-top-ten/

naasking · on May 16, 2024

Some background:

1. "An unauthorised user cannot obtain some information" is not a property that can be achieved by any access control system, even in theory, eg. an authorized user can just provide a copy to an unauthorized user. I will outline the only achievable property below.

2. All resources in a capability system are designated by capabilities, just like all objects in a memory safe language are designed by references.

3. You can obtain more capabilities by accessing capabilities contained within resources designated by other capabilities, just like you can obtain more references by accessing fields of objects that contain references.

4. Capabilities intrinsically combine designation with authorization. If a user has a capability to some resource, they are by definition authorized to access it. There is no such thing as a "context" in which a resource may or may not be accessed, you only either have or do not have a capability to it.

5. Capabilities are not ambient in the environment, and so the only path to obtaining capabilities is through the capability set you're given. An example of an ambient authority is the ability to obtain a file handle to any file in a system using an arbitrary file string you can just construct. Another example is being able to convert an arbitrary number into a reference, ie. violating memory safety.

Hopefully it's now clear how there's a formal correspondence between capabilities and memory safe languages that have had ambient authority removed. This does not in any meaningful way restrict our ability to construct programs, at worst we must pass around a few more parameters that we used to access using ambient authorities (like File.Open). Access to resources can now only flow along the reference graph and security analysis now becomes tractable, basically another type of dataflow analysis.

So the security property that you want to verify is that there is no way for user X to access resource Y if given capability set C. This means that the transitive closure of C doesn't include Y.

There are numerous ways to achieve this, but the most trivial way is to use a typed language and create a type specifically to hold Y, and two capability set types for authorized and unauthorized users, and then you can ensure by simple inspection that a Y never appears in any reference placed in a C (or you can go more precise and only ensure that Y never appears in any return/escape position, thus allowing X to operate on Y's indirectly via some API). The type system is already a dataflow analysis, so this just piggybacks off of that to achieve the property you're after. Note that these are comparatively simple steps compared to getting memory safety to begin with.

That said, I agree that "extra-program" means of undermining the foundation of one's secure system obviously undermines the security properties. This is not really a compelling objection to me anymore than the proof of the validity of an algorithm must assume that your computer correctly adds two numbers.

pron · on May 17, 2024

You're thinking like a language designer rather than a black-hat hacker :)

> All resources in a capability system are designated by capabilities, just like all objects in a memory safe language are designed by references.

The problem is that data doesn't begin life as a typed object but as a string of bits. I.e. it passes some "IO barrier" between the outside world and the world controlled by the language. At some point, some code must assign those bits to an object, and this code could be vulnerable.

> If a user has a capability to some resource, they are by definition authorized to access it.

This is defining whatever the program does as being correct, but that is not what we usually mean by correctness [1]. It's like saying that anyone who has obtained root privileges is, by definition, authorised to have them.

I fully understand the point about what is attainable and what isn't, but this is another example of my point about how some who think about correctness focus on what can be done without properly considering how much what can be done is worth, which is defined entirely but what needs to be done (or what's desired). The thing that I, the application's owner, want is not to have my data protected by a programming language's type system or, say, by means of a fuzzer but to have my data not leak to people I don't want to give my data to. In the case of security, the attacker would always go after the weakest point, whatever it is, and the question is by how much you've reduced the chances of a successful attack and at what cost.

Saying I can't give you what you want but here's what I can give you is perfectly fine, but how much what you can give me is worth to me depends entirely on how much it covers what I want. The value of reducing the chance of a successful attack by 70% is the same regardless of the means by which you achieve it, and then becomes purely a matter of cost.

I fully understand the difficulty of judging probabilities, but you need to understand that that is the sole source of value in correctness. But what I want is to lower the probability of some failure as much as possible for my $100 of investment. I don't care how it's done. That means that debates over assurance methods should be grounded in that question of cost and value. What makes an assurance method better or worse than another isn't how sound it is but how much it costs per unit of value, which is fault reduction (obviously the cost isn't linear). Soundness is a technique for the purpose of controlling value and costs, which may be more or less effective depending on circumstances. The point is that it's a means not an end.

naasking · on May 19, 2024

I don't take issue with anything you say here, just your earlier comments about how expensive and non-compositional soundness necessarily is. I just described a programming language with built-in access control that only needs to be verified once for memory safety and absence of ambient authority, and then you can build any system with it and declare security policies, compose them, and have them enforced for no more cost than any other language. So why wouldn't you?

There is no cost-benefit here, it's just benefit. This is why I keep pushing back on how you characterise soundness, it just doesn't necessarily entail the kinds of costs and non-compositionality that you describe.

pron · on May 19, 2024

> how expensive and non-compositional soundness necessarily is

It is "deeper" properties (these are defined more robustly as those requiring a greater number of alternating universal and existential quantifiers) that are usually non-compositional and expensive to prove (and soundness means proof). Simple properties are easy to prove and can be made compositional, but they're also less valuable.

> So why wouldn't you?

No reason, except for the fact that you've defended against things that are easy to defend against easily and didn't defend against the hard things. In other words, of course you should prove as many properties as you can for a low cost, but don't exaggerate what it is that you've gained by doing so. That is precisely the delta between "can" and "need" that I was referring to. It's cool that I can get some extra security cheaply, but if that extra security amounts to 5%, then its value is not that high.

> There is no cost-benefit here, it's just benefit.

Of course there is. When the benefit is low then it's particularly easy to offset by other concerns, e.g. the speed at which I can churn out code or the cost of writing a compiler for an exotic architecture, a relatively common concern in the embedded space.

In any event, I always prefer typed languages when building big/important software (I've rarely used untyped languages), but remember that the original post is not about the simple properties that we get cheaply in typed languages, but how non-aliasing can make proving deeper properties easier. And what I was saying is that these proofs are nowhere near cost effective in most cases even after they've made a little easier.

naasking · on May 19, 2024

> and have them enforced for no more cost than any other language

By which I mean that you get the sound policy enforcement for free, without any additional programming effort, where in typical programming languages this isn't the case.

naasking · on May 16, 2024

> to prove something underwhelming, like memory safety or other similarly low-level properties. After all that hard work you'd only end up at the same point where most of the software world -- which is already using memory safe languages -- is today.

I'm not sure why this is underwhelming. You reduce your hardware cost budget by using a low-level language while ensuring you don't suffer from the single greatest source of security vulnerabilities in real programs. These have significant real-world impact.

pron · on May 16, 2024

> I'm not sure why this is underwhelming. You reduce your hardware cost budget by using a low-level language while ensuring you don't suffer from the single greatest source of security vulnerabilities in real programs.

There are two problems with your statement. First, it's unclear whether what you save pays for what you gain. Second, the participation of memory unsafety in vulnerabilities is not binary. If you can cheaply reduce memory unsafety by 95%, that doesn't mean that paying a high price for the remaining 5% is worth it. Yes, memory safety is the single greatest source of security vulnerabilities in some languages (it is not in general, but then again most software is already written in memory-safe languages), but once you've already reduced it by 95% it no longer is. (BTW, I'm not even sure that soundly eliminating the 95% is clearly worth it compared to alternatives, but that's a separate discussion).

There's this obsession with soundness that is undoubtedly interesting to researchers of sound methods, but isn't actually justified by the real-world needs of software (sometimes it is and sometimes it isn't, but when and by how much is something that can only be answered empirically; it isn't something that can be simply asserted). Software correctness is at least as much art as it is a science, and ignoring the empirical aspects that make it into an art and focusing only on the aspects that are science because that's where things are clear-cut doesn't make those aspects disappear. At the end of the day someone has a budget and they need to spend it in the way that yields to most correct software; how that is best done is an empirical question. After all, you started with an empirical finding about memory safety being a leading source of some bugs, but it simply doesn't follow that total memory safety soundness is the answer to the empirical question that we need answered. Soundness is clear-cut and mathematical (and that's also why it's attractive [1]); sadly, software correctness is the very opposite.

We must continue researching sound methods, but if anyone thinks -- knowing what we know now -- that this should be the main focus and the best answer to software correctness then I think they are misjudging the nature of the problem.

[1]: Ironically, people who focus on soundness love proofs, yet they try to skirt around the proofs of the futility of soundness by resorting to empirical -- though untested -- claims.

naasking · on May 16, 2024

Here's the problem with this argument: the long history of security vulnerabilities demonstrates that people are complete shite at estimating both the likelihood of security problems, and the severity and scope of any unaddressed security problems.

So the only empirical data that we know to be absolutely reliable is this long history of colossal human failure on estimating this risk. That's why I don't trust your argument one bit, and thus, that's why soundness properties are important, because they take flawed human judgment completely out of the equation.

Which isn't to say that I want soundness proven for all properties no matter the cost, but I do take issue with your outright dismissal of soundness. There are low-hanging fruit like memory safety that handle the vast majority of these problems, and so unquestionably have a great bang for your buck.

pron · on May 16, 2024

But you are again trying to make a non-binary problem into a binary one. We will never be able to soundly verify all the properties we need. That dream is just getting further and further away. 100% correctness is unattainable. So given that, let's say we could only ever hope for 99%. In that case it doesn't matter whether we get rid of 99% of problems using unsound methods or soundly get rid of the cause of 99% of the problems -- the end result is the same. What matters is the cost of getting to that 99% correctness, not the technique.

My choice of technique -- sound or otherwise -- is driven by its cost-effectiveness, not principle. I don't care if you're using a sound technique or an unsound one; I only care about how much it costs to give me the most correctness I need and can get within my budget.

> I do take issue with your outright dismissal of soundness

I don't dismiss soundness. I dismiss the focus on soundness out of principle rather than effectiveness. That myopic focus is grounded in the preferences of researchers of sound methods; it is not grounded in the study of software correctness.

naasking · on May 16, 2024

> But you are again trying to make a non-binary problem into a binary one. We will never be able to soundly verify all the properties we need.

I literally said I did not want soundness for all properties no matter the cost, but that soundness for some critical properties is absolutely warranted. The binary for some properties is absolutely there and completely justifiable based on the reasoning I laid out (memory safety in particular).

pron · on May 16, 2024

> but that soundness for some critical properties is absolutely warranted

A particular technique can only be warranted for practical use by its cost effectiveness, not on principle. If you justify soundness for a property because that property is at the root of 90% of bugs then you should still prefer another technique that reduces 90% of bugs more cheaply. It doesn't matter what technique is used to reduce that amount of bugs.

I understand the argument that nothing but soundness covers the full gamut of mistakes people make in practice around memory safety, but at the end of the day of those mistakes amount to 800 bugs in your program out of 1000, then you should still prefer a technique that reduces 800 bugs -- the same or different ones of equal severity -- for cheaper.

Indeed, there are situations where sound methods are cost-effective, and that is where they tend to be popular.

naasking · on May 16, 2024

> If you justify soundness for a property because that property is at the root of 90% of bugs then you should still prefer another technique that reduces 90% of bugs more cheaply. It doesn't matter what technique is used to reduce that amount of bugs.

This is a false equivalence, as not all bugs are the same and so cannot be trivially equated in this fashion. Memory safety violations permit classes of bugs of significantly higher severity than other types of bugs. Your off-hand disclaimer of "the same or different ones of equal severity" doesn't mean anything, because you cannot predict the severity or scope of even trivial memory unsoundness, so your heuristic arguments of preferring lower cost methods simply don't work. A memory safety bug can range from something as simple as "a process crashed" in the best case, to "we lost control of all our servers worldwide and all of our customer data was compromised".

So I would summarize my dispute with your position as follows:

1. Memory safety is NOT an underwhelming property. Arguably, all other properties of your program derive from memory safety.

2. If you have to use an unsafe language that permits violating memory safety, say for performance reasons, and you have any kind of attack surface, then ensuring memory safety is important. Probably more important than most other properties you otherwise think are important due to #1.

3. Heuristic approaches to memory safety do not necessarily entail a reduction in the severity of bugs and/or security vulnerabilities. Arguments that you eliminated 80% of trivial memory unsoundness is simply not compelling if it leaves all of the subtle, more dangerous ones in place.

4. The long history of security vulnerabilities is a main driver in the interest in soundness among researchers, contra your claims that that this obsession has no association with real-world needs.

I think I'll just leave it at that.

pron · on May 16, 2024

> This is a false equivalence, as not all bugs are the same and so cannot be trivially equated in this fashion.

Correct, and I tried hard not to make that equivalence.

> Memory safety violations permit classes of bugs of significantly higher severity than other types of bugs.

This is not known to be true after eliminating 95% of memory safety violations.

> Your off-hand disclaimer of "the same or different ones of equal severity" doesn't mean anything, because you cannot predict the severity or scope of even trivial memory unsoundness, so your heuristic arguments of preferring lower cost methods simply don't work.

Ah! Now that is 100% true, but it cuts both ways. Of course it's a problem that we can't know in advance what would be a severe bug or it's probability, but saying let's shift resources into sound methods in areas where they happen to work implicitly makes such guess. It's alright to acknowledge that measuring the effectiveness is hard, but that also means we can't presuppose the effectiveness of something just because it has an easy to understand mathematical property. I'm not arguing for or against sound methods; I'm saying that since the question is empirical, it must be answered by empirical means as that is the only way to answer it.

A pragmatic argument may be that you'd rather risk overspending on correctness than underspending, which essentially means: spend whatever you can on soundness whenever you can have it. The problem is that the world just doesn't work this way, because budgets are limited, and any work that is done to increase confidence in one property must, necessarily, come at the expense of other work that may also have value. There is simply no escape from measuring the cost/benefit of an economic activity that has a non-zero cost.

If I fly on a plane I want to know that the company doing the avionics spent its assurance budget in the way that eliminated most bugs, not that it spent it on ensuring memory safety (when I was working on avionics, we didn't use deductive methods even though they were available at the time precisely because their ROI was low and that's bad for correctness where you want every dollar to do the most good).

> Memory safety is NOT an underwhelming property. Arguably, all other properties of your program derive from memory safety.

Except this is the starting point for most software. If your goal is to get to where most software is already is -- I would very much characterise it as underwhelming.

> If you have to use an unsafe language that permits violating memory safety, say for performance reasons, and you have any kind of attack surface, then ensuring memory safety is important. Probably more important than most other properties you otherwise think are important due to #1.

Yes, but how much should you invest in getting from 0 safety to 90% vs how much from 90% to 100%?

> Arguments that you eliminated 80% of trivial memory unsoundness is simply not compelling if it leaves all of the subtle, more dangerous ones in place.

I agree, but again -- it cuts both ways. If you invest a significant amount of effort in eliminating the last 5% of memory safety violations you cannot claim that it's more effective than spending your effort elsewhere. All I'm saying is that these are empirical questions and the fact that whether something is sound or not has a clear yes/no answer doesn't really help tackle the empirical problem.

> The long history of security vulnerabilities is a main driver in the interest in soundness among researchers, contra your claims that that this obsession has no association with real-world needs.

This is clearly not true, because the focus on soundness has only decreased over the past 50 years and continues to decrease. In the 70s it was soundness is the only way. Now there's more research into unsound methods.

naasking · on May 16, 2024

> I'm not arguing for or against sound methods; I'm saying that since the question is empirical, it must be answered by empirical means as that is the only way to answer it.

And I say that the history of CVEs empirically shows that any important software with an attack surface must ensure memory safety, because no heuristic approaches can be relied upon.

> It's alright to acknowledge that measuring the effectiveness is hard, but that also means we can't presuppose the effectiveness of something just because it has an easy to understand mathematical property.

Except you can measure the effectiveness of memory safety in preventing vulnerabilities in domains with an exposed attack surface. Your argument just reduces to an overall focus on bug count, despite acknowledging that 1) most security vulnerabilities are memory safety related, and 2) that memory safety bugs and other types of bugs can't be equated; and your only "escape hatch" is a supposition that a heuristic approach that "eliminates 95% of memory safety violations" probably doesn't leave anything serious behind. Sorry, 40+ years of CVEs does not make this claim reassuring.

> Except this is the starting point for most software. If your goal is to get to where most software is already is -- I would very much characterise it as underwhelming.

It's not the starting point of software that has high performance or low power requirements, or expensive hardware, which is the context in which I took issue with your statement in my first reply. That's why memory safety is not underwhelming in this context.

> Now there's more research into unsound methods.

There's more research into those methods because security costs are still a negative externality, and so a cost reduction focus doesn't factor it into account.

Anyway, I feel we're just circling here.

pron · on May 17, 2024

> And I say that the history of CVEs empirically shows that any important software with an attack surface must ensure memory safety, because no heuristic approaches can be relied upon.

That no heuristic approaches can be relied upon to eliminate all memory safety violations -- something that is obviously true -- does not mean that eliminating all memory safety is always worth it. If it is the cause of 80% of security attacks, reducing it by 90% will make it a rather small cause compared to others, which means that any further reduction is a diminishing return.

> Except you can measure the effectiveness of memory safety in preventing vulnerabilities in domains with an exposed attack surface. .. and your only "escape hatch" is a supposition that a heuristic approach that "eliminates 95% of memory safety violations" probably doesn't leave anything serious behind. Sorry, 40+ years of CVEs does not make this claim reassuring.

Sorry, this doesn't make any logical sense to me. If X is important because it is a cause of a high number of bugs then reducing the number of bugs caused by X necessarily reduces its remaining importance. You cannot at once claim that something is important because it's the cause of many bugs and then claim that what's important isn't reducing the number of bugs. The number of bugs can't be both relevant and irrelevant.

> Anyway, I feel we're just circling here.

Yep.

naasking · on May 22, 2024

> Sorry, this doesn't make any logical sense to me. If X is important because it is a cause of a high number of bugs

It doesn't make sense to you because you keep focusing on the number of bugs, and I keep talking about bug severity. As I've already explained, memory safety bugs are worse than other bugs. The number of bugs is generally not as important as severity, and doesn't typically impact how useful software is past some threshold for bug count. The severity of the bugs is important, always, eg. across the set of programs with 1 bug, the subset of programs where that bug leads to memory unsoundness will be considerably worse than the rest.

pron · on May 22, 2024

> It doesn't make sense to you because you keep focusing on the number of bugs, and I keep talking about bug severity

No, I'm talking about number of bugs weighed by severity.

> As I've already explained, memory safety bugs are worse than other bugs.

Perhaps, but first, the post doesn't talk about memory safety but about deeper properties -- that's the more expensive kind of proof -- and second (since we've started talking about memory safety, which was only something I mentioned in passing as its completely tangential to this subject) it is not clear just by how much memory safety bugs are worse. A $5 note is, indeed, worth much more than a $1 note, but you still wouldn't pay $50 for it.

Obviously, these things are hard to quantify precisely, but it's important to price things at least somewhat reasonably. At the end of the day, a memory safety violation results in some functional bugs/security vulnerabilities -- which form the actual loss in value -- and is worth their total but not more.

When MS said that memory safety violations cause 70% of security vulnerabilities, they meant that the total worth -- as far as security goes -- of memory safety is 70%, which is the same as any other 70% regardless of cause; i.e. that's the value after factoring in the "impact multiplier", not before.

For example, you can have 10 memory safety bugs that cause 70 severe vulnerabilities and 30 other bugs causing 30 more severe vulnerabilities. Each of the first ten is 7 times worse than each of the other 30, but eliminating only 8 of the first ten and half of the other 30 is a little more valuable than eliminating all of the first ten and only one of the remaining 30.

foldr · on May 16, 2024

>The binary for some properties is absolutely there and completely justifiable based on the reasoning I laid out

As a spectator in this thread, I do not see an argument for this conclusion in any of your posts. You're just insisting on it rather than arguing based on data.

naasking · on May 16, 2024

> You're just insisting on it rather than arguing based on data.

So the unbelievably long history of memory safety vulnerabilities is not an argument for distrusting human judgment on the prevalence or scope of possible unsafety in one's program, and thus an argument for memory soundness. This is literally the argument I presented here [1]. In your mind, this is not an argument based on data?

[1] https://news.ycombinator.com/item?id=40378315

foldr · on May 16, 2024

It's not an argument for always insisting on eliminating 100% of memory safety bugs via sound methods regardless of budget. That requires a more elaborate argument than just "memory safety bugs are really bad and therefore we should eliminate all of them with certainty".

naasking · on May 16, 2024

> It's not an argument for always insisting on eliminating 100% of memory safety bugs via sound methods regardless of budget.

Firstly, I never said sound methods ought to be used. Secondly, I also never said that memory safety bugs always must be eliminated without qualification, I specifically couched this in a context of security. In that context, ensuring memory safety is absolutely critical. For instance, scientific simulations of the weather or something don't have such considerations.

The whole problem of memory unsoundness is that you cannot predict the scope or severity of unsoundness. They are simply not like other "bugs", so the heuristic arguments the OP presented simply don't work. All other desirable properties derive from memory soundness, so if you cannot rely on it then your system effectively has no guarantees of any kind.

So yes, any system with any kind of attack surface should absolutely ensure memory safety by any means necessary. If you need to use an unsafe language for performance reasons, as per my initial reply, then isolate the unsafety to trusted kernels (like checked arrays) that can be verified by model checking or other proof techniques.

foldr · on May 16, 2024

>Firstly, I never said sound methods ought to be used.

You kind of did say this: "soundness for some critical properties is absolutely warranted". It seemed clear that you were taking memory safety to be one of the "critical properties".

>if you cannot rely on it then your system effectively has no guarantees of any kind.

In the real world you cannot rely on anything absolutely. All guarantees are probabilistic. Rust, for example, does not have a formally verified compiler, so you cannot be absolutely sure that your provably memory safe code will actually compile down to memory safe machine code (or indeed to machine code that does what it's supposed to do in any other respect). Does that mean we need to go off and formally verify the Rust compiler because otherwise none of our systems will have guarantees of any kind? If it doesn't, then note that you are making exactly the same kind of cost/benefit analysis that pron is talking about.

It's easy to list lots of things that systems 'should' do. In a realistic scenario there is a budget of time/effort/attention/money.

naasking · on May 16, 2024

> You kind of did say this: "soundness for some critical properties is absolutely warranted". It seemed clear that you were taking memory safety to be one of the "critical properties".

You can achieve soundness without "sound methods" in the formal sense that you were using the term.

> If it doesn't, then note that you are making exactly the same kind of cost/benefit analysis that pron is talking about.

I disagree, there's at least an order of magnitude difference between what we're discussing (probably more). pron is tossing around numbers like eliminating 95% of memory safety bugs, where the safety of runtime checked arrays will eliminate more like 99.99% of memory safety bugs, where that 0.01% allows for the chance the you made a mistake in your checked array implementation. There is simply no equating the two.

foldr · on May 16, 2024

I think "sound methods" in this context just means methods that guarantee soundness. You can have soundness without sound methods, but you can't know (for sure) that you have it.

zozbot234 · on May 16, 2024

> to prove something underwhelming, like memory safety or other similarly low-level properties.

Memory safety is table stakes. If you can't even prove memory safety, your program doesn't really have any well-defined semantics after all.

> you'd only end up at the same point where most of the software world -- which is already using memory safe languages -- is today.

Which is another way of saying that the modern software world is already using lightweight formal methods in an end-to-end fashion. Type checking is one such use of lightweight formal methods.

Unsound methods can be useful to surface possible bugs quickly and efficiently, much like fuzzing does. But they're not a kind of verification - they provide no guarantee of correctness.

pron · on May 16, 2024

> Which is another way of saying that the modern software world is already using lightweight formal methods in an end-to-end fashion.

Sure, but only for specific properties that are inductive. Most properties that programmers care about aren't. It is precisely this kind of extrapolation from one property to another that doesn't work.

> But they're not a kind of verification - they provide no guarantee of correctness.

Correct 1. nothing else does either (again, sound formal verification only works in very limited situations -- either in the property they guarantee or the program they can guarantee it in -- and is falling further and further behind what the software world needs) and 2. unlike for algorithms, a "guarantee" is not and cannot be the bar for real software running on physical hardware, as you cannot guarantee the real behaviour of a software system even if the algorithm is guaranteed to be correct. Unlike an algorithm, a software system is ultimately a physical object, not a mathematical one. To crudely oversimplify, you don't care about any confidence beyond the probability of a bit flip.

In other words, you're pointing out that the current state-of-the-art in the field is not as good as something we don't (perhaps can't) have and don't need to begin with.

zozbot234 · on May 16, 2024

> Sure, but only for specific properties that are inductive. Most properties that programmers care about aren't.

Yes but in practice, we keep finding ways of making more such interesting properties "inductive" by placing appropriate restrictions on how the program can be designed. This is how the Rust borrow checker works, at a high level.

pron · on May 16, 2024

Not really, and this is what I meant by having an order of magnitude improvement is very exciting to the researcher even if in practice there are still 29 to go, but it doesn't make a big impact. Even something as simple as the borrow checker -- that only brings you to about where most of software, with all its bugs, already is -- comes at the cost of only being able to use memory in certain ways. I.e. to make the property inductive, it had to be made more coarse (indeed, that is what is always done when looking for inductive properties [1]) which requires more effort on behalf of the user.

As I've written above, over the past two or three decades we've only learned the theory that things are more difficult than we thought, and at the same time the capabilities of sound verification have fallen further away from the needs of mainstream software. The trajectory of both the theory and practice of sound methods has been negative. All the while, we've made some nice strides in unsound methods. Tony Hoare started recognising it in the nineties (which somehow shifted his 70s view on the subject), and the trend has continued: Yes, there are improvements in sound methods, but the problems become harder faster than those improvements, and the improvements in unsound methods is also faster.

[1]: That is seen very elegantly in TLA+, but I digress.

zozbot234 · on May 16, 2024

That would require a formal model of how 'unsafe' works, and we still don't have that. Pointer provenance is an especially difficult issue, also affecting C/C++ specification efforts.

roca · on May 16, 2024

Indeed, we don't have that model, and we definitely need it. I don't think it's intractable. We've made a lot of progress. I would like to see more focused effort on this particular problem from the academic community.