Rust -- like any other non-research language -- achieves this particular feature...

anp · on June 12, 2016

> Rust -- like any other non-research language -- achieves this particular feature mostly through runtime checks

This isn't true, and it's a big part of what makes rust interesting for this kind of work. There are array bounds checks, yes, but only with the indexing operator which is not idiomatic rust (FWIW, Rust has idiomatic iterators which allow LLVM to remove bounds checks at compile time). Nearly all of the other memory safety guarantees are from compile time checks.

It's fine for a general discussion about rewrites to be uninformed about the specifics of the target language, but citing incorrect specifics about a language isn't helping your case.

pron · on June 13, 2016

Yes, I know that, but we are talking about bounds checks! In many places where there is simple iteration, a C compiler would be able to prove no over/underflow, and where the pattern isn't so straightforward, a Rust rewrite would be even more expensive. Now, I don't know how many times I need to say it, but I have no doubts that Rust is "interesting for this kind of work". I absolutely fucking love Rust! I am just seriously questioning the economic sense of rewriting every piece of legacy code out there (billions of LOC!) whenever a new language comes along in an effort to improve its safety/correctness just because we know that there are probably lots of bugs in there somewhere. Draining the ocean with a spoon is guaranteed to find every piece of sunken treasure, but it's probably not the approach with the greatest bang-for-the-buck.

What would you say if in ten years people would say, "Ugh, Rust?! That hardly guarantees any interesting properties. Fine, it ensures there are no data races and dangling pointers, but does it make sure the library/program does what they're supposed to? Let's rewrite every line of code out there in Idris or ATS! That would be the best use of our time!" Whenever a library needs to be replaced -- either it doesn't serve modern requirements, it's hard to maintain and needs to undergo changes, or it is simply shown to be broken beyond repair, by all means -- write it in Rust! Gradually, there will be more and more Rust code out there, which is no doubt much better than C code. But a wholesale rewrite?

anp · on June 13, 2016

Bounds checks aren't really about underflow and overflow though? At least not as typically discussed. They're about determining whether the access to array contents is within the predefined region of memory which is covered by the array. And while Rust has bounds checked when directly indexing into an array, idiomatic Rust relies on iterators which have bounds checks that are nearly always removed by LLVM -- generally eliminating the runtime checks you referred to.

> I absolutely fucking love Rust!

Me too!

> I am just seriously questioning the economic sense of rewriting every piece of legacy code out there (billions of LOC!) whenever a new language comes along in an effort to improve its safety/correctness just because we know that there are probably lots of bugs in there somewhere.

Me too!

I don't know anyone who is seriously advocating for "rewrite literally everything in Rust." I use that expression to describe those who specifically call out OpenSSL, glibc, etc. every time a new CVE comes up which requires a fire drill. Rewriting critical infrastructure is very different from "everything," and even then, I've not made any assertion that we should do so. Just that it's worth looking at how we might do so if we choose to, and that it's also useful to have exploratory projects which can discover some of the pitfalls here.

> What would you say if in ten years people would say, "Ugh, Rust?! That hardly guarantees any interesting properties.

That would be absolutely fantastic! Because if that's happening, perhaps Rust will have improved the overall situation and encouraged using better typing guarantees for systems work. If Rust is displaced by a language with better or more interesting safety properties, I will be elated (at least in part because to be displaced, Rust will have to have done OK in the meantime ;) ). But for now, I think it's worth finding out what mileage we can get out of Rust and its various properties and guarantees.

> Whenever a library needs to be replaced -- either it doesn't serve modern requirements, it's hard to maintain and needs to undergo changes, or it is simply shown to be broken beyond repair, by all means -- write it in Rust! Gradually, there will be more and more Rust code out there, which is no doubt much better than C code. But a wholesale rewrite?

So reading over your various responses, I think we're actually in general agreement. Any major change in a project requires careful risk assessment/management and justification. I'm just choosing to spend a small amount of my free time exploring the possibility space of the Rust rewrite process so I can know more (and also learn more about how my OS and most applications work).

kbenson · on June 13, 2016

> Bounds checks aren't really about underflow and overflow though?

Buffer overflow, is possibly what is being referred to? That's definitely the province of bounds checking.

I don't know anyone who is seriously advocating for "rewrite literally everything in Rust."

Personally, I assumed that was supposed to mean "some critical subset of infrastructure". I'm not sure why pron settled on the particular interpretation they did (everything in the literal sense), but they've been careful to define that over and over in their comments, and very few have bothered to clarify like you have. It's almost enough to make me wonder a bit if the interpretation you and I have is actually the minority and a sizable portion of the rust community, or at least those involved here, actually want it in a literal sense, but I think that would be silly...

pron · on June 13, 2016

> generally eliminating the runtime checks you referred to.

Yes, but I conjecture that this may be largely irrelevant (and would love to hear an explanation of why my conjecture is false) because one of the following is true in a significant enough portion -- if not the majority -- of the cases where this applies, either: 1/ the proof of no overflow could be determined by a C tool (it may not be a 100% proof that works in the face of, say, concurrent modification like Rust can guarantee, but something that's more than good enough), or 2/ making full use of Rust's contribution in that regard would require an API change, which you can't do. In most other cases, utilizing Rust's contribution would require a careful, very expensive study and analysis of the code (if you don't want to introduce new bugs), and so in that case, too, a C rewrite of that particular piece may still end up being cheaper and not (significantly) less useful (as the rewrite would reshape the code in such a way that a C tool would be able to verify its correctness to a sufficient degree).

Remember that proof (and I don't necessarily mean formal proof) that a piece of C code is correct in some specific regard (overflow) is just as much a guarantee as having that property verified by the Rust compiler, and is just as useful if you don't intend to make changes to the code anyway. If you're writing new code, then obviously it's great having the compiler help you, but when seeking to correct bugs in old code, most of it is likely correct, the justification is different.

> I don't know anyone who is seriously advocating for "rewrite literally everything in Rust."

Then why are people arguing? I honestly don't think I wrote anything even slightly contentious in my original comment. I wrote something that I think is pretty obvious (and meant as an emphasis of what you've written in your post; certainly not to express disagreement) yet worth noting on HN, where some people may be too new-language-trigger-happy but may have less experience with legacy code improvement and various (effective!) verification tools, or may be unaware that the majority program-correctness work is done outside the realm of PL design.

Isn't it obvious that before deciding to rewrite large amounts of largely untouched and largely correct legacy code, we should weigh the relative cost and benefit of other approaches, like tools that can help find and fix lots bugs relatively cheaply, and then wrap the thing and forget about it until compelled to rewrite by some critical necessity? It seems like some people disagree, and favor a preemptive rewrite of as much infrastructure code as possible, merely because it's "better".

I admit I made the mistake of mentioning "totally correct" programs that, while true that they are currently more feasible in C than in Rust (thanks to really expensive tools like Why3 or even automatic code extraction from various provers), it is an entirely different and separate issue, one that applies only to a minuscule portion of software.

> Just that it's worth looking at how we might do so if we choose to, and that it's also useful to have exploratory projects which can discover some of the pitfalls here.

Absolutely. You're doing great work, and like I said, I hope you keep track of the effort you spend as well as the number and kind of bugs you find in the process. That would make your work truly invaluable.

> That would be absolutely fantastic! Because if that's happening, perhaps Rust will have improved the overall situation and encouraged using better typing guarantees for systems work.

Hmm, here we may be in disagreement. I think that making types ever more expressive has diminishing returns, and is, in general, not the best we can do in terms of correctness cost-effectiveness; indeed, most efforts in software correctness research look elsewhere. Types have some advantages when expressing more "interesting" program properties and some disadvantages (although I think Rust's borrow checking is some of the most useful use of advanced typing concepts I've seen in many years; I can't say the same about languages designed for industry use that try to employ general dependent types with interactive proofs).

> But for now, I think it's worth finding out what mileage we can get out of Rust and its various properties and guarantees.

I agree, but I believe more benefit will likely be gained by writing new libraries and programs, not from rewriting old ones. I think only concrete absolute necessity should guide any rewriting effort.

> I'm just choosing to spend a small amount of my free time exploring the possibility space of the Rust rewrite process so I can know more (and also learn more about how my OS and most applications work).

Awesome! Carefully collect data! We need it.

kbenson · on June 13, 2016

> Then why are people arguing? I honestly don't think I wrote anything even slightly contentious in my original comment.

I think your original interpretation was slightly pedantic, and even though, to your credit, you were very careful to repeat your exact phrasing multiple times in many comments, nobody commented on what was likely the crux of the difference (which I have to admit, I'm confuses as to why that was ignored). Should everyprogram be converted? Definitely not. Is conversion even necessarily the goal? I think not. I think the goal is to get a rust representative of each area for promote choice.

It's about trust, and not necessarily how much I trust the specific developer, but that in using a language the requires specific conformance to compile, I can replace some subset of the trust I had to afford to the developer to the tool I use to compile. Do I trust code written by DJB or the OpenBSD developers? Yes. Would I like a way to transfer some amount of that level of trust to the random author of a library on github that would be really useful for my work? Hell yes. Does that replace my need to look for markers, whether in the code or based on project/author status, to determine whether the project and/or developers are competent and trustworthy? No, but it can reduce it.

> I agree, but I believe more benefit will likely be gained by writing new libraries and programs, not from rewriting old ones. I think only concrete absolute necessity should guide any rewriting effort.

Depending on what you mean by "new libraries" (new concepts or new versions?), I don't think that's sufficient. True, most of the real benefit of a rewrite or new implementation of a need won't happen from a language conversion from an existing library, but if it comes down to getting a new (for example) libc and our choice is to convert a small implementation so there's something or to wait for the next project to come along and hope they don't choose C, I'm happy with the conversion for now. If we're lucky, it will also show those about to start on the next libc-like project that Rust is a viable option, so it isn't done in C.

jcranmer · on June 12, 2016

The C language is sufficiently broken that's it really hard to reliably retrofit any type of safety features on top of it. Since people assume that C is portable assembler, they expect it to map very closely to the underlying machine, which makes even specification-permissible changes effectively impossible (good luck getting fat pointers or non-wraparound integer overflow working on real programs!).

The state of the art here for retrofitting C code is ASAN and tools like SoftBound, SAFECode, and baggy bounds, all of which use shadow memory techniques and impose about 100% overhead. And you can't use them simultaneously because there's a conflict on the shadow memory, which makes actually deploying software with security checks enabled impractical if not impossible.

Disclaimer: I've worked on research projects for automatically adding software protections to C/C++ code (albeit for integer overflows, not memory safety).

pron · on June 13, 2016

Once again: we are not talking about the brokenness of C, but about whether rewriting billions of lines of legacy code whenever a safer language comes along because surely it contains bugs is the most economically sensible way to improve software quality.

I am not talking about providing all the benefits a rewrite would, but I am suggesting something that would be orders of magnitude less expensive, and still very effective (and may actually have some benefits that a rewrite won't), with the net result of being an overall more effective strategy (and I am not talking about runtime tools but about whitebox fuzzing, concolic testing and static analysis tools).

jjnoakes · on June 12, 2016

> Rust [...] achieves this particular feature mostly through runtime checks

If you do a 1:1 translation then sure. But idiomatic rust probably uses zero-overhead (or very nearly so, I haven't checked in a while) iterators and other higher level constructs that generate extremely efficient code and don't require bounds checks.

You won't see nearly as many "for (size_t i = 0; i < BOUND; i++) { ... array[i] ...}" in Rust, which means even though array bounds checking might be no faster in Rust, the fact that you don't use it as often is a big win.