Hacker News new | past | comments | ask | show | jobs | submit login
Fearless Security: Memory Safety (hacks.mozilla.org)
137 points by feross on Jan 23, 2019 | hide | past | favorite | 86 comments



When talking about Garbage Collection they claim that "Even languages with highly optimized garbage collectors can’t match the performance of non-GC’d languages" and link to this paper: http://greenlab.di.uminho.pt/wp-content/uploads/2017/09/pape... However I cannot see where this paper supports the assertion (although Rust comes out well).

The particular problem is that malloc/free is not free of charge. Different allocators have complex internal implementations, plus you lose easy sharing of complex structures and compaction. So if you're using a GC you probably program in a different way, eg using more shared immutable structures, exploiting the advantages of GC.

Edit: A bit later they brush off this old favourite for GC-haters: https://people.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf claiming that GC requires 5x as much memory. The problem with this paper is it assumes that malloc/free is cost-free and instantaneous, and that your programmer always calls free at the exact moment that the memory is no longer needed, both of which are extremely unrealistic.


>The particular problem is that malloc/free is not free of charge. [...] The problem with this paper is it assumes that malloc/free is cost-free and instantaneous,

I think I get that you're trying to explain how GC overhead is overstated and malloc()/free() is understated but your angle about "malloc not being free of charge" it isn't really the evidence you want to use.

As analogy, we can see that a Lamborghini is faster than a Hyundai. Let's say we state that the performance comparison is flawed because it it still takes the Lamborghini a non-zero amount of time to accelerate to 60 mph (or 100 km/hr) and Lamborghini is still consuming gasoline. While the cost datapoints are true, it still doesn't change the macro observation that the Hyundai is slower.

Also, you're misrepresenting the 2nd paper you cited. The authors do mention "overhead" of malloc. They do not assume that malloc is "cost-free" on page 4:

>3.2 malloc Overhead - When using allocators implemented in C, the oracular memory manager invokes allocation and deallocation functions through the Jikes VM SysCall foreign function call interface. While not free, these calls do not incur as much overhead as JNI invocations. Their total cost is just 11 instructions: six loads and stores, three registerto-register moves, one load-immediate, and one jump. This cost is similar to that of invoking memory operations in C and C++, where malloc and free are functions defined in an external library (e.g., libc.so).

In other words, even if malloc/free is not instantaneous and not cost-free, it can still be faster than GC. Neither paper's conclusions depend on malloc/free taking zero amounts of time or zero cpu instructions. If the papers are flawed, you have to explain it using different logic.


> >3.2 malloc Overhead - When using allocators implemented in C...

That's not even about malloc overhead, it's about the overhead of just calling C malloc. So actual malloc overhead is on top of that.

Malloc and free have pretty unpredictable runtime over time, once a lot of allocations and deallocations have been performed. That's why you don't use either in latency sensitive code, like with realtime requirements.

> In other words, even if malloc/free is not instantaneous and not cost-free, it can still be faster than GC.

Whoah, faster than GC in what regard? GC will probably win the throughput race, or average latency. Manual allocation will likely win the jitter race, lower latency standard deviation.

(I write low level code in C/C++, including kernel drivers and bare metal firmware with hard realtime requirements. Hopefully in the future in Rust or some other memory and concurrency safe language.)


>Malloc and free have pretty unpredictable runtime over time,

Yes, that's another true statement about malloc but it also doesn't matter to the particular point I'm making. To continue your correct & true statements of malloc, we can add:

- malloc has to search the freelist; GC can be just a bump allocation which is faster

- malloc leads to fragmented memory; GC can reorganize and reconsolidate

- malloc doesn't have extra intelligence to assign pointers to shared memory structures (e.g. Java string pool stores identical strings only once based on hashes)

- ... a dozen other true statements about malloc

All those true statements (which most can agree on) isn't the misunderstanding. The issue is misusing those true statements as some type of convincing evidence to explain the papers' flaws. For example:

>Whoah, faster than GC in what regard?

Well, we can just use the total runtime of the the 2 papers benchmarks where there were lots of memory operations. (In other words, we can acknowledge that performance has multiple dimensions/axis but we can also look at the simple measurement of total wall clock time of benchmark code that doesn't do database access or floating point calculations.)

The C/C++ programs ran faster and took up less memory.

Ok, were there flaws in the benchmarks? Then lets explain the specific flaws.

Yes, I can say "malloc runtime is unpredictable" but that true statement doesn't actually explain anything about GC running slower than malloc/free in the papers. We can also say that "malloc is not cost free" as another true statement -- but that also doesn't actually explain the GC's longer elapsed time.

See the problem with those attempted explanations? They're all non-sequiturs.


> Well, we can just use the total runtime of the the 2 papers benchmarks where there were lots of memory operations. (In other words, we can acknowledge that performance has multiple dimensions/axis but we can also look at the simple measurement of total wall clock time of benchmark code that doesn't do database access or floating point calculations.)

...

> See the problem with those attempted explanations? They're all non-sequiturs.

I'm comparing GC vs manual memory management.

You (or the papers) are comparing different implementations of programs in different languages. That might be great for practical considerations for choosing implementation language, but is pointless when comparing those two different memory management strategies. Apples and oranges.

EDIT: I feel "Quantifying the Performance of Garbage Collection vs. Explicit Memory Management" paper is a bit dishonest. From the paper:

> The culprit here is garbage collection activity, which visits far more pages than the application itself [61]. As allocation intensity increases, the number of major garbage collections also increases. Since each garbage collection is likely to visit pages that have been evicted, the performance gap between the garbage collectors and explicit memory managers grows as the number of major collections increases.

Pages got evicted – so their heap ran out of physical RAM and started swapping to disk. Wow.

Yeah, GC uses much more RAM, that's a well known downside. Setting the benchmark up in such a way that causes the system to start swapping is not a fair way to compare GC and manual allocation throughput.


>You (or the papers) are comparing different implementations of programs in different languages.

Fyi... the 2nd paper is using the same language of Java. It just compares different allocation strategies: explicit vs GC. (I think that paper is written in a confusing way.)

My original point back to op (rwmj) was that the computer scientists were quite aware that malloc had a non-zero cost. And pointing that out really doesn't challenge the paper's findings.


Yeah, and the second paper said their GC scenario system was swapping to disk. Please read my edit to the previous comment.


Yet several companies on the tourism world circuit happen to use semi-automatic gear boxes that outperform any human driving with manual, so much for the typical car comparisasions.


Agreed. I suspect a more useful distinction would be "...can't match the performance of languages that facilitate stack allocation" - the main advantage comes from locality of reference rather than tighter heap management. There's a separate point that could be made about the memory headroom needed for GC to work well, but (sadly) I don't think many people think of memory-efficiency as a performance metric these days.

I'm not sure about your

> So if you're using a GC you probably program in a different way, eg using more shared immutable structures

This is definitely not the case for mainstream GC languages like Java, which have truly awful support for immutability, to the point of having to write a separate type if you want it.


Sorry if OT, but the nand2tetris course really brought the benefits of stack vs heap allocation into stark relief. As part of the course, you have to implement a compiler and OS, including malloc/free [1].

You can see how local variables just push something onto the stack, which is a small number of CPU instructions, while malloc is a big function, each step of which translates into many instructions, and which also has to deal with iterating through a data structure to find a big enough block.

Part 8: Implementing the virtual machine, where you do stack allocation/calls: https://www.nand2tetris.org/project08

Part 12: Implementing the OS, where you add the malloc/free utilities (Memory.jack): https://www.nand2tetris.org/project12

[1] It calls them alloc and deAlloc in the course.


Agreed about Java. However in OCaml or Haskell it allows elegant immutable tree structures with maximal sharing between versions.


> that your programmer always calls free at the exact moment that the memory is no longer needed

I think this is not so unrealistic if you use RAII.


Herb Sutter has a nice talk about how RAII can cause stack overflows or stop-the-world pauses if used badly.

https://www.youtube.com/watch?v=JfmTagWcqoE


Even then. There's nothing stopping you from placing a loop in the same scope as an object that's not used in or after the loop.


Also another point is that even GC languages do happen to support value types, stack, global memory allocation, manual memory management in unsafe code and memory slices.

So putting all on the same bag doesn't work quite well.


This can be easily illustrated using this example [1], which uses both the Boehm GC (v8.0.2) and jemalloc (v5.0.1), directly installed via Homebrew, as well as the system malloc on macOS to implement the binary trees benchmark from the benchmark game.

The benchmark keeps a large live set and allocates aggressively, making it an unattractive scenario for many GCs.

The Boehm GC is being run with the following four distinct configurations:

- Four marker threads in parallel, trading CPU time for wall clock time and lower pause times.

- Single-threaded marking.

- GC disabled, all memory is freed explicitly.

- Incremental collection (using virtual memory).

The results, run on a Macbook Pro with a six core 2.6 GHz Core i7:

  $ make benchmark DEPTH=21
  # jemalloc explicit malloc()/free()
  /usr/bin/time ./btree-jemalloc 21 >/dev/null
         17.53 real        17.40 user         0.11 sys
  # Boehm GC with four parallel marker threads
  GC_MARKERS=4 /usr/bin/time ./btree-gc 21 >/dev/null
          8.50 real        10.87 user         0.09 sys
  # Boehm GC with single-threaded marking
  GC_MARKERS=1 /usr/bin/time ./btree-gc 21 >/dev/null
         10.40 real        10.33 user         0.05 sys
  # Boehm GC with explicit deallocation
  GC_MARKERS=1 /usr/bin/time ./btree-gc-free 21 >/dev/null
         11.75 real        11.70 user         0.04 sys
  # Boehm GC with incremental collection (single-threaded)
  /usr/bin/time ./btree-gc-inc 21 >/dev/null
         18.39 real        16.40 user         5.11 sys
  # System malloc()/free()
  /usr/bin/time ./btree-sysmalloc 21 >/dev/null
         64.43 real        63.69 user         0.71 sys
Obviously, one should not read too much into this, as this is a very specific scenario with its own very specific allocation behavior that will not match other use cases. And one can speed up this specific example easily with a specialized allocator (as all allocations have the same size and predictable lifetime). Plus, different GCs make different tradeoffs, and so do general purpose manual allocators.

But for throughput at least, the worries about GC overhead tend to be exaggerated.

In practice, any language with proper value types will also spend only a fairly small fraction on allocation and garbage collection, so overhead becomes less of a problem.

[1] https://gist.github.com/rbehrends/528fc713c24195b1c8aefda074...


For completeness sake, there are C programs that never allocate memory dynamically, using press-allocated variables only. Your microwave likely runs one.

There are also programs that allocate but never we-allocate memory. They are fast-terminating programs, ranging from a CLI utility to onboard software that controls a surface-to-air rocket.

But many programs still need more complex memory management done safely.


I think it's worth noting that there's plenty of memory unsafety to be had without dynamic allocation.


Buffer overflow without any dynamic allocation:

#include <string.h>

#include <stdio.h>

int main() {

char hello[5];

strcpy(hello, "hello");

printf("%s\n", hello);

}

(gcc warns about, interestingly clang does not)


Rust recently had an integer overflow in str::repeat permitting a buffer overflow. It was a classic bug, the kind that gave C a bad reputation. And the error(s) in arithmetic occurred outside any unsafe block.

If we want to be pedantic, all this talk of "fearless" wrt Rust is dangerous hyperbole. Once you move past juggling scalar values things get dicey, especially in C but also in so-called "safe" languages like Rust. And that, I think, was the previous poster's point--as you move away from dynamic allocation you tend to move toward using scalar (fixed-sized) objects.

One of the original points of emphasis of Rust was to favor scalar values rather than pointers and even references. To a limited but useful extent you can mimic this in C. Rust examples that simply use Vec miss the point--1) the str::repeat bug overflowed a Vec, and 2) just because C doesn't come with a built-in Vec doesn't mean you can't write one or use one.


> Rust recently had an integer overflow in str::repeat permitting a buffer overflow.

In case someone is interested of the details, like I was: http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-10008...


>And the error(s) in arithmetic occurred outside any unsafe block.

This is completely misleading. The unsound check for integer overflow[0] was in 'safe' code but just because the code is not in an unsafe block does not absolve you of any potential bugs, and the subsequent writes to memory that were in an unsafe block clearly did not care about prior mistakes that were made. The PR that introduced the vulnerability[1] was due from switching a safe method to one that uses unsafe for performance reasons, I suggest checking out the commits for both the introduction of this bug as well as the commit that fixed it as they're both short and the actual bug only occurs on one short line.

It's also worth noting that the vulnerability could have been found using a fuzzer[2] that did not exist (for Rust) at the time but that does now exist, and that integer overflows are checked for on debug builds (but not for release builds by default, again due to performance reasons) but that in this particular vulnerability it would have been extremely unlikely that any legitimate code would have triggered this integer overflow.

The take away here isn't that 'even 'safe' languages get it wrong', but that unsafe operations are difficult to do correctly, and that's true in either C or Rust. There's further discussion[3] on this bug if anybody is interested.

[0] https://github.com/rust-lang/rust/pull/54397

[1] https://github.com/rust-lang/rust/pull/48657

[2] https://medium.com/@shnatsel/how-ive-found-vulnerability-in-...

[3] https://internals.rust-lang.org/t/pre-rfc-fixed-capacity-vie...


That's dynamic allocation on the stack, though. As the parent said, your microwave probably (maybe) doesn't do that.

Move the declaration out of main and declare it statically. And, of course, don't copy large buffers into small ones.


Unfortunately, that does not completely prevent memory corruption bugs in C. But it's a step in the right direction. For example BearSSL is a C TLS library that does not use malloc.


"Another type of problem that can appear is memory leakage... This is a memory-related problem, but one that can’t be addressed by programming languages."

https://www.fos.kuis.kyoto-u.ac.jp/~tanki/papers/memoryleak....

Well, that took about a minute in DuckDuckGo. :) Key words were "memory leaks" and "type system." For those wanting to find CompSci, adding type system in quotes to a property is a reliable way to find language work on that property. The word language can help, too, but the wording in PDF's varies on that more.


Not bad but a little bit of a publicity stunt.

A major source of vulnerabilities is (still) the Javascript engine and that's (still) written in C++.

Even worse, as far as I know, Mozilla has no plans to rewrite even parts of Spidermonkey in Rust.

For some recent examples:

https://usn.ubuntu.com/3688-1/

https://usn.ubuntu.com/3749-1/


SpiderMonkey dev here. As others have mentioned, Cranelift is one component that's being written in Rust. Eventually we want to use Cranelift as compiler backend not just for WebAssembly but also for JS. After that it might be feasible to port more of the JS JIT to Rust. It might also make sense to use Rust for the parser or regular expression engine, who knows.

There will probably always be C++ code in Gecko, but I firmly believe that writing more components in Rust instead of C++ will (in general) improve security and developer productivity.

It still amazes me that we're actually shipping a browser with a CSS engine (and soon graphics engine!) written in Rust. Even more amazing is that these components are mostly shared with an entirely different browser engine.


A JS engine is a high-risk, high-reward problem for Rust. High-reward because JS engines are, to your point, a major source of vulnerabilities; high-risk because JS-engine theory is rather outside of Rust's wheelhouse.

One class of vulnerabilities in JS engines is use-after-move. A raw pointer is extracted, an allocating function is called (triggering a GC), then the raw pointer is used, pointing into nowhere. It's awkward to express in Rust that a function may modify state inaccessible from its parameters.

A second class of vulnerabilities is type-confusion. A value is resolved to (a pointer to) some concrete type, but some later code mutates the value. Now the concrete type is wrong. Again this possibility is awkward to express in Rust.

The problem is complicated by the NaN-boxing and JIT aspects of JS engines, which interfere with Rust's tree-ownership dreams.

People smarter and way better at Rust than myself are working on it; I'm excited by the prospect of novel solutions that can defeat entire classes of problems.


I'm curious what proportion of vulnerabilities in JS engines are due to mis-generated JIT code vs direct errors in their compiled code. Rust allows you to express some nice properties not always directly related to memory safety (e.g. checked consumption, convenient and safe ADTs), but unless there is a novel application of these facilities to the structure of a JIT engine it won't help a ton with the former kind of vulnerabilities.

I'm excited to see a practical programming language that implements full dependent typing; languages like Idris are actually really good at dealing with precisely the kinds of situations you mention.


JS engines have many parts implemented natively, which may be called from JS, and in turn call back into JS. An example is CVE-2015-6764: this grabs an array length, which quickly becomes stale, because accessing one of the array's elements invokes a custom toJSON which in turn modifies the array's length.

This feels like a hopeless problem; can any of Rust's powers be brought to bear here? Could Idris?


F* is probably the best equipped at the moment to deal with situations like that CVE at the moment, since its library has a concept of heaps. Basically, any function that can access or modify the "heap" (which in F* is just a set of pointers that are guaranteed to point to a value and not alias any others outside of the same heap) must specify what properties of the state of the heap must be true at entry, and what properties are true afterwards. So in pseudo-types (, the functions for accessing a JavaScript array would be something along the lines of

    fn arrayLength(x: JSArray*) -> n: uint (requires nothing) (ensures length of x = n, changes nothing)
    fn callToJson(x: JSValue*) -> JSValue* (requires nothing) (ensures nothing)
    fn arrayAccess(x: JSArray*, m: uint) -> JSValue* (requires length of x > m) (changes nothing)
(NB: F* syntax doesn't look much like this, but I'm guessing this will be readable to more people on HN)

The stuff in parentheses after each function type are the preconditions and post-conditions respectively. So if you do something like:

    let x = arrayLength(someArray)
    for i in range(x) {
      let element = arrayAccess(x-1)
    }
It will typecheck just fine. But if you add the call to toJSON:

    let x = arrayLength(someArray)
    for i in range(x) {
      let element = arrayAccess(x-1)
      let transformed = callToJson(element)
      // ERROR: (requires length of x > m) not satisfied for all runs of loop body
    }
Since callToJson cannot ensure any property of the heap after it runs. In this way you can elide range checks when needed for performance without worrying that you've sacrificed safety.

Covering all the cases a JS engine would need without adding 10 million lines of proofs to the size of SpiderMonkey is still an open problem, but this general approach (known as Hoare Logic[1]) is very enticing, and the type systems that languages like Idris and F* have are definitely the closest to realizing it in more places. There are real software engineering efforts using descendants of Hoare logic like TLA+ (notably Amazon IIRC), but it's rare to see it even in huge projects like browsers.

It's also critical to note that the heap concept of F* is not a totally fixed part of the language; most of the specification of how heaps work are actually in the standard library. That level of flexibility is what I think makes these languages likely to become capable of tackling these problems: something like a JS engine or any optimizing compiler is exactly the kind of place where being able to come up with your own type-level verification model is worth the effort.

[1]: https://en.wikipedia.org/wiki/Hoare_logic


> Mozilla has no plans to rewrite even parts of Spidermonkey in Rust.

Here's a substantial part being rewritten in rust by Mozilla: https://github.com/CraneStation/cranelift/blob/master/spider...


I am not sure I would call cranelift "substantial" in terms of exposure/usage. From what I gather, it's not used at all for normal, everyday Javascript.

I stand corrected though, every little bit helps. Here's hope they'll start using Rust in more places where it counts.


I suppose "substantial" is subjective, but I really do thing it counts. Certainly their are unfortunately frequent vulnerabilities in the code it intends to replace. For example:

https://bugzilla.mozilla.org/show_bug.cgi?id=1493900

https://bugzilla.mozilla.org/show_bug.cgi?id=1493903

To be fair I'm not actually sure rust would fix either of the CVEs I linked. Both being about problems in the generated code (as I understand them from a glance), which is something inherently unsafe to do.

Edit: I realized you might be picking out the word "ARM" on that page. I know Crainlift also works on x86, and I assume it's intended to replace IonMonkey everywhere, not just on ARM chips.


Rust isn't just for Firefox. I'm unsure if Safe Rust works with JIT compilation. Unless some new method comes around jit is King when it comes to JavaScript.


> I'm unsure if Safe Rust works with JIT compilation.

I mean, if you just want to compile the code, sure.

Executing arbitrary machine code not generated by the rust compiler (i.e. by the JIT compiler you wrote in rust) is basically the definition of unsafe though...


"Some languages (like C) require programmers to manually manage memory by specifying when to allocate resources, how much to allocate, and when to free the resources."

This is not required of programmers in C, because the programmer could choose to delegate memory management to a memory management library, such as the Boehm-Demers-Weiser conservative garbage collector. [1]

[1] - http://www.hboehm.info/gc/


You're still required to manage memory allocation.

Even having to explicitly delegate memory management to a garbage collector is arguably manual memory management compared to other languages (It's a choice you have to make and adhere to, not a decision that's already been made for you).


Conservative garbage collection like Boehm inevitably leads to memory leaks in long running applications. It's awesome for some use cases but isn't a complete solution by any means.


Wrong. Only if the system failed to identify a root, which is a grave developers mistake, but in the general case boehm-gc does not generate any leaks.

Unlike manual memory collection which inevitably leads to memory leaks.


Failing to identify a root would cause the GC to free too much memory. Memory leaks happen when too little memory is freed. Boehm GC is called conservative because it can't distinguish pointers from other memory content, so it will determine some allocations to be reachable even if there's no pointer to them, because some random integer looks like a pointer into that allocation.


I see, that's what you meant. You are right. You need to use boehmgc in precise mode, which is not the default. You need to tag your pointers. Also use incremental mode for shorter pause times.


Conservative collection is unfortunately terrible when you're dealing with data that looks random, like cryptography. It generates many false "pointer live" positives, which ends up retaining a lot more memory which looks like it's leaking.


True if you don't cooperate at all with the collector, but easily avoided in many cases. E.g., when using Boehm, try using gc_malloc_atomic() for data that you know cannot contain pointers, like your encryption buffers.


Neat. Doesn't really change anything, except that you aren't required to manually manage memory but, practically speaking, you will end up manually managing memory. Especially if you're choosing C for performance requirements.


Not necessarily.

My first experience with the boehm-gc was a long time ago, when I was using a very performance-intensive AI library. As an experiment, I modified it to use the boehm-gc and, surprisingly, it actually became faster.

I've since learned that such a speed improvement when manual memory management is replaced with the boehm-gc is not uncommon.


'Performance' isn't just raw averaged throughout, but often includes worse case latency. There's a lot of perf sensitive applications where a GC isn't a great fit.


This is true, but a lot of those latency sensitive applications are hard latency requirements, where you have to do something in X time, but doing it faster than X isn't actually useful.

There are realtime GCs that can meet these hard requirements.


Those hard real time GCs everytime I've seen them come with overall throughput compromises. Many times real time constraints and overall throughput aren't mutually exclusive.

And even on the soft real time side, like rendering modern GUI, GC pauses causing frameskips (you have ~16ms to render each frame) makes your app look janky.


Totally, that could absolutely be the case. But it probably won't be. The reality is that most C programs do use manual memory management, by a very wide margin.


Unfortunately the defaults are insecure, and so are most of the programs written in C.


[flagged]


Current systems programming in unsafe languages has already resulted in many disasters. Rust is too complex to be the language of script kiddies, and neither does the article infer it's direction towards it. Perhaps if devs of old changes with the times we would already had rust 10 years ago.


I write/maintain BSPs and RTOSes for a myriad of embedded systems, every day, for the last 15 years. I can't remember the last time I had a bug like the ones mentioned in the article...

I look at Rust with enthusiasm, but... if they are targeting "system" developers, the ones already doing that job, the last think they (we) worry about is on doing memory management errors. At some point, memory management becomes an automatic process, you just don't think too much about it.

But on the other side, perhaps the scope of a language like Rust is to bring more people from higher layers into developing engines that require performance. The Rust sales-pitch may work on people like a JS programmer but won't click yet on people like me: I just don't want to be baby-sited by a compiler.

I also noticed during my entire life that (reasonable) "fear" that C/C++ produces on developers, and the many recipes for the cure. "Script kiddies" as you call them will never target the C/C++ state-of-the-mind, let alone write system-related software: it just too difficult, and boring, and don't want to deal with truly complicated problems.

You can't solve a "lazy programmer" problem with a "language" solution. I think they're targeting the wrong audience.


> [It] becomes an automatic process, you just don't think too much about it. (...) I just don't want to be baby-sited by a compiler.

Interestingly, as a Javascript developer, I see an interesting parallel here with TypeScript. You hear the argument "TypeScript is for people who don't know Javascript" relatively often - which I think is similar to referring to a compiler "baby sitting". Sure, making sure you use the correct types everywhere and don't rely on implicit type casting is an automatic process for me, but that doesn't mean I'm perfect, nor that it doesn't subconsciously still creates a cognitive load - hence why I still greatly appreciate TypeScript, even though it only helps me with things I supposedly already know very well how to do.


> the last think they (we) worry about is on doing memory management errors. At some point, memory management becomes an automatic process, you just don't think too much about it.

I think the large number of memory-safety bugs in even popular C libraries indicates that this is not the case.


> At some point, memory management becomes an automatic process, you just don't think too much about it.

The thing with Rust is that this becomes literally true. Because the borrowck pass runs automatically with every Rust compile, and tells you where it could not prove that you're managing memory correctly. And yet, the single biggest obstacle for C/C++ developers trying to move on to Rust is that they keep "fighting with the borrow checker", with only a very low understanding of what it would take to fix their code so that it can pass the automated checks. This does not inspire much confidence.


Exactly. But in the "system" industry that reason only is not worth the effort. I already know how to ride a bicycle "by instinct". Why would I use the training wheels again? just in case?

I really hope libraries like libssl and other foundational (but not considered "system") libraries are rewritten in Rust, but also believe that they should not push lower than that (kernel, peripherals, bare-metal).


I'm not sure if I'd compare a type system to training wheels. In general, a type system will be able to tell you that you're doing something wrong, but won't be able to correct it. You're still managing the memory/lifetimes yourself. Something like garbage collection would be more like wheels, since it does the work for you.


I agree; and there's nothing wrong with not wanting to deal with that level of complexity and hair pulling, I don't blame them.

I don't mind Rust either; it doesn't appeal to me, but then many languages don't. If it helps someone move forward, that's a good thing.

What isn't working very well is having Rust stuffed down my throat while being lectured by ignorant assholes who don't even know which side is up. That's going out of fashion fast.


Programs have bugs, period.

Some of us have been around long enough to see the pattern, go back 15 years and you'll find me preaching the same gospel in the name of C++.

One thing that would help is to stop dumbing down programmers and languages, and start valuing experience and powerful tools again. The full stack developer role is a joke, and not a very funny one.


One can have valid criticisms of Java, the language. But there is no disagreement that the JVM is very good in terms of performance.


The problem is that by the time it's warmed up enough to actually run as fast as advertised, the world has already moved on. And when it still doesn't, figuring out why means digging through a giant pile of complexity that you didn't ask for. Thanks, but no thanks; I wasted enough years on Java.


Unless you happen to use a JVM from IBM, Azul, or many other JVM vendors that support AOT compilation and caching JIT code between executions.

A way to save years on Java is to use the right implementations.


When your life depends on something, nerves will kick in and prevent you from doing well. Ok? I can't find the article on google now, but the experiment was done in India where a worker will be given the equivalent of one year of salary if they can perform a simple tasks. Before being told the stakes were so highly, many could complete, but after being told about the potential reward, almost none can do it.



I know. Just pointing out that it's not scientific. People use that to mean it's important, and hence will bring out your best. But the opposite is true scientifically.


> Most of them couldn't write a page of decent C if their life depended on it.

Do you have any evidence of this claim?


[flagged]


Quite true, they just contribute to new entries on the CVE database.


I suggest learning C properly before you make more of a fool out of yourself. It's more constructive, too; because it actually improves your situation. Right now you're just digging a bigger hole.


Should we do a quiz test about which one of us knows C best?

Rest assured, except for the C11 changes to the standard, I do know C pretty well.

I majored in systems programing with focus on distributed computing, graphics and compiler design, so I got to use it a lot, and even teach it to first grade students.

My first job was writing server secure code across Aix, HP-UX, Solaris, Windows NT/2000.

So what about that quiz?


I don't do quizzes, too much code to write to waste time on alpha bullshit. It's not up to me to prove either, plenty of people prefer C, plenty of them more experienced than you; calling them all stupid based on your quiz skills doesn't look very intelligent from here.


That's not evidence, that's restating the claim with more words. (And different ones - your claim was about decent C. There are plenty of experienced C programmers who have been writing indecent C for decades.) Do you have any evidence of the claim?


Evidence? I have my own authentic experience, if that's not good enough for you it's really your problem.


Unfortunately most people can't write decent C. Actually nobody understands what decent C is.


True, and False.

You're projecting your own experience.

I have a pretty good idea what decent C looks like, as does anyone else who spent 30 years using it.


In spite of all those quality gates before a C patch lands in the Linux kernel, the CVE reports just keep increasing.

Source, the Linux Kernel Summit 2018 and the Google sessions on kernel security.


Programming is difficult, writing kernels even more so; hence there will be bugs. It's not a language issue.


Google and a large majority of Linux kernel developers think otherwise hence Kernel Self Protection Project.

According to Google 68% of 2018 CVE's were caused by C's lack of bounds checking.

Google is also collaborating with ARM on their memory tagging extensions to tame C.

Like everyone else you can go watch the Linux Kernel Summit 2018 talks.

Oracle also thinks otherwise, hence Solaris with SPARC ADI memory tagging turned on by default.

DoD has a report where UNIX typical exploits weren't possible in Multics thanks to PL/I instead of C.

https://multicians.org/b2.html


it is a language issue


In related news, the C/C++ development community is finally catching up to what Java has been providing for over 20 years.


This is not really a good take. There's reasons why Java was able to do what C/C++ couldn't, and there were also performance implications that kept people from moving away from C/C++.


What is it that the C/C++ development community is catching up to?

(And also, I'm not convinced that programming languages should be treated as having their own isolated developer communities, considering there is often a lot of overlap. Are we talking individual users? Companies? Language designers? Etc.)


Good support for multi-core programming comparable to what java.lang.concurrent, language thread and network async IO offer.

C++11 introduced std::thread, with a couple of issues retified in later revisions, and apparently executors just failed C++20, delaying the introduction of a major part of async networking.

Language safety as well.

For me, in spite of the safety improvements in C++, I see the language being tailored for specific niches and no longer a full stack language, similar to how it is handled on modern desktop and mobile OSes.

And C will never catch up in security.


Interesting that your metaphor involves comparisons of speed...


I mean, this post is more about Rust than C and C++.


Java is the worst language ever invented. Go and Erlang both have annihilated it.

C and C++ are definitely being replaced by Rust now.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: