I have met Shapiro and he is a very nice and smart guy, but I find it rather shocking that somebody could invest so much time and effort in designing new language before building a library to do something as fundamental as I/O.
Is interesting to contrast his approach building BitC to that of Ken Thompson and Rob Pike (both former colleagues of his at Bell Labs) building Go.
Go has been criticized for "not paying attention to PL research", which might be true, but is already extremely useful and even before its first 'stable' release people already have used to build large systems used in production.
I think what applies to building other kinds of software applies to designing programming languages too: you need something that you can start using to build real systems as soon as possible and then iterate from there based on your experience.
The problem with languages is that changing the language involves throwing away the code you wrote, which discourages one from building real systems with a language that is still changing.
That puts even more value in keeping languages simple. And tools like gofix are also quite nice to help with this.
They have fairly different goals, though. BitC's main goal was to be a verifiable systems language, so it makes some sense that a lot of effort up front was put into investigating the "verifiable" part. Rather than building libraries and trying to get it to catch on, the effort was focused on trying to figure out if it could be designed in a way that simultaneously supported low-level, systems-ish programming while also supporting verification. If the answer on the verifiability part turned out to be no, then that would mean it wouldn't necessarily be worth building the rest (or trying to get people to use it), since that was the point of the project existing in the first place.
Let's define success as new + useful. You can build something new then make it useful, or you can start with useful and add something new.
I suggest that the second approach is better, because we already have a good sense of how to measure useful, so you will always know where you are. Keep your eyes on the prize, as they say.
If the goal was to make a C-like language that's verifiable, though, and you end up with a C-like language that isn't verifiable, in what sense is that useful? The whole usefulness of the project, at least as it was conceived, hinged on it adding some notion of verifiability to C. Otherwise it's just a C clone with smaller installed base, worse tool support, etc.
There are degrees of verifiability. A C-like language with imperfect but stronger verifiability (for example, with a strong, formalized type system) is still more verifiable (admits the proof of more theorems via static analysis) than C.
The whole usefulness of the project, at least as it was conceived, hinged on it adding some notion of verifiability to C.
That is exactly what I am saying. It sounds like they trimmed away too much of C in the process and didn't check to make sure those parts could be added back. I haven't really followed the whole saga though.
I think you're missing that this was was research project, and they were trying to answer the fundamental question, can we build a verifiable systems language?
>Go has been criticized for "not paying attention to PL research", which might be true, but is already extremely useful and even before its first 'stable' release people already have used to build large systems used in production.
Is that a factor of being a better language (for some value of "better"), or of being marketed by famous people working for a famous company?
I really do mean this. Pike et al have Google behind them, and thus not only the Google publicity machine but a whole team of Googlers working actively on making Go work. BitC was a creature of, IIRC, 8 people working on research, and possibly their grad-students if that number doesn't include grad-students already.
The core Go team was smaller than 8 people, at least at launch, although it has lots of peripheral contributors now. But you could look at that as the value of launching. :-)
Java, that had marketing. As in million dollar campaigns, ads in trade magazines, tons of articles and PR even in NON-tech outlets about the "next big language", a company betting most of its strategy on the language and so on. C# too.
Go? Being made by Google guys that are also battled tested programmers helps, but it's not like the language had any marketing. A website, a mailing list, a blog not that frequently updated, and a few appearances on some tech conferences. That about sums it.
As a language, Dart is more heavily touted by Google. And is made by the famous V8 (and more) guy. But noone is using it.
So, yes, Go is used for its pragmatic merits, not because of marketing.
Go might have had less marketing than Java, but it still had a decent amount of support. There were several high quality promotional videos when it was first released. The web page is really full-featured; it has a ton of documentation, including translations for several languages. And all those appearances at tech conferences take time & money (and I'm sure that Google sponsorship helped to secure some of them).
All that stuff adds up in terms of manpower and budget, and it's out of reach of what a small independent team can do. It's not all about technical merit.
There's active marketing and there's passive marketing (brand association). Go is popular because of its association with the brand of Google and the brand of Pike and Thompson. A lot of people like the language on its merits (although I disagree with pretty much every decision they made), but I believe the reason the language got traction is because of its brand association.
>Go is popular because of its association with the brand of Google and the brand of Pike and Thompson.
A (non factual) jump to conclusion.
For one, Go isn't that popular. From the latest TIOBE notes:
"Another interesting observation is that while big software-related companies such as Oracle, Microsoft and Apple all have one of their programming languages in the top 10, Google seems to be incapable to achieve the same.Google's Go language dropped out of the top 50 this month".
Second, Dart is also associated with the brand of Google and the brand of Lars Bak. Yet, it's totally unpopular.
Same goes for Unladden Swallow. Touted by Google, got nowhere, died. (Hell, even things Google spends ton of marketing money on go nowhere: Google Buzz, Google Wave, Google Wallet, ...).
>A lot of people like the language on its merits (although I disagree with pretty much every decision they made), but I believe the reason the language got traction is because of its brand association.
That might get some early interest in the language, but not continued traction (and not that it has much traction, anyway).
People like it because it's fast, the type system gets out of the way, it's not as weird as the lisp-like languages for those used to C syntax, has some cool modern features (from closures, to easy concurrency) and has a big and current standard library.
I also happen like the language on its merits. I also believe that most of those who disagree don't share the design goals of the language designers and wanted mostly just another derivative functional language or some fads-of-the-day-in-academia thing like Scala.
>I also happen like the language on its merits. I also believe that most of those who disagree don't share the design goals of the language designers and wanted mostly just another derivative functional language or some fads-of-the-day-in-academia thing like Scala.
Oh, goody, jumping for joy, another Industry Versus Academia holy war!
Look here. I think Go has some very, very nice ideas in it. I also think it's fundamentally broken by a couple of stupid decisions the designers made: no exceptions, no parametric polymorphism. I also think it's fundamentally pointed at the wrong problem domain: systems languages don't have runtime libraries, don't have garbage collection and don't require operating systems beneath them so that they can write free-standing systems code.
Go is a brilliant, but also broken, competitor against C++, Python, Java, and Erlang (especially Erlang) for the applications-language space, especially highly-concurrent applications (as sometimes handled by Erlang, which I will be happy to see Go murder and replace). Go is not remotely a replacement for C.
And the exact reason for these things is that Go ignored parametric polymorphism (aka generics) and left out exceptions (which are messed-up in Java by quite widely useful in real programming!).
Now, as to "derivative functional languages", there mostly aren't that many. Most derivative work in functional languages gets fed back into Haskell, and nobody expects you to use Haskell for doing Real Work. Haskell does, after all, have the motto, "Avoid success at all costs."
And as to "fads-of-the-day-in-academia", Scala is the single best language I've ever used. And I don't even use all of it. I just use the subset of Scala that is Everything Java Should Have Been.
It has lambdas and closures. It has just enough type inference. It has mutability control for slots. It has object orientation and case-classes/pattern-matching and type-classes via implicit parameters. Scala lets me figure out what language paradigm I need to Get Stuff Done. It's the Python of statically-typed languages, and it interoperates with a vast outside ecosystem to boot.
I can't say as much for every fad-of-the-decade-in-industry thing like Java or Go.
This is one of my favorite features of Go. Exceptions are a horrible mess in every language I have seen that has them. They make following control flow basically impossible. They are equivalent to COME FROM.
Go's multiple returns and defer/panic/recover are a much cleaner way to handle errors, not perfect, but way superior to exceptions.
> Go ignored parametric polymorphism (aka generics)
Go didn't ignore parametric polymorphism, Go already has features (interfaces and slices/maps) that provide most of what people use generics for. Go might even get generics some day if a good design for them is found. But the people using Go every day don't miss generics, and that is because the language itself already provides what people needs.
Also note that C lacks both exceptions and generics, and many people still consider it the best language around to build many kinds of systems.
This has been explained to you time and time again. You keep posting this exact same comment, complete with the "COME FROM" analogy. Then it gets explained to you that panic/recover are an exception system. Then you ignore it. It's really tiresome.
Additionally, people get around the lack of generics in C with macros. Go doesn't have those.
If panic/recover is the same as exceptions, why does everyone whine about the lack of exceptions?
The fact is that while in theory panic/recover are mostly equivalent to exceptions, in practice they are very, very different. Panic is only used in truly extreme situations (like running out of memory) or programmer errors, and you can pretty much write code ignoring it.
Most programs at most will need to call recover() once. Compare this with how exceptions are used in most languages where they are a fundamental aspect of a library's API.
Also the way defer() reliably and cleanly handles cleanup is much more pleasant than exception and 'finally'.
As for C macros, good C programmers have known for a long time that they are best avoided as much as possible.
Because panic/recover are bad exceptions (recover can't specify the types of the exceptions it wants to recover from), and the libraries don't use them. Having to check errors is annoying since most of the time you want to bubble the error up to your caller. This is exactly what exceptions (or the error monad) do for you. In Go you have to do it manually.
Anyway, Go does have exceptions. It just doesn't make much use of them. Seems a shame to not use a useful feature, but oh well.
defer is not reliable if you forget to call it. RAII is much more foolproof than defer in this way. You can forget to call defer, but you can't easily forget to call the destructor.
If the answer is "panic", note that exceptions are superior here, since exceptions let you "panic" without having to litter your code with those calls.
If the answer is some form of "log an error message", exceptions are superior, since you can group multiple related calls into one try-catch block and avoid having to check at the site of each and every call.
If the answer is some form of "return a different error code to your caller", exceptions are superior, because you can catch many different types of exceptions in one try-catch block and re-throw a different exception.
>If the answer is "panic", note that exceptions are superior here, since exceptions let you "panic" without having to litter your code with those calls.
If the answer is some form of "log an error message", exceptions are superior, since you can group multiple related calls into one try-catch block and avoid having to check at the site of each and every call.
If the answer is some form of "return a different error code to your caller", exceptions are superior, because you can catch many different types of exceptions in one try-catch block and re-throw a different exception.
How about locally handling the problem --which could be some non-issue-- and continue without the OVERHEAD of an Exception?
How about now using the verbose and convoluted try-catch-finally idiom for common and not at all exceptional error conditions?
How about THINKING about the error handling, instead of blindly catching "many different types of exceptions in one try-catch block and re-throw a different exception", which we've known for Java to result in a mess...
How about avoiding the exception stack penalty that happens when you "group multiple related calls into one try-catch block and avoid having to check at the site of each and every call" to merely log errors?
>Go has exceptions. They are called panic/recover.
This has been explained to you time and time again. You keep posting this exact same comment, complete with the "COME FROM" analogy. Then it gets explained to you that panic/recover are an exception system. Then you ignore it. It's really tiresome.
He might ignore the similarity of panic/recover with Exceptions, but you also ignore:
panic/recover is not used for 99% of exception control in Go. We just check the out-of-band error code and act accordingly.
Panic/Recover is only used for TRULY EXCEPTIONAL situations.
Wish we could say the same for, say, Java's exceptions.
Odd that you say that, because the language that's most similar to Go here is Java, with its checked exceptions. The same reasoning that led to Go's error codes (that you should always handle your errors) led to checked exceptions. The fact that people generally think checked exceptions are a bad idea is a point against Go, not in favor of it.
Checked exceptions are superior to Go's error return codes anyway, since they make it easier to bubble your errors up to your caller.
This is exactly what crossed it off my list. I am interested in a better language than C++ for writing really fast, low-level code but I already have tons of good options for higher-level code.
>I also think it's fundamentally broken by a couple of stupid decisions the designers made: no exceptions, no parametric polymorphism.
Is C "fundamentally broken" too? It doesn't have exceptions or parametric polymorphism either, but people seem to get a lot of work done. Fuck, people even get work done with Javascript, that's it's a bona fide example of a language with broken semantics (Harmony fixes some of those).
The thing is, "fundamentally broken" to me sounds like "it's totally unusable", and that is far from the truth. You might call it "not as convenient as it could be", but not "fundamentally broken". Broken implies "it doesn't work". Well, Go works fine.
Parametric polymorphism would be nice -- at least they are already thinking of adding generics in some v2.0 edition. But it's not a showstopper either, as in "fundamentally broken".
>I also think it's fundamentally pointed at the wrong problem domain: systems languages don't have runtime libraries, don't have garbage collection and don't require operating systems beneath them so that they can write free-standing systems code.
Well, by the "systems" moniker, they don't mean OSs and drivers, they mean what other people call large scale application systems. What you'd might use Erlang for, for example. Server stuff with high concurrency and big memory/cpu etc needs, but without all the C pain. From stuff like, say, Solr, to Hadoop, to Cassandra, etc, ...
>I can't say as much for every fad-of-the-decade-in-industry thing like Java or Go.
I wouldn't call Go "fad of the decade". For one it's not a fad, as it's not even popular.
>*And as to "fads-of-the-day-in-academia", Scala is the single best language I've ever used. And I don't even use all of it. I just use the subset of Scala that is Everything Java Should Have Been.
Well, I like Scala too. But it lacks some things that Go has, and help me in a few cases: no static binaries. No control over memory alignment. No dead easy FFI to C. No dead simple syntax I can understand in a weekend.
C was designed and released in, IIRC, 1972. If someone designed and released C in the year 2012 as a new, state-of-the-art systems programming language, I would tear their guts out for releasing something so fundamentally broken, unsafe, obsolete and simplistic!
>people seem to get a lot of work done.
Quite to the contrary, the entire shift to virtual machines and scripting languages happened because people didn't like using C and C++ for application domains where their advantages don't count.
In fairness I think its a bit early to say that Dart is totally unpopular. The target market for it isn't tech bloggers and proggit. The people who will potentially use it are those who are primarily driven by the need to get some work done as opposed to evaluating new, evolving languages. They won't use it until all the tools and infrastructure are in place.
Guilty as charged. I do want a functional language like Scala, because I believe that school represents a better design. That school of language design saves me the headache of null pointers; it helps me write unit tests with QuickCheck; it allows me to define custom generic collections; it places emphasis on having a fast GC; it features monadic style to help with error handling.
It's not a reasonable comparison. Go lacks any features that haven't already been thoroughly explored in other languages. BitC was an attempt to do something that had never really been done before.
If you read the BitC documents, Shapiro mentions that one of the goals of the project was to avoid innovation.
The BitC goal isn't to invent a new language or any new language concepts. It is to integrate existing concepts with advances in prover technology, and reify them in a language that allows us to build stateful low-level systems codes that we can reason about in varying measure using automated tools. The feeling seems to be that everything we are doing is straightforward (read: uninteresting). Would that it were so.
Avoiding innovation in a particular narrow area, while combining others' new idea from other areas in a novel way, is innovation. In fact, it's very rare to see any other form of innovation.
BitC was started when Shapiro was at a research university. No research institution would fund Go, because Go is not a research language. BitC was, however, designed as a research language.
For that to be true, the Go developers would have to be open to adding research ideas to the language. Java is (for example, generics). Right now Go's developers are not open to this. As a result, Go is not publishable.
Since a long time I had an eye on the BitC project, the idea seemed very compelling. When Jonathan Shapiro quit Microsoft and resumed @ Coyotos I was really hopeful that the pace of the development will kick up a notch or two and we would have an usable BitC shortly.
Now it is clear that is not going to happen, but in the process if we have a better language its perhaps good (though honestly cant hide the disappointment that it is going to be a longer wait). For those who are interested in things similar follow the development of decac
ATS is a bit similar to and can be used like:
C, which offers you very fine control over memory
allocation as well as the management of other kind of
resources.
ML, except you have to annotate types for at least the
function arguments and return types
Dependent ML, using dependent types to help you reason
about data structure integrity, from something as
simple as requiring two lists to be zipped to have the
same length, to preserving tree-size invariant when
rotating AVL trees to the left or to the right in order
to guarantee well-balanced property of AVL trees.
...and it is mighty fast.
An interesting email. It's a shame, and it seems like we have a long way to go (still) in order to have a normal-ish programming language that supports formal verification, but Shapiro brings up some really insightful points about language design and why things are the way they are.
Sidenote: This article isn't about BitCoin, it's about a programming language called BitC that's focused on making low-level simpler and formally verified. (I personally thought it was about BitCoin until I saw the coyotos.org next to the title.)
The following passage is interesting because it points away from the most widely held belief about concurrency and FP. I'd like to know more about what he means.
The other argument for a pure subset language has to do with advancing concurrency, but as I really started to dig in to concurrency support in BitC, I came increasingly to the view that this approach to concurrency isn't a good match for the type of concurrent problems that people are actually trying to solve, and that the needs and uses for non-mutable state in practice are a lot more nuanced than the pure programming approach can address. Pure subprograms clearly play an important role, but they aren't enough.
I imagine he means that pure programming can only address concurrency in some of the many areas of programming where we may wish to leverage some kind of concurrency. I personally see immutable state as an important fundamental building block to better concurrency support in languages, but certainly not the only building block.
One example of something which I would like to see is a linear type system for pointers to mutable data so that you can reference mutable shared state, but only one task/thread/function may access it at any one time and assigning or copying the pointer causes ownership to be transferred. I think this would be useful in a pipeline or dataflow system where the pipeline stages operate on a blob of memory directly, but its guaranteed that only one stage can ever access it at any one time. I guess its more or less an optimization on message passing immutable memory from stage to stage, so not exactly necessary, but still another building block I would like to see.
I'd also like to hear what Shapiro has in mind though.
so that you can reference mutable shared state, but only one task/thread/function may access it at any one time and assigning or copying the pointer causes ownership to be transferred
That sounds kind of like Rust's "unique pointers".
Yes, I think this is exactly Rust's unique pointers. Rust has a lot of interesting language features and is one of the up-and-coming languages I'm watching with interest.
Almost, but not quite. I'll quote a few key phrases from [1] to distinguish between what I want and transients, though I imagine in real life transients probably capture most use cases fairly well.
The second feature of transients is that creating one does not modify the source, and the source cannot be modified via use of the transient. Your source data is immutable and persistent as always.
Note in particular that transients are not designed to be bashed in-place.
The unique linear pointer would allow direct in-place mutation of the source. As it is guaranteed to be unique, no synchronization is needed. Transients do not modify in-place and do not allow you to mutate the source type directly, but allow you to add to the source and the added data is mutated directly for performance. There is a lot of overlap in use cases and transients are an excellent middle ground for Clojure, since it allows you to code against immutable types but with the performance benefits of partial mutation. The pointers I described would be less flexible in that you cannot have references to an immutable portion as you can in Clojure, but at the same time the owning code has completely mutable access.
In general, I love Clojures concurrency model and I don't think unique pointers would really mesh well with the rest of Clojure.
It's interesting to compare the issues that Shapiro found with the issues that we found during the development of Rust. It turns out that we ran into many of the same issues. Since Go is being mentioned here too, I'll compare the way Go dealt with the issues as well.
(0) Objects and inheritance: Rust had support for objects (via prototype-based inheritance) in the first version of the compiler, but we found that they weren't being used. We attempted to be as minimalist and simple as possible regarding objects, and as a result we ended up with a system that didn't have enough features to be useful. It also didn't fit well with the rest of the language, so we scrapped it and added typeclasses instead. Those worked a lot better, and now most of our standard library is using typeclasses for OO-like patterns. Recently, we've found that we really do want objects, but mostly as a way to achieve code reuse, privacy, and a more direct association between a type and its method than typeclasses alone provide. The current system that is being implemented is a nice way, in my opinion, to unify typeclasses and object-oriented programming. There are just a few concepts to learn, and it all meshes together quite nicely.
Go's interface system is quite similar to Rust's typeclass system. The main things that Rust has that Go doesn't are first-class constructors, the trait system (not yet implemented) to facilitate method reuse, and the ability to define multiple implementations of an interface per type. The things that Go has that Rust doesn't are duck typing (which is a quite good idea, and I'd like to add it to Rust as well) and downcasting (which we don't want to support because we have more type-safe mechanisms for the same thing).
(1) The compilation model: Rust uses dynamic linking pervasively, because OS's have good support for it and it helps keeps binaries small. It also has strong support for separate compilation, because we want to make compile times fast. So far, so good, but, just like BitC did, we discovered that type abstraction (which you use generics in Rust to achieve) doesn't mix well with separate compilation. We didn't want to have a uniform value representation like the MLs do (supporting only 31-bit ints and boxing everything else doesn't fly in a systems language), so we tried to use dynamic size calculations for all of the values. It resulted in a huge amount of complexity (we never shook out all the bugs), and it also had a large runtime performance penalty. Unlike C#, we couldn't fall back on a JIT, because Rust is an ahead-of-time-compiled language. So we moved to a "monomorphization" scheme for Rust 0.2, which is basically like C++ template instantiation, only without the overhead of reparsing all the code from scratch. Even with this scheme, you only pay for monomorphization when you use generics, you can still dynamically link all non-generic code (which is most of it), and your runtime performance is unaffected by your use of generics.
Go, of course, doesn't have generics. I don't personally believe that buys them much though; the programmer ends up working around it in a way that either involves boxing (via interface{}) or by-hand monomorphization (by duplicating the code for each type). To me, generics are just a way for the compiler to do work the programmer would end up having to do anyway.
(2) Insufficiency of the type system regarding reference and by-reference types. It's spooky to read this, because it's precisely the problem we ran into with Rust. At the moment we have by-value and by-reference modes for parameters, and we've found that this isn't sufficiently expressive. (We also tried making the only difference between by-value and by-immutable-reference internal to the compiler, which didn't work well due to higher-order functions and interoperability with C code.) We also found that parameters really aren't the only place you want by-reference types; you really want to be able to return references and place them within other data structures. Whenever we wanted to do this, we had to fall back onto heap allocation, and that was significantly hurting our performance, especially when unique types were involved (since aliasing a unique type is impossible, you have to copy it). Profiling Rust programs showed an alarming amount of time spent in malloc and free. So we're in the process of bringing up a new regions system that I'm excited about: it's too early to say for sure, but I think we've stumbled upon a way to make regions not require a doctorate in type theory to understand. Regions allow you to have safe pointers into the stack and into the heap and pass them around as first-class values.
Go doesn't have zero-cost reference types at all; it just does simple escape analysis to allocate structures on the stack when it can and falls back on tracing GC for the rest (note that this is what Java does nowadays too). This is one of the most significant differences between Go and Rust; Go's memory model is essentially identical to that of Java, plus the ability to allocate structures inside other structures, while Rust has a much more C++-like memory model (but safe, unlike C++). This decision is based on our experience with Firefox; fine-grained control over memory use is so important that we didn't want to place our bets on pervasive use of GC.
(3) Inheritance and encapsulation: Rust still has no concept of inheritance; it's our hope that a combination of enums (datatypes like in Haskell or case classes from Scala) and traits will allow us to avoid introducing the complexity of inheritance into the language. Time will tell, of course. As for encapsulation, we thought we didn't need it, but it turns out that we really did want private fields. This we're solving with the class system, mentioned above.
Go achieves inheritance through anonymous fields. Anonymous fields are multiple inheritance in all but name, complete with the "dreaded diamond" problems of C++. We were hoping to avoid that. Go has standard privacy through "unexported fields".
(4) Instance coherence. Since you can define multiple typeclass implementations for a given type, and the caller chooses which implementation to use, you have to make sure in many contexts (for example, the hash method of a hash table) that the same implementation gets used for the same data. That's one of the reasons we introduced classes -- they tie implementations of methods to a type.
Go doesn't have this problem, because it only permits one implementation of an interface for each type, and it has to be defined in the same module as the type. Basically, Go's interfaces are much like our classes in that regard. We wanted to allow people to add extra methods to types -- for example, to add extension methods on vectors (think what Rails does to the Ruby standard library, but in a way that doesn't involve monkey patching) -- so we didn't want to force this restriction on users.
I think that one of the most important things to underscore is that we would have never found these things so early unless we had written the Rust compiler in Rust itself. It forces us to use the language constantly, and we quickly find pain points. I highly encourage all languages to do the same; it's a great way to find and shake out design issues early.
>Go achieves inheritance through struct nesting. I find this a bit clever for my taste, and it only admits single inheritance, but it's certainly simple.
Is that really so? I thought that Go could nest multiple structs at once, and would complain if there were method name conflicts, but allow these to be resolved by defining a new method (based on the "outermost method wins" resolution rule) that could decide how to explicitly call included methods.
> Go has standard privacy through "unexported fields"
And via the "all methods must be defined with the struct" rule which prevents out of control access to internals.
Hmm, hairy how? You get forced to resolve clashes by defining a new top level method, and you can always call the included anonymous methods directly.
Seems to me, "in type T, keep included type A's implementation of X and rename B's implementation as Y" is identical in effect to "define T.X as calling self.A.X() and define T.Y as calling self.B.X()"
I suppose they aren't that hairy if you introduce type-specific namespacing like C++ does; i.e. allowing method names to be qualified with the name of the class or trait they belong to to resolve conflicts (A::X and B::X). Renaming seems like a way to avoid having to create an extra delegation function in case you want to delegate to one or the other, though.
It's important to keep in mind that traits aren't the same as inheritance, though. Traits combine methods and fields into one final class instance (they're like mixins), while anonymous fields are more like C++ inheritance: there's actually an instance of each superclass embedded inside each subclass. This starts to matter especially when you have diamond patterns (A derives B and C, B and C derive D). When you have an instance of D and are calling methods of A on it, does each method of B's embedded A clash with that of C's embedded A in Go?
Go's anonymous fields are always accessible as a struct member named after their type, basically. It isn't namespacing, it's syntactic sugar that kinda-sorta removes their anonymity. Diamond inheritance in Go means you have physical nesting like [D [C [A]] [B [A]]] so necessarily the A's will clash, but you can override their methods and manually resolve it - yeah, at the cost of a delegating call (which the compiler could probably strip out).
Interfaces subsume the feature of traits to define an operation abstractly against the methods of its operand without defining its representation. Unlike traits though they can't mess with member variables directly (of course methods/interfaces can be defined on things that lack members, such as a renamed int32).
Traits are actually about code reuse, not about polymorphism. Indeed, in the original traits literature (and in Rust), traits are not types at all. You use interfaces in Rust for that, just as you do in Go (they work very similarly).
Think of traits as little pieces you can glue together to make a class.
Pure-code mixins and methods on Go interfaces are sort of duals of one another. Interface methods are parameterized with a type when called ("calling outwards"), mixins are parameterized by a type when compiled, and merged into it ("calling inwards").
Wow that was very interesting to read and shows you guys are pushing boundaries even when the goal is a pragmatic language.
One last big thing that Shapiro mentioned was criticizing purity alone as a useful way of attacking concurrency and losing faith in verification. Can you or someone else knowledgeable comment on this?
>The last reason we left objects out of BitC initially was purity. I wanted to preserve a powerful, pure subset language - again to ease verification.
The object languages that I knew about at the time were heavily stateful,
and I couldn't envision how to do a non-imperative object-oriented
language. Actually, I'm still not sure I can see how to do that
practically for the kinds of applications that are of interest for BitC.
But as our faith in the value of verification declined, my personal
willingness to remain restricted by purity for the sake of verification
decayed quickly.
The other argument for a pure subset language has to do with advancing
concurrency, but as I really started to dig in to concurrency support in
BitC, I came increasingly to the view that this approach to concurrency
isn't a good match for the type of concurrent problems that people are
actually trying to solve, and that the needs and uses for non-mutable state
in practice are a lot more nuanced than the pure programming approach can
address. Pure subprograms clearly play an important role, but they aren't
enough.
Sure. Regarding concurrency, we're trying to eliminate data races by avoiding shared mutable state. Right now, our unique pointers attack the "shared" part of that -- since data in the exchange heap is localized to a single task, you don't have data races, regardless of mutability. So mutable structures aren't a concern there.
There is the concern that unique pointers are too limited to do a lot of the kind of tasks you want to do in a parallel way, and that's where the experiments we want to do with the "patient parent" model come in -- you can see my colleague Niko's blog here [1] for info on that. That approach allows shared immutable data.
Verification has never been a goal of ours, except for the kinds of invariants you can address relatively simply in a type system. Dependent types are cool, but at the moment they're so complicated as to place the language out of reach of most programmers. We have more interest in making dynamic checks (assertions, design-by-contract) easy to use.
Go achieves inheritance through anonymous fields. Anonymous fields are multiple inheritance in all but name,
Anonymous fields in Go are syntactic sugar for delegation. Because anonymous fields do not create a "subtype" relationship between types, anonymous fields are inheritance.
It seems to me that it's pretty impossible to actually replace C (although I fully support the people trying to do so). All the things that "make C fast" are really ways of making the "underlying machine" fast through C, and trying to come up with new language constructs that facilitate these techniques seems like a losing game. Any new languages are going to have a hard time catching up to C when it comes to compiler optimizations and can never really catch up when it comes to ubiquity. (Would your 5-year-old compiler target dsPIC and have support for their IO ports and interrupts? clearly not)
So, what about a modern safe language that's meant to be used alongside C? For example, the language presents a modern approach to modules and imports, but a function 'bar' in a module 'foo' is compiled down to a straightforward foo___bar() in the generated C. The generated C has straightforward type declarations and readable code, with a minimum of macro abstractions. The source language emphasizes features that allow one to write concise reasonably performant code (H-M, local inference, sum types, some kind of object polymorphism). It has implicit GC, but mostly relies on linear references and stack allocation, falling back to real GC only for explicit unknown-life allocations (with swappable collectors for different runtime overhead). Do-everything capabilities like unsafe pointer arithmetic are completely left out, as the programmer can drop back to C in a neighboring file or inline. The per-project ratio of this new language to C would vary depending on the type of the project - something like a network server would use a minimum of C, perhaps just in cordoned-off performance critical code.
(I've been mulling on this idea for a few weeks. I ran across Felix the other day, and found myself nodding a lot, but the language seemed quite complicated and the array details in particular left me wondering if memory safety was even a design goal.)
>(Would your 5-year-old compiler target dsPIC and have support for their IO ports and interrupts? clearly not)
C doesn't have support for dsPIC and its IO ports and interrupts. People writing real kernels or embedded code have to write little shims in platform-specific assembly code to hook their C code up to the hardware.
>All the things that "make C fast" are really ways of making the "underlying machine" fast through C, and trying to come up with new language constructs that facilitate these techniques seems like a losing game.
Yes and no. C's so-called speed comes from its transparency: everything you write in C can be cleanly mapped down to its constituent assembly code without extra code being inserted by the compiler or runtime library. C has a fundamental principle: you only pay for what you use. Other languages can, but mostly choose not to, emulate this.
>Any new languages are going to have a hard time catching up to C when it comes to compiler optimizations
GCC and LLVM make this not so much of a problem. Most of the major compiler optimizations nowadays are done on intermediate representation code, not C source itself.
>The source language emphasizes features that allow one to write concise reasonably performant code (H-M, local inference, sum types, some kind of object polymorphism).
Local inference is only necessary in situations where Hindley Milner or another HM-based algorithm doesn't work (see: Scala).
There's bunches of languages like this. In addition to BitC, there's also Cyclone, Rust, ATS, and Deca -- all of which are in various stages of development. ATS has a stable release, IIRC, Rust is in alpha, Cyclone has made a release but development stopped, and Deca is in active pre-alpha development.
> In addition to BitC, there's also Cyclone, Rust, ATS, and Deca -- all of which are in various stages of development. ATS has a stable release, IIRC, Rust is in alpha, Cyclone has made a release but development stopped, and Deca is in active pre-alpha development.
There is also Clay (http://claylabs.com/clay/) which keeps a low profile, but there are a lot of work being done on it. What I like best there is that it has no complicated run-time system, which is an obvious plus if you want to supplant C, and that it does whole program optimization right out of the box, no old-fashioned notions about separate compilation units.
Yes, sorry, I love Clay, but this field has gotten large enough that I don't always remember every contender when I list them out. For example, there's also Habit and Tart.
> People writing real kernels or embedded code have to write little shims in platform-specific assembly code to hook their C code up to the hardware
Platform-specific extensions to C, but not assembly. IIRC declaring a register on the hitech picc18 compiler was a cast of an address to a specific type, and on another compiler was a pragma. But that happens in compiler-supplied include files, so the C programmer ends up with a PORTA macro. Point being, the algorithmic code written in C isn't entirely beholden to a specific implementation and what platforms it chooses to support.
> C has a fundamental principle: you only pay for what you use
Unfortunately, there is a lot you are unable to use even when you are more than willing to pay. I'm basically proposing augmenting C with modern statically analyzable language features, so that one is able to pay for more capabilities.
The problem with the current bunches of languages is their extreme lack of maturity. Rust currently looks the most promising to me, but I can't be sure that at version 0.4 the main contributors won't move on to other things, leaving its popularity to be eclipsed by Deca 1.0. A working application can always be recompiled with a frozen version of the compiler, but if your goal is to write a major library that exports its abstractions as more than a C FFI, language community is a major concern.
My full and proper response to this was sent to the mailing list. Since we appear to have at least one Rust person here and the mailing list may have blocked the email or something, I'm CC'ing here.
Hey, I saw this linked on Hacker News, so I thought I would give it a read. And boy, this is a strange one. It's like watching a painful portion of my life flash before my eyes, and realizing someone else lived it before I did. I can't say how much I and my work owe to you and BitC. Seriously, thank you for all your effort and for having the courage to do a public autopsy of your own language.
I've seen everything you said about the compilation model happen in my own attempts. My only way of slinking around that issue has been to output to a bitcode able to do link-time resolution of opaque/abstracted type representations. This means that I have to allow for some whole-program tracking of abstracted representations. It leaves a bad taste in one's mouth, but it can be... workable. Its downside is that representation sizes have to be specified for abstracted types exported as binary blobs from libraries, because the library client's dynamic linker (to my knowledge) cannot see inside the existential. I need the same whole-program tracking to perform polyinstantiation (although theoretically the typed IR language would admit code to copy it and then resolve an Opaque Type to an instantiation).
Ideally, I think we'd compile a New Systems Language down to a Typed Assembly Language, and the portion of the compiler dealing with polyinstantiation and type abstraction would retain access to generated TAL code. Work on TALs is ongoing.
On the fortunate side, C++ already had an ABI problem with exporting classes from shared libraries. C++ requires both programmers use C++, and both to use the same C++ ABI. Yick.
I have ended up having to "add back" regions into my own language just this year. Pointers have ended up being just another type constructor, and they do have to be associated with both a region and a mutability. The good news is that once you do associate them, the region can be a universal variable; a function can be polymorphic in the region it touches by taking a pointer as a parameter.
The good news is: inference, polymorphism and subtyping do get along. They get along just fine, if you're willing to toss Top, Bottom and principal typing out the window in the special case where HM unification would end with a polymorphic generalization. Consistently viewing this problem as impossible (either a full semi-unification problem or a failure of the existence of principal types) has, I've noticed, been holding back certain strands of PL thought for a long time.
Once you have subtyping, you can introduce existential types (possibly with a restriction such as "existentials can only be used in the this-pointer place") and recover a fair approximation to objects, interfaces, BitC capsules, etc. Since we're dealing with existentials, there's also a strong resemblance and possibly even a full morphism to first-class modules, thence to type-classes and instances... And then we all get lost in how deep our thoughts are.
The lexical resolution issues of type-classes (especially with respect to their "looking like" object instances) can be dealt with as by Odersky et al in "Type-Classes as Objects and Implicits". That leaves us with the same instance-coherence issues you've already mentioned (was my ordered set built with this partial ordering or that partial ordering?), but no problem of ambient naming or authority. There might be a good hybrid approach in allowing only one truly global type-class instance per type, accessible via a special way of capturing the instance as an implicit parameter/value, but lexically resolve instances when we don't want the One True Instance. Then we could use object identity for testing instance equality to check, when given data built based on multiple instances, that they agree.
Moving on, I've got a question: why did you tackle the problem of inference and mutability together by trying to treat mutability as part of the data type?
Look at it from the C/C++ point of view rather than the ML point of view, and it sort of stops making sense. An integer value can be passed to a parameter expecting an integer, whether or not the procedure will modify the parameter slot in which it received the integer. In an imperative language, if we want to actually modify the parameter, we pass it by reference ("var i: integer" in Pascal) or by passing a pointer ("int i" in C). In ML or Haskell, I'd say it makes sense to classify mutable slots/references to mutable memory cells as their own data-type because everything* other than referenced memory cells is immutable. Those mutable cells are not real values: the reference to one is a real value, and the contents of one is a real value, but the memory cell itself is not a first-class citizen of the language.
I sidestepped the issue for my own work by treating mutability as the property of the "slot" (the memory cell itself), which is a second-class citizen of the language, and then building pointer types that track the mutability of the slot to which they point. This is (once again) much easier to do once you already know you can handle subtyping ("submuting", haha, a vastly incomplete start of a short paper). For a systems language, I admit to having gone "the C way" with a little more formalism rather than "the ML/Miranda/Haskell way".
But why go the ML/Haskell way in the first place? Just because doing things that way makes you more publishable in PL? You seem to be implying that, but I'm fairly sure the OO-PL community has their own literature on constancy and mutability ("Ownership and Immutability in Generic Java" being a paper I've seen referenced fairly often) that you could have dragged in to "shield yourself" against the ML/Haskell-style FP-PL community (and its horrifying armies of conference reviewers!).
Anyway, my condolences for the death of your language by starvation of funding. It really sucks to see BitC go when you've learned so much about how to do BitC right, but you definitely did manage to make a ton of important contributions. Though, since I should ask, did you ever talk to or collaborate with the Cyclone folks? Habit folks? ATS folk(s)?
The line between self-promotion and useful contextual information can be a fine one. But in this case, I think mentioning the source of your own experience is OK, since it's directly relevant to the discussion and it establishes the context for your comment.
I dunno about that. This clarifies to me that what I just read was not about "some guy's language", but "that language I saw on LtU, that's on the top of my stack of languages to look at." :-)
TLDR: "BitC is a new systems programming language. It seeks to combine the flexibility, safety, and richness of Standard ML or Haskell with the low-level expressiveness of C."
That doesn't follow at all. Programming language research and crypto research are different fields. There are likely lots of mathematicians who could provide opinions on bitcoin that have never heard of BitC.
BitC is often referenced from object-capability literature, which is one way of decomposing problems to mutually-trustable solutions. It's certainly possible to understand adversarial system design without having come across the ocap model, which is why I said 'most likely'.
Question: What is the fundamental problem that bitcoin solved in a novel way? If your answer is something from cryptography, you don't know its context.
I would say the fundamental problem is some sort of distributed knowledge proof system, and in my taxonomy of learning, that'd definitely be near the cryptology department.
To be honest, I think I'd been lumping bitc in too much with PL research, and hadn't considered that other people may have an interest.
Is interesting to contrast his approach building BitC to that of Ken Thompson and Rob Pike (both former colleagues of his at Bell Labs) building Go.
Go has been criticized for "not paying attention to PL research", which might be true, but is already extremely useful and even before its first 'stable' release people already have used to build large systems used in production.
I think what applies to building other kinds of software applies to designing programming languages too: you need something that you can start using to build real systems as soon as possible and then iterate from there based on your experience.
The problem with languages is that changing the language involves throwing away the code you wrote, which discourages one from building real systems with a language that is still changing.
That puts even more value in keeping languages simple. And tools like gofix are also quite nice to help with this.