Hacker News new | past | comments | ask | show | jobs | submit login
Graal and Truffle could accelerate programming language design (medium.com/octskyward)
478 points by cosbas on July 19, 2016 | hide | past | favorite | 195 comments



There's actually a fairly long history of cross-language VMs, with various degrees of success. What usually happens is that they work fine for languages that look, semantically, basically like the native language on the VM. So LLVM works well as long as your language is mostly like C (C, C++, Objective-C, Rust, Swift). Parrot works if you language is mostly like Perl 6 (Perl, Python, PHP, Ruby). .NET works if your language is mostly like C# (C#, F#, Boo, IronPython). The JVM works if your language is mostly like Java (Java, Clojure, Scala, Kotlin). Even PyPy has frontends for Ruby, PHP, Smalltalk, and Prolog.

"Mostly" in this case means semantically, not syntactically. It's things like concurrency model; floating point behavior; memory layout; FFI; semantics of basic libraries; performance characteristics; and level of dynamism. You can write a Python frontend for both the JVM and CLR, and it will look like Python but act like Java and .NET, respectively, with no guarantee that native CPython libraries will work on it.

The problem is that this is where basically all the interesting language-design research is. I wouldn't use Rust for the syntax; I'd use it because I want properties like complete memory safety, manual control over memory allocation, easy (and fast!) access to C libraries, and short startup time. These are all things that Truffle explicitly does not deliver.

It's a great tool if you work in the JVM ecosystem and want to play around as a language designer. But most of the interesting languages lately have created their own ecosystems, and they succeed by solving problems so fundamental that people will put up with having to learn a new ecosystem to gain the benefits they offer.


To add to the examples, Truffle currently has interpreters for unmanaged languages such as C/Fortran (with Sulong), dynamic languages such as Ruby and JS, statistics/math like R and functional like Clojure. The interopability is done in a way that does not need a common representation or object layout.

> I wouldn't use Rust for the syntax; I'd use it because I want properties like complete memory safety, manual control over memory allocation, easy (and fast!) access to C libraries, and short startup time. These are all things that Truffle explicitly does not deliver.

Truffle itself does not do it, but it is possible to have a Truffle interpreter with these requirements. Memory safety can be enforced at the frontend compiler level like Rust does I believe. Memory can be allocated directly with Unsafe or calling libc functions. GNFI [1] allows for fast calls to native libraries. Startup can be improved radically with SubstrateVM, but it is currently closed-source. There are many ways to improve the JVM startup, but not necessarily convenient (but then you don't need to wait to compile your program either to start running it).

[1] http://dl.acm.org/citation.cfm?id=2500832


The big question is "Would you want to use this in a mission-critical project?" Emscripten lets you write webapps in C, for example, but outside of sharing common libraries across server/Android/iOS/web clients, few people use it.

Pretty much every professional job I've had used polyglot programming of some sort - when I was in financial software it was mixed Java/C/Fortran numerics, when I founded my first startup it was polyglot Javascript/ActionScript, and at Google it was a large server that was first Python/C++ and then Java/C++. The boundary between languages is always problematic. And the reason for that is because you choose memory layout to make certain trade-offs around the access patterns for that data. Do you inline data structures in a vector, or chase pointers? Do objects carry their type information with them so you can manipulate them dynamically, or do they throw it away at compile-time for greater efficiency? Do you get O(1) string indexing or native UTF-8? Do you allocate on the stack or the heap? Do you copy, borrow, move, or COW?

And then when you go to switch representations to call into another language, you pay a cost to convert all the data structures you might be touching. In many cases, that cost could be more than you saved by using optimized representations in the first place.


That's exactly how Truffle's cross-language-interface works though. Instead of paying the high cost of conversion, data stays in its existing representation and it is the interface code that is recompiled to fit.

An example is JRuby+Truffle where in its C extensions, pointers to Ruby objects act to the C code as if they are pointers to MRI data structures, but behind the scenes Graal compiles code that accesses them like the JRuby objects that they are.

The C-memory to Java GC transition doesn't work as well, but you can get around that by implementing C-memory using the Java GC like the Managed-C paper does, alternative implementations only have to keep the same behaviour, not the same underlying data representation.


Interesting. What happens if the native memory layout doesn't support a feature that the extension code requires? For example, what if a JRuby+Truffle script calls into a C library, which allocates a struct and invokes a Ruby callback with it, which then wants to access the fields via reflection? C doesn't normally include type information with its structs, so how would the Ruby code know the memory layout of the object?

What if it's not the C extension in your project that does the allocation, but a closed-source third-party library that your C code calls into? Do you need to recompile all source code to generate the appropriate typemaps?


Yes. Sulong assumes that at least the LLVM bitcode including debug information is available. It does not work with third party machine code binaries where no source information is available.


The idea of Truffle language interoperability is to avoid switching the representation at the language boundary. Objects carry their type information with them that includes the semantics on how to access their properties. This also allows Truffle to fake up the existence of an object without actually performing allocations (e.g., a table in raw buffer format can be interpreted as an array of objects). This clear separation of logical and physical layout enables efficient data representation in the context of higher level languages that typically suffer from pointer chasing and object header overheads.


Right, and my point is that this won't work unless you dictate a common memory layout for all Truffle objects. And if you do that, then you lose the ability to make trade-offs about object representation that depend upon access patterns that only the programmer can know. The whole reason we have different programming languages is because not all domains of computation face the same trade-offs.


It works without common memory layout between Truffle languages. It even simplifies the ability to use diverse physical memory layouts within the same language. The programmer specifies the logical layout based on the semantics of a language. The runtime decides how to map this logical layout onto the physical hardware. It can take into account the trade-offs the programmer decided to choose.


Right, and my point is that the reason people continue to use C++ or Rust over JVM languages is because there are some use-cases where the JVM's decision about how to map the logical layout onto physical memory causes unacceptably high memory usage and/or cache misses, or prevents them from taking advantage of clever serialization formats (Cap'n Proto, for example, loses much of its performance benefits on the JVM without the use of sun.misc.unsafe). Will Truffle allow the programmer to override Graal's decisions on this and manually specify memory layout? And if so, how are conversions between different language formats specified?

If you already buy the JVM's premise of "just let the compiler do it, we'll figure out the most efficient representation", then this is a non-issue. But there are still programmers out there who believe that Rust or C++ or Go or CPython or whatever presents a better memory layout for the tasks that they wish to accomplish, or language designers who think they can do better than all of the above, and these are the users that Graal+Truffle is trying to win over. What's the story for them?


The default when running static languages like C++ or Rust or Go via Sulong is to stick to the memory layout chosen by the programmer. There is no conversion to typical JVM memory layouts. Nor does running on the JVM has major restrictions for the framework (we are working on a version that does not run on the JVM in the SubstrateVM project). What Graal+Truffle allows is for library writers and language designers to invent new representations without any restrictions on the layout and have them interact with the rest of the system. We want programmers to use the best language for the task. And even combine multiple languages within one program. Foreign objects can be passed around freely as parameters and local variables, but there are restrictions when building combined data structures - e.g., if you have a highly optimized native data structure, it is not possible to install a pointer to a JavaScript object in it without performance loss.


Interesting. So, if I'm understanding this correctly - Truffle allows a language designer to specify a certain logical layout for data structures, along with hints for how this will get converted to a physical memory layout. Language implementors also get the full power of Graal for lowering their AST to machine code and other compiler tasks. On cross-language boundaries, it generates automatic accessors for other languages to access that data, using the logical layout to identify how particular fields need to be pulled out and manipulated, but not requiring that the full data structure be converted across the foreign-call boundary. One consequence of this is that nesting & embedding of data structures may require an explicit conversion, since if an object is a hashtable in Javascript but a packed series of fields in C, it's obviously not going to fit.

Sorta like SWIG++? If you could do SWIG but never require that an end-user write a typemap or debug a crash themself, there'd probably be a big market for that.


Yes, this is an excellent description.


> Cap'n Proto, for example, loses much of its performance benefits on the JVM without the use of sun.misc.unsafe

Not actually true. `ByteBuffer` lets you read integers from arbitrary offsets, which is all Cap'n Proto really wants.


I agree with your entire comment. However, I also agree the point the op authors make about tooling being reimplemented time and time again, taking years to get anywhere near the point represented by the 1st 9-point wishlist. What the programming field could do, which would be awesome, would be to implement language category backends that follow your "mostly like" categorization. These backends would be engineered to satisfy the wish lists, but be tailored to certain specific domains. (System programming, hard real-time, soft-real-time, high level dynamic programming...) From what I can see, we're a good part of the way there already. (Graal and Truffle, LLVM, GUILE)

"Little languages" can have significant advantages over APIs in terms of flexibility, optimization, and conceptual fit with specific domains. Why not create infrastructure to not only de-duplicate work creating tooling for languages while ensuring everyone always had 1st-class tooling and making "little languages" as easy to create as APIs?


We already that in the form of Lisp and Smalltalk. Sadly that's not what the majority of programmers want to use.


We need something with the capabilities of Lisp and Smalltalk but with syntax more like Ruby or Python.


We should be clear here by what we mean by language.

It's perfectly possible and manageable to create cross-language VMs and to translate pretty accurately between languages. However, what breaks down, as you alluded to, is that people expect the standard libraries, and the libraries of others, to work interchangeably, and that is orders of magnitudes more complex (mostly because you are, by definition, working closer to the metal).

So, I need to ask, how much research is being done in giving the standard library as much consideration as the language? Do we have a grasp on what it would take to treat the stdlib as a "first class" citizen?


> Do we have a grasp on what it would take to treat the stdlib as a "first class" citizen?

One of the solutions to this problem could be having package management in your language be a first-class citizen, then treating each part of the stdlib as a separate package. If using stdlib code is just as simple as using 3rd-party libraries, then why need a stdlib at all?


It's because languages are communication tools as well as compilers, and the stdlib is part of the common vocabulary that the rest of the language ecosystem shares.

Without a decent stdlib, you end up with the pre-STL C++ situation, where every library & framework declares its own string, vector, and hashmap classes and you need slow, verbose, and error-prone routines to convert between them. The main benefit of having these in the stdlib isn't that you don't have to re-implement them, it's that every third-party lib knows exactly which class represents the concepts of text, sequences, and dictionaries.

The same goes for many other types, eg. Promises, Paths, URIs, Dates, Input/OutputStreams, Iterators, etc. Indeed, when one of the stdlib types is misdesigned (eg. java.util.Date), you get an utter mess as the community converges on an alternative (eg. JodaTime) and then the alternative API is folded back into the stdlib (Java 8).

Luckily, we're learning just which types need to be in the stdlib and which can be outsourced to a third-party package manager. In particular, a lot of serialization/parsing formats are better off in a package manager, as long as the language defines an annotation system that can be used to define which fields must be saved. And giant systems like webservers, webframeworks, or RPC formats are often better off as a third-party library.


> It's a great tool if you work in the JVM ecosystem and want to play around as a language designer.

Except that it's not really exposing much of the JVM from what the article mentions, it just uses the JVM as a base. That is, Truffle languages can talk to other truffle languages, but I'm not sure how easy it is for a truffle language to talk to a JVM language.

From everything I read, it seems like people thought Parrot was a good idea so decided to rewrite/reimagine it using the JVM instead of C as a base.


I liked Parrot, but it didn't move in the right direction. No useful AOT and JIT from the start kind of turned me down.


Well, those were always planned features, they just never got around to them, or they never got enough attention before the project basically died out.

As for not moving in the right direction, it was probably far too ambitious to try to develop something like Parrot and something like Perl 6 at the same time, with dependencies on each other. Both projects would have been better served if they did their own things, and Parrot was just seen as "a good candidate for a second implementation" from the Perl 6 perspective, rather than the primary target. It's hard to target a VM that's not done, and it's hard as a VM to support a language that isn't finalized (and has crazy requirements).


Well you have the mainframes model that used kernel JITs for all their official languages, like the OS/400 nowadays IBM i.

On IBM i the language surface is RPG, Cobol, C, C++ and Java.

On other ones you could eventually consider the micro-coded CPUs as a kind of cross language VMs.


The ctoss-language compatibility was a feature of OpenVMS systems. The AS/400 family had AOT and/or JIT for all object code in system. They also had PL/S that subsetted PL/I by removing runtime. These tech seem to combine all those capabilities in one stack.


> What usually happens is that they work fine for languages that look, semantically, basically like the native language on the VM.

> The JVM works if your language is mostly like Java (Java, Clojure, Scala, Kotlin).

I'm really curious why you're characterizing Clojure as "mostly like Java"?


Indeed. What if you're writing a Scheme interpreter? You're screwed.


As a specific example, Graal is built on the JVM. I'd be interested to see if they have support for languages that require tail call optimization (haskell, scheme) as the JVM cannot work with TCO languages.



F# isn't much like C#, and in fact it does suffer a bit for it, as some of the back end causes compromises in the F# language. Mutable fields in records have secret baggage, functions with arithmetic operators can't be automatically genericized unless you mark them inline, etc


My biggest let down with F# is that NullReferenceException is still a thing. Also a C# function is called as if they're typed to take tuples but then you can't create a tuple & pass it along


Judging by LuaTruffle, the Lua example, then this framework is useless.

LuaTruffle appears to implement the syntax of Lua but none of the semantics. For example, no metatables and no coroutines. Lua is unparalleled in its support of coroutines, and metatables are what make optimizing Lua _difficult_. Without implementing either of those things you've neither demonstrated the ability of the framework to implement novel control-flow semantics (asymmetric coroutines that can yield across multiple, unadorned function invocation) nor performance capabilities (JIT-optimizing Lua tables is non-trivial).

I'm not even sure LuaTruffle implements Lua's lexical closures or tail call optimization, both critical to Lua's ability to support functional programming patterns.

Of course, it definitely doesn't implement the Lua C API, which is part-and-parcel of what makes Lua preferable as an extension and glue language. But I was willing to overlook that if it could easily implement the former.

The beauty of a good DSL isn't the syntax (I know, hard to believe!), but in novel approaches to code flow execution and other deep semantics. Golang's goroutines are beautiful. Rust's ownership analyzer and prover are what _defines_ the language. Haskell's lazy evaluation open up whole new dimensions for problem solving (and headaches). If your language framework doesn't at least preserve the ability to implement those things cleanly and performantly, it's not adding much value and is basically a toy. It's not like writing a lexer for a DSL is a serious impediment. (Using Lua's LPeg, for example, you can write parsers and build an AST for most major languages in a few hundred lines of code.)


LuaTruffle has a few hours work by an casual non-expert external person.

If you want to know about how well we can implement the semantics of an existing language then look at JRuby+Truffle - it passes more Ruby language tests than any other alternative Ruby implementation. The only stumbling block is co-routines as the JVM doesn't have these and they're hard to implement efficiently with other constructs.


Does that kinda throw off the whole concept about pushing language design if you can't do coroutines?

That's a pretty critical feature of Lua that somewhat defines it(along with it's low overhead + embeddability, both of with you won't get with the JVM).

It seems like engineers always want to build One Tool to Solve Them All(tm) and yet there are always tradeoffs to be made. The reason 90% of this industry is still employed is because we don't have these one size fits all problems that we'd so love to solve.


There was a patch for coroutines on the JVM some time ago. Unfortunatly it did not make it into the platform. No reason why it could not be added to the JVM.

SubstrateVM solves the embeddability part.


Info out there on SubstrateVM looks a little sparse.

For context we use to run VM + tuneable data + game logic code in a 400kb block allocated to Lua on the PSP. I'd love to hear how SubstrateVM compares. Lua has a really rich history of being embedding in some pretty small targets.


400 KB is a very aggressive target. I don't think anything in SubstrateVM would absolutely prohibit that, but currently our images are larger.

Some of this is just simply a design trade-off. E.g., in JRuby+Truffle, we've implemented much of the core library in Ruby. This has allowed us to achieve a high level of language compatibility in a fairly short period of time -- implementing 3,000 core library methods in Java would be quite the undertaking. As a consequence, the static binary must include those full Ruby sources (compressed, but they're not in an optimized bytecode format).

Having said that, if we wanted to optimize for size over a restricted subset of the language (think mruby), that would be straightforward to do.


Oh, only co-routines? So you've implemented callcc already?


Truffle+Graal could actually easily implement all of those things, LuaTruffle is just one of the less developed ports. JRuby+Truffle has highly efficient analogues of most of the things you mention.


> Since the dawn of computing our industry has been engaged on a never ending quest to build the perfect language.

Except it really hasn't. The industry is going to use whatever language has the bare minimum feature set they value at the moment and no more.

Call me cynical but my point of view is: if the industry were really striving for perfection, there would be no COBOL or BASIC, for instance. Lisp had already been invented. Garbage collection was a thing already. We had macros. A REPL. A little later, OOP and multiple dispatch. The list goes on.

They had to ditch the (almost) perfect wheel for a cheaper square one. Fast forward a few years, and the square wheel is now polished but just as square, the insanely great wheel now just is as cheap as it is, but we won't use it, as the axles expect the square wheel.

And then we invent Java and XML...


if the industry were really striving for perfection, there would be no COBOL or BASIC, for instance.

COBOL was a huge improvement at the time, an innovation in readability. BASIC is a good demonstration of why the platonic ideal language is wrong: it was designed for beginners (that's what the acronym stands for), and it did a reasonably good job at that. Beginners might be overwhelmed by OOP, multiple dispatch and a list of design patterns. Sometimes experts are too.

Garbage collection is a nice feature, but it doesn't fit everywhere. Sometimes you want to manage your own memory. That's why we have more than one language.


> COBOL was a huge improvement at the time, an innovation in readability. BASIC is a good demonstration of why the platonic ideal language is wrong: it was designed for beginners (that's what the acronym stands for), and it did a reasonably good job at that.

Sure. But these systems were created from the ground up, every single time. Given a good foundation, which could be Lisp or something even better, you can whip up the "simpler" languages in no time. Even ditch the parentheses if you so desire. While getting all the benefits of the underlying system.

> Garbage collection is a nice feature, but it doesn't fit everywhere. Sometimes you want to manage your own memory. That's why we have more than one language.

Sure. But I disagree with that's the reason why we have multiple languages. There's no single reason, it is probably a mixture of preferences, previous knowledge, budget, time to market (JS!) among others.

We can manage our own memory from higher level languages. We can compile them. We can even make them spit out assembly custom-made for a given, niche application (see the Galileo magnetometer patch).

Sorry if I sound like a smug lisp weenie. But even that description is not accurate: had we focused our collective resources towards "perfection", as that sentence implies, we would have something way better than Lisp itself. Or anything else that we currently use.

It is as if we had invented jet engines before planes. But then placed steam engines on them because they were cheaper. Or that more people understood them.


Lisp didn't miss the train because it was too good or because people were idiots. For a good foundation to be good it has to meet the pre-requisite to work for most use cases. Lisp didn't when C did, in a time when hardware was small and expensive.

Times have changed now? Sure, but history didn't wait.


That's a significant part of it. Check out PreScheme, though, for how that might have worked:

https://en.wikipedia.org/wiki/PreScheme

It's a Scheme created as a C alternative for low-level programming used in a verified Scheme48, VLISP. It was quite efficient. LISP's doing more of that kind of thing might have fueled more use of the language.


UNIX and C were free as beer, because AT&T was forbidden to sell it and initially gave it away.

Lisp usually required expensive mainframes and workstations.

Of course C got the adoption of the 80's hipsters with such price tag.


Lisp worked well before C was even conceived. It's that C could be fitted on less expensive hardware, which was in turn easier to market to companies. C and UNIX success is a literal "worse is better" example - crappy solutions (compared to other contemporary ones), but with smaller up-front costs, so easier to sell. Buyers didn't seem to care that it costed more down the line.


The reality is: programming languages are fun, but the programming language is not the biggest barrier to writing good code, or even getting it done quickly.

Good programmers can write good code in any language, lousy programmers will write lousy code in every language. Malbolge might be excluded from that, but with a decent macro pre-processor, even brainfuck can be workable.


Most programmers are average though, so it helps to have languages that make it harder for you to introduce bugs.


> Given a good foundation, which could be Lisp or something even better, you can whip up the "simpler" languages in no time.

In practise, that foundation (for many programs) is Java byte code.


> BASIC is a good demonstration of why the platonic ideal language is wrong

That's often my point when people bring up arguments against a language that seems a bit more complex (Perl), or for language simplicity (Python). Some complexity falls into a sliding scale where there is more cognitive load while learning, but it pays of in the every day usage. The siple example of an extreme end of this is APL. If you can internalize and understand the extremely concise and powerful syntax and semantics of the language (I haven't), you can do amazing things very quickly[1].

Now, doing amazing things very quickly isn't the only metric by which we judge a language, but it is a useful metric, and it's probably worth having a language on that end of the spectrum in your tool set.

1: https://www.youtube.com/watch?v=a9xAKttWgP4


> If you can internalize and understand the extremely concise and powerful syntax and semantics of the language (I haven't), you can do amazing things very quickly

Not to mention figuring out how to type APL -- that's where I got stuck.


Try J then. It uses the ASCII character set, and has added some concepts that APL seems to be adopting. [1] And it is amazing what you can do once you have put some time into it. Jd and Jdb, both database apps for J beat Spark/Shark and others at the big data game. I'd call that pretty powerful. Similar to the way qdb/k is used a lot in the finance sector for ticker data, but J is opensource. [1] jsoftware.com


Keyboards are available, but they're expensive:

http://www.dyalog.com/apl-font-keyboard.htm


Don't worry too much about typing it. The syntax of APL isn't all that powerful and actually limits how you can leverage the aggregate operation semantics.


> Some complexity falls into a sliding scale where there is more cognitive load while learning, but it pays of in the every day usage.

I think quite a lot of complexity falls into this scale, and the quest for simplicity - or rather, "user-friendliness" - in computing is utterly misguided. The difference between a toy and a tool is that a tool exchanges more upfront learning requirement for ability to get more done. Not just more efficiently, but more. We're doing ourselves, and the world, a huge disservice by expecting that everything - be it a website, an IDE or a programming language - should be able to be mastered in first 5 seconds of exposure. The only way to achieve that is to dumb the product down.

In other words, we need more learning culture (or even RTFM culture), less "user-friendliness".


The majority never picks the best and markets do not optimize as much as some people think. Whether it's movies, arts, books, music, politicians, keyboards or programming languages, the most popular choices are practically never the best ones.

Which is probably why a technocracy in which every voter must pass a fact-based proficiency test before being allowed to vote might trump democracy. Not sure, just wanted to chime in with being cynical. :/


Who gets to determine which facts?

Why would that group not use that to their advantage, instead of to the benefit of everyone?

> The majority never picks the best and markets do not optimize as much as some people think. Whether it's movies, arts, books, music, politicians, keyboards or programming languages, the most popular choices are practically never the best ones.

The question is not whether the market picks the optimal outcome, but whether a committee of experts would do better. The experience of central planning within the Warsaw Pact should hint at the answer...


> The question is not whether the market picks the optimal outcome, but whether a committee of experts would do better. The experience of central planning within the Warsaw Pact should hint at the answer...

The experience of the entire Internet seems to suggest a different answer, so it's not that central planning is always bad. Note that the best, most reliable, most stable parts of the Internet were invented and codified long ago, when there weren't many people on-line and there wasn't that much commercial interest in it. Today we can rarely if ever agree on any kind of protocol or standard, and if we do, it's usually a huge bloated mess.


> Today we can rarely if ever agree on any kind of protocol or standard, and if we do, it's usually a huge bloated mess.

So the question becomes, how do you know whether you're going to get a huge bloated mess or the Internet, and how do you influence the outcome?

As the number of potential 'experts' increases and the number of people involved in choosing them increases, you get something that looks more and more like a market, but without price transparency...


Randomly chosen from a democratically determined set, e.g. by a double-blind peer review committee.

Also, the facts can be discussed openly and if there was a problem with some of them, there could be a system of revoking or modifying the set of questions similar to the recounting of ballots.

For the committee of experts, that was your idea, I never suggested that, because it obviously makes no sense.


You used 'trump' next to 'democracy' ;)

The majority and optimizing markets are at odds in your statement. Look at qdb/k, it is very expensive, programmers cost a bunch, but it is used by the big financial houses. I am sure they are glad it is not too popular and affordable.

I do agree that I don't like central planning, and design by committee, but those are different than your market statement or technocracy comment.

Movies, arts, books, music...horses for courses, best for what moment? I like action films, I like high-brow, arthouse stuff too. Same with books. I am glad I have a choice for the moment at hand.

When I was a young intellect, I succumbed to that elitist attitude of 'the best', and 'they don't know what's good for themselves' which is funny because I grew up in a poor, working-class neighborhood in Brooklyn in the late 60s/early 70s. Elitists were the guys from NY as well, but who wore black berets to film class at TSOA/NYU (three in my class alone!).


> "They had to ditch the (almost) perfect wheel for a cheaper square one."

Lisp was slower than C when speed mattered most. If you can't see why that would be an issue, I suggest trying to write a game for a home computer from the 1980s.


Some languages are better than others at certain things. By default, C is better than anything at low-level systems programming. Fortran can crunch numbers like nothing else. COBOL and BASIC are easy for novices, and BASIC is relatively simple to implement on low-end hardware, making it a natural choice for the early micros. Java is better than anything else at... well, no, Java sucks. And XML is even worse. Steve Yegge once noted that somehow, Ant and Jelly were less verbose than Java. And honestly, I'd believe it.


Before anyone gets too excited, make sure you look at what a compiler using Truffle actually looks like, and remember that Java is far from an ideal language for writing a compiler. I'll use their SimpleLanguage (their primary tutorial lang) as an example.

Parser [1] shows how lack of sum types makes code messy. Lots of "factory.createStringLiteral" kind of calls. At least Java has a mature parser generator.

Implementing interpreter nodes [2] requires a pretty large amount of boilerplate, each individual node is relegated to its own file, some classes are littered with probably autogenerated getters/setters. While functional languages can be too terse, Java shows its exceptional verbosity here.

[1] https://github.com/graalvm/truffle/blob/master/truffle/com.o...

[2] https://github.com/graalvm/truffle/tree/master/truffle/com.o...


Partial Evaluation requires very precise knowledge of the bytecodes which are feeded into it. Having eg. Scala abstractions showing up suddenly would be quite annoying. Java is closest to bytecodes thats why we use it for Truffle interpreters.


Urrgh. This just reminds me of why I gave up on android app development, and the JVM ecosystem as a whole: it tests my ability to put up with bullshit.


I agree. I wanted to really write native Android apps, not HTML5 or even use the NDK with C/C++. It is just too much stuff. The downloads, the verbosity of Java, the lack of a slick toolchain (Android Studio is getting close, I guess), and a slew of other incoherencies.

I don't cut slack for Obj-C on iOS either.

I've had great fun with two products for Android/iOS and Mac/PC/Linux:

[1] Godot game engine. I followed the introduction and had an app on my Android phone in 10 minutes. A little hack in the background, but hey, for gaming it is great.

[2] 8th - a dialect of Forth - you write Forth-like 8th, and voilà it's running on your Anroid, PC, Mac, iOS device!

[1] https://godotengine.org/

[2] http://8th-dev.com/about8th.html#cross


I have not used godot up to now, as I am wary of engines that require me to learn a scripting language used nowhere else. Is it any good?

Oh, and I've heard React Native's not so bad for mobile.


I have not tried more than about 8 game engines, so my opinion is limited. Godot is very good:

* Very customizable interface * Lots of tutorials on YouTube and elsewhere * Good examples provided * Script is very similar to Python, so not too hard to pick up, and again, lots of examples. * Easy packaging for Android, Windows, and Linux on my end * MIT license, commercial or non-commercial * 2D/3D and live debug and scripting on the fly

Really worth trying out, since it is easy to install and dive in.

I'll have a look at React Native, but I am not a JS programmer, and all of these 'hot' JS libs, modules, etc... seem more cluttered than plain old JS. I think if I were to branch out, I'd try Purescript instead.


If you're thinking along the lines of Angular, React is very different (less clutter, nice design, emphatically not a framework), and has inspired many re-implementations of its ideas.

It's probably more worth checking out that most of what's out there.


Well can't we just use another language for JVM (e.g. Kotlin or Scala)


Hopefully! I would love to see a Truffle language written in Scala, although some of the DSL stuff I've seen in Scala (e.g. Delite) is still pretty verbose. I feel like the missing piece is quasiquotations--writing the code generation step would be a lot easier with those.


The Oracle flavors of the JVM/JDK probably scare too many away with regards to redistribution of their product. That coupled with the fact that things like the AOT engine is closed source and the weight of the JVM for anything besides daemons (both mentioned towards the end of the article) probably keep language designers away these days.

I would love a lightweight toolkit that made for easy language development. VMKit[0] is dead, MicroVM[1] is a nice idea but not full fledged. Many like myself looking to implement a toy language would love to be fast out of the gate and would rather not mess with the OS-specific constructs and specifics of LLVM IR. So we end up cross-compiling to existing languages or marrying ourselves to a certain runtime (e.g. the CLR). Any other lightweight VMs or AOT compilers that are cross-platform that language designers can use these days?

0 - http://vmkit.llvm.org/ 1 - http://microvm.github.io/


QBE[1] seems to be what you're looking for. It "aims to be a pure C embeddable backend that provides 70% of the performance of advanced compilers in 10% of the code".

Previous discussion on HN: https://news.ycombinator.com/item?id=11555527

[1]: http://c9x.me/compile/


Neat, I missed this when originally posted. Sadly, it falls apart on the "cross-platform" requirement (does not appear to support Windows).


Graal/Truffle is not a proprietary part of the JDK, i.e. it's also part of OpenJDK. The article mentions this.

Some tangential things mentioned are proprietary though.


I think we should start to see post-Java9 a less resource intensive JVM, even more so with Java10. The problem boils down to if there exists an alternative ecosystem that is going to be as featureful and as powerful in the meantime. I am bullish on Oracle here, as much as I hate to say it


yeah java9/10/11 stuff is probably a little bit late.


Yeah, the problem is that it will only appear around 2020 and meanwhile the competion doesn't stand still.

Also we still have deployments using EOL Java versions that don't get upgraded, because IT doesn't want to change those servers.


Also sometime back there use to be OpenJDK maxine VM project which is gone and now an option of closed source SubstrateVM appeared.

It may be unlikely but Oracle can simply remove highly paid staff from above mentioned projects and put in some closed source alternative/analogous projects. Open source project may then be left to rot. As it is almost every single developer on Graal etc is Oracle labs or Oracle funded research group at university.


check out RPython.


I should have mentioned one of my wants was a pluggable type system. Maybe not a full type inferencing component (that would be nice!) but at least static typing at the IR level.


Doesn't RPython do all the same things? Also there are language workbenches (http://www.languageworkbenches.net/) that allow one to get a bunch of things like IDE support and even basic compilers by properly expressing the language syntax and semantics.

I agree there is a lot of interesting stuff going on here but lets not forget the prior art. Even OMeta I think already did a lot of these things way back when. Going further back there's META II (https://en.wikipedia.org/wiki/META_II).

I think building self-specializing interpreters is the main trick. If that becomes easy then a whole bunch of other magic is also possible. But I'm just an amateur so all of this is still very impressive.


Since you mention OMeta, let me remark here that for some reason, this whole Graal+Truffle thing looks to me like an "enterprisey" version of OMeta+COLA.


RPython is very similar to Graal+Truffle, except that RPython is based on tracing and Graal+Truffle is based on partial evaluation.

There's a nice paper comparing the two approaches and their relative merits:

http://stefan-marr.de/papers/oopsla-marr-ducasse-meta-tracin...


Okay, very cute, but what if you want your language to have significant semantic differences from C and the like? What if you want to write a Scheme interpreter, for instance? Can I have Continuations? Can I have Tail Call Optimization? I very much doubt it.

If you want to write a language that is semantically like C for the most part, than go ahead and use Truffle, but that's not where interesting language design is happening. Show me that Truffle's Ruby implementation hasn't removed callcc (yes, Ruby has callcc), and maybe I'll reconsider.


You can do TCO relatively simply. You throw a special exception class for nonlocal control flow (eg. tail calls, breaks from loops, continues, etc.) and Graal knows how to optimize the cost away to a normal jump.

http://cesquivias.github.io/blog/2015/01/15/writing-a-langua...

ZipPy (an incomplete Python implementation) supports Python's coroutines, which are faster than any other implementation. In theory these might be directly generalizable to get continuations.

http://thezhangwei.com/documents/oopsla113-zhang.pdf


The TCO sounds like an ugly hack to me, but I'll take it. How would this interact with the magical AST merging? I would assume that languages with different calling conventions can't just merge like that.


Imagine you have an AST. You execute it directly, by running a node. This node will run its sub-nodes, etc. The implemented language has its own stack frames stored in specific objects. Thus the stack is actually split between those variables in the Java call stack and those in the explicit object.

Let's say you're in a particular function at a tail call node. Since the tail call node is semantically (from the point of view of the Java program) many function calls deep into the implemented language's function and we want to get back to the start, we want some way of exiting a whole stack of function calls in one jump.

Exceptions offer a simple and neat way to do this. You throw from the call site and catch back at the start of the function. Graal will inline all of the internal function calls up to this point, since Graal uses inlining extremely heavily. This means that Graal will see the throw site and the call site next to each other in one block of code, and consider it to just be a jump.

Because Graal is just inlining Java functions, and the actual target language's call stack is stored separately as a particular object, the target language's calling convention for a large part doesn't matter. The jump to the first node doesn't invalidate any of the target language's stack - that has to be done by the language implementor manually at that point.


If you do it in a language specific way, you would probably need to catch those TC exceptions when calling into functions of your language. That is very easy to do, as every Truffle language controls how its functions get called in a foreign languages.


Ah.


Don't know why you get downvoted. That's exactly what I thought.

Java is my main lang and currently I'm working on my hobby project - implementing Scheme r5rs in Java. That is hard! And I doubt I'll be able to implement full r5rs even close.

I have no idea (yet) how to implement Continuations: In theory yes, you can implement Continuations in any lang which has Exceptions (the only way in Java to unwind a stack).

Like: https://www.politesi.polimi.it/bitstream/10589/108685/3/2015...

Proper TCO:

AFAIK it is impossible in general case in object-oriented lang. That is one of the reasons why Rich Hickey decided to have explicit recur in Clojure,

Yeah, you can implement your own stack and stop using Java's stack, but then you lose performance and interop: "Interop, speed, TCO -- Pick two." Rich Hickey

See https://news.ycombinator.com/item?id=4922848

Then Full Numeric Tower is hard! Mostly due to the fact that Java's type system is not strong enough.

I think Kawa is the best implementation of Scheme for JVM you can get now.

But even Kawa has some limitations (due to JVM limits): https://www.gnu.org/software/kawa/Compatibility.html


You're not compiling to Java though. You're writing an AST walking interpreter, so you should be able to implement continuations and TCO just fine.


Can you please elaborate? Because I don't see how it is different.

Are you still using Java stack, Java calling convention, still able to do Java interop?

When you are writing your own language on top of Java (Scheme, for example), you have full control of your lang AST (S-expressions, for example). How does it help to implement TCO and continuations (assuming you don't want to lose Java interop, Java calling convention, Java stack and performance)?


I'm definitely not an expert on this, but how the article describes it is you basically implement an AST with "execute" methods on its nodes. So that's just an interpreter that happens to use the Truffle framework to implement that AST. You can then implement continuations or TCO however you see fit. As I understand it, Graal is able to 'magically' turn that into a JIT. Again, I'm not an expert (I just read the article), but that seems very similar to how PyPy does it. The actual optimizations are different; PyPy uses tracing, while Graal uses Hotspot I think.

Now, as you point out, you lose Java interop that way. However, you can imagine using Truffle + Graal to also build a similar interpreter for Java. For interop, your interpreter 'simply' has to merge in the AST for the Java interpreter (with its own 'execute' methods). As I understand it the RubyTruffle project uses a similar trick to do C interop.


But if the calling and stack semantics are different from Java's (and the kind of have to be), than you can't just merge the ASTs. Ruby, provided you don't implement callcc, is almost, if not entirely identical in calling convention to C. So if truffle supports magical AST merging, as described, I don't think you can implement anything with a different set of conventions to C/Java.


Wait, nevermind. See above.


There are a lot of weasel words for the other languages, but the comment about ruby is extremely interesting:

"For example, the TruffleJS engine which implements JavaScript is competitive with V8 in benchmarks. The RubyTruffle engine is faster than all other Ruby implementations by far. The TruffleC engine is roughly competitive with GCC."


Chris Seaton gave this talk recently about JRuby+Truffle and some specifics about how it optimises so well compared to other Ruby implementations.

https://youtu.be/b1NTaVQPt1E

Previous discussion

https://news.ycombinator.com/item?id=12062454


Chris is also reasonably active here.

I've had the pleasure of discussing some of his jRuby work with him in the past (I occasionally plod along with own Ruby compiler in Ruby - it's nowhere near complete) and they've done a lot of impressive work to demonstrate how compiling Ruby into fast code is largely about assuming people will behave sanely (e.g. people won't usually give Fixnum#+ side-effects, or make it return weird stuff), but add fallbacks for when they do.

This is the big challenge with Ruby: The subset of Ruby that people actually tend to work in is largely very predictable and possible to compile efficiently, but there's a lot of stuff around the fringes that people expect to work on the very rare occasions that we use it.


I believe those benchmarks are for throughput, though. JS VM startup times (including v8) are highly optimized, while Graal+Truffle run on the JVM which is more optimized for sustained speed.

JS VMs also tend to have multiple tiers for that reason - interpreter, baseline, and full JIT, etc. - while AFAIK the Graal+Truffle/JVM approach has just 2 (which is optimal for Java, but not for startup times of JavaScript and other dynamic languages).


Startup times are mostly important for scripting or running things in the browser. If you start applications once on the server or desktop and keep them running then steady-state performance is more important.

There also are projects underway to bring AOT compilation / caching of compiled code to the JVM. Right now they're only available to commercial customers due to their experimental nature and the needed support but if I remember the talk correctly they're supposed to be added to openjdk eventually


True about AOT coming to the JVM, however, dynamic languages can't be AOT'd well in general, so that won't help Graal+Truffle languages much.

Instead, dynamic languages tend to do tiering, which in principle is something Graal+Truffle could do, but it might be a lot of work.


Well at least it would help you avoid warm-up for the interpreter/compiler itself, so it might help a bit.


> The TruffleC engine is roughly competitive with GCC

It is not surprising in sense Java is competitive to C/C++ as per many Java developers.


I work with .NET, Java and C++ stacks.

In many use cases, C++ might win the benchmark game, but the end user will not notice any difference in human time.

The biggest issue is of course than one needs to learn how to optimize code, use the right algorithms and data structures in first place.


> In many use cases, C++ might win the benchmark game, but the end user will not notice any difference in human time.

If the program runs on the user's machine, I'm pretty certain the users WILL notice the difference, memory wise.


That's not true. Do you know Cyberduck? It actually takes less memory than most other FTP/SFTP etc programs. and probably feels a dozen times more smooth than most clients.


As I said "use the right algorithms and data structures in first place".


I agree mostly. Any Java application which does not need GUI other than web is just fine in most use cases.


UIs though...

I haven't worked with a single Java based GUI application that did feel smooth, fast and efficient.

I wonder why that is.


Because most developers don't care and code everything on the UI thread.

Additionally Swing has bad defaults, so it requires effort to make the required set of calls to make them look better.

Back in the Sun glory days, there were Sun blogs like Filthy Rich Clients, which lead eventually to a book http://filthyrichclients.org/.

It was a consequence of Sun not getting what it means developing GUIs for the consumer systems, think that their Solaris UIs (SunView, OpenWindows, CDE) weren't that great in that area.

So most Java developers stick with the defaults and as such the majority of Java based applications send the wrong message.


Android does a bunch of things to keep you off the UI thread(IO throws an exception, handlers, Async Tasks, etc).

You still have to GC at some point and when that happens chances are you're going to drop frames. Unless you're very aware of the garbage you're creating you'll take more than ~5ms which is usually enough to push you over 16ms with the other work that goes on during a frame.


Measure Java performance by how Android works is not a good measure.

Dalvik is well known on Java world for having a JIT and GC implementations that worse than what most commercial embedded JVM are capable of.

Things have improved with ART, but even there there are quite a few performance improvements that Google could eventually do.

Soft real time Java GCs for embedded devices are being used in ground station controls of missiles and a couple of US Navy weapon systems. They surely don't want GC glitches in battle situations.

The Android team seems to only bother with "good enough" in what concerns Android performance.

As a side note, just check how many Android releases are they going through and yet real-audio support isn't quite there. Even Windows Phone has better support for it, and they do have WinRT and .NET on them.


>As a side note, just check how many Android releases are they going through and yet real-audio support isn't quite there

Actually, there is. As of Android 6.0 Google's CCD includes a section for Professional Audio devices

If a device implementation meets all of the following requirements, it is STRONGLY RECOMMENDED to report support for feature android.hardware.audio.pro via the android.content.pm.PackageManager class.

The device implementation MUST report support for feature android.hardware.audio.low_latency.

The continuous round-trip audio latency, as defined in section 5.6 Audio Latency, MUST be 20 milliseconds or less and SHOULD be 10 milliseconds or less over at least one supported path.

If the device includes a 4 conductor 3.5mm audio jack, the continuous round-trip audio latency MUST be 20 milliseconds or less over the audio jack path, and SHOULD be 10 milliseconds or less over at the audio jack path.

The device implementation MUST include a USB port(s) supporting USB host mode and USB peripheral mode.

The USB host mode MUST implement the USB audio class. If the device includes an HDMI port, the device implementation MUST support output in stereo and eight channels at 20-bit or 24-bit depth and 192 kHz without bit-depth loss or resampling.

The device implementation MUST report support for feature android.software.midi.

If the device includes a 4 conductor 3.5mm audio jack, the device implementation is STRONGLY RECOMMENDED to comply with section Mobile device (jack) specifications of the Wired Audio Headset Specification (v1.1).


I saw the Google IO 2016 presentation where Google acknowledged Android 6.0 didn't actually fully delivered but Android 7.0 is finally there.

They even had Samsung on stage to talk about their Android extensions for real-time audio.

So with Android 6.0 barely over 10%, still with issues fixed in Android 7.0, it is as I said "real-audio support isn't quite there".


>Even Windows Phone has better support for it, and they do have WinRT and .NET on them.

No they don't. Point me to one windows phone audio app that has low latency audio.



You linked to an app that measures audio db levels. There are no low latency audio / music apps because windows phone isn't very good at it.


Or maybe because no one capable of writing such applications wants to target 1% market share?

Sometimes being good at something isn't enough.


I wonder if BB users also use that excuse.


I didn't use it as an excuse.

I am no real time audio expert, but as someone that does work on Windows, I saw the information I have posted back when it went live.

So it is possible, that in spite of the support being there, no one cares about it due to the sad state of the Windows Phone market and not because they suck.

Or I suck at understanding how real time audio is supposed to work and the support isn't really there in spite of what Microsoft says.


do you use or know cyberduck?


I think another key part of the dream is IDE support. So maybe the dream is complete with: Graal + Truffle + Nitra[1].

[1] Nitra - https://github.com/JetBrains/Nitra


How does Nitra and MPS fit together? They both come from JetBrains and both describe themselves as "language workbenches"


So why is Oracle doing this?

Also, anything from Oracle which is partly open and partly closed is scary. They've done that with Java and MySQL, which drives people away from both.


That was one of my counterpoints. We've seen Oracle fight to say API's are copywritten and try to take down Android. All that kind of crap makes me think safest route is pushing ETH/Wirth languages and/or Racket if you want DSL's and stuff.


Graal + Truffle are very, very cool. The main downside is that it only runs on the JVM, which is a substantial limitation.

However, I assume one goal of the project is to make the JVM more competitive, which it certainly does.


Have there been any recent languages/environments that (have reasonably) succeeded but aren't completely open source? I feel a general resistance to building a dependency on something when you don't have complete access to it, but maybe that is just me.


Have there been any recent languages/environments that (have reasonably) succeeded but aren't completely open source?

Xamarin. iOS too, although that depends on what you mean by 'recent'


My comment was because I found it interesting that they are trying the partially open source model, when it seems like that hinders adoption. So 'recent' is the competition for adoption.

Didn't a lot of Xamarin get open sourced after the acquisition? Swift is also open source, although Xcode isn't.


  >Swift is also open source,
True, though practically speaking, it's not very usable without the (closed-source) Cocoa libraries (much like objective-C).


Interesting. But I feel like Racket is a better framework for quickly prototyping new languages.


Totally, Truffle is not nearly as easy as a simpler interpreter in a terser language, but unlike a Racket interpreter your code can actually run faster than raw Racket, and have incredible tooling and integration support.


When I first heard about this it was promoted as a faster ruby implementation. How far away is it from running rails?

Interesting that they've gone as far as running C extensions "internally", making it easier to run existing ruby apps without running up against the barrier of a native gem and having to change the code.



> Since the dawn of computing our industry has been engaged on a never ending quest to build the perfect language.

...unfortunately. I see PLs as mere material - sure, you can improve on them but at far more important is how we architect our systems (the PL-independent ways we create and organize our systems into interfaces and components) is where I see the software practitioners of today flailing - and no PL is going to save us there.

We need to have a better way to analyze systems on these architectural measures and a better way to train people to build better architectures, not more PLs.


unfortunately. I see PLs as mere material

That's only part of their point. It's as if we were in the building industry, and we needed new tooling for each specific material. When I'm at a hackspace, the same chop saw will let me cut wood, delrin rod stock, steel linear rails, and aluminum extrusion. The same drill press and bits will operate just fine on most of the above as well. What we have in the programming world is a situation where I'd need a different toolset for every material I'd listed above.


> A way to create a new language in just a few weeks

As a Lisp programmer, I regularly create new languages in a matter of hours.


Do you mean macros? Could you re-implement a language such as C as a macro? And have it be about as fast as GCC? I didn't think they were that sophisticated.


A Lisp macro is simply a function from an AST (in s-expression form) to an AST. Other than that, it's an ordinary function and has full access to the power of the language.

Implementing C as a parser that builds s-expressions followed by a collection of macros to translate it into Lisp is certainly possible; I did it in 1984. I wouldn't say it was trivial, though; nor was the performance anything special (even then).

I will definitely take a closer look at Graal and Truffle.


Yes, you could. You'd implement a program to take C code and emit an equivalent Lisp program (i.e., parse C and emit the tree into Lisp), then compile the Lisp program. The annotations and care the parser/semantic analysis code uses will provide the semantics of C.

This will, however, not "play" like C without a great deal of work. You will be dragging an entire Lisp VM around and have to shake all of that out to actually produce something as lean as a C program at the end. You're probably best suited to take the Lisp compiler and build a "independant code" generator to generate binaries that don't rely on Lisp. But I'm waving my hands, you'd need to talk to the SBCL or CCL development teams there.


That doesn't sound significantly easier than what we've done to implement C using Truffle. I'm not sure I buy this argument that what we've done could have been done trivially using Lisp macros.


Honestly, the front end would be about as hard, unquestionably! But the Lisp ecosystem is already in play, whereas you seem to have had to build that ecosystem. That is to say, it's not a major effort to scrape through the Lisp systems to understand what is going on, whereas your ecosystem with Graal etc required a great deal of work, it seems to be. Would that be correct? Or was it pretty plug and play after you had JVM bytecode?


Why compile to lisp in the first place? Build a nanopass framework (https://www.google.com/url?sa=t&source=web&rct=j&url=http://...) that acts on sexprs, and then emit the thing to a target of your choice. The advantage is that Lisp is really good at transforming sexprs, and macros make writing the infrastructure for that sort of compiler just that much easier.


SBCL can compile lisp to native code. :-)


Well, yeah, but as you said, you have to haul the runtime around with you.


It's pretty small, as runtimes go these days.


Could you re-implement a language such as C as a macro?

You can do that pretty quickly in IO language (admittedly, that's not Java)


You cannot. Because Lisp is homoiconic, you don't code in (what I personally believe to be) reasonable syntax, you code in abstract syntax trees. Lisp is so easy to metaprogram because everything is parentheses and the burden is on the user, not the parser, to determine the program's abstract syntactic structure. So, Lisp can only metaprogram its own syntax, you can't introduce a C-like syntax.


> You cannot.

You can. Reader macros let you put whatever syntax you want on top of Sexprs. For example:

    Welcome to Clozure Common Lisp Version 1.10-r16479M  (DarwinX8664)!
    ? (spark-init)
    #P"/Users/ron/devel/spark/ergolib/init.lisp"
    ? (require :parcil)
    ...
    
    ? (in-readtable infix)
    |NIL|
    ? infix(x=1.23+4.56)
    5.79
    ? sin(x*x+1)
    0.03341171
    ? 
The code for this is here: https://github.com/rongarret/ergolib


For sure, I was just considering normal macros.


No, you were just flat-out wrong.

> Lisp can only metaprogram its own syntax, you can't introduce a C-like syntax.

This is a flat-out false statement, and it reflects a deep but sadly common misunderstanding of how Lisp works and why it's cool. Even if Lisp did not have reader macros as a standard feature, you could still write a C compiler in Lisp more easily than you could write one in any other language. The whole concept of "metaprogramming a syntax" (so that you can talk about whether or not a language can "only metaprogram its own syntax") is non-sensical, a category error. Metaprogramming is simply writing programs that write programs. What makes Lisp cool is that it separates the syntax from the program. A Lisp program, unlike all other languages, is not text. A Lisp program is a data structure. Lisp happens to define a textual surface syntax (S-expressions) that allows you to easily convert text into the particular data structures that are Lisp programs (and also incidentally data structures that are not Lisp programs) but you can also produce these data structures in other ways, like, for example, writing programs that produce them. The textual surface syntax is a detail. Writing an infix parser in Lisp is an elementary exercise, and you could use that parser to parse C-like code whether or not you had reader macros. All reader macros let you do is seamlessly integrate that C-like syntax into the Lisp REPL rather than having to embed your new language in strings or read it from files.


Yes I agree with you, I am not trying to say it is impossible to write a C interpreter/compiler in Lisp. Any Turing-complete language can do that. I'm more so interested in the built-in metaprogramming capabilities, and it's evident that reader macros go far in that direction.

On a different note, would you say that Lisp's syntax is better because it makes an easy compile target (i.e. it's easier to compile C into Lisp instead of x86) or because it makes writing macros/AST transformations easier? I ask because my biggest beef with Lisp has always been that I thought its S-expressions to be less readable than other languages, but if you're arguing for Lisp's value as a compile target, then that makes a lot of sense.


> Lisp's syntax is better because it makes an easy compile target

Not quite. Lisp's syntax is (mostly) irrelevant. Lisp's design is better because you don't have to worry about syntax to use Lisp as a compile target.

There's a huge conceptual difference between

    (eval '(some code))
and (using Javascript or Python as an example)

    eval("Some code")
In JS/Python case you are passing a string to EVAL. In the Lisp case you are passing a data structure to EVAL. That data structure has been generated by the reader (by virtue of using QUOTE) but that is not the important part. You could as easily have written:

    (eval (cons (quote some) (cons (quote code) nil)))
or

    (eval (some-function-that-generates-code))
with the point being that some-function-that-generates-code does not return a string, it returns a data structure, so there is no syntax. The whole concept has evaporated because you're not dealing with strings any more. Syntax is for humans. When you have programs writing programs, syntax just gets in the way. But when your EVAL function takes a textual representation of a program as a string you have no choice but to muck with syntax. This is the reason that the misconception that syntax is essential is so widespread. It seems essential only because most programming languages don't distinguish between READ and EVAL.


point being, you can trivially parse code of any syntax, just like any other language. What makes lisp different is that once you have a parse tree of sexprs, you can then traverse that tree very easily: Lisp has tools built right into the language. You can then very easily transform that tree: In fact, you can write a hook whereby the code being written can provide arbitrary transformers, which are called by the compiler when seen in code: This is all a macro is: it's an API for code to hook into its own compiler: much like Rust's compiler addons. Unlike compiler addons, however, they don't depend on compiler internals: In fact, if a Lisp implementation didn't have macros and readtables, you could write a preprocessor to add them in (but gensym/hygene would be a pain to do). However, this would all be very confusing if the internal tree didn't look like the external representation. This is why Lisp is written in sexprs. The reason you see so many DSLs in lisp being written in a lispy syntax is that lisp hackers, like all other hackers, are fundamentally lazy: Since there's already a lisp syntax parser built into the language, they don't have to write their own, and macros often suffice for the most common reasons to write a DSL, which means that they didn't even have to write a compiler, they just hooked into the existing one. Besides, sexprs provide advantages even if you are writing your own language, and you can overload lisp's syntactic shortcuts for your own purposes.


If you want to handle a C-like language, you just need to make a parser; once you've got an AST, that is your "Lisp syntax", so you can do all of your fancy metaprogramming, interpretation, compilation, etc.

Using a parsing framework (e.g. parser combinators, parser generators, ometa, etc.) to parse a language which has a well-specified syntax (e.g. a BNF grammar) is pretty routine and mechanical; usually it just requires a one-to-one translation of the BNF form into the parser framework's syntax, then fiddling with ordering and precedence rules until your tests pass.

Starting from scratch, I could knock out a C-like parser in maybe half an hour, e.g. using parsec or ometa. I'm sure Lisps have equivalents (I know Racket has built in support for defining new concrete synax)


> Because Lisp is homoiconic, you don't code in (what I personally believe to be) reasonable syntax, you code in abstract syntax trees.

There's no reason a regular macro couldn't implement a basically C-like syntax (though it would have Lisp, not C, tokenization rules, to the extent that those are different) on what is passed as its argument. The actual call to the macro would have typical Lisp syntax, but what the macro consumes would be interpreted with whatever syntax was implemented by the macro.

(And, a reader macro could implement more deeply C-like syntax.)


Well, for regular macros, only if your C syntax was in a string. Otherwise, lisp would try to tokenize it, and choke. Heck, if your lisp of choice offers an eqivalent of TCL's uplevel, you wouldn't even need a macro, just a function.


> Well, for regular macros, only if your C syntax was in a string. Otherwise, lisp would try to tokenize it, and choke.

Which is why I said normal (rather than reader) macros could do a "basically C-like syntax" but with Lisp tokenization rules.


What would that look like?


  ((:include "stdio.h")
  
  (const char * message = "Like this.\n")
  
  (int main ((int argc) (char * * argv))
    (printf message)
    (return 0)))


Ah.


I've never seen something like this before--do you know of any examples of this kind of macro usage?


This article provides a simpler (but conceptually similar, where it comes to implementing syntax different than standard Lisp S-expression syntax) example of using reader macros to implement JSON literals in Lisp.

https://gist.github.com/chaitanyagupta/9324402


I'm vaguely familiar with Lisp, but I'm not a Lisp programmer. Can you explain this comment?


Lisp languages are easy to build because of their simple syntax. It's also easy to use a Lisp to build a Lisp, since it is homoiconic (this is what you're doing with the macro system.)


It's much the same in Smalltalk. Often, one could create a domain-specific superset of Smalltalk, complete with its own control structures.


I was thinking that myself. If they had started with a modern Lisp implementation, they would be worlds ahead. :-/


Here is a video of a Graal/Truffle talk given by a researcher at Oracle (presented at Mozilla in 2013):

https://air.mozilla.org/one-vm-to-rule-them-all/


This is quite outdated. here is an overview of Graal/Truffle papers and tutorials: https://wiki.openjdk.java.net/display/Graal/Publications+and...



Yes this brand new. Have fun!


> Interpreted dynamic languages like Python, JavaScript, PHP and Ruby look the way they do because building such a language is the path of least resistance when you start from a simple parse tree.

This looks... backwards?


Especially if you want to talk about ruby as a language with a simple parse tree.


So, judging by the slides for that "one vm to rule them all" talk [1], it's Java all the way. One has to generate AST in a form of Java code and deal with Java ecosystem. And it feels all very very complicated for all the wrong reasons.

I guess some people from the Java world could find something useful there, but it's very unlikely to attract anyone else.

[1] https://lafo.ssw.uni-linz.ac.at/pub/papers/2016_PLDI_Truffle...


Well now, I have got to find time to implement Joy (lang by Manfred von Thun) in this and see if it works. (I can't tell just by thinking about it, at least not so far. It will either work and be great, or not-work and be really really interesting why not.)


This article, while very interesting, completely ommitted .NET CLR. Does anyone know how does the CLR compare to JVM, especially the new CoreCLR? How maturę is it in terms of using the best of breed JIT optimizations?


fwiw IO language is a pretty good prototyping language too.....good for experimenting with different syntaxes, etc


One word: Oracle.


To be fair, Oracle gives you things such as VirtualBox, MySql, Java, Netbeans etc for free. That's not too bad for an 'evil' company.


No it doesn't, they are made by Sun Microsystems, they are free because of Sun, not Oracle.


That's just not true.

Consider all the improvements to the above software that came after Oracle's purchase of Sun. There was a boatload of them and they keep coming. Oracle is paying for that.

They are all free, and that is because of Oracle, not because of Sun as Sun does not exist anymore.


Improvements? How much? They ruined OpenOffice, triggering the LibreOffice fork. They ruined MySQL, triggering the MariaDB fork. The original Sun would have done 10X better improving those things they made.


They were project from Sun Microsystems originally, but Sun Microsystems doesn't exist anymore. Oracle have chosen to keep some of the Sun products open/free (they've closed support for others, such as OpenSolaris, which was continued through an open-source fork).


Funny, I posted this yesterday and got no up votes. Can you post a duplicate of some post and get up voted? I didn't realize that was possible.


Take a look at the Medium URLs and you'll see they have a fragment ID and so count as different submissions.


I wondered why this didn't get merged with my submission. Makes a neat example of how random getting on the HN front page is though. Although I'm sad that randomness chose to give the second posting karma instead of mine.


What was your title? Without a good one, your post might have been just skipped by most.


Exactly the same except including the "radically" from the blog post. Wow I missed out on 400+ karma at this point... Oh well, at least they are only internet points.


These people and their gamified minds... ;)


like Digg before it, HN can be easily easily gamed

you have sizable groups of people who work like one team, one person submits and others up vote

In the Digg days I was approached by some guys in Eastern Europe who offered to get me diggs for $$. I declined. I was hoping HN is harder to game, but I don't see any evidence that support that.


[flagged]


Do a little googling and you wont get a downvote: https://news.ycombinator.com/item?id=7494732

dang's profile https://news.ycombinator.com/user?id=dang with contact information.


At the very bottom of every Hacker News page, there is a contact link: hn@ycombinator.com


There is a contact link at the bottom of the page : hn@ycombinator.com




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: