Hacker News new | past | comments | ask | show | jobs | submit login
Carp: a statically typed lisp, without a GC, for high performance applications (github.com/eriksvedang)
196 points by adamnemecek on July 6, 2016 | hide | past | favorite | 67 comments



Is there an explanation of the memory allocation used? There seem to be a GC (https://github.com/eriksvedang/Carp/blob/master/src/gc.c), maybe used in the compiler only, coupled with lifetime analysis (https://github.com/eriksvedang/Carp/blob/8665a7f9a19d9347cc0...), which as far as I know is used once (https://github.com/eriksvedang/Carp/blob/6107e619d4f632acace...) in the compiler. This analysis seems tied to the manual "reset!" special form which apparently frees any existing data bound to a symbol. So, what happens when lifetime analysis cannot determine lifetime? what about stack-allocation? What should I worry about when programming in Carp w.r.t. memory?


As far as I understand from listening to Erik talk about Carp, there is an interactive interpreter which does use GC. In contrast, the compiler generates GC-free code.


> In contrast, the compiler generates GC-free code.

Yes, but the question was how it does that. Does it reject code where it cannot reliably detect lifetimes? Or does it generate code that just does not deallocate (i.e., that leaks) objects whose lifetime the compiler can't determine? Both of these are "GC-free"...


The compiler does lifetime analysis, and there is a "reference borrowing" concept like in Rust.


So it rejects code for which it can't prove lifetimes. Why not just say so? There's an uncomfortable degree of beating around the bush in this thread.


That's not quite how I'd put it. The documentation is pretty scarce, but if it's anything like Rust, you have to be explicit about ownership and somewhat explicit about lifetimes.

That is, it's not just "rewrite your program until the prover succeeds".


Curious if semantics of this subdomain of "allowed/legal" programs will be easy to understand, or will it be "try and see on case-by-case basis if your code compiles". Still, this way or the other, probably user will have to gradually learn and build intuition, as with learning any other aspect of any programming language.

Also: will be/is the lifetime-checking done in the same way in the interpreter, even though it uses GC? Otherwise, those are in effect two, I think significantly different dialects of the language.


It will require roughly the same amount of practice as when learning Rust, so a pretty significant learning effort I'm afraid.

I'll try to do as much checking as possible in the interpreter so that the transition from interpreted to compiled code is smooth. Right now there is none though.


Very simple. The compiler, written in lisp (with a GC) just generates C code, which is compiled and executed. Without any GC, it tracks the vars.


No, not simple. Analyzing an arbitrary lisp program to determine the lifetimes of all allocated structures reduces to the halting problem, so it's undecidable. You must either restrict the input program or tolerate some leaks.


>arbitrary lisp program

Well, Carp is a dialect of Lisp so it's free to impose whatever semantics it wants. It's not a dynamically-typed language (where it would indeed be impossible to statically manage memory).


Yes, only the statically compiled code is GC free.


Very simple. Each C allocated structure is allocated at the end of its C scope, and free'd at each end. Since the compiler knows the scope of each variable it's trivial.

Harder is of course the ffi, which doesn't generally know if the extern call allocated something. This is undecidable for the compiler and he has to trust the user/program. Other lisps have an attribute for those externally malloc'd args.


No, really no. First of all the values of the variables can escape their scope, e.g. when used as a return value. Secondly, some child scopes outlive their parents, e.g. closures. As the parent comment correctly points out you'd literally need to solve the halting problem to infer all the lifetimes for arbitrary code.


tracking the scope of vars is a simple compiler job (yes, also across functions, when used as byval or byref args or return values) and not a uncomputable problem. please stop spreading fud.

you only need a GC to recycle cyclic data structures, with links back to previous nodes. since Carp does not create such data structures in the generated C code the reclamation scheme can be kept simple.

heap vars to be kept global (such as closed-over upvals you are complaining about to be leaking) are kept global and released at the end. every dynamic language does keep it functions and closed over values global.

however, lexical closures are released at the end of scope, and remaining tracked heap vars are copied.


> heap vars to be kept global (such as closed-over upvals you are complaining about to be leaking) are kept global and released at the end.

In other words, they leak if the program no longer references them.

> every dynamic language does keep it functions and closed over values global.

Yes, but most of them collect closed-over values when they are no longer used. Using a garbage collector. To avoid memory leaks.


Oh my, have you ever implemented a dynamic language?

Tracked global vars are of course free'd at the end of the program. There's no leak.

2nd closed-over values inside global closures can only be freed at end. A common problem in non-GC'd languages, which can only be fixed by adding a GC. Yes. The goal here is to avoid a GC. So don't reference big memory inside closures (as upval or return val, local is OK), or undef them explicitly, as in every dynamic language.

Using local closures would fix that, but Carp has no syntax for that I think. It would need an assignment from lambdas. Most langs prefer a simple global deffun variant instead. Lua got that right. That's why lua is the preferred game VM. But being able to use the easier and more simple lisp syntax should be preferred.


> problem [...] which can only be fixed by adding a GC

Cool, we're agreed then.


Thanks. Is this talk online? I can't find it.


Sorry, it was not recorded.


I understand this is marked as "a research project", so any answer here is possible and must be accepted; that said I'd really love to learn at least about plans with respect to the following questions:

• Regarding FFI, is it possible to pass pointers to carp functions/closures as arguments to external dynamically loaded functions? are those features available in REPL?

• What's the story/approach regarding structs inheritance vs. composition? also custom types and implicit type casting?

• What are the plans regarding multithreading/parallelism? Esp. interesting with respect to memory handling.

At its current state of claims, the language sounds awesomely close to the "holy grail" for me at first glance — though with a fair bit of vagueness and uncertainty.


1. Yes, and yes!

2. No plans for inheritance at the moment, probably some kind of interfaces but that's not a top priority right now. Custom types are limited to structs, I will add union types soon

3. This has not been decided yet, I want to do more research before settling on any particular solution. It will be a high priority though, since I want to use it for the games I write

Thanks, Erik


Thanks!

Re 1: are there any examples of this in the codebase? I haven't seen anything in the "examples/" subdir, did I overlook, or should I dive somewhere into the compiler/runtime codebase?

Also, some more questions still, if you wouldn't mind:

• Could you possibly answer junke's question? https://news.ycombinator.com/item?id=12041555 I didn't repeat it, as I hoped you'd see it too;

• Do you have tail-recursion? Also, to tell the truth, a list of "what we don't have yet" and also "what we don't plan to have" would help to confront my dreams with reality...

Regarding (2), personally I like Go's/Rust's approach, but I'm not an expert in the domain by any means, and I understand everybody has one's personal taste and view. Just couldn't resist, sorry :)

edit: Also I see you're the author of Else Heart.Break() - awesome, congrats!


No tail recursion at the moment but it's something I'd be interested in adding later.

I'm being very conservative with adding different type constructs but sure, Rust & Go certainly gets a lot of things right in that department and I'll consider their solutions for the future development of Carp.


Neat to see a contender to replace what should've had plenty of adoption in this area:

https://en.wikipedia.org/wiki/PreScheme

I still don't see a clear answer to junkie's question despite a huge thread. Let's be specific: "Where no GC is used, how exactly do you handle the issue of memory safety? Is it no safety like C, a specific analysis like Rust, or something different entirely?"

Far as concurrency, you probably know about Rust's scheme. The ones that came before it that you might want to consider were Ada Ravenscar and Eiffel's SCOOP.

https://en.wikipedia.org/wiki/Ravenscar_profile

http://www.sigada.org/ada_letters/jun2004/ravenscar_article....

https://en.wikipedia.org/wiki/SCOOP_(software)

http://cme.ethz.ch/publications/

SCOOP might be the more interesting of the two for you. It was deployed in real-world apps in Eiffel first. Then, a team ported it to Java. It had a performance penalty. One of the many works in publications... don't remember which... did something ("slices?") that knocked out almost all that overhead. Another model-checked it to eliminate a few errors in SCOOP model. I see one on message-passing. So much badass work and results on SCOOP model I've been annoyed that it was ignored by mainstream for so long.

So, hope those help in your thinking on tackling concurrency. Personally glad I looked it up for you as I found this very, interesting project for safe use of GPU's:

http://se.inf.ethz.ch/people/poskitt/publications/Kolesniche...


Thanks for the links!

The short answer is that safety is handled similar to Rust, using lifetime analysis. It's quite a bit more simplistic than Rust at the moment though. Also, some checks are done at runtime (like bounds checking on arrays, when turned on).


Using the information on dependent types and some inference and eliminate run-time type checks in those cases would be a nice type-checker job, e.g. for the sicp/solver.


Ok. There we go. It's actually what I had in mind for PreScheme. You just gave me more confidence in it.


Is the plan for this to be a stand-alone gaming language or a language for embedding into a gaming engine or framework like Lua? Obviously built-in threading constructs are less essential for the latter.


Primarily stand-alone.


Could you pretty, pretty please allow for an option to have automatic parentheses insertion, similar to how javascript will insert semicolons, making them optional? This is almost exactly the language that I've been planning for the last six months, and if you would be so kind as to implement that one feature, it would save me a lot of reinventing the wheel.

I was originally envisioning having the compiler being able to infer parentheses based on code indentation.

Also have you considered compile time AST evaluation and simplification?


If it takes off, and fulfills all peoples' dreams and brings rainbow-farting pink ponies to Earth, then I suppose there might come lots of parser butchering by people and fitting it to their likes via forks/frontends - me, for example, I'd love it to look more Rebol-like ;)


Maybe you can use "Readable Lisp S-expressions".

http://readable.sourceforge.net/


If you don't want the parenthesis then why not use reverse polish notation?


unless you're thinking of something that I'm not thinking of? RPN only solves this when the operators are known and used exclusively in operator context and can be relied on as implied close-parens


Looks like just what I have been trying to find, since I am incapable of writing my own. I did start with 'Learn C * Build Your Own Lisp' book [1], and I will finish it for educational purposes; it is a great, short book to work through, and should help with grokking Carp.

The closest I have come to finding something similar for livecoding visuals and audio, or games is Extempore [2], which has xtlang, a Scheme, with manual memory management and types. The only two criticisms I have, and they are small, are the build and number of dependencies, and how to distribute standalone executeables. In all fairness there are binaries now available for all platforms, but standalones are still problematic.

I curse Fluxus [3] for starting me down this crazy road many years ago in search of the best creative coding environment. Shame it is not refreshed every so often, because the interface is brilliant.

I'll have to look over Carp this weekend more closely. I am intrigued how it is like Rust's semantics. And if it is manual memory management all the way, then I guess that is where it differs from Rust's safety features?

Now I don't have to try Clojure again! I don't like the JVM, and prefer going to C.

[1] http://www.buildyourownlisp.com/

[2] http://extempore.moso.com.au/

[3] http://www.pawfal.org/fluxus/


You should have a look at Hypergiant library for Chicken Scheme. It compiles to C as well. https://github.com/AlexCharlton/Hypergiant


I did. I'll have to revisit it. I gave it up, since I work on Windows and Linux mainly, and I think I couldn't get one or more of the dependencies working on Windows, and I gave up. I really like Chicken, but with Chez Scheme having gone opensource, I use it now when I am working in a Scheme.

I've also looked at Lobster, which is now statically typed by default, not optional.


I saw this a month or two ago and submitted a pull request that I still need to fix up to have it accepted. My original intention was to benchmark a bit against SBCL, but I got sidetracked with the pull request, then lost time and then forgot about it. I need to get back to it and see about getting it submitted.

Anyway, it sounds neat, but even after playing with it a bit, I'm not sure how necessary it is.

SBCL (and supposedly the commercial Common Lisps) are pretty fast nowadays. Not C fast, but I can usually get about 1/2 - 1/3 C speed and 4-8x faster than Python without doing anything special.

For example, I made an animation app (https://github.com/jl2/qt-fft-viz) that reads an MP3 and generates an animation while it plays back, and I get 90+ fps. It's not a big graphics/CPU intense video game, but it's doing a lot of FFT calculations and some basic graphics.

So, not to diminish the Carp project, but I think regular old Common Lisp is faster than most people think, is better supported, has more libraries, and works on more platforms and operating systems.


I believe one major motivation behind Carp is the lack garbage collection. Or rather, avoiding GC pauses. So, maybe you could benchmark Carp vs SBCL in that area?


Interesting.

> Carp borrows its looks from Clojure but the runtime semantics are much closer to those of ML or Rust.

> – https://github.com/eriksvedang/Carp/blob/c762e9ff07544b40c8d...


To the author: please, please, please port (or reuse) LuaJIT's C-header parser for FFI; it's the best FFI interface specification method if seen among JNI, Python, V8 and even C# (because you can just use the C header, with support for typedefs, unions and packing).

It really looks like a great language!


Interesting, thanks a lot for the tip!


Also check out Pixie's - towards the end of https://www.youtube.com/watch?v=1AjhFZVfB9c. It's a really interesting technique, and for a Lisp too.


This is fantastic. AFAIK there are no Lisps our there that are designed with games in mind which is a huge shame because I often think how perfect it'd be for game dev (Arcadia and Nightmod spring to mind but imo they're shoehorns. Good ones, but still).


There are a couple of high-performance, low-latency lisps, either with a low-latency GC or refcounted or compiling to C as Carp does.

ECL (also typed), Vicare, Larceny, Ypsilon, the new guile, Gambit-C, Chicken, Chez, Bigloo, stalin, Corman CL, or fast typed CL: sbcl, franz, lispworks and few more.

Or the ones compiling to LLVM or JVM or .NET. Most prominent Clojure with a very unlispy syntax.


And note that you simply get a low-latency GC with libgc (Boehm-Weiser) by using the incremental GC with GC_enable_incremental() and SMP with parallel marking phases -DPARALLEL_MARK. http://www.hboehm.info/gc/scale.html This is no voodoo nobody ever managed to do, even if most blog posts for popular not-invented-here languages state the opposite.

Of course libgc is pretty simple and slow, good for foreign code (i.e. ffi's needed in games), slower than Cheney-2 finger copying allocators which I got down to 10ms, compared to 100-300ms for mark-sweep. But copying collectors need 2x memory, so not usable for small devices such as phones. And a bit complicated for foreign memory, which prefers conservative mark & sweep.


There's (or was) Game Oriented Assembly Lisp: https://en.wikipedia.org/wiki/Game_Oriented_Assembly_Lisp


PreScheme was designed to replace C in general. It's just been sitting there with no love or further development.

https://en.wikipedia.org/wiki/PreScheme

I imagine some techniques behind Chicken Scheme in area of optimization might help it too.


Hmm I actually like the fact that the arguments of a function are defined using square brackets. It just reminds me of Clojure, especially the "defn" macro, which sounds pleasant to me.

I wonder what Lispers would say about that since I heard that not having a homoiconic syntax might cause some troubles at metaprogramming(macro) level.


It is homoiconic, because the data structures used to describe code are also data structure provided by the language. However, brackets mean that you pass arguments as arrays instead of lists: there is little reason except aesthetics to have this distinction in source code. I prefer parenthesis everywhere, but some people really dislike that.


Yes, it's still homoiconic.

I think vectors for argument lists is mainly a usability affordance. Which gives you aesthetics for free. (If you happen to find it more aesthetic.)

> Common LISP and Scheme are not simple in this sense, in their use of parens because the use of parentheses in those languages is overloaded. Parens wrap calls. They wrap grouping. They wrap data structures. And that overloading is a form of complexity by the definition I gave you. (https://github.com/matthiasn/talk-transcripts/blob/master/Hi...)

Elsewhere, he critiques Clojure's use of vectors for argument lists. Because a vector implies order, so every caller must put arguments in the right order. (To decomplect, you'd use maps instead of vectors. But vectors win you brevity. Many notice that with longer argument lists, maps increasingly become more attractive than long argument lists.)


Take C:

    struct {
       int x;
    } ...;


    int foo () {
        return 0;
    }

Should braces in both cases represent the same internal data structures? Probably not, the first one is for structure members and the other one for a block of statements. But in Lisp, characters are used to parse the same kind of data, which means they are same structure in all contexts. The Lisp reader has a simple approach to parsing, you don't generally change the readtable's binding based on which form you are reading (but strings, comments and code are not read the same way). Using different characters for semantics has its limit.

In Clojure, take [a b] out of context. Which element is evaluated, a, b, none or both? The fact that it is a vector does not help you, because it could be a binding, an argument list or a vector constructor.

Also, how something is stored inside the AST has nothing to do with its meaning at runtime. Argument lists could be passed using the stack, but you don't actually write a stack inside the source code. The fact that they are written as vectors or lists does not matter either, you still have to know the semantics. And semantics is almost never context-free.


I found that definition, applied liberally, lost its effectiveness.

I've never found the use of parenthesis to be an inhibiting factor in the complexity of Lisp code. It's a piece of syntax that has a single use and isn't ever over-loaded. A parameter list... is a list. A form... is a list with a specific structure. Surprise. The hyper-spec is very clear on this (i.e.: there is no over-loading).

I don't think his justification for [] was really necessary. If you want a different syntax for the defun macro you're free to have at it in Lisp. Clojure did nothing special there. Most schemes could interchange braces if you wanted to. You could easily write your own defn macro with your preferred syntax. It is no great innovation to use a different syntax so I don't know why he bothered making that remark about Lisp.


In most Scheme implementations, which this Lisp somewhat resembles, it's permissible to use "(" or "[" as long as the opening bracket matches the closing one. With that choice, each user can have the "esthetic" that is preferred. Personally, I agree that using parentheses exclusively is nicest, but it's not that big a deal either way.


You are right about Scheme. Note that Carp, like Clojure, treats brackets as arrays (https://github.com/eriksvedang/Carp/blob/master/src/reader.c...).


I'm not familiar with Clojure, but briefly reviewed the Carp documentation, sparse as it is. The part about "[]" meaning array hadn't quite sunk in when I commented about the syntax. Thanks for pointing that out, the Carp syntax makes perfect sense. I think it's a project worth keeping an eye on as it looks like it could have interesting capabilities.


That's brilliant... I'd always assumed lisps were so tightly bound to the GC that any other memory model was infeasible.


you might enjoy this classic then: http://www.pipeline.com/~hbaker1/LinearLisp.html


https://github.com/vsedach/Thinlisp-1.1

  ThinLisp is not a typical Lisp implementation in that it does not implement a
  garbage collector or many of the other run-time development features of other
  Lisps.  ThinLisp is designed for producing high quality deliverable C libraries
  and executables from Lisp sources.  Originally designed for real-time control
  applications, ThinLisp stresses run-time performance at the expense of some
  development time conveniences.
Prescheme

  http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.3.4031
Schlep

  http://people.csail.mit.edu/jaffer/Schlep/scm2c.html
BitC

  http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.570.5677&rep=rep1&type=pdf
and various others...


The idea of getting rid of GC is very exciting, as GC has a lot of drawbacks. I want to study Carp to see how this was achieved. Incidentally, today I also found Bone Lisp https://github.com/wolfgangj/bone-lisp/blob/master/README.md which has the same goal.


This looks amazing and very close to a project I've been working on myself - a clojure-ish lisp-1 with Rust style memory management. Very cool work!


I'm fishing for a pun in there but just can't catch a big one.


Is there a compiled binary for Windows yet??


Are all those round brackets necessary? I understand the goal is Lisp like syntax. But while reading the OpenGL sample, I found the enclosing brackets quite unintuitive. Since this is still supposed to be new language may be "statically typed Lisp with python-like blocks" will appeal to a larger audience?


One of the nice things about s-expressions ("all those round brackets") is that they're well suited to meta-programming.

For example, you can automatically convert back and forth between s-expressions and something else, like i-expressions http://srfi.schemers.org/srfi-49/srfi-49.html


This sounds like a good idea at first, but i dare you to take a non trivial Lisp function and "pythonize". This instantly becomes an indentation nightmare. Lisp function calls are highly nested and rarely one per line.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: