Hacker News new | past | comments | ask | show | jobs | submit login
A Haskell Programmer Tries to Learn Racket (artyom.me)
257 points by jackhammer2022 on April 15, 2014 | hide | past | favorite | 156 comments



"...but nothing beyond that. As long as Node.js exists in this world, I can't truly hate anything else."

I found this hilarious. I am also rather underwhelmed (to be nice) with Nodejs and a little bothered at its wide adoption.

I have also been learning racket recently; my formal language and functional programming class uses it. I had some previous experience with common lisp but the raw nature of scheme still pleasantly surprised me a little bit.

EDIT: From what I remember, javascript was "inspired" by scheme. Obviously that when well...


I keep flipping back and forth, thinking I don't get it, then thinking others don't get it. Single-threaded JS with callbacks. OK? What am I missing?

The only thing Node has going for it is a lot of networking libraries. It's not fast, the async style leads to callback hell, JS is a terrible language, and their package management system is slow and bloated (or at least the packages are)[1]. Yet people think it's some new kind of thing that's just so awesome, despite MS doing server-side JS in the mid-90s.

Sure, people can use whatever makes them happy. I'm just confused as to why anyone is particularly impressed with Node or JS.

1: Seriously, to automate builds, I ended up saving the node_modules directory then symlinking it in to the build environment. Otherwise a simple CSS/JS/etc. grunt build took 15 minutes instead of 1. And it seems every module pulls in its own copy of dependencies, so you end up with 300-character deep paths. There's no apparent reason it needs to be like this.


Straight from the horse's mouth:

"Node is popular because it allows normal people to do high concurrency servers. It's not the fastest or leanest or even very well put together - but it makes good trade offs in terms of cognitive overhead, simplicity of implementation, and performance. I have a lot of problems with Node myself - but the single event loop per process is not one of them. I think that is a good programming model for app developers. I love Go so much (SO MUCH), but I cannot get past the fact that goroutines share memory or that it's statically typed. I love Erlang but I cannot get the past the syntax. I do not like the JVM because it takes too long to startup and has a bad history of XML files and IDE integration - which give me a bad vibe. Maybe you don't care about Erlang's syntax or static typing but this is probably because you're looking at it from the perspective of an engineer trying to find a good way to implement your website today. This is the source of our misunderstanding - I am not an app programmer arguing what the best platform to use for my website--I'm a systems person attempting to make programming better. Syntax and overall vibe are important to me. I want programming computers to be like coloring with crayons and playing with duplo blocks. If my job was keeping Twitter up, of course I'd using a robust technology like the JVM. Node's problem is that some of its users want to use it for everything? So what? I have no interest in educating people to be well-rounded pragmatic server engineers, that's Tim O'Reilly's job (or maybe it's your job?). I just want to make computers suck less. Node has a large number of newbie programmers. I'm proud of that; I want to make things that lots of people use. The future of server programming does not have parallel access to shared memory. I am not concerned about serialization overhead for message passing between threads because I do not think it's the bottleneck for real programs." https://news.ycombinator.com/item?id=4310723


> The future of server programming does not have parallel access to shared memory.

Yeah but it could be IO parallelism. There could be two instances of callback chains of sequence C1->C2->C3 started such that the the second starts before the first one finished. As in C1->C2 ran then C1 gets called again. If in those callbacks you update a data structure (a database record?), you now accessed that data in parallel. So you have to protect against that with some kind of a lock/mutex. Yeah context switching doesn't potentially happen at every assembly instruction, the granularity is much higher, but it is still there.


There's some need to synchronize, but hot damn is it simpler when you're dealing with sensible blocks of high-level statements than when you're dealing with out-of-order parts of assembly instructions.


That's true, but personally I still don't find this model ideal because the synchronization points are implicit. The cleanest concept I know of for dealing with synchronization is transactions.


If you don't want shared memory... then don't use it. Other languages are not stopping you from copying objects and using a message passing system.

And it seems rich to complain about syntax when the language of choice has "function" as it's lambda operator.


Such a wonderfully confusing and self-contradictory quote. I love it.


Why do you think it's self-contradictory? I found it fairly stable.

I think the most important line for me is about wanting computers to be like lego blocks. I think that's an admirable goal and I think that if Node as a community of programmers and projects owned that more ("We're Mindstorms, not NASA, here!") it'd get a great deal of love from the rest of the community.

Sadly, that probably won't happen since Mindstorms sounds diminutive and no collection of people has an ego so in check.


Here are some of the bits I read as inconsistent.

"Node is popular because it allows normal people to do high concurrency servers." vs "If my job was keeping Twitter up, of course I'd using a robust technology like the JVM." ==> So Node is not actually good for high concurrency servers?

"I want programming computers to be like coloring with crayons and playing with duplo blocks." ==> The wonderful wonderful thing about Duplo is it composes. Continuation-passing style (i.e. callback hell) is the paradigmatic example of a non-compositional whole program transform.

"Node has a large number of newbie programmers." vs "Node is popular because it allows normal people to do high concurrency servers." ==> Newbies are writing high concurrency servers?

"I'm a systems person attempting to make programming better." and "Node is popular because it allows normal people to do high concurrency servers." vs "If my job was keeping Twitter up, of course I'd using a robust technology like the JVM." ==> If Node isn't actually good for its intended use case how exactly are you making programming better?


I guess that I don't feel these so painfully because I don't think Node is necessarily successful at it's goal, though I believe its goal is admirable. CPS might not be the most elegant way to schedule threads, but it is fairly simple which is one kind of boon.

I'm also pretty sure that high concurrency means a variety of things allowing a distinction between Node High Concurrency capability and Twitter High Concurrency need. If Node allows newbies to achieve higher concurrency than other newbie tools (PHP, Rails) then I think it can achieve that task to some degree.


I mostly agree with everything you say. I think it's sad that so many in our industry use such poor tools. Racket, for example, is an excellent tool for beginners. The PLT group that produces it has long had a focus on introductory programming and has produced resources such as HtDP (http://htdp.org/) and Bootstrap (http://www.bootstrapworld.org/) yet Racket is far more capable than Node, Ruby, and the like.


Complete agreement... A world where Racket was in browsers instead of Javascript would be wonderful.


> I am not an app programmer arguing what the best platform to use for my website--I'm a systems person attempting to make programming better.

What is a "systems person", in this case? I guess it makes sense if you know who it was who said this, but I don't. Is it a programmer working with code for servers?


> I'm just confused as to why anyone is particularly impressed with Node or JS.

Low barrier to entry means anyone who used to animate jumping monkeys on webpages and now write distributed back-end systems (all web-scale of course).

Kids haven't seen anything else. Maybe took C++ or Java in college, then they see a kid with piercings and a messenger bag and tight pants talk about this awesome asynchronous callback-futures-based awesome frameworks. There is really nothing to compare it to. So they are impressed.

Next level up, write a benchmark. Compute fibonacci, compare with Java, hey not too bad. Even faster than Ruby! Clearly this is the framework of the future.

Maybe read some place about how threads are bad and callbacks are great. And so on.

> so you end up with 300-character deep paths.

Hehe, too funny. Actually it just needs to be 260 characters, so 'git checkout blows' up on your unfortunate colleague who happens to use Windows.


=> Low barrier to entry means anyone who used to animate jumping monkeys on webpages and now write distributed back-end systems (all web-scale of course).

There is nothing distributed about Node. It has IO concurrency, that is all.


I was being sarcastic ;-)


You got me!


My theories:

1. Using the same programming language on both sides of HTTP often appeals to a developers visceral sense of elegance, or tidiness, even though it this alone inherently solves no existing problems.

2. It's powered by Googles V8 engine, which means people already associate it with this "super fast JIT" thing that lives in their favourite browser. It must be efficient, right?

3. The crowned alternative is PHP which is slowly becoming Java and alienating, with no disrespect to anyone intended, the less experienced, the "let's just hack a script together" crowd.


1. There is a concrete advantage of shared code.


There are multiple concrete advantages.

I've recently worked on a few web apps which use ClojureScript/Clojure on the the client/server. Prior to that I mostly worked on web apps that were CoffeeScript/Python.

The first difference I took note of was how easy and seamless it was to move EDN structures back and forth between the client and the server. Of course this is a pretty minor difference as it's not exactly hard to pass JSON back and forth from CoffeeScript (or Javascript) to Python.

But then I started using Prismatic's Schema library. Schema is essentially a lightweight (gradual/optional) type system. Schema is written in cljx which means it works with both Clojure and ClojureScript. This means I could write my schema definitions once (also in cljx) and use them on both sides of the pipe. Schemas end up being similar to class hierarchies in an OO language. (So, for example, in one project I had a User schema and a Project schema and Project's had an "owner" fieldthat was a User, etc).

This ended up being really nice because now I still get the full dynamicity of Clojure(Script) but everywhere that I'm dealing with these core "types" I can validate the schemas. So I validated them going in/out of database access functions, I validated them going in/out of the server, in/out of the client, in/out of various specific functions on both the client and the server, etc. Wherever it made sense.

Of course in my old CoffeeScript+Python days there was nothing stopping me or anyone else from writing something like Schema and implementing both CoffeeScript and Python versions but it would be a lot more work upfront and in perpetuity to maintain 2 versions. But be that as a it may, for whatever reason, this type of tool doesn't actually get created that often in more heterogenous environments. (And I would argue they on balance, not as powerful/effective/simple/etc when they are created in those environments).

But honestly, for me personally, I think the biggest benefit of using a single language is just the reduction in mental baggage and context switching you have to deal with. I never found CoffeeScript+Python (or Javascript+Ruby or Javascript+Java or whatever) particularly challenging or wearisome while I was working with them, but having now accumulated a significant number of hours on the ClojureScript+Clojure combo it's very noticeable when I go back and work on old heterogenous projects.

Of course the plural of anecdote is not data, and my experience is definitely biased by the fact that Clojure and ClojureScript are just such a pleasure to work with all around. In the end though, while I'm not personally a fan of Javascript or Node.js, I now fully understand why the people that like Javascript love Node.js for giving them the gift of single language development.


Technical nitpick: Schema does not implement a type system of any kind, but instead a contract system. Typed Clojure would be a type system.

Outside of that (+1) because yes, I completely agree.


Which you can get with far better platforms and languages that compile to JS.

And a lot of the use cases aren't sharing code or even have a client/server setup. Like crappy non-parallel build systems being written from scratch using Nodejs.

This thread's been a bit illuminating. I guess I'm not missing anything technical. Some people like JS I guess and Node exists so why not use it.


How does one share server HTTP-serving code and client DOM-manipulating code?


Facebook's React lets you write DOM rendering code that can easily run on both client and server. The framework discourages DOM manipulation, instead promoting a near-declarative style of stateless rendering. Updates to the client DOM are then done by efficient tree diffing. It's very nice.


Most likely by abstracting both into business concerns which appear in the client (manipulation for implementing a UI) and the server (manipulation for persistence and client-to-client communication).


There's a library called browserify that does it. It's not useful for direct DOM manipulation, but there are other libraries for which it is nice.


You only share the data model between the server and the client.


PHP best practices might be becoming like Java, but the language is far, far, far from anything close as far as I can see.

Again, as far as I can see it's as simple to hack together a script in PHP as it ever was. What's changing about the language that makes it more like Java?[1]

1. I tried and failed to make this sentence seem as non-passive-aggressive as possible. I'm genuinely interested, as it's not something I've come across and I'm by no means a PHP expert.


Last I looked it was class definitions and access modifiers that made it look java-esque. I'm not in the know however.


It's popular because people think other people think it's popular.

There's no accounting for taste, but, it won't last. :)


Javascript is fast enough to be usable on a server, Javascript has grown client-side, and you can share code across both. That adds up to something at least reasonably useful and unique.



Not sure what you're saying with this? That they're fairly equivalent except when you need heavy regex jiggling?


Yeah, but I've basically never seen anybody suggest that Racket is a great language for prod.


Naughty Dog has used Racket as a scripting language in several of their video games for the PS3.


Slightly misleading, they don't actually run Racket on the PS3. IIRC, they created a DSL for game logic in Racket and wrote a special compiler for that DSL.

That's still a good example of a "real" program written in Racket, but it doesn't prove that Racket is a good choice for programs that need all the resources they can get.


Ah, I didn't realize that. Very interesting.


Also, Hacker News is written in Arc, which runs on the Racket runtime. It handles quite a bit of traffic every day.

It's an ugly-ass buggy website, but that is largely to do with the implementation and PG's propensity for live-modifying production sites in the REPL and not the runtime itself.


The one good thing about the packaging system is that each dependency is its own version, so you can easily load different versions of the same package.


I'd have to check again, but I looked for specific versions of a JS file and found it repeated multiple times throughout the forest of node_modules.

If side-by-side is desired, then just suffix the folder with a version name. foo.js-1.1 and foo.js-1.0 can be in the root and reused by every package that needs either.

Maybe it does that and I didn't understand what I was looking at.


It's quite naive about it, so it will copy them over and over again. Supposedly there are tools that make everything in node_modules symlinks to the same files.

It's not exactly an advanced and polished system, but it still has a feature that is hard to get with others.


I also hate NodeJS for its package manager, NPM. I don't understand the way it manages dependencies and their folder. What the hell is 'node_modules/express/node_modules/connect/node_modules/multiparty/node_modules/readable-stream/node_modules/debuglog/'.

As a result, when updating dependencies with npm, a same dependency is downloaded multiple times.


That is a good thing when multiple libraries you depend on depend on different versions of some other library.

See http://blog.izs.me/post/1675072029/10-cool-things-you-probab...


It solves the nasty versioning issues that plague the other package managers like pip and rubygems.

Every package gets the correct versions for all of its dependancies at the cost of taking more hard drive space. Hard drive space is ridiculously cheap so it's a good trade off.


I wasn't going to bother reading the source article, up until I saw this excerpt in your comment. Now it is added to my reader. :-)


This sums up quite a lot about Lisps in general. I'm amazed OP got so fast to this "insight" :)

    (And this is probably Lisp's greatest weakness as well – with this level of
    possible diversity, everyone has to use the “common lowest denominator”
    simply because nobody can agree on what alternative syntax / library / etc.
    is better and should be used.)

    Off-topic: it's not enough to give everyone opportunity to improve the
    language; you have to choose the winners and promote them heavily. The rules
    of free market don't work here; people won't use the best thing, they'll use
    the one which is available out of the box and which is used by their peers.


As a lisper, I can say that this insight is very true, but it is still just a simplification. Most lisps deal with it in different ways.

In the scheme world, the way they deal with this is by wasting a decade on making decisions about the language that should have been done in the 80s. The result is that scheme essentially split into scheme and racket. Now we have an awesome language and a nice little ecosystem thats good for teaching and research, and possibly even real work(tm). Classic scheme unfortunately is fragmented into implementations who all do things slightly differently and occupy their own niches.

In the clojure world, they have a) a BDFL who sets the course of the language. b) A very strong core community of very smart people who managed to build a nice culture based on common ideas about software and design.

In the common lisp world, because we have a very high-quality standard, implementations are almost completely compatible. Compatibility libraries make it easy to write very portable code, avoiding the scheme problem. At the same time implementations are free to experiment. The other problem of everybody developing their own little universe tends to be rare. Because common lisp is so old, we have a long history and traditions that guide future design, but don't constrict it. There is a subtle balance here. We have a lot of old examples to learn from, but we are not locked in by too many bad old decisions(not always the case, but good enough in practice).

A few examples where this does not work include utility libraries and things like json libraries, libraries for outputing html etc. Since we don't have a BDFL we are left to figure things out amongst our selves and sometimes, like with utility libraries(there are dozens such, which is ridiculus) it doesn't work. In other cases, it works very well, for example quicklisp, ASDF, bordeaux-threads, closer-mop, hunchentoot etc. are either de-facto standards, or sufficiently popular to be a very good default. As with clojure, there are a lot of common ideas about what is good design in the community, we have a lot of examples to learn from, as I mentioned.

In the end, at least in the case of common lisp and clojure, I see it as an advantage to have this "level of possible diversity", it's what has kept lisp alive for 50+ years! The fact that lisp can adopt to each new era of software development philosophy is a great reason to study it. It will be with us for many more decades because of this.


At this point we can safely say that any programming language will be with us for many decades. Cobol (http://en.wikipedia.org/wiki/Cobol#COBOL_20XX), Fortran (http://en.wikipedia.org/wiki/Fortran#Fortran_2015), Basic (http://en.wikipedia.org/wiki/Visual_Basic_.NET#2013_.28VB_12...) are still evolving. Heck, Python is already 23 years old and I don't see it going anywhere. I'm not even going to talk about the ubiquitous C/C++ or Java, as they'll probably be around long after we're dead.

Adaption is neat, but I don't see this argument as being a very solid argument in favor of Lisp - it's just the norm.


I think you're just conveniently forgetting all the languages that didn't go anywhere. https://en.wikipedia.org/wiki/Timeline_of_programming_langua...


To me, "almost completely compatible" sounds similar to "just a tiny bit pregnant". In practice, either it is compatible, or it isn't.


Show me a language with >1 implementation where at least two are completely compatible. Common Lisp has one of the best track records in this aspect. C, C++ and JavaScript are all far worse.


Not necessarily.

Java 6 and 7 are "almost completely compatible".

Most programs written in Java 6 will run in Java 7, and vice versa. But not all.


C++ is similar with regards to C code. They're compatible enough that a lot of people will write "C/C++", but they're still incompatible enough that you'll get yelled at by a lot of people if you write "C/C++".


That's very true and I think that it's related to the topic of Rich Hickey's talk Simple Made Easy [1].

Maybe what we need is to study the economics of software and come up with a system in which market outcome is promotion of good libraries. I think that the social/economic dynamics of software development play a huge role in building a successful product, both free and commercial. Has anyone studied the subject in greater detail?

[1] http://www.infoq.com/presentations/Simple-Made-Easy


I think this was a cached thought – a year ago I was reading a bit about Lisp and probably had stumbled upon what is known as the Lisp Curse.


I dream of a modular language in which languages, or dialects, can be built from a small base language, which can then be extended, and so on. Of the languages that I've seen, Racket looks the most promising. Forth might be good for this, too, but to build large hierarchies of languages seems the antithesis of Forth - or at least, Chuck Moore's - philosophy.

On the other hand, such a language might just end up as an incredibly fragmented mess of different languages, with little interoperability between languages - some one makes a 'typed' version of the language, another takes that version and tunes it just enough for it to be incompatible with the original version and, in turn, all languages that are built on top of that original typed language, and so on.

Maybe incredible modularity is fundamentally at odds with (social) interoperability.


It's ironic that Haskell is such a language: it grows from a tight, small core into larger and greater abilities as you learn it even without using its extensions.

I agree that Common Lisp a very powerful language, but I can't live with all that power uncontrollably thrown on me. Common Lisp grossly lacks self-discipline and self-limitation when it's needed.


Ugh, please let Common Lisp be. There's enough bondage & discipline languages already so it is nice to have CL on the opposite side of the spectrum.


Why would anyone need CL when we have JavaScript?! I think the latter has even less bondage & discipline in it.


And you can even sweeten it with hygienic macros ;) (http://sweetjs.org/)

...but to be honest, in everyday work Javascript feels a lot like a bondage and discipline language, because it lacks so many features and in practice you always use a restrictive "coding guideline" and a linter configured for the maximum strictness you can have, so you end up with a pretty verbose, retarded and restricted dynamic language. In order to keep your sanity and be able to work in a team in Javascript you basically have to throw away the baby and keep the bathwater to work with :)


I suspect the parent thread was dreaming of a framework where languages with arbitrary syntax can be mixed and matched in user-defined ways.

Haskell isn't necessarily that language, partly because it still requires centralized coordination of development of these "extensions" to ensure they're interoperable - that is, there is only one parser for the language and its supported extensions, and many of them are build into the compiler rather than added as libraries, except those extensions which are done through quasi-quotation, such as with MetaHaskell, or some EDSL. Even that has it's own problems, and you'll have issues parsing if your quoted language happens to have delimiters which conflict with Haskell's quasiquoting delimiters `[| |]` - producing syntax which cannot be parsed unambiguously (perhaps very rare or unlikely though).

Perhaps the biggest hurdle of having a modular language is that we do not understand how unambiguously parse the combination of two or more syntaxes. We only know that composition of two CFGs results in another CFG, but with no guarantee of unambiguity, and other parsers such as PEG rely on ordered choice, where the computer can't decide which choice you really want.

What makes lisps great for composition of languages (or "EDSLs" in market speak), is that it bypasses the parsing problem by asking you to just write your language directly in terms of the syntax tree which a parser would generate - and perhaps use macros or other functions to simplify the use of that tree. Instead of a language being vocabulary+syntax, we create new vocabulary for what would be done through syntax in other languages - and we can thus refer to it unambiguously. Similar can be done in haskell too, through regular functions and quotation.

The parse problem is only really a problem because we're stuck with this silly model of "sequential text files" to describe code, and we're required to limit our languages such that a parser can take one of these text files and make sense of it. When we break out of this model, and use intelligent editors, we can reach the point where syntaxes can be composed arbitrarily, because we can indicate where each new syntax begins and ends. Diekmann and Tratt have demonstrated how this can be done while still appearing much like traditional text editing, which they call Language Boxes.[1][2]

Language Boxes only provide the means to compose syntaxes, but handling the semantic composition of languages is left to the authors of the languages which are being composed. Haskell is perhaps a good choice of language for providing the kind of glue needed here, where we can decide where languages can be composed based on the types returned by their parsers.

[1]:https://www.youtube.com/watch?v=LMzrTb22Ot8, [2]:http://lukasdiekmann.com/pubs/diekmann_tratt__parsing_compos...


I don't think the problem stops at syntax. It's possibly an even bigger issue that mixing different language semantics can be awkward. As a big obvious example, a language where all objects are nullable will interface awkwardly with one that only has option types. Similarly, interfacing with something like Smalltalk (which uses methods for flow control) or Forth (which…is Forth) would be awkward from a language that's more like C++.

Even in an environment like the JVM which specifies a lot of stuff for you, it's awkward to call into Clojure from Java because of the semantic differences.


I wasn't implying there is no problem with the semantics, just that it's much easier to deal with when you already have the parsed trees, because they're easier to reason about with code - and we can project them unambiguously.

We already do write tools for such language interoperability for specific pairs of languages, which is often really awkward because it requires us to re-implement the parsers, and only deals with entire code files rather than specific productions in the syntax.

It's pointless composing languages unless it makes sense semantically, which would need to be done on a per-language basis (or per-production rule), which is where I was hinting with using Haskell as the glue for such interoperability - because if we encode the semantics into the type system, such that one syntax expects a language box of type T in it's grammar, then one should be able to use any other language whose parser returns a T, and the semantics will be well-defined for it.

It could also provide the glue for converting between nullable types and option types for example too, by requiring that a language returning a "Nullable T" be wrapped in some function "ToOption", which converts "Nullable T" into "Option T". Attempting to use the Nullable where an Option is expected would fail to parse. How ToOption is implemented is left to the author of the code.

It's much easier to have interoperability between individual production rules in different languages (which share many parts in common) versus "whole text files" which we currently have, which basically require the languages be almost equivalent to convert between them.

Also as a result of storing the semantic information as opposed to sequential text, it would be possible for the user to chose his preferred syntax for any semantic elements in the tree, since they're just working on a pretty-printed version. Most of the concerns about "code style" disappear because they're detatched from the actual meaning that is stored.


Yes, Haskell does not allow free syntax extensions composition as easily as Lisp does. But it gets right semantical compositions, using monads for encapsulating semantics and monad transformers to compose effects easily and in a controlled way. I think the latter is much more valuable.


Maybe you dream of Rebol?

[http://www.rebol.com/rebolsteps.html]


Is this... a Let's Play of learning a programming language? Because it's surprisingly effective. I learned and I was entertained.


I would definitely watch a video series in which an experienced and talented programmer tried out various programming languages for an hour and gave first impressions, comparisons, etc as they hacked something together during the video.


The idea of a twitch.tv stream sounds interesting, with viewers throwing small problems at the coder to solve as they fumble around with a new language.


In this vein I've been pondering doing a livestream of a LFS install for a while, just for funsies. I wonder if anyone would actually watch it.


I'd watch it!


Do it!


Twitch learns themselves a Haskell for great lulz?


Anything that increases the already considerable weirdness of http://www.twitch.tv/directory/all can't be bad. ;)


As others have said, this does justice to the idea of actually learning a new language...or perhaps because it is Racket a new family or ecosystem of languages. Anyway, if you're still curious about pairs verus lists and why anyone would use dotted pairs, I like to think about it as where Lisps show that they are from the age when running close to the metal was a given.

And it all goes back to car and cdr and the fact that they are (or rather were) embedded assembly language and there to give raw access to Lisp's linked memory model (as opposed to the sequential memory model of Fortran). A dotted pair has two efficiency advantages over a proper list and both stem from the fact that the last cell of the last pair of a proper list contains 'nil (or 'null in Racket).

Storing two values in a proper list requires two cons cells - the first with the first value and a pointer to the second cons cell and a second cons cell containing the second value and a null pointer. In contrast, a dotted pair holds two values in a single cons cell - halving the memory requirement.

The second advantage is that when there are only two values there's no need to walk the list and test for 'null (or 'nil) on the cdr. This saves an instruction step.

Philosophically, dotted pairs allow for car and cdr to be used symmetrically. Calling cdr on a dotted pair returns the second value directly just as calling car on any list returns the first value directly. Lastly, one of the things that is awesome about Lisp is the way in which lists can model data structures, and in the case of a dotted pairs their efficiencies are available to all those structures which consist of or rely on paired values.

Of course, this may be obvious and on a machine with 10+ GB of RAM not really applicable, but I find it fun to think about anyway.


My question is, given the cons pair is not a meaningful primitive datatype to modern CPUs (can't fit in a register unless you're using 32 bit atoms on a 64 bit CPU, the individual cells cannot be meaningfully manipulated while packed into a single register even in that case...), is there a good reason to use an unrestricted pair/2-tuple/2 element vector for the sexp representation? Would it be "wrong" to remove dotted pair notation and make sexps a restricted type whose cdr could only be another pair or the empty list? You'd have to give up alists, but those would probably be better served by doing as Clojure does and introducing a hash table syntax.

Other than alists, simplifying the implementation[1], and tradition, I can't think of a good reason to keep improper lists and dotted pair notation around. They aren't really that useful, confuse newcomers, and encourage the use of inefficient data structures. I haven't written much Lisp, but when I see a dotted pair (usually in the context of an alist or some hideous approximation of a struct[2]), it comes across as a "code smell" to me, like "this person is taking SICP too literally."

[1] You pay for extra reader syntax, but gain uniformity in your treatment of the car and cdr cells. On the other hand, maybe you could optimize the cdr cell a bit if it only needed to address pairs and not arbitrary atoms.

[2]

  ;; barfage
  (define (make-thing a b c) (cons a (cons b c))) ; barf
  (define (thing-field-a x) (car x))
  (define (thing-field-b x) (cadr x))
  (define (thing-field-c x) (cddr x)) ; more barf


Going a little Abelson&Sussman, the utility of cons persists because it is a powerful abstraction. Fitting into registers was, as a scholastic might say, an as accidental feature as the input line voltage of a toaster [110 @ 60Hz makes things easier in this part of the world but isn't intrinsic to making crisp English muffins].

cons is useful for dealing with linked memory structures, including singlely linked lists and particularly singly linked lists where the stored value in each list element is a pointer because it abstracts over all the pointers. Writing code to process and manipulate and pass pointers is idiomatic in C. Using pointers is idiomatic Lisp.

Your barfing code shows why cdr and car persist. They are also powerful abstractions due to their rules of combination and probably because their rules of combination aren't constrained by a bias toward English language. There ain't no synonyms for "caddr" and "cdddr" in English or Python.

The idea that lists should be everywhere replaced by structs is assembly in any language thinking (also known as C). Lists provide a standard interface. Each struct definition creates a new data type.

It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures. Alan J. Perlis, Epigram 9.


I think the question is more along the lines of "Should the core data type be the cons cell rather than the list?" Clojure takes the approach the GP was talking about. It has lists with heads and tails, but it does not have dotted pairs. The tail of a list is always a list.


> There ain't no synonyms for "caddr" and "cdddr" in English or Python.

In Python:

    x[1][1][0] # caddr
    x[1][1][1] # cdddr


It has pedagogical value to define things in terms of a very small kernel language. You will find things like alists in fairly advanced books on lisp (e.g. Lisp In Small Pieces) simply because it reduces the number of extra concepts you need to explain. It's assumed that the reader can figure out how to make the code cleaner and more idiomatic as long as they understand the ideas being presented.

Also I have made use of improper lists in my code, usually when I need to do something with multiple values and I don't want to mess with multiple return values, or if I just want to create a list of tuples for some reason and don't feel like defining an actual struct (can't think of a good example right now...)


>Also I have made use of improper lists in my code, usually when I need to do something with multiple values and I don't want to mess with multiple return values, or if I just want to create a list of tuples for some reason and don't feel like defining an actual struct (can't think of a good example right now...)

I get what you're saying, but isn't this just a shortcoming of the language? It's pretty common to want to write something like

  (define (string-lookup-that-might-fail)
    (if (random) <"string you were trying to look up" nil> <nil "It wasn't there.">))
  (define <some-string error-msg> (string-lookup-that-might-fail))
without wanting to destructure a list or use the call-values hack. That's a bad example, just pretend there were multiple reasons the lookup could fail.

I totally agree about there being pedagogical value in having a simple core. I'm torn between appreciating minimalism and believing that it's ok to introduce complexity to a language that people will use to get things done every day, so long as it's useful complexity and not just historical accident.


Okay yeah I agree with you there. The pain of destructuring cons cells is somewhat mitigated by the various forms that allow pattern matching, but it's still basically the same.

So I think that is actually a good argument for introducing option types (like Haskell's Either a b) into the language. I think Typed Racket has something like this, but in either case you'd want some nice syntax to handle it as well.

Hash tables make this sort of thing fairly easy though (at least in Racket)

(define h #hash{["a" . 3] ["b" . 5]})

(hash-ref h k "failed")

I think most lisps support this right? I remember seeing something similar in pg's bayesian spam filter code.


> So I think that is actually a good argument for introducing option types (like Haskell's Either a b) into the language. I think Typed Racket has something like this, but in either case you'd want some nice syntax to handle it as well.

I need to look into Typed Racket more. Ever since I learned the tiny bit of Haskell that I know, I can't help but think that I need ADTs in my parenthesis.

Right at the top of the docs, though, it says that they designed the Typed Racket type system to provide static typing for existing untyped Racket programs. That makes me wonder what a statically typed lisp that wasn't trying to be backwards compatible would look like. I've heard of Shen and Qi but at first glance they (especially Qi) seemed to be drifting too far from Lispy prefix notation goodness for my taste.

> Hash tables make this sort of thing fairly easy though (at least in Racket)

It doesn't really help if you're trying to look up a string, though, because then you can't differentiate between a "good" string and an error string based on their types. That means it won't help for similar string fetching operations either, like making a network request and expecting either a response or a string indicating that the request timed out, the connection couldn't be made, etc.

According to the docs, you can also pass an error continuation to hash-ref instead of a failure string. That's interesting, though I have to admit I have trouble thinking in those terms.


   #lang typed/racket
Isn't bound by backwards compatibility in the sense [I think] you are implying. What the docs are saying is that modules which enjoy static typing may be called transparently from modules which don't - though as is common with functions in the Math library there may be a performance penalty. The backwards compatibility is that the implementation of static typing does not impede dynamically typed code. Statically typed code works just fine with dynamic type checking.

+ regex-match-event from racket/port might be an alternative to handle network connections than hash table lookup. http://docs.racket-lang.org/reference/port-lib.html#%28def._...


> It's pretty common to want to write something like [. . .] without wanting to destructure a list or use the call-values hack.

This makes no sense. You want feature X (the ability to return multiple values) without having to either (a) pick apart a simple structure containing those values or (b) assign names to those values, letting the runtime system pick them apart (or never stick them together, its choice) for you. So what more do you want?!

You want an option type, right? Well what's an option type? Let's consider Haskell:

data Maybe a = Nothing | Just a

No matter what kind of cleverness you do in memory or whatever, in the end this type is a tuple of two values: (constructor, data). The only special thing going on is that the type system makes sure that (a) constructor == Maybe or constructor == Just, and (b) if constructor == Maybe the data field is meaningless (has no type, can't be read, does not effect equality, whatever).

But it's still a tuple. The cons cell is the exact same thing in the absence of a strict compile-time type system, except for your code example we have it flipped: (data, constructor) (well, actually you seem to have a flipped 'Either String a' going on there, but let's stick with option for now) where data is actual data or nil when meaningless, and constructor is like Just when nil and like Nothing when true (btw, lisp convention is the other way around; second value is true like Just and false/nil like Nothing).

Just like the original linked article, what your complaint boils down to is "non-compile-time-typed systems allow you to do things which would break compile-time-typed systems"—and while I love compile-time typing as much as the next Haskell aficionado, I don't demand compile-time typing in a language [family] whose very essence is its absence.

EDIT: if you could reply and explain what would resolve this "shortcoming of the language", maybe I could understand better what you're trying to say...


> This makes no sense. You want feature X (the ability to return multiple values) without having to either (a) pick apart a simple structure containing those values or (b) assign names to those values, letting the runtime system pick them apart (or never stick them together, its choice) for you.

I was actually aiming for (b) here, just incorporated directly into the normal function call and return syntax. Why shouldn't you be able to return multiple values in registers or on a results stack without "tupling them up"? It's clearly supported by the hardware. By "call-values" (sp) hack, I meant that needing to use "call-with-values" and "values" is the hack. Lua, for instance, lets you (with very clean syntax) return multiple values from a function, and assign the results to separate variables, during which at no point is an intermediate table constructed:

  thing, err = getthing()
  if thing == nil then print(err) return nil end
  do_stuff(thing)
I don't think I got it quite right with my angular brackets, but I was trying to think of a nice, clear syntax for multiple return values that wouldn't look out of place in a Lisp.

Maybe I was being misleading by using an example that resembled option types. I wasn't trying to bring static typing into this, though I am interested in static type systems and curious about static Lisps, it's just that emulating simple option types for error handling purposes is a very common usage for multiple return values in dynamically typed languages (and, cough, static ones like Go that have lame type systems).

> data Maybe a = Nothing | Just a

> No matter what kind of cleverness you do in memory or whatever, in the end this type is a tuple of two values: (constructor, data).

Is that true? "Maybe" seems like the perfect candidate for a nullable pointer representation, or at least some kind of small tag. I'd be surprised if GHC represents Maybe as a full blown tuple.


I still don't see how you're advocating a solution of anything other than call-with-values under a different name.

I don't know Racket itself, so let me switch to common lisp which has the same "problem"... except it comes with a super-simple macro to do exactly what you want, apparently:

(multiple-value-bind (quotient remainder) (floor x y) ;; some code referring to quotient and remainder )

which (excepting perhaps edge-cases I'm not considering, and definitely excepting some error-checking) is just

(defmacro mvb (vars form . body) `(multiple-value-call (lambda ,vars ,body) ,form)

The same thing should work in Racket, modulo syntax and names.

Asking a lisp programmer to do everything with special forms/primitive language elements is like asking a Haskell programmer to do everything in the IO monad. Sure, it's possible, it will work just fine; but... it's just wrong.

By the way, the existence of the #'values function by no means implies there's some data structure being created; indeed, that's the whole point of the #'values construct. Anything you could do with #'values you could do in a list which you then destructured; #'values is to be used when the common-case is to throw away the "structure" and just use each value individually or not at all, like with the return values of #'floor. So in fact #'values is exactly what you want; I'm not sure why it's a "hack" when lisp programmers write it under one name and "the solution" when you write it under a different name.

Ah, it occurs to me that maybe racket doesn't work like this, but in common lisp at least, the transiency of values can be seen in that when you say, e.g., (+ (floor a b) c), the remainder of a/b is thrown away; just the first value is used. That's another huge benefit over structures-to-be-destructured, but obviously doesn't work if the extra values are always important.

The point of my discussion with Maybe is that /even if/ it's just a nullable pointer, conceptually it's essentially a [restricted] tuple. (pointer-is-null,data). There's still two pieces of data there, even if the representation of one of them is cleverly folded into generic object representation or what-have-you. By definition it can't be done with a single piece of data: (Maybe a) is not the same type as (a), for all a. Even if (Maybe Word16), e.g., were represented in a single value at a single place in memory, that value would have to hold more than 16 bits. Then it's viewable as a... tuple, again; one element in 16 bits and the other in the remaining bits.


I come from a background using Common Lisp for system and web development so I may see things differently than people who were introduced to it academically. Your code looks great, mine just needed to compile :)

The cons pair is a a really powerful primitive data structure, it is the linked list of Lisp. You can build a lot of powerful structures given this great base. I really don't it matters what architecture your running on.

Do they really confuse people new to Lisp? car+cdr is a pretty simple concept, when talking to new Lispers they usually don't complain about this.

Hash Tables are part of the Common Lisp Hyperspec so that is a non issue, they have been first class citizens since before Clojure existed (and are nicely integrated into things like loop)


I think the parent was saying car and cdr are confusing because they don't actually map to registers (i.e. address register and decrement register). On the other hand you don't have to think too hard about the composed versions of them (caddr, cadar, etc...) but they can easily make your code seem obfuscated. Racket offers both that and first/rest anyway so you can choose which style you prefer.


I don't mind car, cdr, and their composed versions at all. `caddr` is much easier for me to read than `(head (tail (tail ...`

I think their usage is sometimes a code smell (the ad-hoc structs I mentioned), but they're really useful when you want to deal with an "unlispy" list of tokens (say, a parser for a language that isn't Lisp).


I mean, it's neat that you can build everything out of cons, in the sense that it's neat that Turing machines and the lambda calculus can perform any computation, but that doesn't mean it's a good idea to so in your code, any more than encoding numbers with Church numerals. Why use alists when you can use hash tables, why use... weird SICP-style struct list things when you could just use structs/vectors, etc.

>Hash Tables are part of the Common Lisp Hyperspec so that is a non issue, they have been first class citizens since before Clojure existed (and are nicely integrated into things like loop)

I'm not that familiar with Common Lisp, but a quick search didn't reveal a syntax for hash table literals, which I would consider to be a prerequisite for calling them first class citizens. There are other funny uses of lists (not improper ones, but still) in Common Lisp, like named procedure arguments[1]. I guess my point is, why encourage these weird constructs in the first place? Sexps are good at representing recursive structure, tables are good at mapping names to values, why not just have both and let them do what they're good at? Obviously Common Lisp is very old and can't be changed after the fact, I'm just trying to imagine the "ideal Lisp," whatever that is.

>Do they really confuse people new to Lisp? car+cdr is a pretty simple concept, when talking to new Lispers they usually don't complain about this.

I wasn't, but I've seen it confuse other people. Either way, it's a wart on the syntax that seems out of place in Scheme (not so in Common Lisp, which has lots of other warts ;))

>I come from a background using Common Lisp for system and web development so I may see things differently than people who were introduced to it academically.

Well, I was introduced to Scheme through SICP, but my motivation for picking it up was more practical than theoretical; I'd heard all the message board talk, Paul Graham essays, etc. proclaiming Lisp to be the most powerful, productive family of programming languages that will turn you into an expert programmer and so on, and wanted in on that ;) Maybe it's more practical to be Getting Shit Done™ instead of worrying about syntactic warts, but I think choosing to not have dotted pairs runs a bit deeper, by forcing the language to promote at least tables to a first class syntactic citizen.

[1] For the unfamiliar, it's something like:

  (message "Text" :style bold :color red)
Reads pretty well, but it still seems wrong to me (ymmv), like it's another spot in the language begging for first class hash tables. In Lua, you'd emulate named arguments by passing a table to the function. I wonder what Clojure programmers do here. This?

  (message "Text" {style: bold, color: red})
Which doesn't make a big deal for readability, but I bet if you want to use a macro or whatever accepts named arguments, it'd be much easier to deal with the one with the real hash table.


I don't understand the argument for using a hash-table for kw-args. Hash tables only win against plists when you don't need to read every single element; when you always need to read every single element, then it's a losing proposition to use a hash-table instead of a plist.

Yes I agree with your later comment about syntactic blessing and all, but I think you picked a terrible example by choosing the single place in lisp where plists make more sense than hash tables.


Maybe it was a bad example, but in either case, I would expect a "sufficiently smart compiler" to transform

  (defun fun (a &key (b 0) (c 0)) ...)
  (fun 5 :b 3)
into something like

  (defun fun (a b c) ...)
  (fun 5 3 0)
Whether it does so as a regular macro or as a special form of the compiler doesn't matter, there's no need for there to be any difference in efficiency between representing keyword args with tables or plists.

As I later mentioned in another comment, it was a mistake to talk about hash tables when I was really more interested in the abstract map datatype.


Named arguments are there to be interfaces to procedures. With hash-tables you know nothing about them. With named arguments, we can ask for argument lists, check for missing arguments, complete arguments, prompt for arguments, ... Much of that can be done in the IDE or at compile time.

Named arguments had been introduced to Lisp with MDL (a Lisp dialect, brought to Lisp Machine Lisp and then to Common Lisp).

Using hash-tables for it is a step back from the view of development support.


  ; CL keyword arg syntax, taken from Practical Common Lisp
  (defun foo (&key (a 0) (b 0)) (+ a b))
  (foo :a 1) -> 1
  (foo :a 1 :b 2) -> 3

  ; hypothetical table keyword arg syntax
  ; clojure defines commas to be whitespace, you can pretend they aren't there if you want
  (defun foo ({a: 0, b: 0}) (+ a b))
  (foo {a: 1}) -> 1
  (foo {a: 1, b: 2}) -> 3
I fail to see why the table syntax would be any less usable or introspectable by compilers or editors. They'd have the advantage (which would be shared with alists, if Common Lisp had used them for keyword args) of sharing the syntax for keyword args with a syntax for optional structure properties (think XML's attributes)


Basically the syntax for specifying keyword arglists is based on assoc lists. The calling syntax is based on property lists.

There is nothing to be gained by hashtables.

The Common Lisp keyword syntax actually is a bit more powerful then what your PCL example shows:

(defun foo (&key ((var keyword) init var-arg-supplied-p) ...)

Generally I think it is a slight mistake to use specific data types in arglists - basically mixing syntax and data types.


> Generally I think it is a slight mistake to use specific data types in arglists - basically mixing syntax and data types.

That's kinda what alists and plists do! They mix up the abstract "map" or "table" data structure (something that maps keys to values) with some concrete implementation of it. '((key . value) (key . value)) and '(key value key value) are literal representations of two particular map-like data structures, whereas {key: value} or {key = value} could have any concrete representation that the language implementer desires.

Maybe I should have dropped the "hash" from "hash table" in my above posts, so that it was clear that I was interested in the syntactic benefit of using one syntax for mapping names to values, and not some imagined efficiency gains from having (read keyword-function-call) include a hash table instead of an alist.


    (defun foo (&key foo) ...)
In Common Lisp an implementation can implement a call like (foo :bar 10) how it wants. There is nothing said about lists, hash tables, property lists or assoc lists.

Whereas having a syntax for hash table {} and using this syntax in definitions/calls clearly indicates a conflict. What is it? Coincidence? Purpose?


> I wonder what Clojure programmers do here.

About the same.

    (defn message [text & {:keys [style color]}]
      ...)

    (message "Text" :style :bold :color :red)
Because typing those extra two brackets is apparently too much trouble.


I would just accept the hash, I think a lot of us prefer it to the pseudo keyword args.

    (defn message [text {:keys [style color]}] ...)
    (message "Text" {:style :bold :color :red})
To the person asking, that {:keys [style color]} is destructuring, it's not required, but it's nice, it's equivalent to doing:

    (defn message [text opts]
       (let [style (:style opts)
             color (:color opts)]
          ...))


I can't say I expected it to look like that.


In Common Lisp you can create your own syntax for hash table literals fairly easily with a reader macro:

http://frank.kank.net/essays/hash.html

Mind you - reader macros should be used with a degree of care otherwise you basically create completely new language. I remember working on a project where I got a drop of some code from one of the other organizations in the project (by tape, this was a long time ago) opening the file and wondering what language the code was written in and taking a while to realize that it was indeed Lisp - but one that had used reader macros to do strange and terrible things.

Mind you, once I learned about reader macros I went on to do my own crazy stuff with them - they are almost too powerful a feature (almost!).


> but a quick search didn't reveal a syntax for hash table literals, which I would consider to be a prerequisite for calling them first class citizens

http://en.wikipedia.org/wiki/First-class_data_type

Hash-tables are part of the standard, and available in conforming implementations out-of-the-box. (make-hash-table) produces an instance of a table which can be passed as argument to function, be assigned to variables, ... I don't think syntax defines what feature is or isn't first-class, even though it has an impact on it's usage.

Adding support for litteral hash-tables is possible, as it was already said: define a reader macro; that macro could even simply use the utility function "plist-hash-table", from Alexandria, which is used like this:

     (alexandria:plist-hash-table '(:a 1 :b 2))
So that you could write #H{:a 1 :b 2} and get what you want.

On the first hand, people say that CL is bloated, but on the other hand, they want to have everything available by default. Do you expect Python or C++ implementations to come with regular expression built-in, or is it okay if you need to add an 'import' statement or link with Boost?

CL allows you to define libraries that can change even the basic surface syntax, for your convenience. Considering that there is more than enough defined in the specification, what benefits would be in having an "official" hash litteral syntax? consistency accross different projects? many people use the "cl-ppcre" regex library without problem.

Yes, regular expressions come in a library even though they could easily be labelled as a fundamental feature of a language, and hence be expected to be "first-class", or "built-in"; but again, that is not the case in CL, which is neither Perl nor Lua.

And about keyword parameters, this is just the surface syntax, which is on purpose based on a simple, basic representation of the syntax tree as lists of expressions.

Then, the compiler will work with the argument list in your function definition and take one or another approach to represent them in the object code. And I doubt that they are stored in a hash-table, which is unlikely an appropriate way to handle a couple of keyword arguments.

Similarly, when you define a class, everything is declared by a simple list of slots, where slots are lists of parameters: but in fact, the class might store the actual instance slots in a hash-table, an array, a list, a database or a memory mapped file.

The fact that you are mainly manipulating lists does not mean that everything is eventually represented as supposedly inefficient linked-list at runtime.


> http://en.wikipedia.org/wiki/First-class_data_type

Fair enough, pretend I said "first class syntactic object."

> CL allows you to define libraries that can change even the basic surface syntax, for your convenience. Considering that there is more than enough defined in the specification, what benefits would be in having an "official" hash litteral syntax?

Reader macros are powerful, no doubt. The benefit of building this into a language would mainly be syntactic regularity. CL as it stands has keyword args, alists, property lists, and real hash tables (probably amongst other things I don't know about), that all fill basically the same purpose of mapping names to values. If the creators of Common Lisp and its parent lisps had started with hash tables with standard reader syntax, they probably wouldn't have felt the need to introduce so many subtly different but conceptually identical concepts. Lua, for instance, uses its single table data structure both as syntax and as an internal representation for pretty much any case where it needs to map some names to some values; from keyword arguments to environments to "metatables" that provide fairly powerful metaprogramming support, they're all just tables that you can write literally and use the same functions to manipulate. You could say that alists fill this purpose and are even more syntactically regular since they're still sexps, but then why doesn't CL use alists for keyword args, and why don't Scheme's keyword arg extensions use alist syntax? Historical accident, or are alists just ugly? If you can admit that they're a little too ugly and verbose to include in function calls, then maybe you can see why I don't like writing alist literals.

> The fact that you are mainly manipulating lists does not mean that everything is eventually represented as supposedly inefficient linked-list at runtime.

I'm well aware that a compiler can easily expand keyword arguments into regular ones. I strongly doubt your compiler will transform all the literal alists in your program into hash tables, and replace all the alist-ref functions and so on operating on them with the equivalent hash table functions. As long as alists are more syntactically blessed than hash tables, people will use them where another data structure would be more appropriate.


Assoc lists make little sense as argument lists. Named arguments are basically like property lists. The point of argument lists for functions in Common Lisp is that they enable a contract. This requires special built-in support in the language. I want to see during compile time if an arg is missing or additional.

Example: the function WRITE takes several keyword arguments, but not :BAR. The compiler complains about wrong use.

    * (defun test () (write "foo" :bar 10))
    ; in: DEFUN TEST
    ;     (WRITE "foo" :BAR 10)
    ; 
    ; caught WARNING:
    ;   :BAR is not a known argument keyword.
Now if you allow a hashtable or some other data structure to be passed, then it would also be great, if one could specify at definition time which keys it takes, their default values, etc. We also may want to find out which values were default values, and which were actually passed.

The keyword argument facility for functions in Common Lisp provides all of that.

This allows you compile-time checks for code and makes interfaces easier to use.

Common Lisp also allows access to arguments as lists. This is simpler and more Lispy than using hash tables. For most purposes lists are useful enough and hash-tables would just add overhead.


> I strongly doubt your compiler will transform all the literal alists in your program into hash tables,

Time for experiment. SBCL doesn't, as far as I know. But you seem to imply that it is always better to use hash-tables, and I don't think so (even though hash-table can be implemented in a smart way, like in Lua).

Under roughly 10 elements, alist result in shorter code.

     (defun my-fun (x)
       (declare (optimize (speed 3) (debug 3) (safety 0))
                (type (member a b c) x))
       (let ((a '((a . 10) (b . 21) (c . 31))))
         (cdr (assoc  x a :test #'eq))))

     (disassemble #'my-fun)
     ; disassembly for MY-FUN
     ; Size: 53 bytes
     ; 04A7FF2F:       483B1592FFFFFF   CMP RDX, [RIP-110]         ; 'A
                                                                   ; no-arg-parsing entry point
     ;       36:       7511             JNE L1
     ;       38:       488B0D91FFFFFF   MOV RCX, [RIP-111]         ; '(A . 10)
     ;       3F: L0:   488B5101         MOV RDX, [RCX+1]
     ;       43:       488BE5           MOV RSP, RBP
     ;       46:       F8               CLC
     ;       47:       5D               POP RBP
     ;       48:       C3               RET
     ;       49: L1:   488B1D98FFFFFF   MOV RBX, [RIP-104]         ; '(B . 21)
     ;       50:       488B0D89FFFFFF   MOV RCX, [RIP-119]         ; '(C . 31)
     ;       57:       483B157AFFFFFF   CMP RDX, [RIP-134]         ; 'B
     ;       5E:       480F44CB         CMOVEQ RCX, RBX
     ;       62:       EBDB             JMP L0
This is roughly equivalent to a "case" construct (not shown here). Then, this is what I have with a hash-table:

     (let ((table (make-hash-table :test #'eq)))
       (declare (optimize (speed 3) (debug 3) (safety 0)))
       (setf (gethash 'a table) 10)
       (setf (gethash 'b table) 21)
       (setf (gethash 'c table) 31)
       
       (defun my-fun-hash (x)
         (declare (optimize (speed 3) (debug 3) (safety 0))
                  (type (member a b c) x))
         (gethash x table)))

     (disassemble #'my-fun-hash)
     ; disassembly for MY-FUN-HASH
     ; Size: 115 bytes
     ; 05467EB0:       .ENTRY MY-FUN-HASH(X)                       ; (FUNCTION (#)
                                                                   ;  (VALUES T # ..))
     ;      EE8:       8F4508           POP QWORD PTR [RBP+8]
     ;      EEB:       488D65E8         LEA RSP, [RBP-24]
     ;      EEF:       488B7805         MOV RDI, [RAX+5]
     ;      EF3:       4C8BC7           MOV R8, RDI
     ;      EF6:       498BF8           MOV RDI, R8
     ;      EF9:       BE17001020       MOV ESI, 537919511
     ;      EFE:       488B05FBFDFFFF   MOV RAX, [RIP-517]         ; #<FDEFINITION object for SB-IMPL::GETHASH3>
     ;      F05:       B906000000       MOV ECX, 6
     ;      F0A:       FF7508           PUSH QWORD PTR [RBP+8]
     ;      F0D:       FF6009           JMP QWORD PTR [RAX+9]
     ;      F10:       6A20             PUSH 32
     ;      F12:       B9604F4200       MOV ECX, 4345696           ; alloc_tramp
     ;      F17:       FFD1             CALL RCX
     ;      F19:       59               POP RCX
     ;      F1A:       488D490B         LEA RCX, [RCX+11]
     ;      F1E:       E917FFFFFF       JMP #x1005467E3A
The function with an hash-table has a constant size relatively to the number of elements in the table, which is fully known at compile-time. The size of the code with a case/alist grows linearly with the number of elements.

Now, let's compare speed.

In the following versions, I have added more elements in the alist, just to be sure the code with an hash-table is the shortest (from 'a to 'l).

     (time
      (dotimes (i 10000000)
        (dolist (x '(a b c d e f g h i j k l))
          (my-fun-hash x))))

     (time
      (dotimes (i 10000000)
        (dolist (x '(a b c d e f g h i j k l))
          (my-fun-case x))))

     (time
      (dotimes (i 10000000)
        (dolist (x '(a b c d e f g h i j k l))
          (my-fun x))))
Results:

   HASH:
    2.491 seconds of real time
    33,072 bytes consed
  
   ALIST:
    0.852 seconds of real time
    0 bytes consed
  
   CASE:
    0.778 seconds of real time
    0 bytes consed  
 
Even though in terms of footprint, the code tend to be larger with alist quite rapidly (10 elements), the resulting code is faster and does not allocate memory.

Also, just to clarify, alist and property lists have a different behavior than hash-tables, namely that the sequential access allows you to shadow values from another list: if you write (cons (cons 'a b) older-alist), you have a new list where the value for key 'a is b, and where values for other keys are those found in older-list (even though older-list also contains key 'a).

I don't quite remember what is my point anyway ;-) but it was fun to test the different behaviors.


Microbenchmarks are fun but silly. I tried to write a similar one for Lua, but LuaJIT ultimately recognized the program was useless and boiled it down to a 3 instruction loop:

  loop:
    add ebp, 1
    cmp ebp, 10000000
    jle loop
Anyway, in regards to your results, I could be wrong, but I think I read somewhere that LuaJIT uses linear search for tiny tables for this reason. Dunno what the threshold is, if there is one. The performance then would be similar to an alist, but saving a few bytes and cycles by not needing to deal with the extra list pointers and indirection.

> Also, just to clarify, alist and property lists have a different behavior than hash-tables, namely that the sequential access allows you to shadow values from another list: if you write (cons (cons 'a b) older-alist), you have a new list where the value for key 'a is b, and where values for other keys are those found in older-list (even though older-list also contains key 'a).

Oh yes, doesn't emacs make good use of this trick for its configuration variables? You can get the same effect with Lua's metatables, however, with a little elbow grease:

  parent = { a = "beep", b = "boop" }
  print(parent.a) --> beep
  print(parent.b) --> boop
  child = { a = "poing" }
  print(child.a) --> poing
  print(child.b) --> nil
  mt = { __index = parent }
  setmetatable(child, mt)
  print(child.a) --> poing
  print(child.b) --> boop
Not quite as easy as "cons", but very flexible; __index can be another table to searched if the lookup on the child fails (which in turn can have its own parent and so on), but it can also be a "metamethod" that is called whenever a lookup fails and can then do arbitrary things. A neat example off the top of my head: OpenGL has the peculiarity that you do not know the address of any of its functions until runtime, requiring a program to call a lookup function for each function to get a usable pointer. Declaring FFI function prototypes for many OpenGL functions is not such a big deal, but looking up hundreds of functions that you will never use can add significantly to startup time. So someone (possibly an HN user?) wrote an OpenGL FFI library that uses the __index metamethod in a clever way; the first time a function like "GL.CreateShader" is called, the lookup fails, and the __index metamethod mangles the index name a bit and in turn calls (on Windows) the C function wglGetProcAddress to look up its address, which it then stores in the original table. Using this library, you can write code that uses GL functions willy-nilly, and their addresses will automatically be looked up at runtime the first time they are used.

Sure, you can do the same thing in any language with macros or a preprocessor, but is that cool or what?


> ... recognized the program was useless and boiled it down to a 3 instruction loop

The loop itself is quite useless, why wasn't it also removed ? (Just kidding)

I don't have anything agains Lua or hash tables in principle. And of course tables are used in practice in CL code, but they aren't the primary data-structure.

> __index can be another table to searched if the lookup on the child fails (which in turn can have its own parent and so on)

> the first time a function like "GL.CreateShader" is called, the lookup fails, and the __index metamethod mangles the index name a bit and in turn calls (on Windows) the C function wglGetProcAddress to look up its address, which it then stores in the original table. Using this library, you can write code that uses GL functions willy-nilly, and their addresses will automatically be looked up at runtime the first time they are used.

So, it is an association list implemented using tables, where links are given by the __index property, using a metatable.

So maybe it is convenient after all to have a very simple data-structure like cons in a language and let more complex data be implemented with it, instead of the opposite.


> The loop itself is quite useless, why wasn't it also removed ? (Just kidding)

Good question! I guess LuaJIT isn't optimized for programs that don't do anything.

> So, it is an association list implemented using tables, where links are given by the __index property, using a metatable.

I think that's a matter of opinion. It's interesting that being able to form these simple hierarchies is emergent property of alists, but just because Lua provides another mechanism to implement hierarchical lookups, doesn't mean that the language designers were trying to ape alists. If anything, I'd assume that Roberto and company were inspired by Smalltalk's doesNotUnderstand message when they implemented __index metamethods.


In some cases it makes more sense to store things as an a-list.

A good example would be simple key->value pairs. When working with Hunchentoot it represents HTTP Headers and form POSTs as a-lists. It would be easier to use a hash table though.


I probably shouldn't have focused on data structures when talking about dotted pairs, but some days it's just hard to be a Lisp head...

     > (define (f g)
            ((car g)(cdr g)))

     > (define my-val (cons sqrt 4))

     > (f my-val) 
     4
...and go down the rabbit hole where code and data merge.


As another Haskeller...

    f :: (a -> b, a) -> b   -- a.k.a. `ap`
    f (a, b) = a b

    myVal :: (Double -> Double, Double)
    myVal = (sqrt, 4)

    > f myVal
    2 -- I hope
In other words, what's different about (a . b) and (a, b)? But then: why not (a, b, c)?


To a man with a Haskell all the world is a nail...and every nail has a type and requires a special hamner. I.e.:

    (define side-effect (cons print "This is not an IO monad"))

    (define its-already-generic (cons number->string 4))
The more I think about Lisp the more I believe that the best reference language is C. [1] Lisp took form as a first step away from assembly, not a first step toward the singularity. What is garbage collection but some lazy programner automating memory management?

The mistake is to take the ideological fanaticism of Lispheads for ideological purity of the language. There is none. Even the great schism of Lisp1 versus Lisp is rooted largely in arguments from precedent...and precedent for Lisp is largely that it can shave a few precious bytes and register loads by allowing a single symbol to point to more things.

[1] Or Cobol.


Yeah, nobody is ever going to agree about language statics. From each person's POV: "you either get it or you don't".

But totally agree about the pragmatism of Lisp. It's clever self-interpreter is clever but not terrifically interesting as a proof of simplicity of language but instead a proof of simplicity of self-implementation.


It's not that I don't think that staticly typed code isn't advantageous. It's that it seems to often impede finding better abstractions by locking our thinking around local minima.

   'a . 'b -> 'b
Is a great place for a procedure to end up in production code. Yet, it's not necessarily a great place for me to start because I'm lazy and might try to force

   'c . 'd -> 'd
into working kludgetasticly simply because like most people I find it hard not to love my own ideas, and inertia makes it easy to stick with them even when I might find better alternatives.

Static typing, like any heuristic, does well in some situations and poorly in others. The bad thing isn't that it's not a free lunch. The bad thing is thinking that it is.

The same of course is true for dynamic typing and strong typing and duck typing and Haskell and Lisp. Though I'm not sure that Lisp was ever intended to prove anything as a programming language [1] in the way that Haskell was, which might explain why after 50 years we can graft enough onto it to argue about its comparison to Haskell.

[1] As distinct from it's invention as a mathematical formalism for describing lambda calculus.


I think that personally that's the place that genericity and inference live. I never write types as concrete as

   'a . 'b -> 'b
but instead write code and let the inferencer tell me how general it might be then use that information to peel apart layers which don't need to be conflated.

This isn't tied to this example, of course. I think generally static types get misused the thought is that they don't let you experiment with code. I think that they don't let you experiment with code in the same way that dynamic types do, but I don't think they're at all impeded for that. It just takes investment into a different kind of experimentation.


I like the approach you took to writing this post. This wasn't the typical attempt to summarize an entire programming language in a page (which IMO is done too often, too poorly), but rather an exploration. As you went through each section, I could see how you were approaching various problems and learned a lot about Racket in the process. I would love to see more of this style of post.


Of course I couldn't summarize Racket in a page – I don't know it! And there's so much to learn about it that I could probably write half a hundred posts of this length and still consider myself a beginner.


This is how I usually implemented quicksort in Racket:

    (define (quicksort xs)
    (if (null? xs)
    xs
    (let* 
        ([hd (car xs)] 
         [tail (cdr xs)]
         [smaller (filter (lambda (x) (< x hd)) tail)]
         [bigger (filter (lambda (x) (>= x hd)) tail)])
      (append (quicksort smaller) (list hd) (quicksort bigger))))
It's great to see a hugely superior implementation, and I love that people are writing about and using Racket like this because the more people do that the more resources there will be for people like me to learn from.


I love Scheme. And I love Clojure, which is IMHO Scheme plus some great ideas from Haskell.

I regret there's no Scheme or Clojure running on LLVM, which I think is a much better platform than the JVM. Julia, that resembles Dylan (another Lisp), is the perfect example.


I too wish there was such a thing as a JIT compiled Clojure on LLVM. The JVM is amazing but it only seems to be amazing for long running processes. LLVM, IMO, is far more versatile in its amazingness.


There was an attempt at it, https://github.com/halgari/clojure-metal


Not sure what Lisp family it compares to, but there is Hylas:

https://github.com/eudoxia0/Hylas-Lisp


That was intellectually honest. Good read.


It seems as if several of the author's problems come, oddly for someone who's used both strict (Pascal) and non-strict (Haskell) langauges, from being confused about Racket's strictness. Why is time a special form? Because otherwise (time (expensive)) would just get the result of (expensive). Why doesn't (list 1 (2 3) 4) work? Because list isn't a special form, it's a function. Why doesn't quote turn '(list 1 2 3) into '(1 2 3)? (Well, this one isn't about evaluation order, admittedly.) Because if it did it ... wouldn't be quote.


Pascal is strict, but it still has stuff like “move(var1, var2)”, out parameters, etc., so it's also call-by-reference.

I didn't know at the time that Racket was call-by-value.

Quote was confusing at first, yes.


A friend of a friend has considerably code in Scheme, Common Lisp and Clojure, when you ask him which is the true Lisp, guess what will he answer? Haskell!

Haskell does really create a big impact on some developers.

Just an anecdote ;)


I think it can't be "the true Lisp" without cleaner macros. Of course, that doesn't stop Haskell from being a great Haskell.


I do prefer Racket macros to Template Haskell, but you can do some of the same kind of things in Template Haskell. Go ahead and play with it for it for a bit.

Mostly, you need to have the data-structure it defines for the parse tree, and it's harder to convert than syntax->datum. Also there's the Q monad, to generate new safe names. But it works pretty well.


I'm not sure there's anything you can't do in TH, but it feels messier than Lisp macros. I'm hoping typed TH will improve things (not in precisely the same direction as lisp, of course).


Haskell is the algebra of programming - the way to handle code expressions as Immutable Truths.

I suppose that makes Lisp the Principia Mathematica.


One note:

> Racket's default IDE is better than GHCi and probably on par with Emacs (you almost certainly can configure Emacs to be better than anything, but it's not trivial and people don't bother, while DrRacket provides autocompletion and documentation out of the box).

Last I used it (a few years ago), DrRacket was very laggy, so I would find it very hard to use for a serious project. YMMV, maybe it's improved.


I remember DrScheme as it was called before, it was snappy even on old amd duron boxes. DrRacket is very heavy. As other suggested, emacs+geiser gets you quite far. Maybe less nood friendly, because well emacs, no fancy gui artefacts (arrows and such).


Try the Geiser plugin for Emacs. It allows you to connect to the Racket REPL (either it starts a new instance of Racket, or it can connect to a socket/port).

I'm a Vim user and went through elaborate pains to get Emacs working almost like Vim just so I could go through SICP this way.


Could you please outline your final setup besides using Geiser? I also come from Vim-land, have almost zero experience with Emacs and already understood that, to learn SICP, setting up and learning Emacs is going to be easier than setting up Vim for the task.


Well, I'd suggest if you know nothing about Emacs to install Emacs Prelude: https://github.com/bbatsov/prelude

Next, enable the lisp/scheme modules in prelude-modules.el [1]. You need the Evil plugin to get Vim modal editing and probably Evil Leader (so you get Vim style <leader> commands).

You can look at my config [2], there's nothing really funny except I map Ctrl-H to backspace and Ctrl-[ to ESC as that's what I also use in my Vim. There's also a slight modification to make Smartparens make parenthesis highlighting behave like in Vim.

One other thing is if you're in insert mode then Emacs keystrokes should also work (Ctrl-Z to force switching between Vim/Emacs keystrokes).

If you install the SICP plugin for Emacs (it's in my custom.el) and you type M-x info-display-manual sicp, or if you have Evil installed you can use use Vim's command :info-display-manual sicp, you'll get the SICP book as a manual inside Emacs.

[1] https://github.com/humana/dotfiles/blob/master/emacs/prelude... [2] https://github.com/humana/dotfiles/blob/master/emacs/persona...


Wow, thank you very much for your suggestions and the .els! I guess now I'm really out of excuses not to start with SICP.


I find DrRacket sufficiently fast and sufficiently frustrating.

It's very useful for interactive debugging but dependent on the mouse and many Emacs key combinations simply cannot be mapped because of it's CUA interface...and so far as I can tell there is no key-combination that switches focus between the REPL pane and the Editor pane short of closing the other pane.

The syntax analysis gets to be a bit much too.

On the other hand, that's just the price of an IDE over a text editor, and for a batteries included IDE, it's pretty good even if I wish that the effort would have been put into Emacs support while knowing that doing so would not meet the needs of the PLT group's target audience of students.


by default ctrl+f6 will switch focus. You can remap that to something more convenient though (it's called shift-focus).


I think you can switch panes with C-x C-o, same as the Emacs combo.


That's C-x o; C-x C-o in Emacs runs delete-blank-lines, and I'm not sure what it does in DrRacket.


Yeah DrRacket is fairly slow unless you have a powerful computer. I recommend just using a text editor and making use of XREPL, see: http://docs.racket-lang.org/xrepl/index.html?q=

tl;dr, open up the normal repl by typing "racket" and then type (require xrepl) and then ,install! (leading comma denotes an xrepl command).

Then when you want to interact with a file you can do racket -i $file to interact with it using xrepl.

If you're wondering why this isn't just the default, I think it has to do with the readline license.


     racket -il xrepl
Will launch Racket with already xrepl loaded.


I love Racket but I have only ever had the same experience with DrRacket. I don't know what it is trying to do, but it brings my modest laptop to its knees.


Interesting... I've never had any problems with DrRacket. What qualifies as a "modest laptop"?


I actually found DrRacket very fast and Emacs very slow, on every computer I've tried them on. Why might this be?


Emacs Lisp is interpreted (or byte-interpreted), and (GNU) Emacs is written almost exclusively in it, so startup can be dog-slow. Once it's up and running, though, I've always found it pretty snappy.


I love the play by play of exactly what you're thinking as you're doing something new. I find them quite informative.


Web apps are a hack on a hack, yes. But it's kind of ironic he hates Javascript so much when Javascript is a pretty cool programming language heavily influenced by Scheme, which is awfully similar to Racket.

But hey, I like people who call it like they see it, and there certainly are some drawbacks to Javascript too.


Javascript is a pretty cool programming language heavily influenced by Scheme. -> I couldn't agree with you more.

There is a Lisp-to-JS compiler (ParenJS). https://bitbucket.org/ktg/parenjs


he will be amazed when he discovers custodians. http://docs.racket-lang.org/reference/eval-model.html#%28par...


There are many cool features that Racket has out of the box, like very nice module system, delimited continuations, objects and classes and of course macros (both hygienic define-syntax-rule, syntax-case and unhygienic defmacro) and more.

But if I had to show one feature of Racked to make someone amazed, it wouldn't be any of those. It would be a simple program composed of a couple of files, and every file would start with different #lang. Like #lang racket, #lang lazy, #lang typed/racket, #lang datalog.

It's sufficiently mind-blowing that there are this many languages on top of Racket, but the real "killer app" is how seamlessly they integrate with each other.

The next thing I'd show would probably be Danny Yoo tutorial on how you can create even more languages like this: http://hashcollision.org/brainfudge/


It's almost like a practical Mozart/Oz with all Lisp awesomeness. CTM shows how programming with many different paradigms can be extremely powerful.


"Someone who knows Haskell learns some Racket"

"Tonight at 11: this rose gardener learns to plant tulips"


Which is exactly why I was very surprised when it got posted to Reddit and wasn't immediately downvoted.


Nicely written, and a fun exploration.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: