Hacker News new | past | comments | ask | show | jobs | submit login

While I definitely agree Rust is a much faster language than Clojure, I would be interested to see benchmarks on your code that show just how much faster your Rust code was on the same data.

I also noticed that you mentioned avoiding lazy sequences is not idiomatic in Clojure. I disagree with this since using transducers is still idiomatic. I wonder if you could've noticed some speed improvements moving your filters/maps to transducers. Though I doubt this would get you to Rust speeds anyway, it might just be fast enough.




I just moved a medium sized codebase from clojure transducers to JS, and after having used clojure for 7+ years, and done so professionally, I don't wanna go back, ever. The JS solution is shorter, faster, and easier to understand. I'm thankfull for the insights into reality and programming clojure has provided, but highly optimised clojure is neither idiomatic nor pretty, you end up with eductions everywhere. Combine that with reaaaallllyy bad debuggability with all those nested inside out transducer calls (the stack traces have also gotten worse over the years, I don't know why, and a splintered ecosystem (lein, boot, clj-tools)) I'd pick rust and deno/js any day for a greenfield project over clojure. sadly.


Yup, it's like the leadership is actively hostile towards community building.

* Prismatic Schema, immensely popular, was "replaced" by spec, which is not yet complete and still in the research phase

* leiningen (one of the best language tooling out there) was "replaced" by Clojure CLI that can't do half of what leiningen can

* transducers (a brilliant concept) are not easy (as in close at hand) because the code is quite different to normal lazy-sequence based code (I wrote a library [1] to address this)

I still prefer Clojure for all my side projects, but it is very clear that the community is tiny and fragmented.

[1] https://github.com/divs1210/streamer


Schema and spec do not target the same functionality.

In general I agree that it would be best if a small community does not spread itself too thin, but on the other hand, can't Hickey, Miller et al work on what they are interested in? They have published rationales for why those "splintering" works are of interest and they make sense. It seems incongruent to happily use a tool that was born out of an opinionated design, and then complain when the authors keeps pushing their opinionated PL design.


I have limited exposure to Clojure transducers but I spend most of my time writing JS/TS and I've found thi.ng/transducers[0] a pleasure to work with and super elegant for constructing data processing workflows.

[0] https://github.com/thi-ng/umbrella/tree/develop/packages/tra...


Debugging performance problems is the reason I stopped using `cljs`. Those stack traces are so painful.


That's a very common scenario for Clojure users to go through and it's one of the reason it has so many abandoned/unfinished libraries (although 7 years was a lot). After they have gather all the insights they can from Clojure and its ecosystem (which is a worthy endeavor IMO), they go back to their big ecosystem mainstream programming language because of all the benefits you get from it even if that programming language is worse. It also doesn't help Clojure the fact that JS 2020 is way better than JS in 2010 and that you can easily bring all your Clojure insights/concepts to JS.


I have the opposite experience.

Every time I need to use another language outside of Clojure, it feels like most other languages are... confused. A bad case of "designed by committee" experience.


> After they have gather all the insights they can from Clojure and its ecosystem (which is a worthy endeavor IMO)

Just out of curiosity, what do you mean by this? I've never used Clojure, but have done a fair bit of hacking in other Lisp dialects. Do you (or anyone with an opinion) think there's some insight benefit to Clojure specifically vis-a-vis Racket/Scheme/etc?


Production implementations of persistent data structures in an industrial VM plus abstractions for state management, polymorphism, concurrency, the sequence abstraction, etc... It just gives more things for day-to-day programming. While Scheme gives you good foundations, you have to build a lot stuff yourself, it's too primitive (haven't follow Scheme since R5RS). But it's really mostly about the literal data structures and leveraging them anywhere you can to represent information, it's maps everywhere. Data oriented solutions is the common term use in the community. This answer by one the Clojure maintainers sums it up better than me: https://news.ycombinator.com/item?id=25377022

Racket extends Scheme with useful stuff too for everyday programming but Clojure's immutable data structures with its big library of functions for manipulating them in a nice abstract generic way, with the fact that in runs in the JVM gives it a big edge for "real world" programming IMO. You do need strong knowledge of Java and the JVM for critical services.

Almost all your knowledge of Racket and Scheme will transfer and be valuable for Clojure, so you already know most of it and have a big head start if you plan to learn it.


Some of the top reasons for me:

* immutability with persistent data structures

* csp with core.async

* encourages designing a data-driven functional core with an imperative shell

* powerful REPL (better than python / scheme / racket, worse than SBCL)

* multi-platform (backend / browser / mobile through react native)

* easy and comprehensive testing because of several factors: functional nature, dynamic binding of vars, generative testing, etc.


Also one of the amazing aspects of JS ecosystem is TypeScript - structural type system on top of an open object system is such a flexible and pragmatic tool it's amazing.

Last time I used Clojure (probably 5 years ago to be honest) the lack of static typing combined with the functional nature made complex imperative code (which you're sometime forced to write, and there are examples of such code in standard library) almost impenetrable.


> reaaaallllyy bad debuggability

Odd, I can just step through my code with Cursive if I need to.


How is the experience with transducers? I did not try it but I fear that you'd first see the transducer being constructed into a heap of anonymous functions and clojure implementation details, and then afterwards you can step into how that gets executed in each element of the stream. I have had unpleasant experiences with clojure debugging... so I feel what your parent post is saying.


Perhaps it's just a different way of doing the same thing, but I never feel the need to reach for a debugger in the presence of the REPL and tools like Timbre. Just wrap what you want to see in (timbre/spy ...) and you're good to go.


Please stop moving the goalpost.

"Debugging clj / cljs is hardpartly due to transducers" -> "I can step though my code with cursive" -> "how does cursive handle transducers?" -> "you don't need a debugger"


The person who mentioned using the Cursive debugger is different than the person who mentioned using the REPL.


It's not about the person, but the idea.


I doubt it will bring much. If properly implemented, there is nothing that makes generator-like laziness slower than transducers, and since it is pretty central to clojure I doubt you will see much speed gain by using transducers.

In scheme, the srfi-158 based generators are slower than my own transducer SRFI (srfi-171) only in schemes where set! incurs a boxing penalty and where the eagerness of transducers means mutable state can be avoided.

Now, I know very little clojure, but I doubt they would leave such a relatively trivial optimization on the table. A step in a transducer is just a procedure call, which is the same for trivial generator-based laziness.


When I was reading the article, I thought the author of the post was probably pointing more in the direction of Clojure's immutable data being slower, rather than laziness specifically.

IME (admittedly in a different context, doing UI development) Clojure's seqs and other immutable data can be a huge performance drag due to additional allocations needed. If you're in a hot loop where you're creating and realizing a sequence immediately, it's probably much faster to bang on a transient vector. Same with creating a bunch of immutable hash maps that you then throw away; better to create a simpler data structure (e.g. a POJO or Map) which doesn't handle complicated structural sharing patterns if it's just going to be thrown away.

Transducer's would help in the author's first case to take the map/filter piped through the `->>`, which is going to do two separate passes and realize two seqs, and combine it into one.


I stand by what I said (even though I am partial to transducers): there is no reason for lazy sequences overhead to be much more than a procedure call, which is exactly what a step in the reduction in the case of transducers is. At least when implemented as in clojure or srfi-171.

I understand that there might be some overhead to creating new sequence objects, but removing intermediate sequence object creation should be simple, at least for built in map-filter-take-et-al.

Edit: I seem to be wrong. People are recommending transducers over clojure's lazy map-filter-take-et-al because of overhead, which to me seems off, but there might be something about clojure's lazy sequences that I haven't grooked.


I probably didn't make it any more clear in my reply. Transducers don't win based on allocations IME but because it removes the number of iterations required.

Take the case: (->> (for ,,,) (map ,,,) (filter ,,,))

`for`, `map` and `filter` will all create lazy sequences, holding a reference to the previous one. The problem here is when the `filter` seq gets realized, it will first realize the `map` seq, which will realize the `for` seq. Each sequence will wait for the next one to realize before iterating over it. So in this case it will iterate twice; once for the `map`, and then again for the `filter`.

As you know, transducers combine these steps together so that you only iterate over the initial collection once.

My other comment was making the point that the author has conflated "laziness" with "immutable data" AFAICT. The lazy seqs in the first example they give will be slower w/o transducers, but the other problem is that the overhead from all the allocations required for creating a bunch of immutable hash maps that are then destroyed immediately after is also non-negligible, and seems to be a source of the authors performance problems.


I think what bjoli is getting at is that in an ideal world with lazy sequences you don't have any iteration over any of the intermediate "collections" until some step occurs that requires the entire collection. I put collections in quotes because they aren't really collections, they're generators that produce an element on demand.

So you never really iterate over more than one collection; you only have one iteration where you successively ask each generator to produce a single element and then apply a function to it. For example, if you only asked for the first element of a lazy sequence formed by a series of maps, you would (in theory, in practice see my note about chunks) never iterate over any of the intermediate sequences and only ever examine the first element of each of them.

However, the act of asking a generator to produce an element (that is unwrapping a thunk) has overhead of its own and that's the overhead that a transducer would be removing (not iteration itself in the case of lazy sequences). This can have far more overhead than a procedure call because of bad cache locality (in the absence of fusion optimizations you're pointer chasing and potentially generating garbage for each thunk). Clojure tries to get around that by often (but not always) chunking its thunks, in which case we do have multiple rounds of iteration on smaller chunks, but never the entire sequence (unless the entire sequence is smaller than a chunk).


What I am saying is that lazy sequences in my world should mean you don't have to realize any intermediate collections. In the case of the srfi-158 generator (gmap square (gfilter odd? (list-generator big-list))) the overhead for getting one element from the generator would be 3 procedure calls. Without any intermediate steps. The same transducer would have one procedure call less, but would be in the same vicinity.

Does clojure's sequences not work similarly? That seems like a very lax definiti


Sounds like I was mistaken. What I was seeing in my tests was due to chunking, not realizing the whole seq.


I mean Clojure is still running on JVM so there will be at least that difference. AFAIK Clojure is slower than Java so there's that also.


Yes and no. There's real production cases where even default Clojure can result in the same or faster performance than Java. One hypothetical case is a system that does a lot of complex in-memory reads and a few writes to persistent data structures. That kinda of system could be faster than the Java equivalent, out of the box, in Clojure.

A write-heavy system would benefit Java mutable collections out of the box. Clojure can get pretty close to Java with transients and a good dose of type hinting in all the right places.

When we say "faster" or "slower" it's equally important to specify "faster" or "slower" when and where. It's a complex question with no easy answer.


> One hypothetical case is a system that does a lot of complex in-memory reads and a few writes to persistent data structures. That kinda of system could be faster than the Java equivalent, out of the box, in Clojure.

Why would Java be slower in a read-mostly regime? Your hypothetical is not convincing. Btw, you mention "real" and then move on to "hypothetical" as an example.

Are there actually OSS "production" cases of this subset of systems where Java lags behind Clojure in performance?

> When we say "faster" or "slower" it's equally important to specify "faster" or "slower" when and where. It's a complex question with no easy answer.

These sort of subtle distinctions only matter to langauge wars and debates like this. For actuals systems that need to do work and need to be maintained, we can in fact have metrics on efficiency.

That said, fundamentally, Java affords much greater facilities to "optimize" and approach white hot performance than Clojure.


> Why would Java be slower in a read-mostly regime? Your hypothetical is not convincing. Btw, you mention "real" and then move on to "hypothetical" as an example.

Because of Clojure's default persistent in-memory data structures which allow you to obtain a stable reference to your in-memory data even when the data is being updated live. With Java's default mutable data structures, you'd have to use locking or copying to obtain a stable reference, not to mention the huge complexity of those solutions. I said real because I've built similar solutions in Clojure. Substitute "hypothetical" with "example", sorry for the word confusion.

> That said, fundamentally, Java affords much greater facilities to "optimize" and approach white hot performance than Clojure.

Taken to the extreme, the resulting Clojure would not be idiomatic, it would be effectively Java-in-Clojure, but Clojure has all the facilities that Java has, by definition.


In read-mostly regimes, the performance considerations are typically systems-level consideration. Data locality, page swaps, and cache-line misses are typical concerns of high performance systems engineering.

Regarding "complexity", not sure what you mean. IIRC, someone, possibly even me ;), may have pointed out to Rich Hickey in the early days that Java code could also use those (Java or was it Scala) STM libraries. So, there is your "complexity" behind an API, just like Clojure.

> Clojure has all the facilities that Java has, by definition.

Ergo, a situation where Java can -not- be made as fast or faster than Clojure code seems unlikely.

I think all we have here is you claim of "complexity". I remember Rob Pike making a comment along the lines of "prefer libraries over language semantics" or something like that in context of Go language design. I find it a very compelling argument, based on my overall experience to date.


Transducers are definitely idiomatic. They are more general over "similar things to transform in steps" (including sequences, messages and so on), so you can apply them to collections ("I have the whole data in advance") or channels ("I get the data piece by piece") and so on.

Another idiomatic way to improve performance are transients[0]. From the outside your function is still a function, but on the inside it's cheating by updating in place instead of using persistent data structures. See the frequencies function for a simple example[1].

Clojure and Rust are both very expressive languages and even though they both can be considered niche, they have _massive_ reach: Clojure taps into JVM and the JS ecosystems, Rust can also compile to WASM or be integrated with the JVM via JNI.

The big difference between the two, and why I think they complement each other nicely, is that Clojure is optimized for development, and does its best at runtime, but Rust is optimized for runtime, and tries its best at development. (A similar take in the article). In other words: they both achieve their secondary goal well, but resolve trade-offs by adhering to their primary in the vast majority of cases.

[0] https://clojure.org/reference/transients

[1] https://github.com/clojure/clojure/blob/clojure-1.10.1/src/c...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: