Hacker News new | past | comments | ask | show | jobs | submit login

I'm sorry, but it is not benchmark, it is farce.

One run, 7ms, 2ms? It is statistical fluctuation, not data, especially if it was run on "typical" developer laptop under "typical" session where browsers and other high-hitters are run in background and all these turbo-boosts and freq-governors are not turned off.

You need OS where almost all software (including most system services) are killed, CPU frequency is fixed (all power saving technology is turned off in both firmware and OS, I'm not sure it is possible on M-based Apple laptops, and many Intel-based laptops with castrated BIOSes are not suitable too).

You need warm-up loops to warm-up caches, or special code to flush caches, depends on what you want to measure.

You need to have tight-loop with your function called (and you must be sure, that it is not inlined by compiler into loop) which runs enough iterations to spend at least several seconds of wall time.

You need several such loop runs (10+, ideally), to have something which looks like statistics.

You need to calculate standard deviation and check that it is small enough (and you need to understand why it is not small enough if it is not).

Then it is benchmark.

Otherwise it is FUD.




I think you might not be familiar with the package used to benchmark Julia [1].

It does not fix processes to CPU's, or set kernel governor to performance, and there are fluctuations from usage of the computer. But it does run the function for several seconds and returns the distribution of the runs (the little graphics underneath the benchmarks). It calculates standard deviation and if some runs are too small (sub-nano seconds) it emits warnings saying the results might be caused by inlining and constant propagation.

The differences in runtimes you refer to are from use of different machines or different routines, which is completely expected. They also argue they need to run the Mojo code in the same machine as the Julia code to be able to give meaningful results and comparisons.

While to someone outsider it might be seen as done without care, I can asure you that this people are used to take extreme care on how they do benchmarks. Again, it might just be that you're not familiar with the tooling developed to do it.

I do think there is more benchmarks needed to be done, as the Mojo code hasn't be optimised yet and none in that thread was able to run both the Julia code and Mojo code in the same machine (outside of the OP). But I'm sure this will be done (I guess rather sooner than later). :)

[1] Documentation of the package used for benchmark https://juliaci.github.io/BenchmarkTools.jl/stable/ Here you can find all the information you have said in your comment, and more, about reproducibility of benchmarks in different environments. White paper about the strategies used by the package https://arxiv.org/abs/1608.04295


Are you sure the Mojo code hasn't been optimized? It seems to be hand-tuned with simd operations and multi-threading.


I cannot edit my comment, but you're completely right about it being optimised. I will say that I'm not familiar with what idioms might the Mojo compiler be able to optimise better (a problem arising with the Mojo compiler still being closed source, in comparison with the open source nature of Julia), and, in this sense, I don't know if there is more "compiler friendly" code with the same semantics for Mojo that might allow it to get nearer to Julia results.


> I'm not familiar with what idioms might the Mojo compiler be able to optimise better

> I don't know if there is more "compiler friendly" code with the same semantics for Mojo

The Mojo code here is from the official docs [1], so it's from the people best placed to know what the most "compiler friendly" code for Mojo would be, and what idioms they should use to get the best performance Mojo can provide.

[1] https://docs.modular.com/mojo/notebooks/Mandelbrot.html


Thanks for the link! Then I guess there is not a lot more that can be said currently, except that maybe Mojo still needs more compiler work. Still, I prefer how readable is the Julia code. :)


You should read a little more closely before such strong condemnations.

The Julia macros @btime and the more verbose @benchmark are specially designed to benchmark code. They perform warm up iterations, then run hundreds of samples (ensuring there is no inlining) and output mean, median and std deviation.

This is all in evidence if you scroll down a bit, though I’m not sure what has been used to benchmark the Mojo code.


This is such a classic strongly opinionated, inflammatory, and fundamentally ignorant Hacker News comment.


I returned to HN recently after a few years away, and I swear the average commenter has gotten much worse. Simultaneously more arrogant, more ignorant, and with worse reading comprehension.

I used to be able to count on commenters understanding what was written even if they disagreed, but lately I see many comments confidently responding to something that wasn't relevant or even present.


I suspect there's a bit of an Eternal September effect going on as a wider audience ends up here (possibly fleeing the continuing "enshittification" of most all for-profit online fora)


It’s not one run: they are using BenchmarkTools that provides macro to compute exactly the kind of statistics that you expect.

One can say anything, but the Julia community takes performance and benchmarking really seriously.


> One run, 7ms, 2ms?

These are not single runs of the code. The Julia code uses `btime` from BenchmarkTools, which runs many iterations of the code until a certain number of seconds or iterations is reached. The Mojo code uses `Benchmark` from a `benchmark` package, which I assume does similar things.

Beyond that, this is one person getting curious about how a newly released language compares to an existing language in a similar space, and others chiming in with their versions of the code. If you have a higher standards for benchmarks and think it will make a difference, you're welcome to contribute some perfect benchmarking results yourself.


Come on, of course this is not a thorough benchmark, but just a random thread in a forum, where someone wants to get a feeling for the performance of a new technology.

They could have used @benchmark instead of the @btime macro, though. The first gives you the statistics, you asked for, whereas the second one is a thin wrapper around @benchmark, that just prints the minimal time across all runs.

Nevertheless the takeaway of this thread is pretty clear, even without @benchmark: The performance difference mainly stems from SIMD instructions.


This is using BenchmarkTools which does most of these things ;) And the numbers are very nicely reproducable. But yes, those benchmarks are hard to compare, especially if they dont run on the same machine! But, there is absolutely no reason to believe that Julia can't get optimal performance on a given hardware knowing Julia's compiler and architecture...This is much more a benchmark of Mojo, which hasn't been proven in that regard.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: