Hacker News new | past | comments | ask | show | jobs | submit login

[note: see more nuanced comment below]

The Julia benchmark two links deep at https://github.com/h2oai/db-benchmark doesn't follow even the most basic performance tips listed at https://docs.julialang.org/en/v1/manual/performance-tips/.




What specifically are you thinking of?

The non-const global variables stand out to me, but I'm not experienced enough tell whether that would make a large difference.


Non-const globals could be an issue, but it's possible it doesn't matter too much for this particular benchmark. I'm a little worried about taking compilation time (apart from precompilation) into account (would that also be done for C++ code?). But I must confess I maybe posted my comment a bit too soon, partially because of the time of day, partially because of the semicolons at the end of each line in the code, which made me quickly think the benchmark writer was using Julia for the first time. While I have a good amount of experience with Julia, I don't have that much experience with DataFrames.jl itself, so I don't know for sure whether the reported benchmark times are reasonable or not.


From the git history it seems like DataFrames.jl maintainers contributed at least some fixes to the scripts, so I guess that means they aren't opposed to it.


I often end every line with a semicolon, so that it doesn't flood a REPL if I run it there.

IIRC, groupby hasn't been optimized in DataFrames.jl yet.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: