Fast Ruby – A collection of common Ruby idioms

eregon · on Aug 5, 2015

I would advise against concluding anything with < 20% gain, the changes often impacts readability (the intention becomes less clear) and might as well be measurement errors or just be insignificant for any sort of real application. Not to mention, of course, these measurements are specific to a given system, implementation, etc.

epidemian · on Aug 5, 2015

Absolutely. Moreover, many of these minor run time differences are bound to change from one Ruby version to another[1].

I would even consider it harmful to remember many of these numbers as some sort of practical knowledge. I've actually known cases where a programmer would use some extraneous idioms and when told about a more idiomatic solution it turned out that they knew there was a more idiomatic way of doing it, but they used the more obfuscated alternative because "it was more performant". But it turned out that that knowledge was obsolete (it only applied to an old VM version) or incomplete (it only applied in some very specific cases).

So, beware of "knowing" that `arr.last` is slower than `arr[-1]`. It might not be for too long[2].

[1]: I'm speaking about MRI versions here; of course all those measurements are off if you use JRuby, rbx or Opal.

[2]: It is useful to remember that `arr.bsearch` on sorted arrays is faster than `arr.find`. That probably won't change in the near future ;)

flowerpot · on Aug 5, 2015

I agree, before you start implementing all these changes you should probably also spot the time consuming parts of your software using a profiler. Otherwise you might be making a bad trade-off. My software engineering professor always used to say: "Never start optimizing before you have measured [using a profiler]."

eropple · on Aug 5, 2015

Generally I agree with this thinking, but a number of the idioms in this repo (respond_to? rather than begin/rescue) do have a fairly significant perf benefit and are easier to read. And some of the other lessons, like "don't use method_missing if you can define a method instead", are well worth considering as well.

epidemian · on Aug 5, 2015

A minor nitpick:

> And some of the other lessons, like "don't use method_missing if you can define a method instead", are well worth considering as well.

I'd argue that this one also falls into the category of "simpler/more readable and, incidentally, more performant".

When defining methods you get the correct behaviour of respond_to? for free, whereas when overriding method_missing, you also have to take care of defining a corresponding respond_to?.

The main reason for choosing def/define_method over method_missing (when possible) should be that it is generally simpler to do so.

eropple · on Aug 5, 2015

Depends on which way you look at it, I think. Like, when building a DSL, I generally take my inputs and do the work up front to use define_method--but I know people who feel that it's simpler to just use method_missing and check against some bit of data here or there. The perf argument isn't going to change their minds overnight, obviously, but I think it helps to make sure you have all the information.

JohnBooty · on Aug 5, 2015

Yes. That is sane and important advice when talking about any optimization in any language.

Unless it's an execution hotspot in your code, value clarity and maintainability over performance.

Most of the speedups in these optimizations are very small in absolute terms (only a few milliseconds each) so they will only provide a real-world benefit if they're being called in a loop or something.

That all said:

1. A lot of these optimizations are also a win when it comes to clarity (Array#sample is faster and clearer than Array#shuffle.first)

2. Knowing that the "bang" version of a method is always destructive and nearly always faster is a good thing to remember in general for Ruby

diminish · on Aug 5, 2015

Shouldn't some of those items be opened as a ticket to Ruby 2.x and standard library to ask them for a faster implementation?

chrisseaton · on Aug 5, 2015

Yes a lot of these can be solved - JRuby+Truffle for example removes the overhead of at least parallel assignment vs normal assignment, define_method vs def, send vs normal send, Proc#call vs yield.

Some of the others though, such as Array#bsearch vs Array#find, are just algorithmic complexity and not the fault of the implementation.

comboy · on Aug 5, 2015

Really nice project, and there's a lot to learn from it.

But I think if your choice is ruby then you are putting clean code above micro optimizations. And it often pays off.

cremno · on Aug 5, 2015

Most examples aren't really about micro-optimizations. They're more about calling the appropriate method in the first place (count vs. size/length, gsub vs. sub/tr).

xutopia · on Aug 5, 2015

Choose readability over speed unless speed is a problem.

tboyd47 · on Aug 5, 2015

I like to visualize code quality as a point on a triangle with corners labeled "readable", "extensible", and "performant." Most of what is considered "low-quality" code is biased towards one side of the triangle, and most code quality advice tends towards over-correcting that bias towards another corner of the triangle.

What I love about this presentation is that it's not advice -- it's a collection of tools for your perusal. Use at your own discretion. A lot of quote-unquote "high-quality" Ruby code I've seen is either biased towards readability, biased towards extensibility, or sitting somewhere on the edge between the two. So I really do think every Ruby developer should at least glance at this collection and be familiar with it.

seivadmas · on Aug 5, 2015

If you are writing Ruby you already decided you didn't care about speed.

Don't get me wrong, I think Ruby is a great language and I use it every day to get paid, but it is not a speed queen. Ruby's strengths lie in flexibility, fast iteration, readable code and permitting a functional style.

In most practical web applications the big bottlenecks will be either view rendering or database. Choosing to put any focus at all on performance of something like parallel vs serial assignment (unless you are doing something truly pathological) is a complete waste of time.

It will have no noticeable difference to the end user and distracts from the far more important job of making your code modular, extensible and readable.

If need your code to be fast and you are running Ruby, you already lost. Use Java or a compiled language instead.

tsewlliw · on Aug 5, 2015

This is neat! I think the "why" section thats on some of them is the most valuable part, absorbing that "why" into your lizard brain can add up to huge changes in your natural style :)

waxjar · on Aug 5, 2015

In situations where it's unlikely calling a method will result in a NoMethodError, i prefer rescuing the exception over checking if the method exists first. This will be faster when it does not result in a NoMethodError.

In the "Enumerable#select.last vs Enumerable#reverse.detect" benchmark it would be interesting to know what the result of Enumerable#reverse_each.detect would be.

geoffharcourt · on Aug 5, 2015

I'm surprised at the speed difference between parallel and sequential assignment styles. I prefer the sequential, so that's a nice bonus that it's also more performant.

juanfatas · on Aug 6, 2015

Hi! Actually the speed difference between parallel and sequential assignment styles was not correct. Please read more details here: https://github.com/JuanitoFatas/fast-ruby/pull/50. Thank you.

decentrality · on Aug 5, 2015

Awesome comparisons. Extremely good to know. Would love Rubinius on Linux, and comma delimited or underscore delimited results. Maybe I'll run an post those that way.

bradleyland · on Aug 5, 2015

This is a great collection, but any listing of microbenchmarks needs a caveat.

Consider the first example, parallel assignment vs sequential assignment. As we can see by the results, parallel assignment is 2.25x slower, which seems like a monumental impact to performance, right? If all your application does is assign a few variables and exit, sure, but very few applications are this simplistic. In order to make a good judgement call on this optimization, you have to understand the impact within the context of your application:

What is the total execution time of your application?

What portion of that execution time is spent on assignment?

What portion of that execution time is spent on the extra allocation of an array due to parallel assignment?

At the bottom of the benchmark, we can see the iteration rate for each. Parallel assignment managed a rate of 2521708.9 iterations per second. We can work out the total execution time per iteration from this number:

Single iteration as a fraction of a second: 1/2521708.9

In decimal form: 0.000000396556478 s

Converted to milliseconds: 0.000397 ms

The same conversion for for parallel assignment gets us: 0.0001758783 ms.

In each iteration, we save 0.0002206782 ms.

Circling back to my list of questions, what is the total execution time of my application? If my app uses an I/O calls — and especially network I/O — it could be hundreds of ms. At this delta, it would take over 4,500 iterations of this optimization to achieve an improvement of 1 ms. If we're talking about an operation that occurs locally and is 100% in-memory, execution times may be <50 ms, at which point, you'd need around 2,250 iterations to get a 1 ms improvement.

At this point, I have to tattle on myself. This is an obtuse method of analysis. Microbenchmarks are hard, and at the i/s rate we're seeing here, there could be confounding factors that the author (and I) haven't accounted for. Things like garbage collection and object caching will have an impact at these time scales. Also, we have to ask whether our microbenchmark reflects reality? What real world application repeatedly assigns literals to variables millions of times per second? Extrapolating any meaningful decisions from the microbenchmarks alone is a fools errand.

The lesson is that microbenchmarks can only tell you so much. A comprehensive approach to optimization involves looking at the total run time and apportionment of time in an actual application. This process is called profiling, and the tools for profiling Ruby applications have improved in recent times.

Looking at the parallel vs sequential assignment difference, what you really want to know is whether parallel vs sequential assignment is impacting your application, and to what degree. Profiling tools will tell you where your application spends its time, and where it's allocating memory. This tells you where to look. Microbenchmarks will tell you which idioms you pay a penalty for. The combination of the two allows you to make smart decisions.

If you have a a parallel assignment wrapped in a loop that will execute hundreds of thousands of times every time your application runs, this will show up during profiling. Moving to sequential optimization will likely pay dividends. Otherwise, the penalty paid for parallel assignment is probably minimal. Profiling is a good way to tell the difference.

meneses · on Aug 5, 2015

Good job. It would be nice if benchmarks could be arranged by the gains in ascending order.

muP · on Aug 5, 2015

Is there a similar collection for other languages, like Python?

supergeek133 · on Aug 5, 2015

Super interesting... thanks!

robertpohl · on Aug 5, 2015

Good read!