Improving Ruby Performance with Rust

pmontra · on Nov 23, 2017

If one needs speed one does everything it takes. That said, I read the code of the example. It's short and even if I don't know Rust I've been paid to write C in the 90s. They look similar enough.

I see we're back to stuff like

    let ptr = path.as_ptr();

    let c = unsafe { *ptr.offset(i) };

    if c == SEP {
      return &path[(i + 1) as usize..end];
    };

It's as low level as it can get. I remember that I wrote a small web app in C in 1994 but I wrote the next one in Perl and never looked back. It was some hundreds lines, countless core dumps and many hours vs much less code and pain and time.

Again, Ruby and Python are all about connecting those small pieces of C code that implement their builtin and library methods / functions. Writing them in C or Rust makes no difference. I personally won't write code in those kind of languages again unless I find myself in a scenario where CPU time is worth more than my time. Maybe we'll be back to that with functions in the cloud billed by the millisecond. Programming is going to be a pain again. At least modern languages like Rust don't crash like C and have a better concurrency story. Meanwhile I'll keep using Ruby and Python with C or Rust extensions from GitHub written by somebody else (a big thank you!), and Elixir where concurrency matters.

__s · on Nov 23, 2017

    let mut bytes = path.bytes();
    if let Some(end) = bytes.rposition(|&x| x != SEP) {
        if let Some(idx) = bytes.rposition(|&x| x == SEP) { &path[idx..end] } else { &path[..end] }
    } else { "" }

There are other ways to do this, like using a single iterator where we scan manually, but I just threw this together to demonstrate that this code needn't use unsafe, & by extension needn't use pointers. All that unsafe stuff is more so that if you need to drop down to C from safe-Rust you can instead use unsafe-Rust

sambe · on Nov 23, 2017

The real Rust basename implementation is not so low-level:

https://doc.rust-lang.org/src/std/path.rs.html#1801-1808

I don't see how Rust vs C makes no difference. Both languages have trade-offs and I'm pretty sure that Rust advocates would claim the increased safety of the Rust part as a big advantage, and it will certainly reduce those core dumps. Probably the reason there are so many of those advocates is because they don't have to write such error-prone low-level code. Expressive code is more fun, in my experience.

viach · on Nov 23, 2017

> I personally won't write code in those kind of languages again unless I find myself in a scenario where CPU time is worth more than my time

It is very much possible, that if you run code on a big enough number of CPUs , it'll worth more than an average developer time.

oconnor663 · on Nov 23, 2017

For what it's worth, I don't think that line needs to be unsafe. I think you can just do this:

    let c = s.as_bytes()[i];

In unoptimized/debug mode that's going to do array bounds checks that the unsafe code doesn't, but I think the compiler is smart enough here to see that you're doing the exact same checks in the loop condition too, and it can optimize them out in release mode?

rapsey · on Nov 23, 2017

Of course it looks like C, it's using a C interface to interface with Ruby. Normal Rust does not look like that. Your comment is quite frankly entirely pointless.

whyever · on Nov 23, 2017

Your comment is misleading. The code in question is inside a function of the signature

    extract_last_path_segment(&str, &str)

This is pure Rust, no C interface involved.

sambe · on Nov 23, 2017

Per my comment above linking the real implementation: the comment is perhaps misleading about the cause but its claim is largely true. Rust code tends not to look like this.

whyever · on Nov 24, 2017

Well, unsafe Rust does tend to look like this. You were claiming that it is only for interfacing with C code, but unsafe is also used for performance, as in the snippet above. You were IMHO incorrectly dismissing the parent comment based on that. (On the other hand, the code in question could probably rewritten in safe Rust while still avoiding the bound checks. It would be less straight forward though.)

vvanders · on Nov 23, 2017

Not just Ruby, Python too.

Was doing some work of converting [0,1] signals to 16-bit PCM data with 550Hz tone in numpy.

Python version took ~15 minutes to generate 5,000 4 second files. Broke out the inner loop into Rust with FFI via ctypes and cut that time to ~10 seconds with nearly identical code.

kodablah · on Nov 23, 2017

Not just Python, JVM langs too (and other runtimes w/ fast C interfacing, e.g. node w/ neon). JNI is actually incredibly easy with Rust. I've even used some advanced features of the JVM via JVMTI and exposed it via JNI all in Rust as a learning exercise [0]. Contemplating writing a fuzzer using similar tech soon.

0 - https://github.com/cretz/stackparam

fulafel · on Nov 23, 2017

A 100x difference to native and having an inner loop in Python code sound exceptionally slow for a numpy app. I guess you could not leverage numpy array operations?

vvanders · on Nov 23, 2017

Nope, I agree that base numpy operations are snappy. However I needed to both multiply by a sin wave of 550hz(changing input) and taper the edges of the signal on the transition from 0->1 to not generate a ton of harmonics.

gravypod · on Nov 23, 2017

Have you tried using Julia for this task? It's compiled and provides many of the math-y facilities you'll need for something like this (I'd assume).

vvanders · on Nov 23, 2017

Yeah, I've heard of Julia before but looks like it doesn't have a very clean[1] FFI story. I'm not super-interested in spinning up a full-blown interpreter for an inner loop. Plus I already had Rust installed :).

Seriously though Rust was 43 lines of code + cargo build vs 33 lines of Python and a simple C FFI so it was pretty straightforward to drop in.

https://docs.julialang.org/en/release-0.4/manual/embedding/

chroem- · on Nov 23, 2017

I think the whole idea behind using Julia in this situation would be that you get good enough performance "out of the box" so that you don't have to resort to using an FFI in the first place.

vvanders · on Nov 23, 2017

But that's also the brilliance of Rust supporting a clean C ABI.

I can write in anything I want since every language supports C ABI for system calls.

I don't want to have to learn a whole new language and new API at the same time.

Also a cursory glance at Julia's numpy support looks like it requires deep copies which is painful when you want things to be fast.

I'm sure it's a great language but needing to drag in a whole new VM means it doesn't fit my needs.

gravypod · on Nov 23, 2017

Julia is jit compiled by llvm to your native machine code. Correctly written Julia is within 1.5 the speed of C.

It removes the complexity of needing to do the FFI at all.

vvanders · on Nov 23, 2017

You make it sound like ffi is a nasty 3 letter word ;).

It look me 3 lines of code to import the dll, assign the signature and call the function, I'd hardly call that complex.

mlevental · on Nov 23, 2017

this is probably a dumb question but did you just compile the rust binary and then expose the functions? how? I don't remember but does rust face header files?

vvanders · on Nov 23, 2017

Yup, just declare the function like so:

  #[no_mangle]
  pub unsafe extern "C" fn transform_pcm(
        data: *mut c_short,
        len: c_uint,
        ramp_time: c_uint,
        tone_delta_frame: c_float) {

then call from python:

  gen_pcm_dll = ctypes.cdll.LoadLibrary("gen_pcm_rust/target/release/gen_pcm.dll")
  gen_pcm_dll.transform_pcm.argtypes = [ctypes.c_void_p, ctypes.c_uint, ctypes.c_uint, ctypes.c_float]
  gen_pcm_dll.transform_pcm(ptr, len(pcm), ramp_time, tone_delta_frame)

Rust doesn't generate headers(although there are helper libs if you want) and you don't need them to FFI.

the_mitsuhiko · on Nov 23, 2017

Here is how we do it: https://blog.sentry.io/2017/11/14/evolving-our-rust-with-mil...

one-more-minute · on Nov 23, 2017

This is fine for new projects, but OP already has a large Ruby codebase. "Just rewrite the whole thing in Julia and you won't need FFI" isn't a reasonable ask.

Rust's ability to compile to a small C-compatible binary is definitely an advantage here, although Julia will have similar capabilities soon.

jamii · on Nov 23, 2017

> although Julia will have similar capabilities soon.

Do you have a link to further info on that?

one-more-minute · on Nov 23, 2017

See https://github.com/JuliaComputing/static-julia

A lot of this stuff has worked for ages, but it's rough in various places and needs a better interface. I'm not sure to what extent exporting C functions is actually supported in that wrapper script, or whether you'll need to fiddle with compiler options.

coldtea · on Nov 23, 2017

No, but you would have to do with subpar/immature third party ecosystem AND you'd have to port all your Python code to Julia before you can get to resort entirely to it.

_dcwr · on Nov 23, 2017

What did you do exactly and on how much input data? Do you have that Rust and Python code available somewhere or is it too sensitive?

I'm very tempted to compare that with LuaJIT and/or its FFI[0] (just to use structs, not to call any native code) just to see the results.

[0] - http://luajit.org/ext_ffi.html

rplnt · on Nov 23, 2017

Have you looked at cython? Might have been a great compromise. I saw similar gains in my code eventually, but the process to get there was very smooth. At first I could just use original python code and covert it piece by piece (of course sometimes you need to refactor). You can drop GIL too.

kroltan · on Nov 23, 2017

If one is to write such simple code in unsafe mode, might as well just use C!

We have https://doc.rust-lang.org/std/primitive.str.html#method.as_b... for accessing a str bytewise. No need for pointers and `unsafe`.

The whole point of Rust is being safer and higher-level than C, and if you don't want to use its features, there is no improvement over C.

kzrdude · on Nov 23, 2017

There is plenty of improvement over C: generics, ADTs, type inference, and toolchain. Which you can combine with raw pointer hackery if you want.

kroltan · on Nov 23, 2017

Yes, I'm just nitpicking his example code. It might pass the wrong impression to people who never saw Rust before.

rqs · on Nov 23, 2017

Sorry for my ignorance, but since you already doing Rust, why not just use Rust?

I mean, what's the benefit you could get from FFI Rust code into Ruby that you cannot get by directly writing Rust?

ajmurmann · on Nov 23, 2017

It's a lot easier to write Ruby. For lots of things Ruby also had much more mature libraries. Of course there are other benefits to writing everything in Rust as well. Everything is a trade off. It's quite disappointing . I would have liked free lunch...

kenhwang · on Nov 23, 2017

By avoiding rewriting all of an existing ruby project too. This way you can just speed up the slow parts and convert piecemeal when necessary.

Twirrim · on Nov 23, 2017

Depending on the size of the original project, it could take months to rebuild the entire thing in Rust from Ruby.

By leveraging this you could quickly re-write some slow parts in Rust, gaining the benefits with only a few days work. It's certainly worth exploring.

rapsey · on Nov 23, 2017

Rust is a fantastic C replacement. It is nowhere near being a replacement for higher level languages in terms of productivity and ease of use.

bluejekyll · on Nov 23, 2017

I think you might have that slightly incorrect. Rust is a high level language, capable of C and C++ speeds. As you get to know the language, I find it just as productive as Java, Ruby and Python. In some ways more so, because the compiler catches so many bugs before you even run (not to imply you won’t have bugs, just different ones).

YMMV.

kenhwang · on Nov 23, 2017

I find Rust to be about as productive as Scala (which I find to be generally more productive than Java). Both are generally functional, with a powerful type system, with a lot of flexibility in the level of abstraction, with an expansive preexisting ecosystem you could shim in (C or JVM).

However, I don't think Rust is a more productive general programming language than Ruby/Python. I think Rust is more productive when solving systems-level concerns, but most problems aren't systems problems. Crystal is also looking to be a nice compromise for a productive, C/C++ level performance language (if the ecosystem catches up).

Thaxll · on Nov 23, 2017

Crystal is a language that doesn't support multi-threading, talking about performance ...

kenhwang · on Nov 23, 2017

It's also an alpha language, but it seems like they're planning on using M:N threading (like Go/Erlang).

rapsey · on Nov 23, 2017

My work has morphed from Erlang/C to Erlang/Rust and it's mostly Rust atm. I have enough experience to see the good and bad parts of Rust.

The mental effort of writing Erlang is an order of magnitude lower compared to Rust. While Rust is more fun, it does not and can not come close.

I'm not much for Java/Python/Ruby. I don't see those as productive as Erlang so Rust may be closer.

Thaxll · on Nov 23, 2017

Rust as productive as Python lol ... things you read on HN. Even if the compiler catches more bugs it doesn't mean it's more productive, I give you a problem and you have to solve it in Python and Rust, Python problem will be solved much faster because the language has higher construct for day to day programming.

steveklabnik · on Nov 23, 2017

It depends on what you mean. You're optimizing for time to first solve; what about when you find bugs that would have been caught at compile-time in Rust? What about existing codebases, rather than writing new code? It's not that simple of a question.

I find personally that if there are libraries to help me out, I'm only slightly slower in Rust than I am in Ruby, but then there's no debugging time in Rust, and invariably, there will be tons of work I have to do later with the Ruby to shake out bugs.

YMMV.

jgraham · on Nov 23, 2017

FWIW my last project was in Rust, and the current one is in Python (with which I have more experience), mostly due to Python having some useful libraries for what is mostly an exercise in gluing together different external systems.

Despite the fact that I chose to do this in Python, I'm not at all sure I wouldn't have had a working solution faster in a language that had a good type system and a error handling strategy that made errors explicit and forced them to be handled. I'm almost certain that writing in Rust would have ended up with something that was more reliable in the long term.

(FTR I imagine that Go may also have been a reasonable choice for this kind of work, but I haven't used it).

zaarn · on Nov 23, 2017

That's not really my experience, my Rust projects tend to take a lot longer to gain critical features while projects I develop in Go can be MVP way faster.

Rust is quite a good language though, I'll admit that, I'm currently learning it by way of writing a kernel in it. But there are more productive languages out there.

bluejekyll · on Nov 23, 2017

I wonder if your what your working on is a major contributor to the complexity and speed of the project.

When working that low level, there are a lot of lifetime issues your going to need to manage, and that can definitely make Rust have a higher cognitive load.

On a side note have you looked at this blog series? https://www.tockos.org/blog/2017/apsys-paper/

They’re definitely paving an interesting path in the kernel space for Rust.

zaarn · on Nov 27, 2017

One major roadblock I had lately was to manage paging in interrupts. Rust doesn't really like static globals but that is the only way to sanely manage paging in interrupts. So I spent probably a day to figure out the correct way to write a small interface to handle a global kernel state variable.

There is also plenty of unsafe stuff that won't go away or stuff that is unsafe not because of the code but because x86. For example, you can't properly handle a null pointer in rust, however, at low-level, the pointer 0x00 is completely valid and I hate to have to waste that address because Rust and LLVM won't allow it.

The major reason I even bothered to learn Rust for this way because I didn't want to manage all of this in C on top of having only printf as my debugging tool.

adamnemecek · on Nov 23, 2017

Not quite. Rust embraces "zero cost abstractions" which means you can build up pretty high level abstractions without losing any speed. TBH I would much rather write Rust vs Ruby.

rapsey · on Nov 23, 2017

I'm talking about developer productivity not computing efficiency. Those high level abstractions are nowhere near as easy to use or easy to understand as they are in higher level languages.

adamnemecek · on Nov 23, 2017

They are related. I can build higher abstractions in rust than in ruby.

mmun · on Nov 23, 2017

Depending on the application, I might agree. However, if there were a Ruby-with-types language (like TypeScript) I would prefer that over the other two options.

shakna · on Nov 23, 2017

You mean Crystal? [0]

[0] https://crystal-lang.org/

bluejekyll · on Nov 23, 2017

When you become comfortable with Rust and start relying on type inference heavily, Rust often feels very much like Ruby.

ShinTakuya · on Nov 23, 2017

Like Python using mypy?

Doctor_Fegg · on Nov 23, 2017

In a similar vein: https://github.com/phoffer/crystalized_ruby

Write the speed-sensitive bits in Crystal. Aka 'Improving Ruby Performance with Almost-Ruby'.

(Unfortunately development seems to have stopped.)

prh8 · on Nov 23, 2017

Author here. Development is indeed on hold for now. However, Crystal core team has plans to build a DSL into Crystal to allow for creating Ruby native extensions.

There are some technical limitations with how I handled defining functions, which has also become obsolete with recent changes to Crystal. However, there is another approach (macro based) demonstrated at https://github.com/spalladino/crystal-ruby_exts. The macro approach is what is needed.

At time of first development, this was just an experiment and I didn't feel like redo-ing it this way. Now that core team has plans for similar functionality, I'd rather not finish a half baked solution that distracts from what theirs will likely be.

Doctor_Fegg · on Nov 23, 2017

Very intriguing. Look forward to seeing what the core team do.

turboladen · on Nov 23, 2017

Does anyone know of ruru’s status? I’ve used it quite a bit and love it, but haven’t seen any action in the repo in a while. I also pinged the Gitter channel a few weeks ago on the topic and got no response. Its a great tool; would hate to see it die.

steveklabnik · on Nov 23, 2017

The maintainer is in and out, life happens.

rubyfan · on Nov 23, 2017

I might have missed it but is there a particular advantage Rust has over C in this specific use case? Performance, ease of interoperability, etc. or is it preference for Rust over C?

currymj · on Nov 23, 2017

There are lots of advantages in general, especially around multithreaded code, and doubly so for anything with relatively simple data parallelism, where you can just use the rayon library.

Another advantage is that there are Rust packages for Ruby interop -- the article ended up using ruru. There are similar libraries for Python interop -- I believe the most actively developed right now is PyO3.

I'm sure people are sick of Rust evangelists on HN but it really is a wonderful language, and in certain cases it really does seem to empower people to do things they wouldn't have been able to do otherwise.

unrealhoang · on Nov 23, 2017

Safety, I guess. A Ruby developer will likely to shoot themselves in the foot with C's freedom. They can use help from Rust's guidance system.

biokoda · on Nov 23, 2017

History has proven that anyone no matter their abilities is going to shoot themselves in the foot regularly while using C. Some just less frequently.

vvanders · on Nov 23, 2017

Rust also has a much nicer package ecosystem. It's pretty trivial to wrap existing C libraries as well so I find that I can get up and running much faster rather than messing around with makefiles.

bpicolo · on Nov 23, 2017

Yeah, this is the killer feature. cmake is a nightmare. Cargo does what it needs to do and gets out of your way, and provides easy access to the whole ecosystem.

yarapavan · on Nov 23, 2017

github code repo used in the article: https://github.com/danielpclark/bench_ruby_ffi_example

andrewfromx · on Nov 23, 2017

where is rust full implementation of ruby on this list? https://github.com/cogitator/ruby-implementations/wiki/List-... does such a thing just not exist yet? Let's built it tonight!

kibwen · on Nov 23, 2017

Rather than building from scratch, you'd probably just want to start by replacing bits of MRI with Rust one component at a time (assuming MRI is appropriately modularized) and see how far that takes you. :P

steveklabnik · on Nov 23, 2017

Hi http://www.codemesh.io/codemesh2016/steve-klabnik ;)

https://github.com/steveklabnik/ruby/tree/rust

jlebrech · on Nov 23, 2017

I wrote a script in Go and used it in ruby with popen3.

iamleppert · on Nov 23, 2017

Am I really reading it right that he went to the trouble of doing all this to speed up the string parsing of file paths??? Talk about barking up the wrong tree.

I guess its cool to use something as hipster as Rust to speed up as something as hipster (albeit waning in popularity these days) as Ruby.

However, a better approach would be a straight forward implementation of caching (memcache/redis/static CDN) or to simply rewrite that part of the code to not rely on constantly parsing file paths? Or use something like https://github.com/google/re2, you probably won't be able to do any better than that when doing any kind of string matching and parsing.

If you're concerned at all about performance you shouldn't be doing anything with the file system period.