Weld: Accelerating numpy, scikit and pandas as much as 100x with Rust and LLVM

spenrose · on Sept 21, 2019

"the first implementation was in Scala, which was chosen because of its algebraic data types and powerful pattern matching. This made writing the optimizer, which is the core part of the compiler, very easy. Our original optimizer was based on the design of Catalyst, which is Spark SQL’s extensible optimizer. We moved away from Scala because it was too difficult to embed a JVM-based language into other runtimes and languages."

There is an important "contemporary history of computing" article to write about the evolution of the Spark project from "let's build a distributed filesystem for MapReduce in Java because we read those early Google papers" to "SQL is the right model for working with data so DataFrames" to "meet data scientists where they are: Python (and R)" to "make machine learning easy" and now to "LLVM, but for crunching big numeric arrays".

vmchale · on Sept 21, 2019

Not that you should replace Rust with Haskell, but Haskell would've been a better choice than Scala.

It has its own runtime, but it's not difficult to call Haskell code from C or ATS or whatever.

choeger · on Sept 22, 2019

Not difficult to call from C? How does that work, exactly? Wouldn't you need to properly setup the whole runtime (incl. GC) first?

marmaduke · on Sept 22, 2019

Yes but I think the parent meant its 'just' an #include and ghc_init() away.

dpflan · on Sept 21, 2019

Very interesting. Do you have any references to share relevant to the article you suggest to be written?

spenrose · on Sept 22, 2019

No, I based on attending SparkConf between 2015 and 2017. You could probably assemble half of it just by reading summaries of Matei's keynotes.

dpflan · on Sept 22, 2019

Ah, thanks for the suggestion.

paulddraper · on Sept 22, 2019

> it was too difficult to embed a JVM-based language into other runtimes and languages

In addition to the JVM, Scala has had JS [1] and native (via LLVM) [2] targets for years.

(And that's not even mentioning any second-order compilations; e.g. Scala -> JVM bytecode -> native)

There's a number of reasons to not choose Scala, but portability is far from one of them.

[1] https://www.scala-js.org

[2] http://www.scala-native.org

pacala · on Sept 22, 2019

The library ecosystem is different between JVM/JS/Native. Porting across runtimes may require more work than just changing the compilation target.

paulddraper · on Sept 22, 2019

That's true. If you want it across all platforms, you are restricted to the Scala ecosystem, and can't use Java and JS ecosystems.

the_duke · on Sept 21, 2019

See also this interesting talk on Weld at RustConf 2019: https://www.youtube.com/watch?v=AZsgdCEQjFo&t=1430s

westurner · on Sept 21, 2019

There's also RustPython, a Rust implementation of CPython 3.5+: https://news.ycombinator.com/item?id=20686580

> https://github.com/RustPython/RustPython

tomrod · on Sept 21, 2019

Is this basically what Cython and PyPy are trying to do, but with Rust?

sp332 · on Sept 21, 2019

PyPy is a JIT compiler. RustPython is an interpreter.

yorwba · on Sept 21, 2019

And Cython is an AOT compiler for a superset of Python.

RustPython seems to be modestly aiming for a reimplementation of CPython.

laughinghan · on Sept 21, 2019

Maybe "dialect" would be more accurate than "superset"? I don't think Cython is technically a superset of Python, since I think runtime metaprogramming features like __dict__ and monkey-patching are significantly altered or restricted?

infinite8s · on Sept 22, 2019

Cython is a reification of the interpretation of a python program. Ie it converts the python code into the equivalent CPython API calls (which are all in C) thereby allowing the developer to interperse real C code. Anything you could do in python you could technically do in Cython, although it would be much more verbose.

loeg · on Sept 21, 2019

Yeah, I was going to make a similar comment. It's a dialect of CPython, and certainly there are extensions required to make it usable. But I'm not sure it is a strict superset of the full Python language.

sieabahlpark · on Sept 21, 2019

Why didn't they call it Rython

rmilejczz · on Sept 22, 2019

RPython exists (it’s used to implement PyPy) so I imagine that would be confusing

microcolonel · on Sept 21, 2019

Or RytOn ;-)

nostrademons · on Sept 21, 2019

Or IronPython...oh, wait. ;-)

nine_k · on Sept 22, 2019

Fe2Py3

0-_-0 · on Sept 22, 2019

FeOPy3

pard68 · on Sept 22, 2019

Convention wound lean towards calling it RPython

vmchale · on Sept 21, 2019

tmostak · on Sept 21, 2019

Also worth checking out OmniSci (formerly MapD), which features an LLVM query compiler to gain large speedups executing SQL on both CPU and GPU: https://github.com/omnisci/omniscidb . And here's a link to a blog post giving a high level overview of the advantages of JIT compilation of queries over an interpreter: https://devblogs.nvidia.com/mapd-massive-throughput-database... .

nautilus12 · on Sept 22, 2019

That tweetmap is impressive

adrien-treuille · on Sept 21, 2019

This post combines pretty much every technology I'm obsessed with right now: Python, Rust, Pandas, Numpy, and LLVM. Yess!!!

superdimwit · on Sept 21, 2019

If these things interest you, check out Julia

mlevental · on Sept 22, 2019

how do you know someone uses Julia? don't worry they'll tell you.

laichzeit0 · on Sept 22, 2019

The same could be said for every Rust fanboy that feels the need to mention Rust whenever an article about C or anything implemented in C is brought up.

atoav · on Sept 22, 2019

Or every C fanboy brings up C whenever Rust is mentioned.

As somebody who uses both, I don’t understand the whole territorial conflicts in this space. Any C programmer can learn a ton by the paradigm Rust uses. Even if Rust would fade into oblivion tomorrow, the lessons I learned by using it will remain valueable enough for me not to regret having learned it.

C is always going to be needed, given the size of the codebase and the amount of embedded stuff written in it. It comes – however – with a waggon full of dangerous traps and gotchas and in practise very few people are good enough to always avoid them or mitigate the risks created by them. I don’t see any reason why C shouldn’t get better in these areas for the benefit of everyone involved.

L0stLink · on Sept 22, 2019

lets generalize it further: People will take chances to bring up what they are passionate about.

I can relate to this, it is not unreasonable for me.

tempguy9999 · on Sept 22, 2019

No disrespect to you but passion in people can come from deep knowledge of a subject but IME much more commonly comes from lack of experience with alternatives.

I've noticed the younger people are the more likely they are to be passionate, which I put down mainly to not knowing any better. Once one has more of the experience that places you higher up where you can see further, suddenly one's own plot of land doesn't seem so special.

L0stLink · on Sept 22, 2019

Does that justify the snide remarks? We should show more tolerance, there might not be a tool in existence that does not have its disadvantages along with its advantages. Healthy discussion around facts is beneficial, dismissing people for being passionate is not.

tempguy9999 · on Sept 22, 2019

That did sound snide but it wasn't meant that way. I'm not sure how I should have done it better.

> dismissing people for being passionate is not

Seems I didn't say that well either. Passion that comes from knowledge and experience is good. If it comes from inexperience then maybe not so good because it isn't "Healthy discussion around facts" but around existing biases.

Edit: genuinely no offence intended.

L0stLink · on Sept 22, 2019

None taken :), and my comment was about the snide in great (x3) grand parent comment(s). The whole fanboy/vegan/how do you know chain.

integricho · on Sept 22, 2019

The vegans of the programming community.

whoevercares · on Sept 21, 2019

Just a word of caution, always obsess with product and customer needs first :) In ML/data science tech first normally won’t end up well

d33 · on Sept 21, 2019

It's important to enjoy your work, which is - among other causes - about having right tools. Also, some of us actually get to have some influence over what language we write our projects in.

ekianjo · on Sept 22, 2019

I think you misunderstand the parent. The obsession to always focus on tools is I think, what they described. In the end of the day what matters is what you produce, not what tools you used. Nobody cares about what you used, apart from engineers.

nojvek · on Sept 22, 2019

It’s not mutually exclusive. We are craftsmen and craftswomen. Definitely obsess over the problem but also obsess over the tools used to carve out the most usable elegant and efficient solution.

It’s really asking the same questions over and over again. Can we do better ? Does this tool allow me to be efficient, write safer and faster code, how good is the adjacent libraries and ecosystem ? What other kinds of things does it make it possible to solve?

naniwaduni · on Sept 22, 2019

You care.

toss1 · on Sept 22, 2019

Are you coding as a hobby or a profession?

If you are a professional, you will use the most effective tool for the job - to get results. What tool will produce the best results - schedule, budget, quality, maintainability, scalabi, portability, etc.?

Other than outliers that will crush your productivity, or multiply it, your feelings are pretty irrelevant.

Similarly, when you get into a racecar, your feelings about your preferred driving style are irrelevant - if you can change the setup to accommodate your style without slowing jt down, great = but if not, your job is to adapt to the situation and reliably get the best possible result.

Either way, you have fun and produce a crap result, you will not be congratulated (or re-hired), and if you have little fun and produce a great result, you'll get both.

If it's a hobby, do whatever you want.

Obviously, in terms of professional development, you want to use more forward looking tools, but what is the best measure of that - your feelings or results?

susano · on Sept 22, 2019

I am a professional and I do not choose my tools based on the job.

I choose the best tools for me, invest a lot of time in getting better at them, and choose jobs I can do with minimal changes to my toolset.

To expand your analogy, if I have spent the last few years of my life getting better at driving bulldozers, I will not take any job requiring me to get into a racecar.

laichzeit0 · on Sept 22, 2019

> What tool will produce the best results - schedule, budget, quality, maintainability, scalabi, portability, etc.?

Old, boring, mature tools where the limitations are well understood.

nothrabannosir · on Sept 22, 2019

> Are you coding as a hobby or a profession?

This isn’t mutually exclusive, surely. If anything you’d expect it to be quite closely correlated.

turk73 · on Sept 22, 2019

I sort of disagree with your main assertion. I do big data for a living and what I have seen is that our architecture is dictated to us from above for reasons of "fashion" not really for any reasons of practicality.

I'm actually looking for a different job for that reason.

We are required to used Java on K8s, Kafka & Cassandra for every single solution big or small because it is fashionable, not because it gets the job done well or for any other reason. I can even demonstrate how a couple of Python scripts and Pandas could do all the same work with far less overhead and achieve the same results. Crickets. Python is not sexy where I am, it is the language of peasants, apparently. Not sure what to make of it all, but that is my reality right now.

Also, I don't think you know anything at all about driving race cars. The driver has a tremendous amount of input into the car's setup because it's his life on the line out on the track. "Adapting to the situation" gets finishes, not wins.

toss1 · on Sept 22, 2019

Looks like we agree more than disagree. Seems like whoever is deciding on the tools, are failing to do so based on the job/project, instead opting for 'fashion' or whatever.

Good reason to seek a new situation, since you have neither appropriate tools selected for you nor input to select better ones.

Racecars? Yeah, I've only won some SCCA super-regional championships. Yes the driver does have a very large inptut into the setup, BUT it is within the constraint of the combination of the setup change and the improved driver feel must make the combination of car/driver faster. And yes, sometimes a change that makes the car technically a bit slower but gives the driver more confidence will result in faster net lap times -- and those are OK. But whatever the setup is, at the end of the test sessions, whether the car feels great or feels like crap, it's the driver's job to get the most out of it.

And I've had many situations both in the racecar and in international alpine ski racing where something felt weird/odd/unfamiliar/scary, but was fast as heck, so it was my job to adapt, rather than go back into my comfort zone.

Better to keep pushing outside your comfort zone, use tools/setups that get better results, and change your 'feel' to appreciate the better setup.

mkl · on Sept 21, 2019

People aren't all or always working on products or serving customers.

cortesoft · on Sept 21, 2019

Unless they are just talking about personal projects?

stereosteve · on Sept 21, 2019

This project has similar goals to the MLIR project:

https://github.com/tensorflow/mlir

https://www.youtube.com/watch?v=qzljG6DKgic

Exciting times for the future of parallel computing!

mlthoughts2018 · on Sept 21, 2019

Very bizarre there is no discussion of numba here, which has been around and used widely for many years, achieves faster speedups than this, and also emits an LLVM IR that is likely a much better starting point for developing a “universal” scientific computing IR than doing yet another thing that further complicates it with fairly needless involvement of Rust.

https://numba.pydata.org/

sppalkia · on Sept 21, 2019

I'm one of the developers of Weld -- Numba is indeed very cool and is a great way to compile numerical Python code. Weld performs some additional optimizations specific to data science that Numba doesn't really target right now (e.g., fusing parallel loops across independently written functions, parallelizing hash table operations, etc.). We're also working on adding the ability to call Python functions from within Weld, which will allow a data science program expressed in Weld to call out to other optimized functions (e.g., ones compiled by Numba). We additionally have a system called split annotations under development that can schedule chains of such optimized functions in a more efficient way without an IR, by keeping datasets processed by successive function calls in the CPU caches (check it out here: https://github.com/weld-project/split-annotations).

Overall, we think that the accelerating the kinds of data science apps Weld and Numba target will not only involve tricks such as compilation that make user-defined code faster, but also systems that can just schedule and call code that people have already hand-optimized in a more efficient and transparent way (e.g., by pipelining data).

infinite8s · on Sept 22, 2019

Although to be fair, there is no reason why numba couldn't gain those capabilities, it just hasn't been a focus of the project. It should be possible to build a lightweight modular staging system in python/numba similar to Scala's (https://scala-lms.github.io/) or Lua's (http://terralang.org/).

marmaduke · on Sept 21, 2019

Did you read the article? If you know of Numba works, you know it can't just pick up different functions from sklearn and scipy and do interprocedural optimization (IPO). For Numba to do that, it'd need all functions involved to be written in Numba @jit style, whereas Weld would work directly on the pre-existing functions.

Rust is just a IPO driver of sorts here.

I'm not critizing Numba btw, I use it regularly, but your comment seems a little off here, considering that Weld has different goal in mind.

mlthoughts2018 · on Sept 22, 2019

I don’t agree that the purpose of the article is misaligned from my criticism. This is based on reading the article.

unbalancedparen · on Sept 21, 2019

Hi, I am the interviewer. I think I saw numba once but forgot about it. I will check it and probably ask to interview them too. We are preparing interviews about RAPIDS and other similar projects too.

FreakLegion · on Sept 21, 2019

Numba is the option used in Lectures in Quantitative Economics with Python, posted and highly upvoted here yesterday: https://news.ycombinator.com/item?id=21022620.

blts · on Sept 21, 2019

Numba is amazing. +1 for numba

objektif · on Sept 22, 2019

But does it really speed up numerical libraries like numpy and pandas? I thought it only works on pure python code.

axegon_ · on Sept 21, 2019

I have said multiple times that Rust has an incredible potential in the data analysis world. And Weld is a great example.

the_duke · on Sept 21, 2019

Weld is a compiler/JIT/runtime though, something Rust is very well suited for, and which is very different code from data analysis/ML.

I think Julia is a more interesting language for this space, with the built in matrix support, easier prototyping, a REPL, etc...

sppalkia · on Sept 21, 2019

Rust is great, but this is an important comment! We used it to implement Weld's compiler and runtime, but we don't expect data scientists who use languages such as Python, Julia, or R to switch over to it; the idea is that these data scientists continue using APIs in these languages, and under the hood, Weld will perform optimizations and compilation for decreasing execution time (and these "under the hood" components are the ones that we wrote in Rust).

fluffything · on Sept 22, 2019

Would Weld be able to do a better job if these scientist were using a Rust library instead ?

A lot of people would like to use Rust for data-analysis / machine learning, but there are not really any good batteries-included frameworks for getting started with this.

vmchale · on Sept 21, 2019

Lots of languages are well-suited to compilers. See e.g. Haskell accelerate.

I don't really know why you'd use Rust instead of a GC language from the ML family.

janered · on Sept 22, 2019

Ahem... ahem..., D?

rolltiide · on Sept 21, 2019

If only you could get paid for porting open source libraries, maintaining them and trying to get a community to use it

Have fun entertaining your Patreon

axegon_ · on Sept 21, 2019

True that, hence the reason why I do it so little. In fact I personally do it out of pure boredom.

riboflavin · on Sept 21, 2019

Sounds a lot like Gandiva (part of Apache Arrow) as well. https://www.dremio.com/announcing-gandiva-initiative-for-apa.... Cool!

ris · on Sept 22, 2019

So... this requires cooperation from the underlying libraries (numpy, pandas...) - what is the likelihood of said libraries adopting this upstream vs Weld having to maintain their own shadow implementations for the foreseeable future?

Numpy et al of course already have N python acceleration frameworks hammering at their doorsteps to integrate more closely...

sgillen · on Sept 22, 2019

How much cooperation is needed though? It seems to me that all that numpy pandas etc. need to do is maintain a stable API, which they already do AFAIK.

xiphias2 · on Sept 21, 2019

I saw a performance comparision with XLA, and it's interesting that Weld is faster, because XLA is supposed to optimize the code using the known tensor sizes during compile time.

Weld and XLA seem to have similar optimization steps though.

sppalkia · on Sept 21, 2019

XLA and Weld do have similar optimizations -- at their core, one of the main things they do is removing inefficiencies like unnecessary scans over data, common subexpressions, etc. across many operators. The speedup in the benchmark you're referring to actually involved some NumPy code too for pre-processing, and the reason Weld outperformed XLA is because Weld could perform those kinds of optimizations across TensorFlow operators and NumPy functions (whereas XLA only optimizes the TensorFlow part of the application).

I also want to mention that this benchmark is from a while back (around 2017 I believe), so its possible improvements in both XLA and Weld will make the numbers look different today :)

davmre · on Sept 22, 2019

For what it's worth, Jax (github.com/google/jax) now lets you use XLA to compile Numpy code. It'd be cool to see how that would stack up in a modern comparison.

dlphn___xyz · on Sept 21, 2019

whats the benefit of rust over julia or C for computation?

shpongled · on Sept 21, 2019

Not to sound like a member of the Rust evangelism strike force, but after using Rust for a couple years, I don't have any desire to go back to C - sum types alone are worth the switch to me, not to mention iterators, concurrency story, etc.

tclancy · on Sept 21, 2019

I can’t decide if Rust Evangelism Strike Force sounds like an awful or awesome cartoon.

carlmr · on Sept 22, 2019

Rust United Strike Team would have a better acronym.

shaklee3 · on Sept 21, 2019

C++ has sum types as of several years ago with std::variant.

codr7 · on Sept 22, 2019

The way Python has macros, sure :)

You would have to keep everything in variants, or wrap/unwrap manually all over the place to get similar functionality.

And C has tagged unions.

shaklee3 · on Sept 22, 2019

Can you elaborate more? What do other languages have over std::variant/visit?

fluffything · on Sept 22, 2019

That would be like explaining C++ Concepts to an assembly programmer from the 60s that had never used a "function" as a way of abstracting code.

If you really want to know, spend one afternoon learning any programming language with built in support for that (Rust, Ocaml, Haskell, ...). ADTs is one of the first things one learns.

In Rust, the features you'd need to learn are enums, patterns, and pattern matching.

But be warned that using C++ variant and std::visit will feel like you are being forced to only write C instead of C++ for the rest of your life, knowing that life could be much better. Once you learn this, there is no way to un-learn it.

shaklee3 · on Sept 22, 2019

I know how these things work in Rust, and I'm still failing to see your point. It's not at all as complicated as what you are saying given that you can find many blog articles that explain it succinctly in a couple paragraphs.

It's not at all helpful to say that there's something so complicated on these other languages that you can't possibly get the idea across without using them.

fluffything · on Sept 25, 2019

Try doing any of this with variant and visit

    enum A { Foo{ x: i32, y: f32 }, Bar(B), Baz([u32; 4]), Moo(i32), Mooz(i32) }
    struct B { z: f32, w: (f64, f32) }

    let b = B { z: 42.0, ..}; // create a b with z == 42 and default w
    let B{ w: (first, _), .. } = b; // get B.w.0 field
    let a = A::Foo{ x: 42.0, ..};
    if let A::Bar(Bar { z, ..}) = a {
      // if a is an A::Bar, get b.z field of A::Bar(b)
    }
    if let A::Baz([0, 1, 2, 3]) = a {
      // if a is an A::Baz containing an array 
      // with value [0, 1, 2, 3]
    }
    match a {
        A::Moo(1..3 @ v) => {
           // if a is an A::Moo where x in A::Moo(x) is in range [0, 3) and put the value in the local v variable
        }
        A::Moo(x) | A::Mooz(x) => { 
           // Either A::Moo or A::mooz, gives you the value of x
        }
        // ERROR: I forgot to match some patterns
    }

    foo(b);

    fn foo(B{ z, ..}: B) -> f64 { 
      // get the z field of the first function argument
      z
    }

What in Rust are one liners, and can be used anywhere (let bindings, constructors, match, if-let, while-let, function arguments, ...) is a pain to write in C++ using `std::visit` and `std::variant`. The error messages of `std::visit` + `std::variant` are quite bad as well. And well, then there are also other fundamental problems with variant like `variant<int>` having two variants, `variant<int, int>` having 3 variants, but you can't reach the second `int`, etc.

You can translate all the code above to C++ to use std::visit + std::variant instead. I personally find that C++ is unusable for programming like this, and almost never use std::variant in C++ as a consequence, while I use ADTs in Rust all the time.

infinite8s · on Sept 22, 2019

One difference is that std::variant is run-time dispatched and uses memory equal to the max of all the variants, while Rust's sum types could potentially be compile-time dispatched and memory optimized to the exact type being used through various code paths.

atroche · on Sept 22, 2019

Pattern matching syntax (eg in Rust) makes a big difference.

fluffything · on Sept 22, 2019

> C++ has sum types as of several years ago with std::variant.

Haha, that's like saying that C has templates because it has the _Generic macro.

jdc · on Sept 21, 2019

from TFA:

>> We chose Rust because:

>> It has a very minimal runtime (essentially just bounds checks on arrays) and is easy to embed into other languages such as Java and Python

>> It contains functional programming paradigms such as pattern matching that make writing code such as pattern matching compiler optimizations easier

>> It has a great community and high quality packages (called “crates” in Rust) that made developing our system easier.

vmchale · on Sept 21, 2019

Did they have to write the runtime in the same language?

They could've used an ML with GC and it would've been better (for a compiler).

It doesn't really have any functional programming paradigms. Pattern matching is present in imperative languages like past versions of ATS

fluffything · on Sept 22, 2019

> Did they have to write the runtime in the same language?

No, they originally wrote the runtime in C++, but ended up re-implementing it in Rust because the C++ runtime had too many bugs.

> They could've used an ML with GC and it would've been better (for a compiler).

They originally wrote the compiler in Scala with the JVM GC, and they said it was much slower and much harder to embed.

> It doesn't really have any functional programming paradigms. Pattern matching is present in imperative languages like past versions of ATS

When choosing a language for such a project, there are many engineering trade-offs that must be evaluated.

The Weld project has dozens of developers that need onboarding, documentation, examples, tooling, etc. Rust has a lower barrier of entry than ATS. One of the main things Weld does is interfacing with LLVM: this is one of the main things the Rust compiler does and Rust has great libraries for this. Another thing Weld does is interfacing with many dynamic languages (Python, R, etc.). Rust does not just have "C FFI". It also has _a lot_ of great tooling for automatically generating and validating all the boilerplate automatically. Finally, performance and code size of Weld is one of the main advantages over the alternatives. Rust generates reasonable code with LLVM, ATS has its own code machine code generator, which while reasonable, isn't as good.

Finally, it is hard to find people who enjoy writing ATS code. They exist, but are not many. Even when you do find them, they often don't like collaborating with people (I only know one ATS user, vmchale on github, and they don't really like interacting with others). OTOH it is trivial to find lots of people that enjoy writing Rust with others. It doesn't matter if this is due to technical reasons, marketing, or hype, but it's a fact that you have to consider if you want a project to grow fast.

TurboHaskal · on Sept 25, 2019

Did you realise you are responding to the same Github user you're mentioning?

giancarlostoro · on Sept 21, 2019

Rust performance is pretty much in terms of C/C++ performance but promises stability in regards to memory management due to the borrow checker. Rust is very impressive on its own. If you havent taken the time to research Rust because "its yet another language" you really ought to honestly.

mkl · on Sept 21, 2019

The computation here is actually not done in Rust. The Rust code is performing stages of compilation of the original source code, into an intermediate representation that LLVM finishes compiling. The fully compiled code is what does the computation.

juststeve · on Sept 22, 2019

SIMD / intrinsics?

mkl · on Sept 22, 2019

That's part of what such compilation can do, yes.

rch · on Sept 21, 2019

I like the memory safety, concurrency, and zero-based indices.

vmchale · on Sept 21, 2019

This is different from Julia.

I don't really know why you'd write a project of this sort in C.

asjw · on Sept 21, 2019

[flagged]

onei · on Sept 21, 2019

The Rust developers are working hard to ensure that no breaking changes ever appear in the language. As such old code would always be compilable, even if there's better ways to do it with modern features.

Rust learns the lessons that older languages taught us. Why would you think they'd fall into such an obvious pitfall given how long they took to get to v1?

jcranmer · on Sept 21, 2019

Rust guarantees stability on any feature in the stable release of the compiler. And anything in the standard library will remain in there in perpetuity, even if deprecated.

agumonkey · on Sept 21, 2019

in this particular case it's to be a drop in replacement for the python datascientist, the rest is irrelevant I believe

Myrmornis · on Sept 22, 2019

> The motivation behind Weld is to provide bare-metal performance for applications that rely on existing high-level APIs such as NumPy and Pandas.

With regard to Pandas this makes me pause slightly, since, while pandas contains lots of high quality and high performance implementations, the API of pandas in some places doesn’t feel well-designed (the most obvious example is indexing of data frames via square brackets and the various properties like iloc).

syrusakbary · on Sept 22, 2019

This is awesome. The quality of work behind it it's incredible.

I think there might be something interesting for this strategy also in the WebAssembly space :)

objektif · on Sept 22, 2019

Can anyone pls tell me if there are any other tools out there to increase performance of pandas?

alcidesfonseca · on Sept 24, 2019

Modin is an alternative pandas implementation for distributed processing using Ray or Dask:

https://github.com/modin-project/modin

_fwu1 · on Sept 21, 2019

Interesting. It looks like Rust and Swift will be competitors in this field.

xtat · on Sept 22, 2019

This would have made so much of my work so much faster

RocketSyntax · on Sept 21, 2019

you had me at keras

janered · on Sept 22, 2019

Interesting that initial implementation was in Scala but then they switched to Rust because of minimal runtime, language embeddability, functional paradigms, community and high quality packages. The hype bandwagon is so real in here. So basically one could say the same for several other well established languages, e.g. Haskell. Also what saddens me is that everyone forgets about D which has the same benefits and a syntax that does not make scratch your eyes out, especially when it comes to FP. Also D has not actually "skipped the leg day" ;)