When we started the Numerical Elixir effort, we were excited about
the possibilities of mixing projects like Google XLA's (from Tensorflow)
and LibTorch (from PyTorch) with the Erlang VM abilities to run concurrent,
distributed, and fault-tolerant software.
I am very glad we are at a point where those ideas are coming to life
and I explore part of it in the video. My favorite bit: making the tensor
serving implementation cluster distributed took only 400 LOC
(including docs and tests!): https://github.com/elixir-nx/nx/pull/1090
I'll be glad to answer questions about Nx or anything from Livebook's launch week!
Nothing specific about this project, and don't feel obligated to respond, but I just wanted to thank you for all the work you've done with Elixir and the related ecosystem. Great language, great tools, and a helpful, welcoming community. It was a perfect introduction to practical functional programming.
Haven't found a big project for it yet, but I've done a bunch of little side projects since a friend who worked at Appcues gave me the hard sell on it around 2018.
The distributed ML currently seems focused on model execution. I see another commenter's excitement about "Looking forward to NX transformations that take distributed training next level." -- which I agree, will be quite interesting.
Where / how do you see Nx being used effectively in distributed training? Is distributed training a reality for open-sourced models to compete against big tech models?
For distributed training, one important feature is to be able to do GPU-to-GPU communication, such as allreduce, allgather, and all2all. Those are not supported at the moment but they are in our roadmap. At this level, however, it seems the language runtime itself plays a reduced role, so I don't expect the experience to be much different to, say, Python/JAX.
For the second question, my understanding is that all big tech models rely on distributed training, so distributed training is a requisite for competing really.
Do you ever think about why you’re probably a 100x programmer, is it just working memory and pure intelligence or some strategy or tactics that make you so good at this. Asking for a friend :-)
Is anyone working on audio libraries that will enable streaming audio chunks for Whisper processing? Saving audio files into a local file system, running ffmpeg to chunk, and then sending them off to Whisper is very tactical..
The current pipeline expects PCM audio blobs and, if data is coming from a microphone in the browser, you can do the initial processing and conversion in the browser (see the JS in this single file Phoenix app speech to text example [0]).
On the other hand, if you expect a variety of formats (mp3, wav, etc), then shelling out or embedding ffmpeg is probably the quickest path to achieve something. The Membrane Framework[1] is an option here too which includes streaming. I believe Lars is going to do a cool demo with Membrane and ML at ElixirConf EU next week.
I am using it for this talk I am putting together for ElixirConf EU so if you want it used in context that might be helpful: https://github.com/lawik/lively
Neither is release-worthy levels of polish but if interest is there I should make a proper library out of it.
That is to say streaming chunks works great already. I would love two things. Stitching the edges of the chunks, would probably need to do overlapping for that. And building chunks based on silence. That's more DSP than I know though.
Hey Lars! Building chunks on silence is a worthy cause! Why stitch the edges of the chunks? Is that because there isn't a clean chunk on silence?
I think this work is very important. I don't understand whether I actually needed to install the library dependencies for Membrane's sake or specifically for this use case (mad, ffmpeg, portaudio). Doesn't feel right..
You may be able to incorporate the [Membrane Framework](https://membrane.stream/) to do that. Built in Elixir, deals in those types of multimedia problems.
I'm not an expert here, but I'd expect that capturing a sample using Membrane and piping it into Whisper should be doable.
Even reading the blog, after installing the windows app it's not obvious how to get to the machine learning demos.
Also, after I found the +smart button from another page, on windows it fails due to lack of make (and presumably a set of compiler tools). This was frustrating trying to demo for someone on their computer.
It is bonkers how little code and need-to-now is necessary to deploy cutting edge models in an Elixir app these days.
I didn't realize until a recent side project just how much progress had been made in Nx until I started implementing parts of Nx Serving myself only to find the Nx libraries already have distributed batched serving, faiss, pg_vector support and more.
Makes me want to quit all work obligations to hit the books and build product with Nx.
It's pretty clear that Joe Armstrong respecting that the speed of light is a thing and that data locality/data gravity are real is starting to pay off in big ways.
I do wonder if maybe streaming large data chunks over Erlang distribution might be a problem and a secondary data channel (e.g. over udp or sctp) might be worth playing with.
Looking forward to NX transformations that take distributed training next level.
> I do wonder if maybe streaming large data chunks over Erlang distribution might be a problem and a secondary data channel (e.g. over udp or sctp) might be worth playing with.
You may want to take a look at the partisan[0] library written in Erlang. It is basically that, a reimagination of distributed Erlang, except that it can be multiplexed over multiple connections.
Yeah but partisan gives you a "ton of stuff you might not need" plus the point is to treat distribution as a control plane and separate concerns from the data plane. There used to be things to worry about using Erlang distribution in general -- irrespective of backend, iirc, like HOL blocking (I think those are resolved now).
> It's pretty clear that Joe Armstrong respecting that the speed of light is a thing and that data locality/data gravity are real is starting to pay off in big ways.
I'm familiar with Joe Armstrong and Erlang/Elixir, but do you have a particular reference in mind where he was specifically discussing this? Is it one of his papers or talks? Just looking for another interesting thing Joe Armstrong said or thought. :)
I don't have a reference offhand, but I have seen it. It's mostly a vibe. Remember that Joe was a physicist before a programmer: the synchronicity problem is pervasive in the design of the platform. Local immediate access to data is generally a special cased situation via an escape hatch with tons of big red warning signs.
I have used Elixir for few personal projects. I would like to know if there is a limit of where network latency becomes too much compared to the overall ML execution?
When we started the Numerical Elixir effort, we were excited about the possibilities of mixing projects like Google XLA's (from Tensorflow) and LibTorch (from PyTorch) with the Erlang VM abilities to run concurrent, distributed, and fault-tolerant software.
I am very glad we are at a point where those ideas are coming to life and I explore part of it in the video. My favorite bit: making the tensor serving implementation cluster distributed took only 400 LOC (including docs and tests!): https://github.com/elixir-nx/nx/pull/1090
I'll be glad to answer questions about Nx or anything from Livebook's launch week!