Making each webapp target & optimize ML for every possible device target sounds terrible.
The purpose of MLIR is that most of the optimization can be done at lower levels. Instead of everyone figuring out & deciding on their own how best to target & optimize for js, wasm, webgl, and/or webgpu, you just use the industry standard intermediate representation & let the browser figure out the tradeoffs. If there is inboard hardware, neural cores, they might just work!
Good to see WebML has OpenXLA on their radar... but also a bit afraid, expecting some half ass excuses why of course we're going to make some brand new other thing instead. The web & almost everyone else has such a bad NIH problem. WASI & web file apis being totally different is one example, where there's just no common cause, even though it'd make all the difference. And with ML, the cost of having your own tech versus being able to re-use the work everyone else puts on feels like a near suicidal decision to make an API that will never be good, never perform anywhere where near it could.
> Making each webapp target & optimize ML for every possible device target sounds terrible.
Yes it does.
Did something I said imply that?
OpenXLA is an intermediate layer that frameworks like PyTorch or JAX can use. It has pluggable backends, and so if there was a web-compatible backend (WebGL or WASM) then everyone could use it and all models that were built using something that used OpenXLA[1] would be compatible.
[1] Not 100% sure how low-level the OpenXLA intermediate representation is. I know it's not uncommon when porting a brand new primitive (eg a special kind of transformer etc) to a new architecture (eg CUDA->Apple M1) that some operations aren't yet supported, so this might be similar.
I support having web targets. It'd be a good offering.
But it feels upside down to me from what we really all should want, which is a safe way to let the web target any backend you have. WebGPU or WebGL or wasm are going to be OK targets, but with limited hardware support & tons of constraints that mean they won't perform as well as openxla.
Also how will these targets get profiled? Do we ship the same WebGL to a 600w monster as a rpi?
There's a lot of really good reasons to want OpenXLA under the browser, rather than above/before it.
> WebGPU or WebGL or wasm are going to be OK targets, but with limited hardware support & tons of constraints that mean they won't perform as well as wasm.
I don't understand. "WebGPU or WebGL or wasm".. "won't perform as well as wasm".
The purpose of MLIR is that most of the optimization can be done at lower levels. Instead of everyone figuring out & deciding on their own how best to target & optimize for js, wasm, webgl, and/or webgpu, you just use the industry standard intermediate representation & let the browser figure out the tradeoffs. If there is inboard hardware, neural cores, they might just work!
Good to see WebML has OpenXLA on their radar... but also a bit afraid, expecting some half ass excuses why of course we're going to make some brand new other thing instead. The web & almost everyone else has such a bad NIH problem. WASI & web file apis being totally different is one example, where there's just no common cause, even though it'd make all the difference. And with ML, the cost of having your own tech versus being able to re-use the work everyone else puts on feels like a near suicidal decision to make an API that will never be good, never perform anywhere where near it could.