I have no idea what inference means but I hope it happens - and perhaps it will ...

tinco · on Sept 11, 2023

With inference they mean the dominance in purchasing will change from the producers to the consumers of machine learning models. Right now everyone is buying hardware to produce machine learning models (aka training) and at some point the author predicts the market will shift to buying hardware to consume (run inference) machine learning models.

I don't think I agree this is a significant shift that is guaranteed to happen. It might happen that we will go over some sort of hump where there's less training happening than there was at the top of the hump, but who knows when that hump will be? It's such a new field and there's so many low hanging fruit improvements to be made. We could train new models for years and have steady significant improvements every time, even if there's no fundamental breakthrough developments on the horizon.

And even if there was a cooldown on new training, training is so many orders of magnitude more expensive than inference that the inference demand would have to be extreme in the face of a very unrealistically rate of training for inference to be dominant.

pixl97 · on Sept 11, 2023

Yea, if the author is only thinking about text data, then maybe they'd have a point. But the world in which 'intelligence' exists only a tiny bit textual. Visual then audio data represent most of what humans interpret. And who knows what continuous learning will look like.

If you believe we are moving towards the more 'star trek' like future of AI where AI observes and interprets the world as humans see it and experience it, a massive amount of compute is still needed for the foreseeable future.

If you believe we are capping out on AI capability soon for some time, then you'll see AI as more of part of the "IBM toolkit" offered as an additional compute service and it will more likely 'fit' in our existing computer architectures.

krallistic · on Sept 11, 2023

"Inference" - getting the predictions out of the model. While training you need to run: Input -> Model -> Output (Prediction) - Compare with True Output (Label) -> Backpropagation of Loss through the Model. Which can highly batched & pipelined. (And you have to batch to train in any reasonable amount of times, and GPUs shine in batch regime)

When a single user request comes in, you just want the prediction of that single input, so no backprogation and no batching. Which is more CPU friendly.

syockit · on Sept 11, 2023

Wow, now I learned something new. So even though statistics and machine learning overlap each other a lot, a word as simple as inference have totally different meanings. In statistics, it usually refers to determining the influence of an input, for a multi-input model. Getting predictions is simply called prediction.