Hacker News new | past | comments | ask | show | jobs | submit login

I have no idea what inference means but I hope it happens - and perhaps it will happen. That being said - things like Sun (i.e. solaris workstations) or Intel (for desktop or recently in the last 10ish years, servers) had the world under their thumb for 10+ years. Thus Nvidia might have quite a good reign ahead of themselves - even if it will eventually fade, like everyone else.



With inference they mean the dominance in purchasing will change from the producers to the consumers of machine learning models. Right now everyone is buying hardware to produce machine learning models (aka training) and at some point the author predicts the market will shift to buying hardware to consume (run inference) machine learning models.

I don't think I agree this is a significant shift that is guaranteed to happen. It might happen that we will go over some sort of hump where there's less training happening than there was at the top of the hump, but who knows when that hump will be? It's such a new field and there's so many low hanging fruit improvements to be made. We could train new models for years and have steady significant improvements every time, even if there's no fundamental breakthrough developments on the horizon.

And even if there was a cooldown on new training, training is so many orders of magnitude more expensive than inference that the inference demand would have to be extreme in the face of a very unrealistically rate of training for inference to be dominant.


Yea, if the author is only thinking about text data, then maybe they'd have a point. But the world in which 'intelligence' exists only a tiny bit textual. Visual then audio data represent most of what humans interpret. And who knows what continuous learning will look like.

If you believe we are moving towards the more 'star trek' like future of AI where AI observes and interprets the world as humans see it and experience it, a massive amount of compute is still needed for the foreseeable future.

If you believe we are capping out on AI capability soon for some time, then you'll see AI as more of part of the "IBM toolkit" offered as an additional compute service and it will more likely 'fit' in our existing computer architectures.


"Inference" - getting the predictions out of the model. While training you need to run: Input -> Model -> Output (Prediction) - Compare with True Output (Label) -> Backpropagation of Loss through the Model. Which can highly batched & pipelined. (And you have to batch to train in any reasonable amount of times, and GPUs shine in batch regime)

When a single user request comes in, you just want the prediction of that single input, so no backprogation and no batching. Which is more CPU friendly.


Wow, now I learned something new. So even though statistics and machine learning overlap each other a lot, a word as simple as inference have totally different meanings. In statistics, it usually refers to determining the influence of an input, for a multi-input model. Getting predictions is simply called prediction.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: