Yes, you are correct! Training benefits much more from available memory through ...

Yes, you are correct! Training benefits much more from available memory through batching and, since in many cases you only need to train once, it usually makes sense to train on beefy GPUs.

TensorFire is useful in situations where you want to perform inference, but you don't want to ship user-supplied data to your servers, either because you would run out of bandwidth, you would run out of compute power, or your users want to keep their data private.