If you assume an A100 lives at the cutting edge for 2 years, that's about a million minutes, or $0.01 per minute of amortized HW cost.
In the crazy scenarios, I've heard 10 A100s per query, so assuming that takes a minute, maybe $0.1 per query.
Add an order of magnitude on top of that for labor/networking/CPU/memory/power/utilization/general datacenter stuff, you get to maybe $1/query.
So probably not $10, but maybe if you amortize training, low to mid single digits dollars per query?
If you assume an A100 lives at the cutting edge for 2 years, that's about a million minutes, or $0.01 per minute of amortized HW cost.
In the crazy scenarios, I've heard 10 A100s per query, so assuming that takes a minute, maybe $0.1 per query.
Add an order of magnitude on top of that for labor/networking/CPU/memory/power/utilization/general datacenter stuff, you get to maybe $1/query.
So probably not $10, but maybe if you amortize training, low to mid single digits dollars per query?