Hacker News new | past | comments | ask | show | jobs | submit login
How to Make More Money Renting a GPU Than Nvidia Makes Selling It (nextplatform.com)
124 points by ankitg12 5 months ago | hide | past | favorite | 59 comments



The math seems off to me here. In particular:

>> So here is the deal: If you have 16,000 GPUs, and you have a blended average of on-demand (50 percent), one year (30 percent), and three year (20 percent) instances, then that works out to $5.27 billion in GPU rental income over four years with an average cost of $9.40 per hour for the H100 GPUs.

This makes a very strong assumption that the rental cost of an H100 will not change over the next four years. This is wildly optimistic. Instead, we can infer expected prices by looking at the differential rates for one and three-year commitments:

>> We estimated the one year reserved instance cost $57.63 per hour, or $7.20 per GPU-hour, and we know that the published price for the three year reserved instance is $43.16, or $5.40 per GPU-hour.

On the margin, the cloud provider should be indifferent between an on-demand rental, a one-year reservation, and a three-year reservation. That implies that three consecutive one-year reservations should provide about the same income as the three-year reservation.^[1]

Someone who places a three-year commitment will spend $16.20 per hour in one year, over three years. The one-year commitment is $7.20 per hour in one year, over one year. Subtract the two, and we get the residual of $9.00, and divide by the two years remaining in the contract to get $4.50.

With this rough calculation, the two-year, one-year forward price of the H100 is about $4.50/hr. If we further assume that the price changes per year with a constant ratio (0.72), we can break that up the per-hour, one-year reservation prices as $7.20 (today), $5.22 (one year from now), and $3.78 (two years from now).

Going further into speculation and applying this ratio to rental revenue on the whole, that "$5.27b over four years" instead becomes $3.47b. Still a reasonable multiple of the purchased infrastructure cost, but it's less outlandish and emphasizes the profit potential of moving first in this sector (getting those long-term commitments while it's still a seller's market).

[1 — I'm ignoring the option-value in the one-year commitment, which allows the customer to seek a better deal after twelve months. This option-value is minimal of the GPU cloud is expected to be at-capacity forever, such that the provider can replace customers easily.]


The H100 prices seem wrong based on today's prices. I'm looking at our data on H100 prices across ten cloud providers and the H100 on-demand prices range between $3-6 hr, the average being $3.83 USD.

The p5.48xlarge with 8 H100s is indeed $12.29 per GPU-hour as the article states, but their claim that the blended hourly rental price for H100s could be $9.40 USD over the next 4 years appears ludicrous.


These prices are insane- you can do much, much better than those prices if you negotiate. If you're paying $7.20 per GPU-hour for an H100, you should find another supplier.


The rule-of-thumb over a few decades is that enterprise initial quote pricing is usually 200% of the lowest negotiated price. That final price should be between 9% (PC commodity margins) to 50%+ profit over costs of sales.

Also, I really would like to know what the real (unit) costs of goods of H100's would be, but that's proprietary info no one outside of Nvidia execs and accountants would ever see.


Yeah, it's absurd. Current market pricing is more like $4.50 for hourly on-demand and around $2 for long term reservations.


Agreed, that’s what I’m seeing too.


There are already providers today which claim to give you H100 access at $2-4 per hour, with little commitment.

So those prices seem wildly optimistic.


Great article. I'm in the process of building this business myself, so I'm intimately aware of everything in the article. Just keep in mind that a lot of the math is back of the napkin guesses. All of this is managed on personal relationships and deals with the vendors and much of the pricing that actually gets set, isn't made public.

Our first offering is the AMD MI300x instead of Nvidia or Intel products. But, unlike all of my competitors, we are not picking sides. Our unique value prop is that we eventually plan to offer the best of the best compute of anything our customers want. Effectively building a varied super computer for rent, with bare metal access to everything you need and white glove service and support. In essence, we are the cap/opex for businesses that don't want to take on this risk themselves.

What people don't understand is how difficult and capital intensive to deploy and manage large scale compute, especially on the high end cutting edge. Gone are the days of just racking/stacking a few servers. This stuff is way more complicated and involved. It is rife with firmware bugs, limitations, and hardware failures.

The end of the article says some nonsense about a glut of GPU capacity. I do have to call that out. It isn't going to happen for a long while at least. The need for compute is growing exponentially. Given the complexities of just deploying this stuff, it isn't physically possible to get enough of it out there, fast enough, to satisfy the demand.

I love every challenge of this business. Happy to answer questions where I can.


> It is rife with firmware bugs, limitations, and hardware failures.

Is it? I've been out of the space for a while but I have only seen firmware bugs/limitations/failures rise to strategic importance with AMD installs, which is why the space has such a strong preference for NVidia despite spec sheet cost disadvantage.

> Our first offering is the AMD MI300x instead of Nvidia or Intel products.

Oh no. I'm sorry.


> Is it? I've been out of the space for a while but I have only seen firmware bugs/limitations/failures rise to strategic importance with AMD installs, which is why the space has such a strong preference for NVidia despite spec sheet cost disadvantage.

You're right. I'm not doing this the easy way, at all! Is there any easy way though?

Without a direct connection to Jensen, there is a 50+ week lead time on Infiniband, so what other options do I have? Who wants to compete with $1.1b worth of orders from CoreWeave?

I'm fine with the bet on Lisa Su and I'm not trying to beat CoreWeave. I'd just like to carve out my own niche and I think there are enough people out there that want compute, that they are willing to look beyond just a single provider for everything. What fortune 500 puts all their eggs in one basket anyway?

Oh and one more thing... this business isn't just about AMD/Nvidia. There are 1000 other components in the system, all with their own issues. We just discovered VRF is not fully working on our very expensive 400G switches running Sonic.

Cheers.


> Without a direct connection to Jensen, there is a 50+ week lead time on Infiniband

Oof, that's new to me, sounds like you're between a rock and a hard place. Here's hoping you get the wrinkles ironed out and make it to the other side, the ecosystem will be certainly be better for it.

Cheers.


>Our first offering is the AMD MI300x instead of Nvidia or Intel products.

But why? Is there even a market for this? People want GPUs for DNN training and inference and the AMD ecosystem for that is absolute trash. Basically unusable at scale.


I'd like you to back up that statement with some sort of basis. Given that there are zero reliable and detailed public benchmarks on MI300x, how would you know it is unusable at scale?

On the other hand, we've got two hyperscalers offering these things, at scale.

https://techcommunity.microsoft.com/t5/azure-high-performanc...

https://www.tomshardware.com/news/amd-scores-two-big-wins-or...

Oh and I have 3 other competitors in the smaller CSP realm offering MI300x, that I know of today.


> the AMD ecosystem for that is absolute trash

> there are zero reliable and detailed public benchmarks on MI300x

I not very familiar with the space, but as an outsider looking in your datapoint appears to support the parent poster's opinion that the space is not mature. Wouldn't a complete lack of public benchmarks suggest that the ecosystem just isn't there yet?


They didn't say it wasn't mature, they said it is absolute trash.

My point is to agree that it isn't mature. Why isn't it mature? Well... nobody has access to these things. Today, they are only available behind HPC / hyperscaler walls. In order to gain some maturity, someone needs to offer them out in the wild. That's where I come in.


I challenge you to try to get it working and train a basic LLM yourself on multiple AMD gpus. It is far worse than the situation with Nvidia GPUs/Drivers and Linux circa the mid 2010s. Which is saying a lot...

But don't take my word for it. Geohot, maybe one of the best hackers of our generation can't even it working: https://wccftech.com/tinycorp-ditches-amd-tinybox-ai-package...

It IS trash and it's clearly NOT mature. These are not mutually exclusive. There have been attempts to get AMD to open source their firmware to let people work on it since they themselves seem disinterested in doing so. Until then AMD is not a real choice for serious ML work.


Quoting George is a bad look and instantly discredits you as a parrot.

Did you know... George has never used a MI300x and refused my multiple offers to let him try mine, calling me an "AMD shill." If you know me personally, that is hilariously the furthest thing from the truth one can get.

Again, before you can challenge me, I have already challenged you to back up your claim that "it is trash." Quoting George certainly is not that.


Fair enough.


Just adding some information, the article claims Tesla has 15k H100s, but they actually have 40k H100s and 85k by the end of the year. https://www.reddit.com/r/NVDA_Stock/comments/1cbwvnr/tesla_4...

As usual need to have a bit of doubt on the third party estimates.


Meta plans to have 600,000 "H100 equivalent" compute capacity by the end of the year. https://www.datacenterdynamics.com/en/news/meta-to-operate-6...

Microsoft, Alphabet, Amazon have similar plans. They make up 40% of Nvidia sales.


AI boom combined with bottlenecks in foundry capacity and advanced packaging has created contango in GPU sales. Value of old H100 has gone up after sales.

GPU supply-demand curve is nowhere close where Nvidia would like it to be. Demand is so high that Nvidia would make at least 2X profits if it could sell 3X more GPU's. TSMC just cant build new fabs fast enough.


I'd say this is the perfect moment to quote Taleb:

"I've seen gluts not followed by shortages, but I've never seen a shortage not followed by a glut."


I suggest you try buying some out of print blu-rays.

Often there's enough demand to make the cost of used copies go sky-high, but not enough demand to make a new print run profitable.

(This applies double to out of print video games.)


Assuming demand is more or less constant.


The used GPU market is also being driven likely by grey market sales to China as they are prevented from buying directly by export regulations. Some random ebay seller is not going to be checking if the person they are selling to is some sort of complex front for a mainland China based company.


Great article. If this sustainable arbitrage exists from renting gpu time instead of selling and shipping gpus, why doesn't nvidia become a cloud provider itself?


Why are Ford and Enterprise Rentacar different companies?

(ok, this does actually require a substantial amount of business theory to explain, but the shorthand is "core competence". And different risk profiles.)


Because Enterprise don't have to offer you a Ford? I'm in the UK and your chances of getting a Ford are slim to none.


If elon gets his way with his idiotic robotaxi idea, Tesla would be the equivalent of "nvidia being a cloud provider".


> why doesn't nvidia become a cloud provider itself?

DGX Cloud exists --- Nvidia sells the GPU to a hyperscaler and then builds a cloud on top of it and charges a markup.

Get's paid twice


Simple, because Nvidia can charge twice for the same GPU that way.

Building a cloud data center is cost first and then you earn money by rent.

But if you seel the GPU to a CSP then the cloud provider has the cost and needs to rent while you made immediately money on the GPU.

But what about getting paid continously? Well simple, the end customer is then cloud but the cloud customer. So develop a SW for your GPU which then has a license to it. So the end cloud customer will pay rent to the cloud provider and license fee to the GPU maker. That's Nvidia's business model for now.

Eventually, the demand for GPU sales will drop but by that time, Nvidia will have a large install base of end customers using their enterprise SW. Then Nvidia will start becoming a competitior to CSPs and will easily do so because they can deploy data centers at cost and give better pricing. But today, Nvidia needs the CSPs to spread their SW. The primary goal is to reach all the end customer via DGX cloud and to create mindshare and moats. CSPs are only a means to an end of Nvidia as they could easily out compete them on AI compute if Nvidia wanted to to.


Nvidia is fond of and successful at biting the hand that feeds it, so maybe...


Elaborate?


Apple and EVGA maybe?


Why doesn't every chip vendor pull a reverse-Apple and replace all the OEMs in the value chain?

They don't do this because they'll annoy all those folks they're trying to replace (see also 3dfx's move from selling chips to being a card vendor as precedent), and have to build a whole set of skills, relationships, and infrastructure they don't have, and there's risk involved in all of that.

See also why didn't nVidia (and any other GPU vendor) just mine cryptocurrency instead of selling cards during the last boom.


They're hiring for roles that look like cloud and infrastructure engineers- although that may be for internal infrastructure for training their models.


GeForce NOW is kind of that.


It's not an impossibility that they could eventually do that - but it would be very difficult for them to do for a number of reasons that other folks have listed.

Bitmain, a manufacturer of bitcoin mining hardware, was using their machines to mine bitcoin before eventually shipping them to customers.


This was a horrible business practice that they got called out heavily on. When I was mining Litecoin, we received a bunch of units that were obviously used. They would claim they were being "tested" first... uh huh.


What were the indications of prior usage?

Edit: @latchkey oh, how delightful! :-/


Swapped fans and insides of the boxes filled with dust crap.


It's not a risk-free investment, you're basically making a bet on future GPU demand. If demand dips, you're left with a bunch of hardware gathering dust (or no longer operating at a profit).


There’s speculation that they could become one should it be a sustainable for them to do so. I don’t think it will be.


They've invested heavily in CoreWeave.


So should we buy CoreWeave and Lambda shares after IPO or not?

For enthusiasts, even renting from Runpod, Salad, Vast.ai which are an order of magnitude cheaper than established cloud providers is much more expensive than buying a Rtx 3090 or 4090. Which got me thinking why don't companies who need training don't put their money together, buy some H100 and share them?


> Which got me thinking why don't companies who need training don't put their money together, buy some H100 and share them?

Because that's a lot of capex to take on, not to mention accounting hassles, depreciation, and install/maintenance expenses. Your average bunch of startups does not want to tie up a giant chunk of money in hardware, and even most established companies don't want to either, hence the giant "move to the cloud" epidemic of the last years. On top of that, you need some serious expertise to properly wire up an AI training cluster so you can actually use the GPUs to their full potential, and such experts are rare.

The problem is that MBA beancounters don't have a way to put a number behind "what if the cloud provider suddenly decides to terminate us", "what if creds leak and someone mines shitcoins for a week or two", or (for everyone outside FVEY) "what if the NSA decides to compel the cloud provider to turn over our server data containing our trade secrets", and so this gets all ignored.


This is a good answer. Don't forget that H100 class compute is US export controlled and there are some hard limitations we sign agreements on, talking about what they can and cannot be used for (like building bombs and such).

CSP's are taking on that KYC risk too.


Fat citation needed on "order of magnitude cheaper".

OCI is a hyperscaler/established provider and offers GPUs for quite cheap compared to the others. They are competitive with Runpod, Vast, and Salad, especially if you negotiate (which everyone who knows this business does).


They will when federated learning kicks off


Nvidia is a manufacturer. They’ve read all the books of all of the giants of logistics. They know that throughput means nothing if it doesn’t include sales. They know how to reduce in-process work and get from raw materials to boxes on store shelves as quickly as possible.

Which is to say: they know inventory is a liability and they know how to get rid of it.

Renting out equipment is an inventory management problem, which manufacturers don’t understand on purpose. That’s somebody else’s domain.


> there are only 35,064 hours in a year with 365.25 days

Wut. 96 hours per day? I don't trust the maths in this article.


Jeez, didn't pick that out when reading and it seems an egregious error. But they probably got mixed up with how many hours in 4 years because the 4 year timeline is used throughput the article.


Probably done with GPT.



Is there a thread there, or just that one tweet?


Twitter does not show you the full thread unless you are logged in.

It's very very stupid


So, is there a thread there, or is it just the one tweet?


> The extracted value figures are opportunity costs, not direct revenue.

What does this mean?


Utilization is not going to be 100% --- ever. Discounts are often given, and servers break down, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: