Introducing Preemptible GPUs

minimaxir · on Jan 4, 2018

A month ago I ran benchmarks on CPUs vs. GPUs on Google Compute Engine, and found that GPUs were now about as cost-effective as using preemptible instances with a lot of CPUs: http://minimaxir.com/2017/11/benchmark-gpus/

That was before preemptible GPUs: with the halved-cost, the cost-effectiveness of GPU instances now doubles, so they're a very good option for hobbyist deep learning. (I did test the preemptible-GPU instances recently; they work as you expect)

viridian · on Jan 4, 2018

It's also a huge gain to a number of science disciplines that don't have access to a mainframe or supercomputer, but have development resources.

candiodari · on Jan 5, 2018

Yes, because as we all know, university teams don't have access to hardware, but are swimming in money. (obviously the whole point of doing cloud for the cloud providers is that owning the hardware is cheaper)

Whilst I would agree that university teams probably should use the resources the cloud providers make available freely, they should probably stay away from actually using capacity on the cloud and instead have their own hardware.

Besides, what I keep hearing from machine learning researchers is that no matter where you work, developing on your own machine ... there's no beating that, time and productivity wise.

Spivak · on Jan 5, 2018

I used to do IT for a large research university. I can confirm that we were practically drowning in funding for new hardware. If we made a business case for upgrading our infrastructure it wasn't uncommon to get an extra million or so dollars in the budget without much thought.

viridian · on Jan 5, 2018

Can't disagree with that, but for some genomics work it's just not a realistic option when you can get output from a slice of mainframe in 1/10th the time, and that time is like 2 and some change days.

saycheese · on Jan 4, 2018

Related HN post to your blog post on the benchmarks on CPUs vs. GPUs on Google Compute Engine: https://news.ycombinator.com/item?id=15940724

Eridrus · on Jan 4, 2018

I'm always disappointed in these comparisons since they never look at AWS spot instances, which are the real competitors for hobbyists.

maksimum · on Jan 4, 2018

I think the real competitors for hobbyists are physical GTX 1060s. Hobbyist cloud-computing is a different question though.

cobookman · on Jan 4, 2018

Doesn't Nvidia's new eula make that Difficult. Anything requiring more than a handful of gpus would be classified as a data center deployment. Which is against the eula.

IANAL though so I might have interpreted this incorrectly.

http://www.nvidia.com/content/DriverDownload-March2009/licen...

jdietrich · on Jan 4, 2018

I doubt that any court would consider a half-rack in someone's closet to be a "datacenter", nor would I expect Nvidia to enforce that EULA term against a hobbyist.

cobookman · on Jan 4, 2018

But if that hobbyist ends up creating a billion dollar business, that's leverage Nvidia has for a lawsuit.

Similar strategy of Adobe won't sue a single user for pirating Photoshop, but the second they have a successful business...that's a different story.

vlod · on Jan 4, 2018

> ends up creating a billion dollar business, > that's leverage Nvidia has for a lawsuit.

A great problem to have. Maybe first concentrate on creating a billion dollar business and by that time you can afford to get some 'approved' cards.. ;)

nl · on Jan 5, 2018

They (NVidia) don't have any recompense beyond withdrawing support. The ELUA is on the (free) drivers, so a court would find no monetary damages. (IANAL etc)

amelius · on Jan 4, 2018

I'm not getting it. Do they want to sell a product or a service?

And instead of restricting our rights, shouldn't we get a discount when buying multiple GPU cards?

minimaxir · on Jan 4, 2018

When I bought a 2016 MacBook Pro, I was expecting to buy an external GPU + an nVidia card so I could do deep learning/video gaming.

Unfortunately, using nVidia GPUs with a Mac is still fussy even in High Sierra. And with the GPU instance price drops making deep learning pay-as-you-go super affordable, it's no longer worth the physical investment in a card, especially because they depreciate quickly.

mciancia · on Jan 4, 2018

Depends how much are you using this. Looks like right now, for k80+4cores/15G of ram, you will have to pay $0.26 per hour. PC with 1060 would run you probably less than 600USD, so ~100days of using VM 24/7, excluding network traffic on one side, and power consumption on other (although, latter should not be that much when looking at the whole cost).

And you will still have pretty powerful PC at your home for everyday use/gaming

minimaxir · on Jan 4, 2018

I wouldn't call someone who trains models 24/7 a hobbyist.

maksimum · on Jan 4, 2018

One benefit to owning the hardware is that you can use it mine cryptocurrencies when you're not using it. I bought my GTX 1060 for $200 used in March 2017, and it's generated around $1300 worth of Ethereum...

ecnahc515 · on Jan 5, 2018

What's your power bill like compared to before you started running a miner?

user5994461 · on Jan 5, 2018

    0.3 kW/h * 0.20 $/kW * 24 h * 360 days = $518.4 per year of electricity

Your mileage may vary with the costs of electricity in your region and whether you really run 24/7.

In my experience, people always operate around break even. They only make good money if they held their coins and the price increased over time.

minimaxir · on Jan 4, 2018

How much did it generate before the price of cryptocurrency exploded in November?

maksimum · on Jan 4, 2018

It's been at around $40/mo net since March, briefly shooting up to over $100/mo in July (when the price of Eth increased faster than network Hashrate) and now it's back down to $40/mo. The overall profit is higher than the sum of the monthlies because I managed to hold some Eth rather than selling.

Uehreka · on Jan 4, 2018

Whoa! Where/How can you get a powerful PC with a 1060 for less than 600?

I'm not even being sarcastic, I'm thinking of building my first gaming PC this year.

maksimum · on Jan 4, 2018

A short guide: https://www.reddit.com/r/buildapc/comments/6i9jbg/guide_used....

TLDR: buy a used desktop business-class with a decent PSU for $300-$400, and stick in a GPU in the 1060 class.

minimaxir · on Jan 4, 2018

I put a note about spot instances at the end (the economics of spot instances are a bit tricker for calculating cost-effectiveness due to price variability).

Although the cost of a K80 preemptible instance on GCP is now close to the approximate cost of a K80 spot instance on AWS, so there's a bit of competition.

boulos · on Jan 5, 2018

Disclosure: I work on Google Cloud (and helped launch this).

Note that our pricing is flat regardless of number of GPUs attached (and you don't need to buy as many cores to go with them). By comparison, Spot often charges greater than on-demand pricing for anything other than single GPU instances.

Thanks again for your write up, sorry about the confusion as we delayed announcement until the new year.

gputhrowaway · on Jan 5, 2018

The comparison with AWS Spot instances are meaningless since with recent shift to per-second pricing, AWS spot market has changed significantly, there is very little price volatility but also its almost impossible to get a spot instance with GPU allocated. I almost always get spot-capacity-not-available message.

Googles preemptible model achieves a lot more fairer distribution of GPU's as opposed to AWS's model. Though AWS ended up with spot market since you got charged for the entire hour up-front and if AWS evicted you, then you got the entire hour for free, this made pricing spot instances complicated for both users as well as AWS very difficult. With shit to per second billing this issue has been eliminated to a large extent since you only get first 10 minutes free if you get pre-empted within those 10 minutes.

tl,dr; GCP preemptible instances are superior to AWS along all possible dimensions, price, flexibility, IO etc.

boulos · on Jan 5, 2018

Disclosure: I work on Google Cloud (and helped launch this).

While I appreciate the sentiment, there are certainly things some people prefer about Spot. For example, you basically get up to 4 minutes notice compared to the 30 seconds I chose (see the other thread). Similarly, until this change we didn't offer preemptible VMs with GPUs attached.

You're right about the complexity of Spot. Preemptible (and Azure's copy, Low-Priority VMs) is all about a fair, predictable price. Not everyone hates markets, but the number of companies and customers burned by the Spot market gave us the conviction to push for a simpler, fixed price.

Again, thanks for the praise (but there are pros and cons!).

viridian · on Jan 4, 2018

This is a great (read:inexpensive) resource for big data modelling if someone is willing to build for it. By choosing to be the customer of least concern and designing software that is able to handle being bounced, you are getting a terrific deal, assuming your work isn't particularly time sensitive, such as in a bioinformatics research lab.

I wish they put the minimum time you will be granted cycles though, as that information seems like an important design constraint when choosing the size of data chunks you design for.

LeifCarrotson · on Jan 4, 2018

> Compute Engine may shut them down after providing you a 30-second warning, and you can use them for a maximum of 24 hours.

That's a good start...looks like the model is "save your work often". I'm sure you could run a few instances for a while and model the distribution of time granted.

viridian · on Jan 4, 2018

Right, my concern is more the event of them axing work 32 seconds in or something, which could be a problem for some data analytics tools I've seen, that need a good 70 seconds or so to reach a steady state.

jkaplowitz · on Jan 4, 2018

Quoting their docs: "Compute Engine doesn't bill for instances preempted in the first 10 minutes," so that scenario is harmless in most application designs where preemptible VMs would make sense anyway.

Disclosure: I used to work at Google on GCE, but not directly on preemptible VMs or this billing policy. I don't currently work or speak for Google.

candiodari · on Jan 5, 2018

Cloud is extremely expensive compared to your own hardware, especially for things like modelling (because even this price essentially is a great machine for every 4 months of compute, ignoring power). Plus the experience of doing modelling on a machine sitting under your desk is far superior to doing it remotely.

Once you have a good pipeline and need to deploy it, sure. Then we can talk.

gcr · on Jan 4, 2018

This is exactly how the AWS spot market works. If you don't mind getting bumped off, you can cut GPU costs to 30% of the usual.

etaioinshrdlu · on Jan 4, 2018

What I would really love (and use heavily!) is a sort of "AWS Lambda" for GPUs.

I don't think anyone has such a product at this point.

boulos · on Jan 5, 2018

Disclosure: I work on Google Cloud (and helped launch this).

I'm still sad about the PiCloud folks. They were trying to do this (sort of), but it's a tough business to be in as you sit between the cloud provider and the customer needing to make money in between. IIRC, they were an early Spot customer and as the cost of their infrastructure shot up, they got squeezed out. You could imagine something similar again for running CUDA programs in a "serverless" fashion, but it would probably have to come from a provider.

thess24 · on Jan 5, 2018

I asked an AWS solutions architect about this specifically and they said they had no plans to implement it yet (that they knew of). I’d like to see someone offer this too

mv4 · on Jan 4, 2018

what would you use it for?

etaioinshrdlu · on Jan 5, 2018

Batch processing. Inference with NN's. Yes, even inference is much faster on a GPU. Perhaps a factor of 40.

Some jobs have time constraints and need to be completed as quickly as possible.

And I have very bursty workloads.

I would love such a service.

mv4 · on Jan 9, 2018

Bursty workloads is key here. In your opinion, what's the main challenge with existing hourly options (like Amazon's Elastic GPUs, or similar offerings from Google Cloud) - is it the management/provisioning? So, if someone made a Lambda-like service, where one would need to worry about provisioning - you think there'd be a market for that?

Btw, I found some tasks (and some algos) don't parallelize well (e.g. would get any "faster" from employing 40 GPUs instead of 4), but it would definitely make sense to be able to run multiple models (or training) in parallel. Once your needs are well understood, it's almost always significantly cheaper to build your own GPU cluster.

etaioinshrdlu · on Jan 15, 2018

Yes, it is precisely the management and provisioning that is difficult.

No one does hourly billing anymore, thankfully. GCE and AWS do per second or per minute...

What I do now is autoscale a group of GPU instances in AWS based on the size of the jobs queue. I create as many instances as will reasonably help, sometimes up to 100. Work is distributed to them. They turn off immediately once there is no more work to do.

The code runs in Docker containers but I am forced to maintain the base linux system and nvidia drivers b/c no one provides a container or FAAS for nvidia gpu computation...

I get the sense that this is a common problem nowadays. The way NVIDIA manages software releases doesn't help anything. There's quite a bit of .. churn. They don't play nice with anybody else hence Linus's famous words.

There's totally a market for containers as a service with passthrough access to NVIDIA hardware. It best not be more than 30% more expensive than a raw instance though, or it won't be very exciting.

argonaut · on Jan 5, 2018

The 30 second shut-off notice is pretty bad for deep learning training, IMO. You often will not be able to finish a mini-batch of training and save a model to disk in 30 seconds. There are probably ways to interrupt processing of a mini-batch, but it'll require some custom code (it probably will be hard to do in Keras, for example).

boulos · on Jan 5, 2018

Disclosure: I work on Google Cloud (and helped launch this).

Actually both Keras and raw TF have a check point thing built in (as does Torch). I believe it can be setup to do so every epoch, but unless you're talking about many many GBs of parameters, you can also stream lots of data to GCS in 30s.

argonaut · on Jan 5, 2018

You're talking about support for checkpointing every epoch, but a 30 second shutoff would require checkpointing every N mini-batches, which is not a typical use for checkpointing.

boulos · on Jan 5, 2018

Losing some work in exchange for a large discount is still a good trade though. It's a non-goal to provide 100% efficiency :).

argonaut · on Jan 5, 2018

My point is, you would have to write a nontrivial amount of code to get this working. It would not work out of the box with Keras right now, for example.

boulos · on Jan 5, 2018

I think you skipped over my first comment. This is built into Keras (and was the second result on Google for 'keras checkpoint'): https://keras.io/callbacks/#example-model-checkpoints

I don't consider that a nontrivial amount of code to write. Am I missing something?

argonaut · on Jan 13, 2018

I am well aware of Keras' checkpoint code. It only supports checkpointing every epoch. So no, it does not work out of the box with Keras.

gputhrowaway · on Jan 5, 2018

Since you are paying 50% discount and if you save at every 5 minutes you still come ahead, not to forget that if you get pre-empted within first 10 minutes (assuming 4 minutes to spin up the training) you got those batches for free!

argonaut · on Jan 5, 2018

Great! Depending on the framework you use this will not be trivial to implement correctly.

patall · on Jan 4, 2018

Can anyone explain how you cope with the 30s shutdown notice? Do you save your network every few iterations to your SSD or is it feasible to save the whole 16 GB from the P100 within the 30s window? I mean, we can freeze a running operating system (from RAM to disk), can do the same from GPU memory?

carreau · on Jan 4, 2018

See pre-emptible as something that should only run jobs that can be re-schedule. You don't save what you're doing; you close all that can be closed and mark all tasks that have to be re-done. If you are 3 minutes in a task that takes 5 minutes, then stop it and put it back on your global queue; The gain of running on preempt will compensate for the loss of rerunning a few tasks twice.

Const-me · on Jan 4, 2018

But isn’t it still way too expensive?

Performance wise that P100 is close to GeForce 1080. That GPU currently retails for $550.

For the price of 1 month of preemptive google’s IaaS you can buy the GPU and use it for as long as you want only paying very small amount (depending on where you are $0.02-$0.1/hour) for electricity.

aeleos · on Jan 4, 2018

I think its more about the flexibility of having access to more resources than you can ever need. If you buy a 1080 then you are limited by its performance and say you wanted to train to models at the same time, you are even more limited. Whereas with something like this you could train 10 models right when you need them trained, and you aren't stuck with 10 gpus that you bought so you could do that.

Const-me · on Jan 4, 2018

I understand the IaaS is way more flexible, e.g. in cases when you need lots of GPUs but not for long the cloud is a clear winner.

However, IMO the price is just not OK.

Compare with traditional servers. For $0.7/hour you can use e.g. c4.4xlarge amazon instance. The hardware is much more expensive than $550, just the RAM is already around $400. You won’t be able to purchase an equally-performing server for 1 month of that IaaS rent, I think it’ll be like 3-6 months (which is IMO reasonable).

tw04 · on Jan 5, 2018

That's because they can amortize the server over a longer time window than a GPU.

candiodari · on Jan 5, 2018

If that's true, then why are they still running the K80s ?

tw04 · on Jan 5, 2018

They didn't add the K80s until February of 2017, did you really think they'd literally throw out hardware after less than 12 months? "Accelerated" means 18-36 month shelf life, not 10.

anonfunction · on Jan 5, 2018

You’ll need at least some more components to run your GPU like a CPU, motherboard, etc... not to mention a reliable network, power and other ancillary services.

The real power of cloud is not a replacement of one instance but the ability to turn on tens, hundreds or even thousands of nodes.

rockooooo · on Jan 4, 2018

Those GPUs needs to be located somewhere (in a housing), with power, and won't include any of Google's nice software stack to use the.

jsmthrowaway · on Jan 4, 2018

> That GPU currently retails for $550.

If you can find one.

Klinky · on Jan 5, 2018

nVidia GPUs are pretty easy to get your hands on. Plenty of 1080s from $550 - $600 on NewEgg. AMD GPUs are not as easy to find.

mv4 · on Jan 9, 2018

Are you sure about that? all sub-$600 GTX 1080 offers are "out of stock".

The ones that aren't, are asking $1,200+ per.

https://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N...

Dolores12 · on Jan 4, 2018

Did anyone count if it worth for mining anything?

pgeorgi · on Jan 4, 2018

If it is, that'll change in 2 weeks when the difficulty factors in that everybody moved to preemptible GPU deals.

mv4 · on Jan 9, 2018

Unlike deep-learning tasks, datacenter-grade (Tesla) GPUs don't have an advantage in mining over much cheaper consumer GPUs. As an example, a cloud-based server with dual Tesla P100 would get you 120..125 MH/s on ETH, and would cost $2,500 per month. Or, I can build a rig with consumer GPUs (1060/1070) for under $2,000 (one-time).

Interestingly, you cannot run consumer GPUs in a datacenter, as Nvidia driver terms prohibit it.

Dolores12 · on Jan 4, 2018

Downvoted or not, its legit question and i would like to know.

strictnein · on Jan 4, 2018

My experience in this realm:

Out of curiosity, in June or July I ran some Ethereum mining stuff on Azure's GPU instances.

The ROI was around -90% (negative 90%, to be clear: for $1 spent, I mined ~$0.10). Even with the higher price of Ethereum now, you'd probably be around that same ROI since the difficulty has also increased a fair amount.

Dolores12 · on Jan 4, 2018

usually most popular coins are not profitable to mining due to increased difficulty. i would suggest to try http://whattomine.com/ to decide.

strictnein · on Jan 4, 2018

I was making $30 or so a day profit mining Ethereum at home at this time, so there was plenty of profit out there, just not with the cloud GPU stuff.

jsmthrowaway · on Jan 4, 2018

Non-preemptible GCE K80s already were profitable for Ethereum and a number of other currencies. Note that Google forbids cryptocurrency mining specifically during the $300 free trial, and a billing history is required to get a GPU quota raise; you start with zero.

boulos · on Jan 5, 2018

Hmm. I didn't come to the same conclusion. Can you show your calculations?

lgats · on Jan 5, 2018

Also interested in seeing any google cloud (profitable) mining instance.

yonran · on Jan 4, 2018

I wish Google would just adopt spot prices like AWS EC2, where you can look at the 3-month spot price history to estimate cost and to discover combinations of availability zone and instance type with extra capacity. GCE’s preemptible instances are “easy” in that there is no spot price to bid, but the downside is the total absence of information, since there is no way to look up capacity once your instance is shut down.

cobookman · on Jan 4, 2018

This is a known problem. You should ping your account team for more information on upcoming development.

Your account team can also advise you on what zones you should deploy preemptives in. Googlers internally can see current preemptive capacity and preemption rates on a per zone level. In certain zones you're more likely to have preemption due to large customer demand.

swozey · on Jan 4, 2018

AKA; not us-central1-c

jhallenworld · on Jan 5, 2018

Well, we need a way to do this with FPGAs.

option_greek · on Jan 4, 2018

How much money do companies spend on an average per month for GPU computing ? (looking for anecdotal figures).

dmoy · on Jan 4, 2018

That sounds like a number that would be fairly closely guarded by anyone who could answer you :/

falsedan · on Jan 4, 2018

You can just watch the spot prices for g2.2xlarge and see them go up to 4 time the on-demand price in some regions.