Hey peeps full disclosure I work as one of Linode's RnD engineers. I want to try to get to as many of these as I can.
One of the biggest questions is why the Quadro RTX 6000? Few things:
1. Cost it has the same performance as the 8000. The difference is 8 more GB of RAM that comes at a steep premium. Cost is important to us as it allows us to be at a more affordable price point.
2. We have all heard or used the Tesla V100, and it's a great card. The biggest issue is that it's expensive. So one of the things that caught our eye is the RTX 6000 has a fast Single-Precision
Performance, Tensor Performance, and INT8 performance. Plus the Quadro RTX supports INT4.
https://www.nvidia.com/content/dam/en-zz/Solutions/design-vi...https://images.nvidia.com/content/technologies/volta/pdf/tes...
Yes, these are manufactures numbers, but it caused us pause. As always, your mileage may vary.
3. RT cores. This is the first time (TMK) that a cloud provider is bringing RT cores into the market. There are many use cases for RT that have yet to be explored. What will we come up with as a community?!
Now with all that being said, there is a downside, FP64 aka double precision. The Tesla V100 does this very well, whereas the Quadro RTX 6000 does poorly in comparison. We think although those workloads are important, the goal was to find a solution that fits a vast majority of the use cases.
So is the marketing true to get the most out of MI/AI/Etc? Do you need a Tesla to get the best performance? Or is the Tesla starting to show its age? Give the cards a try I think you'll find these new RTX Quadros with Turning architecture are not the same as the Quadros of the past.
If you really want low cost to compute for Deep Learning and you needs lots of compute and don't want to pay for V100s, then the AMD Vega R7 is the card for you. 700 dollars, 16GB Ram, 1TB of GPU bandwidth (higher than the V100!), works with Tensorflow (pip install tensorflow-rocm), and about 60% of the performance on resnet-50.FP64 is not fully gimped (it is halved, i think - so still quite good). Put lots of them in servers with PCI 4.0, and you can do great things. Here's a recent talk on it:
If you really want low cost to compute for Deep Learning and you needs lots of compute and don't want to pay for V100s, then the AMD Vega R7 is the card for you. 700 dollars, 16GB Ram, 1TB of GPU bandwidth (higher than the V100!), works with Tensorflow (pip install tensorflow-rocm), and about 60% of the performance on resnet-50.FP64 is not fully gimped (it is halved, i think - so still quite good).
Two of my colleagues use high-end AMD GPUs to train RNNs and transformers with tensorflow-rocm. There are still some nasty bugs (e.g. [1]), so it is currently not for everyone. However, given how far they have come compared to 1-2 years ago, it is very likely that in a year or so, they are a real competitor to NVIDIA for compute. That competition was long needed.
Agreed, it is not quite prime-time yet. They are trying to upstream all the ROCm stuff in TensorFlow, and when it gets into mainline and stabilizes, i agree that it has great potential for take-off - particularly from price-sensitive researchers and large companies who need huge GPU farms.
This is a terrible suggestion/comparison. AMD has nowhere near the software support in the ML/AI space that Nvidia has. I wish that AMD would invest in a CUDA competitor and break Nvidia's monopoly, but that is not even close to being a reality, unfortunately.
> The difference is 8 more GB of RAM that comes at a steep premium
This is incorrect. The RTX 6000 has 24GB of VRAM and is $4000, and the RTX 8000 has 48GB of VRAM (double the amount) and is $5500. Is it worth the price increase? For a lot of people I know it is.
Also, the RTX Titan is $2500 and is identical to the RTX 6000 (at the chip level) and also with 24GB of VRAM, with the only difference being software enabling of additional H.264/5 encoding features on the Quadro. Definitely not worth the cost increase, especially for anyone doing ML.
If you reason as a consumer the RTX Titan makes a lot more sense than the RTX 6000, however datacenters are forbidden by Nvidia to use consumer cards [1], therefore their choice makes sense.
Except datacenter is not defined by NVIDIA in their EULA at all, and plenty of large and small datacenters continue to use "consumer cards" regardless of NVIDIA's fear mongering. I know that Tesla, OpenAI, Microsoft, Apple, and many others all continue to primarily buy primarily 2080Ti's, RTX Titans, and Titan V's since the EULA change.
Companies make unenforceable claims all the time. That's why we've got courts. Theyr'e almost certainly never going to take any one to court, because if they did, it would get tossed out. They can't pull the same "it's a license to a product" bs media services do. Though they still try with the driver. I think for now, they've just run the numbers and figured out it gives them slightly higher datacenter card sales.
What incident are you referring to? (genuine question)
As far as standards go, we use Linode and all of our customers (some of them quite demanding about internal security details) have been satisfied with the various acronyms they are accredited with... Although I understand this does not necessarily guarantee anything about response behavior, so interested to hear about past incidents.
We've also been making ongoing improvements to our application security and security infrastructure through the implementation of a DevSecOps culture. This is something we take very seriously.
GTX1080 for 100$ a month. Grantend, it is older, but it still works for DL. Let's say you do 10 experiments a month for ~20 hours. Thats 0.5$/hour and I don't think it is 3 times faster.
If you then want to do even more learning the price goes even down.
//DISCLAIMER: I do not work for them, but used it for DL in the past and it was for sure cheaper than GCP or AWS. If you have to do lots of experiments (>year) go with your own hardware, but do not underestimate the convenience of >100MByte/s if you download many big training sets.
For traditional floating point workloads, the RTX 6000 will probably not be 3x faster. For workloads that can use the tensor ops (integer matrix multiply, basically), the RTX 6000 may be as much as 10-100x faster.
It is not a server card, however, it is much faster than any old AWS instances for 1k$/m (if you happen to be an AWS user and did not want to upgrade because of the price going up 3x)
TBH, 100 bucks per month is free, while most of the researches do not have 1k$/m for a server, it is cheaper to buy hardware and put Linux on it.
There are of course other options and Linode is kinda late to the party, but I am happy they made this move.
>There are of course other options and Linode is kinda late to the party, but I am happy they made this move.
Considering their main competitor, DO, Vultr, UpCloud, none of the them offers any GPU instances, I don't think they are late at all. If not the first for their market segment.
How does data in/out work in practice with them? I see this 4 Tbit bandwidth but do you happen to know what that translates to and what happens if you exceed that?
Also check availability shows a 5 day wait current:
“EX51-SSD-GPU for Falkenstein (FSN1): Due to very high demand for these server models, its current setup time is approximately up to 5 workdays.*”
Or maybe there are other regions/dcs.
I have like 18 of their auction servers that are unmetered at 1gbps and really make that bandwidth sweat. I've never had issues honestly, and they've never tried to dreamhost me. I love it.
So far I did not reach that limit (I used it to train networks for image segmentation) so I had mostly ingress and only downloaded large amounts to the machine not from it (which is free like with most providers).
But you can just ask them.
I have to say that not everything was 100% smooth - sometimes the proprietary NVidia driver crashed (you have to use the right CUDA and driver combination) my Linux instance and hanged the system, so I had to hard-reboot it (which is supported via their admin console) which takes some minutes. However that's not their fault as I heard the driver is a big pile of crap shit anyway because NVidia is too embarrassed to post it to LKML.
Tech speaking it’s nVidia’s GeForce driver that restricts datacenter usage not the card itself.
Not deep dived into it but maybe using nouveau instead of GeForce works around that restriction.
You are allowed to use the driver in data centres for cryptocurrency usage. The EULA limited datacenter usage hasn’t really been challenged in court yet. Both sides would have an argument. NVidia are using the Eula to limit an activity that a user would be allowed to do if the location that activity was different (and not even talking type of industry here, though that’s prob in the Eula too) On the other hand, it’s nVidia’s software, they are free to license it how they like.
I've not deep dived into nouveau for a while so wasn't sure if they added cuda support in the past couple of years since I played with it which is why I only said "maybe".
It's a flat fee of $100/month, correct? What would be the best option if the amount of training you do is rather "occasional" (but simply using colab doesn't cut it anymore)?
I don't have specific experience with ML, but AWS spot pricing is by far the best deal last time i checked for GPU. You can get something much more powerful than a gtx1080 and get your task done more quickly. The downside is that at any time your instance can be shut down after a short warning signal to backup your progress, so it may or may not be suitable for what you're doing.
Does the price actually depend on whether you are using a GPU or simply an instance you choose? Let's say you need to do some work that will require a GPU, so you spend 5 hours setting up an environment, doing some light programming/experiments in an Jupyter notebook, downloading datasets, looking at the data. Then you train for an hour then one more hour looking at the data, drinking coffee, stuff like that. Then train again.
So you were using the environment for 10 hour, but only 3 of them in total were using GPU. Will you pay for 10 hours of GPU usage, or will only 3h be expensive and 7h cheap?
If you use a GPU instance you pay the cost for it whether not you use the actual GPU. If the GPU time is short relative to the other stuff you are doing (like data cleanup) it might make sense to do your non-GPU related setup on a different instance first.
Still way too much money when a 2x 2080Ti comparably specced machine under my desk costs less than 2.5 months of their billing rate, and 4x 1080Ti servers in my garage cost about 1 month of their 4-GPU machine _and_ have more SSD storage. This pricing is totally insane, especially if not billed per-minute (which in Linode's case it is not) and if there are no cheaper preemptible/spot instances.
I’m starting to think one can adopt the simple rule of switch to a DIY build whenever there enough work to keep a GPU busy for 2 months, otherwise if the workload is intermittent then better strategy is leasing, especially considering the purchase cost/performance is constantly dropping.
What's the cost for power? Serious question, and I'm not suggesting that this cost should account for a large percentage of the price, but genuinely curious. If your GPUs are working every hour of the month for you, how much is it costing you in electricity?
Quad GPU machines draw about 1.3KW each on average when under load. That's about $100/mo where I live, assuming they're 100% loaded 24x7 for the entire month. Realistically it's less. So it's not free, but it's not a crazy amount either.
Looks amazing. Linode has worked really well for me over the years.
One thing I noticed when recently trying to get a GPU cloud instance, the high core counts are usually locked until you put in a quota increase. Then sometimes they want to call you.
So I wonder if Linode will have to do that or if they can figure out another way to handle it that would be more convenient.
I also wonder if Linode could somehow get Windows on these? I know they generally don't do anything other than Linux though. My graphics project where I am trying to run several hundred ZX Spectrum libretro cores on one screen only runs on Windows.
That pricing isn't too bad. They come with decent SSD storage too, which is key for the large datasets that make a GPU instance worthwhile.
Linode skews more towards smaller scale customers with many of their offerings so I think the GPUs here make sense. The real test will be how often they upgrade them and what they upgrade them too.
Interesting to see another cloud provider go with Quadro chips. NVIDIA repackages the same silicon under several different brands (GeForce, Quadro, GRID, Tesla) and we (https://paperspace.com) have found Quadro to offer the best price/performance value. Despite minor performance characteristics, such as FP16 support in the Tesla family, Quadros can run all of the same workloads eg graphics, HPC, Deep Learning etc. If you’re interested in a similar instance for less $/hr, check out the Paperspace P6000.
RTX 6000 is significantly faster than a P100 outside of FP64, and is the fastest or 2nd fastest GPU outside of FP64 work [1] (the GV100 is sometimes faster, sometimes slower than the RTX 6000 but costs more). For FP64, GV100 based GPUS are quite a bit faster than P100s.
Also, you should really ignore pretty much all of the comparison sites that show up when you search for computer component comparisons as they're nearly all awful. The one you posted doesn't show a single benchmark comparison between them, and compares numbers like clock speed which isn't comparable between architectures, or memory clock speed instead of memory bandwidth leading to the laughable conclusion that the RTX 6000 has "9.9x more memory clock speed: 14000 MHz vs 1408 MHz" vs the P100 when the P100 uses HBM2 and has 732.2 GB/s vs 672.0 GB/s of actual memory bandwidth.
edit: could be wrong thought I read of AWS being .65 dollars an hour for deep learning GPU use.
edit2: Did a quick look, the .65 dollars doesn't include the actual instance, so its around 1.8 an hour on the low end, I think this cheaper.
p2.xlarge comes with an NVIDIA Tesla K80 GPU for $0.90/hr, but this is now an "old" GPU and the RTX Quadro 6000 should have much higher performance (but I was unable to find any machine learning benchmarks).
p3.2xlarge has NVIDIA Tesla V100 GPU which is NVIDIA's most recent deep learning GPU, but it's $3.06/hr.
That said, AWS is among the most expensive providers if you just need a deep learning GPU (but obviously AWS offers a lot of other useful things). For example, OVH Public Cloud has Tesla V100 for $2.66/hr. And comparable NVIDIA GPUs that are not "datacenter-grade" should be even cheaper; AWS, GCP, Azure, etc. are unable to offer them because of contracts when they buy e.g. the Tesla V100.
The K80s are super outdated now. Google used to offer them for free for 12hrs a day on their Colab platform, but they upgraded them to using the Tesla T4s. Note you can get a K80 on GCP (unreserved) for $.45/hr.
K80 is the crappiest cards I've used last year. For sure it was a good choice 2 years ago, but now you are better to upgrade as any new desktop card is better than K80
It depends, for full-time usage, it is a bit more expensive, I think it is a matter of a few hundred, probably less. We've happily migrated from AWS as only one GPU instance cost us near 1k/m. BTW, the newest and the only available GPU instances now should be better RTX6000 even being more expensive.
GPUEater do, i think. Right now, though, they are a viable option for an on-premise use case where you have a budget of say a $100k dollars or more and need a huge amount of compute and have larger models to train. The Vega R7 gives you 16GB Ram (11GB in the 2080Ti) and is just slightly lower performance than the 2080Ti (322 vs 302 images/sec for resnet-50 from here: https://www.youtube.com/watch?v=neb1C6JlEXc ). And you have servers with PCI-4.0 support, so that distributed training scales (yes, 2080-Ti supports nvlink, but nvlink servers cost way more).
Simple math example. A PCI-4.0 server with 256GB Ram and 8xVegaR7 should cost around $10K. With a couple of switches and racks, you can get 100s of GPUs for just a couple of hundred thousand dollars (note, only 2 GPU servers per rack for now is normal, otherwise you have to buy non-commodity racks with high power draw).
Testing if nothing else, I'd you are shipping to be people running AMD GPUs then that would be useful without having to buy a card and another machine for it to go in
Can these be used for crypto mining at any level of efficiency? I was able to mine GRLC back in the day on AWS spot instances at a VERY mild degree of profitability.
Doubtful, since these are just fully unlocked TU102 GPUs (same as the Titan RTX, 2080ti is the same TU102 GPU but partially locked at 4352 cores vs 4608 for the Quadro RTX 6000/8000 and Titan RTX). If you could be profitable with this at $1000/month then people would be flocking out to buy 2080tis for $1100 and getting 90-95% of the hashrate.
They wouldn't be available if they were profitable for that. Providers usually make you do extra verification to use these instances because people were at a time using them for that, not because it was profitable, but because they used stolen cloud accounts/cards.
not really, most cryptocurrency is at the stage where the only thing effective is a combination of custom ASICs and nearly free electricity. About twelve months ago I looked into mining ethereum with state of the art GPUs and it would not have had a reasonable ROI unless I was literally paying $0.00 per kWh. And that was before its value per coin dropped a lot.
When the value dropped, the network hashrate dropped and difficulty went down so things actually became profitable again.
The best time to mine is during the drops, not the highs, unless you follow buy high, sell low and don't believe the market will correct for the better again (which it has).
Of course, it depends on electricity prices, but it is profitable to mine ethereum, especially if you know how to tune the cards to maximize hash/consumption.
That said, mining is competitive and difficult and unless you are going to go really large, don't bother. If you are interested in learning about it, definitely experiment though don't expect to make a lot of money.
One of the biggest questions is why the Quadro RTX 6000? Few things:
1. Cost it has the same performance as the 8000. The difference is 8 more GB of RAM that comes at a steep premium. Cost is important to us as it allows us to be at a more affordable price point.
2. We have all heard or used the Tesla V100, and it's a great card. The biggest issue is that it's expensive. So one of the things that caught our eye is the RTX 6000 has a fast Single-Precision Performance, Tensor Performance, and INT8 performance. Plus the Quadro RTX supports INT4. https://www.nvidia.com/content/dam/en-zz/Solutions/design-vi... https://images.nvidia.com/content/technologies/volta/pdf/tes... Yes, these are manufactures numbers, but it caused us pause. As always, your mileage may vary.
3. RT cores. This is the first time (TMK) that a cloud provider is bringing RT cores into the market. There are many use cases for RT that have yet to be explored. What will we come up with as a community?!
Now with all that being said, there is a downside, FP64 aka double precision. The Tesla V100 does this very well, whereas the Quadro RTX 6000 does poorly in comparison. We think although those workloads are important, the goal was to find a solution that fits a vast majority of the use cases.
So is the marketing true to get the most out of MI/AI/Etc? Do you need a Tesla to get the best performance? Or is the Tesla starting to show its age? Give the cards a try I think you'll find these new RTX Quadros with Turning architecture are not the same as the Quadros of the past.