I’d be far more interested in seeing TensorFlow work on AMD GPUs or standard FPGAs.
Nvidia GPUs, with a profit margin in the double digits, aren’t very cost-effective for such a task. Especially when you’re a student.
For the paper I’m currently writing I ended up running the models CPU-based for months, as I couldn’t afford the cost to get a good Nvidia GPU, and as my university couldn’t provide me with compute time (as they have many thousands more students than servers available).
I agree with your sentiment completely, it would be a win/win to be able to use TF and other similar frameworks on AMD hardware. But I would also like to point out that using online compute power for this should be possible even at a student budget (assuming you're in the west).
Amazon and Google have good options for those who can't afford dishing out $650x4 for a decent ML setup.
Online computing power is not the same as your own - it's like taxi vs your own car. You can't do what you want when you want. Turned on your Amazon machine? No sleep till you fix the bugs and load all GPUs :) Every "downtime" costs. I, personally, find it very disturbing and despite the fact my employer pays for my Amazon time I often prefer my laptop with GeForce 840M :)
Agreed, as well as the evolving nontrivial hunt for cost reductions e.g. "Just use spot instances in US west 1 on weekends", stuff like that.
A pascal GPU and older CPU that takes at least 32GB RAM is a decent combination. I'm looking to upgrade to an older Xeon and a 1080 (price should be dropping any week now...) or 1080 Ti. Even the Sandy/ivy bridge i7's have not too bad results on most benchmarks: http://www.anandtech.com/show/9483/intel-skylake-review-6700...
Where is AMD substantially beating Nvidia on performance per price? This looks pretty well split, to me; their $299 cards are roughly equal in performance, for example.
I bought in August an RX 480 8GB for 219€. At the same time, the GTX 1060 6GB was still above 300€.
That’s where they beat NVIDIA on performance per price.
Sure, with that card I won’t get any nice performance in training or running my models, but compared to running on CPU, it’ll still be noticeable (and the large VRAM helps, too).
The Nvidia 960 isn't quite as good as the AMD 480, but it's cheaper. I think if you look at them across their lines, you don't see a consistent advantage in price/performance.
Though that will take another year at least before it comes to fruition. What was your GPU budget? If you were able to run them CPU-based I think a mid-range Pascal GPU would do great. You can sell it later for at least half the price as well.
My budget – being a student, and having to finance it from money I’d usually spend for textbooks, etc – was basically 0, but I managed to carve out enough for about 200€, which was enough for an RX 480 8GB, but could not afford any performant NVIDIA card (at the time, the GTX 1060 6GB was still hovering over 300€, and the normal price in Germany is still over 300€)
Give Google Cloud ML a try. With Google Cloud ML, one does not need GPU instances or any other cluster. Just submit your ML job from your computer. Cloud ML will train your model and bill you for the by training resources. No need to spin up / spin down anything.
Except, (a) my training data is under EU privacy protections, if I gave it to Google, I'd be in jail, (b) uploading large datasets from a home connection would take ages as well.
If I could just run the same at home, with affordable cards, it'd be a lot nicer solution.
> The amount of time to transfer hundreds of gigabytes will be dwarfed by the time savings from using GPU compute.
For me, it wasn’t about time savings – I had a limited timeframe, and wanted to get the models as accurate as possible within the time.
Transferring the data would have taken longer than the time I had to run the training, and I while the model would have been a lot more accurate, this wasn’t too important for my paper.
Anyone know what the process in TensorFlow is like for taking your trained network and bringing it into another application (or whether this is even possible)? Just wondering whether it's easy/difficult/(in)flexible or whatever.
Edit: to clarify: by 'another application' I mean something else you're making that it serves as an algorithm within (e.g. a handwriting recognizing Android app).
It varies :) Mores especially it depends on your final platform and programming language. Android is fine, pure C++ without Python is more difficult though doable (eg deepdetect server has support for it), many other wrappers exist around TF.
Nvidia GPUs, with a profit margin in the double digits, aren’t very cost-effective for such a task. Especially when you’re a student.
For the paper I’m currently writing I ended up running the models CPU-based for months, as I couldn’t afford the cost to get a good Nvidia GPU, and as my university couldn’t provide me with compute time (as they have many thousands more students than servers available).