Thanks for getting back to me. That's what I thought. The magic seems to start happening in the low billions of parameters -- and I say "seems" b/c there's no consensus as to whether it's really truly magic! In any case, it's a shame that most of the human brainpower capable of improving SotA AI doesn't have access to large-scale resources.
Yes, CPUs are still the main workhorse for many scientific workloads. Sometimes just because the code hasn’t been ported, sometimes because it’s just not something that a GPU can do well.
Seems stupid to use millions of dollars of supercomputer time just because you can't be bothered to get a few phd students to spend a few months rewriting in CUDA...
>> just because the code hasn’t been ported, sometimes because it’s just not something that a GPU can do well.
> Seems stupid to use millions of dollars of supercomputer time just because you can't be bothered to get a few phd students to spend a few months rewriting in CUDA...
Rewriting code in CUDA won’t magically make workloads well suited to GPGPU.
It's highly likely that a workload that is suitable to run on hundreds of disparate computers with thousands of CPU cores is going to be equally well suited for running on tens of thousands of GPU compute threads.
Not necessarily. GPUs simply aren't optimized around branch-heavy or pointer-chasey code. If that describes the inner loop of your workload, it just doesn't matter how well you can parallelize it at a higher level, CPU cores are going to be better than GPU cores at it.
A supercomputer might cost $200M and use $6M of electricity per year.
Amortizing the supercomputer over 5 years, a 12 hour job on that supercomputer may cost $63k.
If you want it cheaper, your choices are:
A) run on the supercomputer as-is, and get your answer in 12 hours (+ scheduling time based on priority)
B) run on a cheaper computer for longer-- an already-amortized supercomputer, or non-supercomputing resources (pay calendar time to save cost)
C) try to optimize the code (pay human time and calendar time to save cost) -- how much you benefit depends upon labor cost, performance uplift, and how much calendar time matters.
Not all kinds of problems get much uplift from CUDA, anyways.
> I know governments have numerous Supercomputers that blow past $200MM in build price, but what universities do?
Even when individual universities don't-- governments have supercomputing centers that universities are a primary user of and often charge back value of computing time to the university or it is a separate item that is competitively granted.
Here we're talking about Jupiter, which is a ~$300M supercomputer where research universities will be a primary user.
sometimes the code is deeply complex stuff that has accumulated for over 30 years. to _just_ rewrite it in CUDA can be a massive undertaking that could easily produce subtly incorrect results that end up in papers could propagate far into the future by way of citations etc
That's the complete opposite of what is actually the case: some of that really old code in these programs is battle-tested and verified. Any rewrite of such parts would just destroy that work for no good reason.
Why don't YOU take some old code and rewrite it. I tried it for some 30+ year old HPC code and it was a grim experience and I failed hard. So why not keep your lazy, fatuous suggestions to yourself.
Sounds like a -terrible- job for LLMs, because this is all about attention to detail. Order of operations and specific constructs of how floating point work in the codes in question are usually critical.
A human has to have the knowledge of what the code is trying to do and what the requisites are for accuracy and numerical stability. There's no substitute for that. Having a translation aid doesn't help at all unless it's perfect: it's more work to verify the output from a flawed tool than to do it right in this case.
I'm not sure how I would do it otherwise, do you have an idea? The concept is that you can swap for the port you like, if that's USB-C then you have to choose between leaving the adapter slot empty (which you can I guess, even though it's probably awkward to use then), or using said "identity function".
I'm the first to complain about Tesla overselling their self-driving abilities, but I think the numbers used in this article are misleading. Miles per disengagements comparisons between Waymo (to pick one) and Tesla is like apples to oranges. Waymo operates in a closed fashion, is a laboratory-like setting (1), and I don't believe for one second that their miles per disengagements would be significantly better than Tesla in the real-world open settings that Tesla's FSD is being tested in. These numbers mean nothing when the environments used to compute them are so vastly different.
(1) Here's what I think is the case for Waymo's operations:
- Phenix AZ only (maybe one more city?)
-- Amazing sunny weather
-- Clear line markings
-- Can overfit to said city with HD maps, handcoding road quirks, etc
- Waymo employees only
-- Not to sound to tinfoil-hat, can we really trust this data
- Even within Phenix, some filtering will happen as to which route are possible
For me, fish made my life instantly better without any real investment coming from bash. There are some incompatibilities, but I felt productive right away.
as an undergrad in the 90s I invested thousands of dollars of my own money to work on machine learning. Worth every penny, even if I had to miss out on some cool parties.
I'm not even sure what virtue you're trying to signal here. Just because someone used the limited funds they had available to obtain the software and educational resources they were interested in, they've exercised some sort of inappropriate privilege?
Isn't that the whole idea behind college? Spend money to learn cool stuff?
Ok, I'll spell it out. Presenting the idea that MuJoCo having a license fee rather than being free is better, not worse, or even that the difference doesn't matter because "well, I paid for stuff" is the inappropriate part of the privilege. Given a choice between "I can learn X but I need to spend significant money on it" and "I can learn X for free", with everything else equal arguing for the former is exclusionary.
with everything else equal arguing for the former is exclusionary.
Sure. But who argued for that? They merely pointed out that they considered this particular physics engine worth spending money on. The implication was that because money is typically limited for a college student, the software must really be worthwhile. The implication was not "LOL, you must be broke and can't afford good software."
The implication j was responding to was "if you really cared, a mere few hundred dollars wouldn't matter because you'd be able to find the money." If you don't read that into the message chain, fine, but I do.
Well, if you're working to pay your rent, bills, and food expenses, and you're only working part time (because you're a full time undergraduate student), you may not have a lot left over to pay for software licenses.
which is precisely the reason I went with linux: so I could maximize my hardware resources while eliminating license costs. By not buying whatever MS compiler in those days, I was able to max out my workstation with 32MB RAM.
Working in undergrad and having thousands left over after paying for food and a place to sleep certainly can't be assumed to be. That's where my pay went.
I was very fortunate to not have to pay tuition or board, but if I had to, I wouldn't have spent my money to pay for that (I would have taken out loans) and still spent my money on RAM.
But they can accept that the world is not geared towards amateur dabblers, and that a few hundreds of $ for a license (a license, meaning to legalize production usage, you can always copy from somewhere to "dabble") is nothing.
The CS world has certainly been geared towards the hobbyist budget for some time now. Would be good to keep it that way, even if it meant sacrificing a supposed 1% extra accuracy in the latest "top conference" (sure if their sponsors are the companies publishing huge models it won't happen even if it should).
the price was reasonable for a good piece of software, but for the way people want to do ML research now (launch tons of parallel jobs on lots of cloud machines) the price, and more importantly, the burden of managing licenses, was prohibitive.
There was a free license for individual users, with the non-free license applicable to those who were receiving financial support (i.e. academic / industry researchers, etc.). The individual license was very popular and heavily utilized.