Hacker News new | past | comments | ask | show | jobs | submit | jefft255's comments login

I did that a few years back: https://arxiv.org/abs/2011.11751


With a large model? How many parameters?

See my other comment here:

https://news.ycombinator.com/item?id=38536178


A couple millions IIRC. Nothing "large" compared to modern transformer models.


Thanks for getting back to me. That's what I thought. The magic seems to start happening in the low billions of parameters -- and I say "seems" b/c there's no consensus as to whether it's really truly magic! In any case, it's a shame that most of the human brainpower capable of improving SotA AI doesn't have access to large-scale resources.


Yes, CPUs are still the main workhorse for many scientific workloads. Sometimes just because the code hasn’t been ported, sometimes because it’s just not something that a GPU can do well.


> just because the code hasn’t been ported,

Seems stupid to use millions of dollars of supercomputer time just because you can't be bothered to get a few phd students to spend a few months rewriting in CUDA...


>> just because the code hasn’t been ported, sometimes because it’s just not something that a GPU can do well.

> Seems stupid to use millions of dollars of supercomputer time just because you can't be bothered to get a few phd students to spend a few months rewriting in CUDA...

Rewriting code in CUDA won’t magically make workloads well suited to GPGPU.


It's highly likely that a workload that is suitable to run on hundreds of disparate computers with thousands of CPU cores is going to be equally well suited for running on tens of thousands of GPU compute threads.


Not necessarily. GPUs simply aren't optimized around branch-heavy or pointer-chasey code. If that describes the inner loop of your workload, it just doesn't matter how well you can parallelize it at a higher level, CPU cores are going to be better than GPU cores at it.


They're not that disparate; the workloads are normally very dependent on the low latency interconnect of most supercomputers.


A supercomputer might cost $200M and use $6M of electricity per year.

Amortizing the supercomputer over 5 years, a 12 hour job on that supercomputer may cost $63k.

If you want it cheaper, your choices are:

A) run on the supercomputer as-is, and get your answer in 12 hours (+ scheduling time based on priority)

B) run on a cheaper computer for longer-- an already-amortized supercomputer, or non-supercomputing resources (pay calendar time to save cost)

C) try to optimize the code (pay human time and calendar time to save cost) -- how much you benefit depends upon labor cost, performance uplift, and how much calendar time matters.

Not all kinds of problems get much uplift from CUDA, anyways.


>> A supercomputer might cost $200M and use $6M of electricity per year.

I'm curious, what university has a $200MM super computer?

I know governments have numerous Supercomputers that blow past $200MM in build price, but what universities do?


> I know governments have numerous Supercomputers that blow past $200MM in build price, but what universities do?

Even when individual universities don't-- governments have supercomputing centers that universities are a primary user of and often charge back value of computing time to the university or it is a separate item that is competitively granted.

Here we're talking about Jupiter, which is a ~$300M supercomputer where research universities will be a primary user.


University of Illinois had Blue Waters ($200+MM, built in ~2012, decomissioned in the last couple years).

https://www.ncsa.illinois.edu/research/project-highlights/bl...

https://en.wikipedia.org/wiki/Blue_Waters

They have always had a lot of big compute around.


CUDA is buggy proprietary shit that doesn't work half the time or segfaults with compiler errors.

Basically, unless you have a very specific workload that NVidia has specifically tested, I wouldn't bother with it.


sometimes the code is deeply complex stuff that has accumulated for over 30 years. to _just_ rewrite it in CUDA can be a massive undertaking that could easily produce subtly incorrect results that end up in papers could propagate far into the future by way of citations etc


All the more reason to rewrite it... You don't want some mistake in 30 year old COBOL code to be making your 2023 experiment to have wrong results.


That's the complete opposite of what is actually the case: some of that really old code in these programs is battle-tested and verified. Any rewrite of such parts would just destroy that work for no good reason.


Why don't YOU take some old code and rewrite it. I tried it for some 30+ year old HPC code and it was a grim experience and I failed hard. So why not keep your lazy, fatuous suggestions to yourself.


The whole point is in these older numerical codes is that they're proven and there's a long history of results to compare against.


*FORTRAN.


Sounds like a great job for LLMs. Are there any public repositories of this code? I want to try.


Sounds like a -terrible- job for LLMs, because this is all about attention to detail. Order of operations and specific constructs of how floating point work in the codes in question are usually critical.

Have fun: https://www.qsl.net/m5aiq/nec-code/nec2-1.2.1.2.f


Attention to detail can come later when there's something that humans can get started with. I did not mean that LLM could do it all alone.


A human has to have the knowledge of what the code is trying to do and what the requisites are for accuracy and numerical stability. There's no substitute for that. Having a translation aid doesn't help at all unless it's perfect: it's more work to verify the output from a flawed tool than to do it right in this case.


The JSC employs a good number of people doing exactly this.


CUDA ? I thought rust was the future. /s


Looking forward to an electric car that’s not a gadget on wheels. Oh well that’s probably never going to happen


I'm not sure how I would do it otherwise, do you have an idea? The concept is that you can swap for the port you like, if that's USB-C then you have to choose between leaving the adapter slot empty (which you can I guess, even though it's probably awkward to use then), or using said "identity function".


I'm the first to complain about Tesla overselling their self-driving abilities, but I think the numbers used in this article are misleading. Miles per disengagements comparisons between Waymo (to pick one) and Tesla is like apples to oranges. Waymo operates in a closed fashion, is a laboratory-like setting (1), and I don't believe for one second that their miles per disengagements would be significantly better than Tesla in the real-world open settings that Tesla's FSD is being tested in. These numbers mean nothing when the environments used to compute them are so vastly different.

(1) Here's what I think is the case for Waymo's operations: - Phenix AZ only (maybe one more city?) -- Amazing sunny weather -- Clear line markings -- Can overfit to said city with HD maps, handcoding road quirks, etc - Waymo employees only -- Not to sound to tinfoil-hat, can we really trust this data - Even within Phenix, some filtering will happen as to which route are possible


So... Karaoke? Why do I have to read all of this to understand that they mean Karaoke?


For me, fish made my life instantly better without any real investment coming from bash. There are some incompatibilities, but I felt productive right away.


The parameters are the number of weights in a neural network, in this case.


This is really good news. It was a really contentious issue in RL and robotics research to be so reliant on proprietary (and very expensive) software.


For more context on "very expensive," a MuJoCo license was several hundreds of dollars.


For undergrad students trying to dabble in reinforcement learning, several hundreds of dollars can be quite a lot.


I got MuJoCo for free as an undergrad and I'm pretty sure all I needed was a .edu email address.


as an undergrad in the 90s I invested thousands of dollars of my own money to work on machine learning. Worth every penny, even if I had to miss out on some cool parties.


Wonderful that you had thousands of dollars as an undergrad to be able to do so.


I'm not even sure what virtue you're trying to signal here. Just because someone used the limited funds they had available to obtain the software and educational resources they were interested in, they've exercised some sort of inappropriate privilege?

Isn't that the whole idea behind college? Spend money to learn cool stuff?


Ok, I'll spell it out. Presenting the idea that MuJoCo having a license fee rather than being free is better, not worse, or even that the difference doesn't matter because "well, I paid for stuff" is the inappropriate part of the privilege. Given a choice between "I can learn X but I need to spend significant money on it" and "I can learn X for free", with everything else equal arguing for the former is exclusionary.


with everything else equal arguing for the former is exclusionary.

Sure. But who argued for that? They merely pointed out that they considered this particular physics engine worth spending money on. The implication was that because money is typically limited for a college student, the software must really be worthwhile. The implication was not "LOL, you must be broke and can't afford good software."


The implication j was responding to was "if you really cared, a mere few hundred dollars wouldn't matter because you'd be able to find the money." If you don't read that into the message chain, fine, but I do.


Worked a job to pay for ram. It atypical.


Working in undergrad isn't a common thing?


Well, if you're working to pay your rent, bills, and food expenses, and you're only working part time (because you're a full time undergraduate student), you may not have a lot left over to pay for software licenses.


which is precisely the reason I went with linux: so I could maximize my hardware resources while eliminating license costs. By not buying whatever MS compiler in those days, I was able to max out my workstation with 32MB RAM.


Working in undergrad and having thousands left over after paying for food and a place to sleep certainly can't be assumed to be. That's where my pay went.


I was very fortunate to not have to pay tuition or board, but if I had to, I wouldn't have spent my money to pay for that (I would have taken out loans) and still spent my money on RAM.


But they can accept that the world is not geared towards amateur dabblers, and that a few hundreds of $ for a license (a license, meaning to legalize production usage, you can always copy from somewhere to "dabble") is nothing.


The CS world has certainly been geared towards the hobbyist budget for some time now. Would be good to keep it that way, even if it meant sacrificing a supposed 1% extra accuracy in the latest "top conference" (sure if their sponsors are the companies publishing huge models it won't happen even if it should).


Well now they don't have to, because it really is nothing.


the price was reasonable for a good piece of software, but for the way people want to do ML research now (launch tons of parallel jobs on lots of cloud machines) the price, and more importantly, the burden of managing licenses, was prohibitive.


For reference here is the price before as seen on november 2020 (previous internet archive screenshot), https://web.archive.org/web/20201111235930/https://www.robot...

TLDR: personal non commercial 500$/year for usage on up to 3 computers personal commercial 2000$/year


There was a free license for individual users, with the non-free license applicable to those who were receiving financial support (i.e. academic / industry researchers, etc.). The individual license was very popular and heavily utilized.


+1 - plus, it's an immediate barrier for attracting new people to the field.


All (I hope) gaming mice with fancy drivers will also just work fine without them.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: