So it will now be cost-effective to connect the exhaust of ChatGPT to its inlet ...

the_gipsy · 2024-07-19T20:53:11 1721422391

Baa baa baaa baaaaaa.

throwthrowuknow · 2024-07-19T21:34:44 1721424884

You’re sadly misinformed if you think training an LLM consists of dumping the unfiltered sewage straight from the web into a training run. Sure, it’s been done in early experiments but after you see the results you learn the value of data curation.

surfingdino · 2024-07-19T22:06:26 1721426786

And who/what is going to curate data? AI or that company in Kenya https://time.com/6247678/openai-chatgpt-kenya-workers/ ? Because neither have a clue what is good data and what is not.

muzani · 2024-07-19T22:46:31 1721429191

That article itself might be part of the degradation. It mentions at least four times that the contract was canceled as if it's something new. I wonder if someone just dumped a bunch of facts and ran it through a spin cycle a few times with AI to get a long form article they didn't expect anyone to read.

GaggiX · 2024-07-19T22:44:23 1721429063

It's clearly working because the models are only getting better, believing that the performance of these models would fall at some point in the future is just very delusional.

Slyfox33 · 2024-07-19T23:52:50 1721433170

Weren't they just getting better mostly because they were being scaled up? There's no way to do that once you've exhausted all of the data. Besides progress has slowed down at this point anyway.

energy123 · 2024-07-20T05:32:38 1721453558

Not only. Look at the subject of this thread, GPT-4o mini.

I'm optimistic about synthetic data giving us another big unlock, anyway. The text on the internet is not that reasoning dense. And they have a snapshot of pre-2023 that is fixed and guaranteed not to decay. I don't think one extra year of good quality internet is what will make or break AGI efforts.

The harder bottleneck will be energy. It's relatively doable to go from 1GW to 10GW but the next jump to 100GW becomes insanely difficult.

GaggiX · 2024-07-20T00:15:00 1721434500

GPT-3 was 173B parameters and it's very bad compare to much smaller models we have nowadays, the data and the compute play a giant role, also I doubt you would need to train a model further after you have trained it on absolute everything (but we are very far from that).