Hacker News new | past | comments | ask | show | jobs | submit login

The scary part is that you can give it instructions in plain language, and it will follow those instructions, in a semi-intelligent manner, without necessarily having seen those specific instructions before.

That a language model is able to do that is surprising, and it puts it much closer to AGI - not sentient or sapient, just general - than one might think.

Sure, it's "just generating text". But if it can sensibly and correctly generate arbitrary text, it's an AGI - it can solve any problem presented to it, as long as you present it in text form and accept its output in text form.

The variety and complexity of tasks it can solve is what's surprising.

"Here is the VHDL description of a CPU. Optimize it to make it faster." is text in, text out. Almost certainly not something it can solve now, but it may already be able to produce valid output for toy-style versions of the problem.

GPT-4 apparently can take images. It probably still won't be able to usefully respond to "you are seeing an image from the camera of the robot you are mounted on, you can say 'left' to rotate 5 degrees to the left, 'right' to rotate 5 degrees to the right, or 'forward' to drive 5 cm forward. Respond with the sequence of words that will navigate the robot through the maze you see."

but it seems no longer obvious that the same approach, just with more training data and compute thrown at it, won't be able to solve this.

And from there, it's not far to add "fire machine gun" to the command list and replace "navigate through the maze you see" with "dominate the battlefield you see".




Models that can do this or are involved in robotics have existed for YEARS in the military and presumably industry, I'm not sure really buying GPT4 represents a revolutionary leap forward and it seems really backwards to use an autoregressive language model for this use case..

Does any reputable academic or expert in LLMs actually support the hype behind GPT/autoregressive models as much as the HN crowd seems to?

Percy Liang and particularly Yann LeCun are pretty meh about them despite being thought leaders in this space and running leading groups in the NLP space.

I'm not sure where along the way we confused next-token prediction which by literal definition results in output that must SOUND really smart and coherent because it's trying to make a plausible sounding output with any real intelligence built in. To our knowledge (GPT-4 isn't really open) there is zero grounding happening with the outputs, certainly was not part of ChatGPT.

Someone tweeted a thread about GPT4 modifying a molecule in an anti-malarial and it didn't even get the original base molecule correct or the substitution, something that can be trivially done without using an LLM querying ope biochemical databases...

> Sure, it's "just generating text". But if it can sensibly and correctly generate arbitrary text, it's an AGI - it can solve any problem presented to it, as long as you present it in text form and accept its output in text for

No it doesn't. It APPEARS to solve problems like these where the actual task is either solved or trivial. If you read OpenAI's disclosures, and other papers by the FAIR group the all disclosure that the answers are routinely incorrect and just SOUND right. 'Write a snake game in python' is like lecture 2 of a 'python for non-engineers' course.

> but it seems no longer obvious that the same approach, just with more training data and compute thrown at it, won't be able to solve this.

Is it? Says who? Once again there are many major experts criticizing the end-game of RLHF. The problem space is much larger than what you can correct with a reward function.

If anything recent work by FAIR and Stanford NLP is suggesting that more compute is not the end game.

OpenAI themselves acknowledge that we still haven't figured out how to reliably ground a language model in truth and avoid spewing BS the moment you're not showing it some trivial thing.

At best the current approaches seems like glorified STS/IR models with the ability to output reasonable sounding text (again by definition given that they work by next token).


It is possible that it will hit a dead end. But I believe nobody expected the current architecture to be anywhere near this good.

I can literally tell it, in plain text with a single example, to become a home assistant to control the lights and output JSON, then prompt it with natural language like "make it look like a submarine on battle stations" and it will output a setting where all the lights are red. In the exact format I asked it to use.

That's a bit more than "appearing to solve a problem". That, right there, is a directly usable application. I could literally plug a speech to text -> ChatGPT -> something that filters invalid JSON -> light controls together, teach it a few more commands, and have a much better home assistant than anything I've seen commercially available, limited mainly by the speech recognizer.

The incredible thing about this is that it can't just do that with ONE problem, it can do that with MOST simple problems that don't require really extensive background knowledge, and it is clearly able to encode knowledge (battlestations -> red), so it seems plausible to me that more data will let it handle more knowledge.


I'm not sure thats as impressive as it seems. It's good at predicting sequeneces that it has seen before, so STS on steroids.

It's good filtering invalid JSON because it's seen how to do that many times, and is basically acting as a very good semantic text similarity with generative output.

If you ask it to act in a way of something that is niche enough to not be something it's seen a lot it fails horribly.

We haven't exactly figured out what exactly it's encoding. I do not think this empiric example is proof of that, whether the model actually has understanding of what 'red' is as a human does is yet to be determined.


i asked chatgpt to generate hex colours in a pleasing palette. the hex codes matched the colours, and the palette was alright but one colour was off.

then i told it to replace the colour but keep the others the same, and it picked new slightly different hex colours.


And this demonstrates what exactly?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: