Hacker News new | past | comments | ask | show | jobs | submit login

It is hard for me to square "This company is a few short years away from building world-changing AGI" and "I'm stepping away to do my own thing". Maybe I'm just bad at putting myself in someone else's shoes, but I feel like if I had spent years working towards a vision of AGI, and thought that success was finally just around the corner, it'd be very difficult to walk away.





It's easy to have missed this part of the story in all the chaos, but from the NYTimes in March:

Ms. Murati wrote a private memo to Mr. Altman raising questions about his management and also shared her concerns with the board. That move helped to propel the board’s decision to force him out.

https://www.nytimes.com/2024/03/07/technology/openai-executi...

It should be no surprise if Sam Altman wants executives who opposed his leadership, like Mira and Ilya, out of the company. When you're firing a high-level executive in a polite way, it's common to let them announce their own departure and frame it the way they want.


Greg Brockman, OpenAI President and co-founder is also on extended leave of absence.

And John Schulman, and Peter Deng are out already. Yet the company is still shipping, like no other. Recent multimodal integrations and benchmarks of o1 are outstanding.


> Yet the company is still shipping, like no other

If executives / high level architects / researchers are working on this quarter's features something is very wrong. The higher you get the more ahead you need to be working, C-level departures should only have an impact about a year down the line, at a company of this size.


Funny, at every corporation I've worked for, every department was still working on last quarter's features. FAANG included.

That’s exactly what they were saying. The department are operating behind the executives.

This is a good point. I had not thought of it this way before.

C-level employees are about setting the company's culture. Clearing out and replacing the C-level employees ultimately results in a shift in company culture, a year or two down the line.

You may find that this is true in many companies.

> the company is still shipping, like no other

Meta, Anthropic, Google, and others all are shipping state of the art models.

I'm not trying to be dismissive of OpenAI's work, but they are absolutely not the only company shipping very large foundation models.


Indeed Anthropic is just as good, if not better in my sample size of one. Which is great because OpenAI as an org gives shady vibes - maybe it's just Altman, but he is running the show.

Claude is pretty brilliant.

Perhaps you havent tried o1-preview or advanced voice if you call all the rest SOTA.

If only they’d release the advanced voice thing as an API. Their TTS is already pretty good, but ai wouldn’t say no to an improvement.

> Yet the company is still shipping, like no other.

I don't see it for OpenAI, I do see it for the competition. They have shipped incremental improvements, however, they are watering down their current models (my guess is they are trying to save on compute?). Copilot has turned into garbage and for coding related stuff, Claude is now better than gpt-4.

Honestly, their outlook is bleak.


Yeah, I have the same feeling. It seems like operating GPT-4 is too expensive, so they decided to call it "legacy" and get rid of it soon, and instead focus on cheaper/faster 4o, and also chain its prompts to call it a new model.

I understand why they are doing it, but honestly if they cancel GPT-4, many people will just cancel their subscription.


VP Research Barret Zoph and Chief Research Officer Bob McGrew also announced their departures this evening.

Greg’s wife is pretty sick. For all we know this is unrelated to the drama.

Sorry to hear that, all the best wishes to them.

Context (I think): https://x.com/gdb/status/1744446603962765669

Big fan of Greg, and I think the motivation behind AGI is sound here. Even what we have now is a fantastic tool, if people decide to use it.


Past efforts leds to today's products. We need to wait to see the real imapct on the ability to ship.

> like no other

Really? Anthropic seems to be popping off right now.

Kagi isn’t exactly in the AI space, but they ship features pretty frequently.

OpenAI is shipping incremental improvements to its chatgpt product.


"popping off" means what?

Modern colloquialism generally meaning Moving/advancing/growing/gaining popularity very fast

Are they? In my recent experience, ChatGPT seems to have gotten better than Claude again. Plus their free limit is more strict, so this experience is on the free account.

Its just tribalism. People tend to find a team to root for when there is a competition. Which one is better is subjective at this point imo.

The features shipped by Anthropic in the past month are far more practical and provide clear value for builders than o1's chain of thought improvements.

- Prompt Cache, 90% savings on large system prompts for 5 mins of calls. This is amazing

- Contexual RAG, while not ground breaking idea, is important thinking and method for better vector retrieval


In my humble opinion you're wrong, Sora and 4o voice are months old and no signs they're not vaporware, and they still haven't shipped a text model on par with 3.5 sonnet!

[flagged]


Is that your test suite?

Companies are held to the standard that their leadership communicates (which, by the way, is also a strong influencing factor in their valuation). People don't lob these complaints at Gemini, but the CEO of Google also isn't going on podcasts saying that he stares at an axe on the wall of his office all day musing about how the software he's building might end the world. So its a little understandable that OpenAI would be held to a slightly higher standard; its only commensurate with the valuation their leadership (singular, person) dictates.

To be fair, that question is one of the suggested questions that OpenAI shows themselves in the UI, for the o1-preview model.

(Together with 'Is a hot dog a sandwich?', which I confess I will have to ask it now.)


If you have a sandwich and cut it in half, do you have one or two sandwiches?

Mu.

https://en.m.wikipedia.org/wiki/Mu_(negative)

See Non-dualistic meaning section.


Assuming a normal cut, this isn't a question about how you define a sandwich, this is a question about the number of servings, and only you can answer that.

Yes, you do have one or two sandwiches.

Edit: oh dang, I wanted to make the “or” joke so badly that I missed the option to have zero sandwiches.


Depends on what kind of sandwich it was before, and along which axis you cut it, and where you fall on the sandwich alignment chart.

Quite interesting that this comment is downvoted when the content is factually correct and pertinent.

It's a very relevant fact that Greg Brockman recently left on his own volition.

Greg was aligned with Sam during the coup. So, the fact that Greg left lends more credence to the idea that Murati is leaving on her own volition.


> It's a very relevant fact that Greg Brockman recently left on his own volition.

Except that isn’t true. He has not resigned from OpenAI. He’s on extended leave until the end of the year.

That could become an official resignation later, and I agree that that seems more likely than not. But stating that he’s left for good as of right now is misleading.


> Quite interesting that this comment is downvoted when the content is factually correct and pertinent.

>> Yet the company is still shipping, like no other.

this is factually wrong. Just today Meta (which I despise) shipped more than openAI in a long time.


> When you're firing a high-level executive in a polite way, it's common to let them announce their own departure and frame it the way they want.

You also give them some distance in time from the drama so the two appear unconnected under cursory inspection.


To be fair she was also one of the employees who signed the letter to the board demanding that Altman be reinstated or she would leave the company.

Does that actually mean anything? Didn't 95% of the company sign that letter, and soon afterwards many employees stated that they felt pressured by a vocal minority of peers and supervisors to sign the letter? E.g. if most executives on her level already signed the letter, it would have been political suicide not to sign it

She was second-in-command of the company. Who else is there on her level to pressure her to sign such a thing, besides Sam himself?

Isn’t that even worse? You write to the board, they take action on your complaints, and then you change your mind?

It means when she was opting for the reinstating of Altman, she didn't have all the information needed to make a decsion

Now that she's seen exactly what prompted the previous board to fire Altman, she fires herself because she understands their decision now.


Exactly, Sam Altman wants group think, no opposition, no diversity of thought. That's what petty dictators demand. This spells the end of OpenAI IMO. Huge amount of money will keep it going until it doesn't

I think the much more likely scenario than product roadmap concerns is that Murati (and Ilya for that matter) took their shot to remove Sam, lost, and in an effort to collectively retain billion$ of enterprise value have been playing nice, but were never seriously going to work together again after the failed coup.

Why is it so hard to just accept this and be transparent about motives ? It's fair to say 'we were not aligned with Sam, we tried an ouster, didn't pan out so the best thing for us to do is to leave and let Sam pursue his path", which the entirely company has vouched for.

Instead, you get to see grey area after grey area.


Because, for some weird reason, our culture has collectively decided that, even if most of us are capable of reading between the lines to understand what's really being said or is happening, it's often wrong and bad to be honest and transparent, and we should put the most positive spin possible on it. It's everywhere, especially in professional and political environments.

For a counter example of what open and transparent communincation from a C-level tech person could look like, have a read of what the SpaCy founder blogged about a few months ago:

https://honnibal.dev/blog/back-to-our-roots


Stakes are orders of magnitude lower in spaCy case compared to OpenAI (for announcer and for people around them). It's easier to just be yourself when you're back on square one.

This is not a culture thing imo, being honest and transparent makes you vulnerable to exploits, which is often a bad thing for the ones being honest and transparent in a high competition area.

Being dishonest and cagey only serves to build public distrust in your organization, as has happened with OpenAI over the past couple of months. Just look at all of the comments throughout this thread for proof of that.

Edit: Shoot, look at the general level of distrust that the populous puts in politicians.


hypocrisy has to be the core of every corporate or political environment I have observed recently. I can count the occasions or situations where telling the simple truth is helpful. even the people who tell you to tell the truth are often the ones incapable of handling it.

From experience unless the person mention their next "adventure"(within like a couple of months) or gig it usually means a manager or c-suite person got axed and was given the option to gracefully exit.

By the barrage of exits following Mira's resignation, it does look like Sam fired her, the team got the wind of this and are now quitting in droves. This is the thing about lying and being polite. You can't hide the truth for long.

Mira's latest one liner tweet 'OpenAI is nothing without it's people" speaks volumes.


true

It is human nature to use plausible deniability to play politics and fool one’s self or others. You will get better results in negotiations if you allow the opposing party to maintain face (i.e. ego).

See flirting as a more basic example.


not for two sigma

McKinsey MBA brain rot seeping into all levels of culture

That's giving too much credit to McKinsey. I'd argue it's systemic brainrot. Never admit mistakes, never express yourself, never be honest. Just make up as much bullshit as possible on the fly, say whatever you have to pacify people. Even just say bullshit 24/7.

Not to dunk on Mira Murati, because this note is pretty cookie cutter, but it exemplifies this perfectly. It says nothing about her motivations for resigning. It bends over backwards to kiss the asses of the people she's leaving behind. It could ultimately be condensed into two words: "I've resigned."


It's a management culture which is almost colonial in nature, and seeks to differentiate itself from a "labor class" which is already highly educated.

Never spook the horses. Never show the team, or the public, what's going on behind the curtain.. or even that there is anything going on. At all time present the appearance of a swan gliding serenely across a lake.

Because if you show humanity, those other humans might cotton on to the fact that you're not much different to them, and have done little to earn or justify your position of authority.

And that wouldn't do at all.


> Just make up as much bullshit as possible on the fly, say whatever you have to pacify people.

Probably why AI sludge is so well suited to this particular cultural moment.


“the entire company has vouched for” is inconsistent with what we see now. Low/mid ranking employees were obviously tweeting in alignment with their management and by request.

People, including East Asians, frequently claim "face" is an East Asian cultural concept despite the fact that it is omnipresent in all cultures. It doesn't matter if outsiders have figured out what's actually going on. The only thing that matters is saving face.

id imagine that level of honesty could still lead to billions lost in shareholder value - thus the grey area. Market obfuscation is a real thing.

It's in nobodies best interest to do this especially when there is so much money at play.

A bit ironic for a non-profit

Everyone involved works at and has investments in a for-profit firm.

The fact that it has a structure that subordinates it to the board of a non-profit would be only tangential to the interests involved even if that was meaningful and not just rhe lingering vestige of the (arguably, deceptive) founding that the combined organization was working on getting rid of.


As I understand they are going to be stop being non-profit soonish now?

We lie about our successes why would we not lie about our failures?

> Why is it so hard to just accept this and be transparent about motives

You are asking the question, why are politicians not honest?


Because if you are a high level executive and you are transparent on those things, and if it backfires, it will backfire hard for your future opportunities, since all the companies will view you as a potential liability. So it is always safer and wiser option to not say anything in case of any risk of it backfiring. So you do the polite PR messaging every single time. There's nothing to be gained on the individual level of being transparent, only to be risked.

I doubt someone with Mira or Ilya’s calibre have to worry about future opportunities. They can very well craft their own opportunities.

Saying I was wrong should not be this complicated, or saying we failed.

I do however agree that there is nothing to be gained and everything to be risked. So why do it.


Their (Ilya and Mira) perspective on anything is so far remote from your (and my) perspectives that trying to understand their personal feelings behind their resignation is an enterprise doomed to failure.

"When you strike at a king, you must kill him." — Emerson

or an alternate - "Come at the king - you best not miss" -- Omar Little.

“the King stay the King.” —- D’Angelo Barksdale

“Original King Julius is on the line.” - Sacha Baron Cohen

King Julien

“How do you shoot the devil in the back? What if you miss?”

the real OG comment here

"When you play the game of thrones, you win or you die." - Cersei Lannister

"You come at the king, you best not miss." - Omar

This is the likely scenario. Every conflict at exec level comes with a "messaging" aspect, with there being a comms team, and board to manage that part.

Failed coup? Altman managed to usurp the board's power, seems pretty successful to me

I think OP means the failed coup in which they attempted to oust Altman?

Yeah the GP's point is the board was acting within its purview by dismissing the CEO. The coup was the successful counter-campaign against the board by Altman and the investors.

The successful coup was led by Satya Nadella.

Let's be honest: in large part by Microsoft.

Does it matter? The board made a decision and the CEO reversed it. There is no clearer example of a corporate coup.

[flagged]


For fun:

> In the sentence, the people responsible for the coup are implied to be Murati and Ilya. The phrase "Murati (and Ilya for that matter) took their shot to remove Sam" suggests that they were the ones who attempted to remove Sam (presumably a leader or person in power) but failed, leading to a situation where they had to cooperate temporarily despite tensions.


>but were never seriously going to work together again after the failed coup.

Just to clear one thing up, the designated function of a board of directors is to appoint or replace the executive of an organisation, and openAI in particular is structured such that the non-profit part of the organisation controls the LLC.

The coup was the executive, together with the investors, effectively turning that on its head by force.


Highly speculative.

Also highly cynical.

Some folks are professional and mature. In the best organisations, the management team sets the highest possible standard, in terms of tone and culture. If done well, this tends to trickle down to all areas of the organization.

Another speculation would be that she's resigning for complicated reasons which are personal. I've had to do the same in my past. The real pro's give the benefit of the doubt.


What leads you to believe that OpenAI is one of the best managed organizations?

Many hours of interviews.

Organizational performance metrics.

Frequency of scientific breakthroughs.

Frequency and quality of product updates.

History of consistently setting the state of the art in artificial intelligence.

Demonstrated ability to attract world class talent.

Released the fastest growing software product in the history of humanity.


We have to see if they’ll keep executing in a year, considering the losses in staff and the non technical CEO.

I don't get this.

I could write paragraphs...

Why the rain clouds?


This feels naive, especially given what we now know about Open AI.

If you care to detail supporting evidence, I'd be keen to see.

Please no speculative pieces, rumor nor hearsay.


Well why was sam altman fired. it was never revealed.

CEOs get fired all the time and company puts out a statement.

I've never seen "we won't tell you why we fired our CEO" anywhere.

now he is back making totally ridiculous statments like 'AI is going to solve all of physics' or that 'AI is going to clone my brain by 2027'

This is a strange company.


> This is a strange company.

Because the old guard wanted it to remain a cliquey non-profit filled to the brim with EA, AI Alignment, and OpenPhilanthropy types, but the current OpenAI is now an enterprise company.

This is just Sam Altman cleaning house after the attempted corporate coup a year ago.


When the board fires the CEO and the CEO reverses the decision, that is the coup.

The board’s only reason to exist is effectively to fire the CEO.


I think thats some rumors that they spread to make this look like a "conflict of philosophy" type bs.

There are some juicy rumors about what actually happened too. much more belivable lol .


Did you also try to oust the CEO of a multi-billion dollar juggernaut?

Sure didn't.

Neither did she though... To my knowledge.

Can you provide any evidence that she tried to do that? I would ask that it be non-speculative in nature please.



Below are exerts from the article you link. I'd suggest a more careful read through. Unless out of hand, you give zero credibility to first hand accounts given to the NYT by both Mirati and Sustkever...

This piece is built on conjecture from a source whose identify is withheld. The sources version of events is openly refuted by the parties in question. Offering it as evidence that Mirati intentionally made political moves in order to get Altman ousted is an indefensible position.

'Mr. Sutskever’s lawyer, Alex Weingarten, said claims that he had approached the board were “categorically false.”'

'Marc H. Axelbaum, a lawyer for Ms. Murati, said in a statement: “The claims that she approached the board in an effort to get Mr. Altman fired last year or supported the board’s actions are flat wrong. She was perplexed at the board’s decision then, but is not surprised that some former board members are now attempting to shift the blame to her.” In a message to OpenAI employees after publication of this article, Ms. Murati said she and Mr. Altman “have a strong and productive partnership and I have not been shy about sharing feedback with him directly.”

She added that she did not reach out to the board but “when individual board members reached out directly to me for feedback about Sam, I provided it — all feedback Sam already knew,” and that did not mean she was “responsible for or supported the old board’s actions.”'

This part of NYT piece is supported by evidence:

'Ms. Murati wrote a private memo to Mr. Altman raising questions about his management and also shared her concerns with the board. That move helped to propel the board’s decision to force him out.'

INTENT matters. Mirati says the board asked for her concerns about Altmans. She provided it and had already brought it to Altmans attention... in writing. Her actions demonstrate transparency and professionalism.


> It is hard for me to square "This company is a few short years away from building world-changing AGI"

Altmans quote was that "it's possible that we will have superintelligence in a few thousand days", which sounds a lot more optimistic on the surface than it actually is. A few thousand days could be interpreted as 10 years or more, and by adding the "possibly" qualifier he didn't even really commit to that prediction.

It's hype with no substance, but vaguely gesturing that something earth-shattering is coming does serve to convince investors to keep dumping endless $billions into his unprofitable company, without risking the reputational damage of missing a deadline since he never actually gave one. Just keep signing those 9 digit checks and we'll totally build AGI... eventually. Honest.


Between 1 and 10 thousands of days, so 3 to 27 years.

A range I'd agree with; for me, "pessimism" is the shortest part of that range, but even then you have to be very confident the specific metaphorical horse you're betting on is going to be both victorious in its own right and not, because there's no suitable existing metaphor, secretly an ICBM wearing a patomime costume.


Just in time for them to figure out fusion to power all the GPUs.

But really. o1 has been very whelming, nothing like the step up from 3.5 to 4. Still prefer sonnet3.5 and opus.


1 you use 1

2 (or even 3) you use "a couple"

A few is almost always > 3 and one could argue that upper limit 15

So, 10 years to 50 years


few is not > 3. Literally it's just >= 2, though I think >= 3 is the common definition.

15 is too high to be a "few" except in contexts of a few out of tens of thousands of items.

Realistically I interpret this as 3-7 thousands of days (8 to 19 years), which is largely consensus prediction range anyway.


While it's not really _wrong_ to describe two things as 'a few', as such, it's unusual and people don't really do it in standard English.

That said, I think people are possibly overanalysing this very vague barely-even-a-claim just a little. Realistically, when a tech company makes a vague claim about what'll happen in 10 years, that should be given precisely zero weight; based on historical precedent you might as well ask a magic 8-ball.


Personally speaking, above 10 thousand I'd switch to saying "a few tens of thousands".

But the mere fact you say 15 is arguable does indeed broaden the range, just as me saying 1 broadens it in the opposite extent.


You imply that he knows exactly when which imo is not and could even be next year for what we knows.. Who know every paper yet to be published??

Because as we all know: Full Self Driving is just six months away.

Thanks, now I cannot unthink of this vision: developers activate the first ASI, and after 3 minutes it spits out full code and plans for a working Full Self Driving car prototype:)

I thought super-intelligence was to say self driving would be fully operational next year for 10 consecutive years?

My point was that only super intelligence could possibly solve a problem that we can only pretend to have solved.

>Altmans quote was that AGI "could be just a few thousand days away" which sounds a lot more optimistic on the surface than it actually is.

I think he was referring to ASI, not AGI.


Isn't ASI > AGI?

Both are poorly defined.

By all the standards I had growing up, ChatGPT is already AGI. It's almost certainly not as economically transformative as it needs to be to meet OpenAI's stated definition.

OTOH that may be due to limited availability rather than limited quality: if all the 20 USD/month for Plus gets spent on electricity to run the servers, at $0.10/kWh, that's about 274 W average consumption. Scaled up to the world population, that's approximately the entire global electricity supply. Which is kinda why there's also all the stories about AI data centres getting dedicated power plants.


Don't know why you're being downvoted, these models meet the definition of AGI. It just looks different than perhaps we expected.

We made a thing that exhibits the emergent property of intelligence. A level of intelligence that trades blows with humans. The fact that our brains do lots of other things to make us into self-contained autonomous beings is cool and maybe answers some questions about what being sentient means but memory and self-learning aren't the same thing as intelligence.

I think it's cool that we got there before simulating an already existing brain and that intelligence can exist separate from consciousness.


Is the S here referring to Sentient or Specialised?

Super(human).

Old-school AI was already specialised. Nobody can agree what "sentient" is, and if sentience includes a capacity to feel emotions/qualia etc. then we'd only willingly choose that over non-sentient for brain uploading not "mere" assistants.


Scottish.

Super, whatever that means

Actually, the S means hope.

Given that ChatGPT is already smarter and faster than humans in many different metrics. Once the other metrics catch up with humans it will still be better than humans in the existing metrics. Therefore there will be no AGI, only ASI.

My fridge is already smarter and faster than humans in many different metrics.

Has been this way since calculation machines were invented hundreds of years ago.


_Thousands_; an abacus can outperform any unaided human at certain tasks.

OpenAI is a Microsoft play to get into power generation business, specifically nuclear, which is a pet interest of Bill Gates for many years.

There, that's my conspiracy theory quota for 2024 in one comment.


I don't think Gates has much influence on Microsoft these days.

He controls approximately 1% of the voting shares of MSFT.

And I would argue his "soft power" is greatly diminished as well

It's kinda cool as a conspiracy theory. It's just reasonable enough if you don't know any of the specifics. And the incentives mostly make sense, if you don't look too closely.

> it's possible that we will have superintelligence in a few thousand days

Sure, a few thousand days and a few trillion $ away. We'll also have full self driving next month. This is just like the fusion is the energy of the future joke: it's 30 years away and it will always be.


Now it’s 20 years away! It took 50 years for it to go from 30 to 20 years away. So maybe, in another 50 years it will be 10 years away?

To paraphrase a notable example: We will have full self driving capability next year..

This was the company that made all sorts of noise about how they couldn't release GPT-2 to the public because it was too dangerous[1]. While there are many very useful applications being developed, OpenAI's main deliverable appears to be hype that I suspect when it's all said and done they will fail to deliver on. I think the main thing they are doing quite successfully is cashing in on the hype before people figure it out.

[1] https://slate.com/technology/2019/02/openai-gpt2-text-genera...


GPT-2 and descendants have polluted the internet with AI spam. I don't think that this is too unreasonable of a claim.

I feel like this is stating the obvious - but i guess not to many - but a probabilistic syllable generator is not intelligence, it does not understand us, it cannot reason, it can only generate the next syllable

It makes us feel understood in the same ways John Edward used to in daytime tv, its all about how language makes us feel

true AGI...unfortunately we're not even close


"Intelligence" is a poorly defined term prone to arguments about semantics and goalpost shifting.

I think it's more productive to think about AI in terms of "effectiveness" or "capability". If you ask it, "what is the capital of France?", and it replies "Paris" - it doesn't matter whether it is intelligent or not, it is effective/capable at identifying the capital of France.

Same goes for producing an image, writing SQL code that works, automating some % of intellectual labor, giving medical advice, solving an equation, piloting a drone, building and managing a profitable company. It is capable of various things to various degrees. If these capabilities are enough to make money, create risks, change the world in some significant way - that is the part that matters.

Whether we call it "intelligence" or "probabilistically generaring syllables" is not important.


it can actually solve problems though, its not just an illusion of intelligence if it does the stuff we considered mere years ago sufficient to be intelligent. But you and others keep moving the goalposts as benchmarks saturate, perhaps due to a misplaced pride in the specialness of human intelligence.

I understand the fear, but the knee jerk response “its just predicting the next token thus could never be intelligent” makes you look more like a stochastic parrot than these models are.


It solves problems because it was trained with the solutions to these problems that have been written down a thousand times before. A lot of people don't even consider the ability to solve problems to be a reliable indicator of human intelligence, see the constantly evolving discourse regarding standardized tests.

Attempts at autonomous AI agents are still failing spectacularly because the models don't actually have any thought or memory. Context is provided to them via prefixing the prompt with all previous prompts which obviously causes significant info loss after a few interaction loops. The level of intellectual complexity at play here is on par with nematodes in a lab (which btw still can't be digitally emulated after decades of research). This isn't a diss on all the smart people working in AI today, bc I'm not talking about the quality of any specific model available today.


You're acting like 99% of humans aren't very much dependent on that same scaffolding. Humans spend 12+ years in school, their brains being hammered with the exact rules of math, grammar, and syntax. To perform our jobs, we often consult documentation or other people performing the same task. Only after much extensive, deep thought can we extrapolate usefully beyond our training set.

LLM's do have memory and thought. I've invented a few somewhat unusual games, described it to Sonnet 3.5 and it reproduces it in code almost perfectly. Likewise its memory has been scaling. Just a couple years ago context windows were 8000 tokens maximum, now they're reaching the millions.

I feel like you're approaching all these capabilities with a myopic viewpoint, then playing semantic judo to obfuscate the nature of these increases as "not counting" since they can be vaguely mapped to something that has a negative connotation.

>A lot of people don't even consider the ability to solve problems to be a reliable indicator of intelligence

That's a very bold statement, as lots of smart people have said that the very definition of intelligence is the ability to solve problems. If fear of the effectiveness of LLM's in behaving genuinely intelligently leads you to making extreme sweeping claims on what intelligence doesn't count as, then you're forcing yourself into a smaller and smaller corner as AI SOTA capabilities predictably increase month after month.


The "goalposts" are "moving" because now (unlike "mere years ago") we have real AI systems that are at least good enough to be seriously compared with human intelligence. We aren't vaguely speculating about what such an AI system might be like^[1]; we have the real thing now, and we can test its capabilities and see what it is like, what it's good at, and what it's not so good at.

I think your use of the "goalposts" metaphor is telling. You see this as a team sport; you see yourself on the offensive, or the defensive, or whatever. Neither is conducive to a balanced, objective view of reality. Modern LLMs are shockingly "smart" in many ways, but if you think they're general intelligence in the same way humans have general intelligence (even disregarding agency, learning, etc.), that's a you problem.

^[1] I feel the implicit suggestion that there was some sort of broad consensus on this in the before-times is revisionism.


> but if you think they're general intelligence in the same way humans have general intelligence (even disregarding agency, learning, etc.), that's a you problem.

How is it a me problem? The idea of these models being intelligent is shared with a large number of researchers and engineers in the field. Such is clearly evident when you can ask o1 some random completely novel question about a hypothetical scenario and it gets the implication you're trying to make with it very well.

I feel that simultaneously praising their abilities while claiming that they still aren't intelligent "in the way humans are" is just obscure semantic judo meant to stake an unfalsifiable claim. There will always be somewhat of a difference between large neural networks and human brains, but the significance of the difference is a subjective opinion depending on what you're focusing on. I think it's much more important to focus on the realm of "useful, hard things that are unique to intelligent systems and their ability to understand the world" is more important than "Possesses the special kind of intelligence that only humans have".


This overplayed knee jerk response is so dull.

I truly think you haven't really thought this through.

There's a huge amount of circuitry between the input and the output of the model. How do you know what it does or doesn't do?

Humans brains "just" output the next couple milliseconds of muscle activation, given sensory input and internal state.

Edit: Interestingly, this is getting downvotes even though 1) my last sentence is a precise and accurate statement of the state of the art in neuroscience and 2) it is completely isomorphic to what the parent post presented as an argument against current models being AGI.

To clarify, I don't believe we're very close to AGI, but parent's argument is just confused.


Did you seriously just use the word "isomorphic"? No wonder people believe AI is the next crypto.

In what way was their usage incorrect? They simply said that the brain just predicts next-actions, in response to a statement that an LLM predicts next-tokens. You can believe or disbelieve either of those statements individually, but the claims are isomorphic in the sense that they have the same structure.

Its not that it was used incorrectly: Its that it isn't a word actual humans use, and its one of a handful of dog whistles for "I'm a tech grifter who has at best a tenuous grasp on what I'm talking about but would love more venture capital". The last time I've personally heard it spoken was from Beff Jezos/Guillaume Verdon.

You know, you can just talk to me about my wording. Where do I meet those gullible venture investors?

I think we should delve further into that analysis.

Well, AI clearly is the next crypto, haha.

Apologies for the wording but I think you got it and the point stands.

I'm not a native speaker and mostly use English in a professional science related setting, that's why I sound like that sometimes.

isomorphic - being of identical or similar form, shape, or structure (m-w). Here metaphorically applied to the structure of an argument.


> There's a huge amount of circuitry between the input and the output of the model

Yeah - but it's just a stack of transformer layers. No looping, no memory, no self-modification (learning). Also, no magic.


No looping, but you can unroll loops to a fixed depth and apply the model iteratively. There obviously is memory and learning.

Neuroscience hasn't found the magic dust in our brains yet, either. ;)


Zero memory inside the model from one input (ie token output) to the next (only the KV cache, which is just an optimization). The only "memory" is what the model outputs and therefore gets to re-consume (and even there it's an odd sort of memory since the model itself didn't exactly choose what to output - that's a random top-N sampling).

There is no real runtime learning - certainly no weight updates. The weights are all derived from pre-training, and so the runtime model just represents a frozen chunk of learning. Maybe you are thinking of "in-context learning", which doesn't update the weights, but is rather the ability of the model to use whatever is in the context, including having that "reinforced" by repetition. This is all a poor substitute for what an animal does - continuously learning from experience and exploration.

The "magic dust" in our brains, relative to LLMs, is just a more advanced and structure architecture, and operational dynamics. e.g. We've got the thalamo-cortical loop, massive amounts of top-down feedback for incremental learning from prediction failure, working memory, innate drives such as curiosity (prediction uncertainty) and boredom to drive exploration and learning, etc, etc. No magic, just architecture.


I'm not entirely sure what you're arguing for. Current AI models can still get a lot better, sure. I'm not in the AGI in 3 years camp.

But, people in this thread are making philosophically very poor points about why that is supposedly so.

It's not "just" sequence prediction, because sequence prediction is the very essence of what the human brain does.

Your points on learning and memory are similarly weak word play. Memory means holding some quantity constant over time in the internal state of a model. Learning means being able to update those quantities. LLMs obviously do both.

You're probably going to be thinking of all sorts of obvious ways in which LLMs and humans are different.

But no one's claiming there's an artificial human. What does exist is increasingly powerful data processing software that progressively encroaches on domains previously thought to be that of humans only.

And there may be all sorts of limitations to that, but those (sequences, learning, memory) aren't them.


> It's not "just" sequence prediction, because sequence prediction is the very essence of what the human brain does.

Agree wrt the brain.

Sure, LLMs are also sequence predictors, and this is a large part of why they appear intelligent (intelligence = learning + prediction). The other part is that they are trained to mimic their training data, which came from a system of greater intelligence than their own, so by mimicking a more intelligent system they appear to be punching above their weight.

I'm not sure that "JUST sequence predictors" is so inappropriate though - sure sequence prediction is a powerful and critical capability (the core of intelligence), but that is ALL that LLMs can do, so "just" is appropriate.

Of course additionally not all sequence predictors are of equal capability, so we can't even say, "well, at least as far as being sequence predictors goes, they are equal to humans", but that's a difficult comparison to make.

> Your points on learning and memory are similarly weak word play. Memory means holding some quantity constant over time in the internal state of a model. Learning means being able to update those quantities. LLMs obviously do both.

Well, no...

1) LLMs do NOT "hold some quantity constant over time in the internal state of the model". It is a pass-thru architecture with zero internal storage. When each token is generated it is appended to the input, and the updated input sequence is fed into the model and everything is calculated from scratch (other than the KV cache optimization). The model appears to be have internal memory due to the coherence of the sequence of tokens it is outputting, but in reality everything is recalculated from scratch, and the coherence is due to the fact that adding one token to the end of a sequence doesn't change the meaning of the sequence by much, and most of what is recalculated will therefore be the same as before.

2) If the model has learnt something, then it should have remembered it from one use to another, but LLMs don't do this. Once the context is gone and the user starts a new conversation/session, then all memory of the prior session is gone - the model has NOT updated itself to remember anything about what happened previously. If this was an employee (an AI coder, perhaps) then it would be perpetual groundhog day. Every day it came to work it'd be repeating the same mistakes it made the day before, and would have forgotten everything you might have taught it. This is not my definition of learning, and more to the point the lack of such incremental permanent learning is what'll make LLMs useless for very many jobs. It's not an easy fix, which is why we're stuck with massively expensive infrequent retrainings from scratch rather than incremental learning.


>no memory, no self-modification (learning).

This is also true of those with advanced Alzheimer's disease. Are they not conscious as well? If we believe they are conscious then memory and learning must not be essential ingredients.


I'm not sure what you're trying to say.

I thought we're talking about intelligence, not consciousness, and limitations of the LLM/transformer architecture that limit their intelligence compared to humans.

In fact LLMs are not only architecturally limited, but they also give the impression of being far more intelligent than they actually are due to mimicking training sources that are more intelligent than the LLM itself is.

If you want to bring consciousness into the discussion, then that is basically just the brain modelling itself and the subjective experience that gives rise to. I expect it arose due to evolutionary adaptive benefit - part of being a better predictor (i.e. more intelligent) is being better able to model your own behavior and experiences, but that's not a must-have for intelligence.


LLMs are predictors not imitators. They don't "mimick". They predict and that's a pretty big difference.

I don't think that's a good example. People with Alzheimer's have, to put it simply, damaged memory, but not complete lack of. We're talking about a situation where a person wouldn't be even conscious of being a human/person unless they were told so as part of the current context window. Right ?

While it's true that language models are fundamentally based on statistical patterns in language, characterizing them as mere "probabilistic syllable generators" significantly understates their capabilities and functional intelligence.

These models can engage in multistep logical reasoning, solve complex problems, and generate novel ideas - going far beyond simply predicting the next syllable. They can follow intricate chains of thought and arrive at non-obvious conclusions. And OpenAI has now showed us that fine-tuning a model specifically to plan step by step dramatically improves its ability to solve problems that were previously the domain of human experts.

Although there is no definitive evidence that state-of-the-art language models have a comprehensive "world model" in the way humans do, several studies and observations suggest that large language models (LLMs) may possess some elements or precursors of a world model.

For example, Tegmark and Gurnee [1] found that LLMs learn linear representations of space and time across multiple scales. These representations appear to be robust to prompting variations and unified across different entity types. This suggests that modern LLMs may learn rich spatiotemporal representations of the real world, which could be considered basic ingredients of a world model.

And even if we look at much smaller models like Stable Diffusion XL, it's clear that they encode a rich understanding of optics [2] within just a few billion parameters (3.5 billion to be precise). Generative video models like OpenAI's Sora clearly have a world model as they are able to simulate gravity, collisions between objects, and other concepts necessary to render a coherent scene.

As for AGI, the consensus on Metaculus is that it will arrive in 2023. But consider that before GPT-4 arrived, the consensus was that full AGI was not coming until 2041 [3]. The consensus for the arrival date of "weakly general" AGI is 2027 [4] (i.e AGI that doesn't have a robotic physical world component). The best tool for achieving AGI is the transformer and its derivatives; its scaling keeps going with no end in sight.

Citations:

[1] https://paperswithcode.com/paper/language-models-represent-s...

[2] https://www.reddit.com/r/StableDiffusion/comments/15he3f4/el...

[3] https://www.metaculus.com/questions/5121/date-of-artificial-...

[4] https://www.metaculus.com/questions/3479/date-weakly-general...


> its scaling keeps going with no end in sight.

Not only are we within eyesight of the end, we're more or less there. o1 isn't just scaling up parameter count 10x again and making GPT-5, because that's not really an effective approach at this point in the exponential curve of parameter count and model performance.

I agree with the broader point: I'm not sure it isn't consistent with current neuroscience that our brains aren't doing anything more than predicting next inputs in a broadly similar way, and any categorical distinction between AI and human intelligence seems quite challenging.

I disagree that we can draw a line from scaling current transformer models to AGI, however. A model that is great for communicating with people in natural language may not be the best for deep reasoning, abstraction, unified creative visions over long-form generations, motor control, planning, etc. The history of computer science is littered with simple extrapolations from existing technology that completely missed the need for a paradigm shift.


The fact that OpenAI created and released o1 doesn't mean they won't also scale models upwards or don't think it's their best hope. There's been plenty said implying that they are.

I definitely agree that AGI isn't just a matter of scaling transformers, and also as you say that they "may not be the best" for such tasks. (Vanilla transformers are extremely inefficient.) But the really important point is that transformers can do things such as abstract, reason, form world models and theories of minds, etc, to a significant degree (a much greater degree than virtually anyone would have predicted 5-10 years ago), all learnt automatically. It shows these problems are actually tractable for connectionist machine learning, without a paradigm shift as you and many others allege. That is the part I disagree with. But more breakthroughs needed.


To whit: OpenAI was until quite recently investigating having TSMC build a dedicated semiconductor fab to produce OpenAI chips [1]:

(Translated from Chinese) > According to industry insiders, OpenAI originally actively negotiated with TSMC to build a dedicated wafer factory. However, after evaluating the development benefits, it shelved the plan to build a dedicated wafer factory. Strategically, OpenAI sought cooperation with American companies such as Broadcom and Marvell for its own ASIC chips. Development, among which OpenAI is expected to become Broadcom's top four customers.

[1] https://money.udn.com/money/story/5612/8200070 (Chinese)

Even if OpenAI doesn't build its own fab -- a wise move, if you ask me -- the investment required to develop an ASIC on the very latest node is eye watering. Most people - even people in tech - just don't have a good understanding of how "out there" semiconductor manufacturing has become. It's basically a dark art at this point.

For instance, TSMC themselves [2] don't even know at this point whether the A16 node chosen by OpenAI will require using the forthcoming High NA lithography machines from ASML. The High NA machines cost nearly twice as much as the already exceptional Extreme Ultraviolet (EUV) machines do. At close to $400M each, this is simply eye watering.

I'm sure some gurus here on HN have a more up to date idea of the picture around A16, but the fundamental news is this: If OpenAI doesn't think scaling will be needed to get to AGI, then why would they be considering spending many billions on the latest semiconductor tech?

Citations: [1] https://www.phonearena.com/news/apple-paid-twice-as-much-for... [2] https://www.asiabusinessoutlook.com/news/tsmc-to-mass-produc...


> Generative video models like OpenAI's Sora clearly have a world model as they are able to simulate gravity, collisions between objects, and other concepts necessary to render a coherent scene.

I won't expand on the rest, but this is simply nonsensical.

The fact that Sora generates output that matches its training data doesn't show that it has a concept of gravity, collision between object, or anything else. It has a "world model" the same way a photocopier has a "document model".


My suspicion is that you're leaving some important parts in your logic unstated. Such as belief in a magical property within humans of "understanding", which you don't define.

The ability of video models to generate novel video consistent with physical reality shows that they have extracted important invariants - physical law - out of the data.

It's probably better not to muddle the discussion with ill defined terms such as "intelligence" or "understanding".

I have my own beef with the AGI is nigh crowd, but this criticism amounts to word play.


It feels like if these image and video generation models were really resolving some fundamental laws from the training data they should at least be able to re-create an image at a different angle.

"Allegory of the cave" comes to mind, when trying to describe the understanding that's missing from diffusion models. I think a super-model with such qualifications would require a number of ControlNets in a non-visual domains to be able to encode understanding of the underlying physics. Diffusion models can render permutations of whatever they've seen fairly well without that, though.

I'm very familiar with the allegory of the cave, but I'm not sure I understand where you're going with the analogy here.

Are you saying that it is not possible to learn about dynamics in a higher dimensional space from a lower dimensional projection? This is clearly not true in general.

E.g., video models learn that even though they're only ever seeing and outputting 2d data, objects have different sides in a fashio that is consistent with our 3d reality.

The distinctions you (and others in this thread) are making is purely one of degree - how much generalization has been achieved, and how well - versus one of category.


I'm not saying you're wrong but you could use this reductive rhetorical strategy to dismiss any AI algorithm. "It's just X" is frankly shallow criticism.

And you can dismiss any argument with your response.

"Your argument is just a reductive rhetorical strategy."


Sure if you ignore context.

"a probabilistic syllable generator is not intelligence, it does not understand us, it cannot reason" is a strong statement and I highly doubt it's backed by any sort of substance other than "feelz".


I didn't ignore any more context than you did, but just I want to acknowledge the irony that "context" (specifically, here, any sort of memory that isn't in the text context window) is exactly what is lacking with these models.

For example, even the dumbest dog has a memory, a strikingly advanced concept model of the world [1], a persistent state beyond the last conversation history, and an ability to reason (that doesn't require re-running the same conversation sixteen bajillion times in a row). Transformer models do not. It's really cool that they can input and barf out realistic-sounding text, but let's keep in mind the obvious truths about what they are doing.

[1] "I like food. Something that smells like food is in the square thing on the floor. Maybe if I tip it over food will come out, and I will find food. Oh no, the person looked at me strangely when I got close to the square thing! I am in trouble! I will have to do it when they're not looking."


> that doesn't require re-running the same conversation sixteen bajillion times in a row

Lets assume the dog visual systems run at 60 frames per second. If it takes 1 second to flip a bowl of food over then that's 60 datapoints of cause-effect data that the dog's brain learned from.

Assuming it's the same for humans, lets say I go on a trip to the grocery store for 1 hour. That's 216,000 data points from one trip. Not to mention auditory data, touch, smell, and even taste.

> ability to reason [...] Transformer models do not

Can you tell me what reasoning is? Why can't transformers reason? Note I said transformers not llm's. You could make a reasonable (hah) case that current LLMs cannot reason (or at least very well) but why are transformers as an architecture doomed?

What about chain of thought? Some have made the claim that chain of thought adds recurrence to transformer models. That's a pretty big shift, but you've already decided transformers are a dead end so no chance of that making a difference right?


> to dismiss any AI algorithm

Or even human intelligence


And there's nothing wrong about that: the fact that _artificial intelligence_ will never lead to general intelligence isn't exactly a hot take.

It’s almost trolling at this point, though.

That's both a very general and very bold claim. I don't think it's unreasonable to say that's too strong of a claim given how we don't know what is possible yet and there's frankly no good reason to completely dismiss the idea of artificial general intelligence.

I think the existence of biological general intelligence is a proof-by-existence for artificial general intelligence. But at the same time, I don't think LLM and similar techniques are likely in the evolutionary path of artificial general intelligence, if it ever comes to exist.

That's fair. I think it could go either way. It just bugs me when people are so certain and it's always some shallow reason about "probability" and "it just generates text".

The only useful way to define an AGI is based on its capabilities, not its implementation details.

Based on capabilities alone, current LLMs demonstrate many of the capabilities practitioners ten years ago would have tossed into the AGI bucket.

What are some top capabilities (meaning inputs and outputs) you think are missing on the path between what we have now and AGI?


Regardless of where AI currently is and where it is going, you don't simply quit as CTO of the company that is leading the space by far in terms of technology, products, funding, revenue, popularity, adoption and just about everything else. She was fired, plain and simple.

You can leave and be happy with 30M+ USD in stocks and prospects of easy to find a job also.

> leading the space by far in terms of technology, products, funding, revenue, popularity, adoption and just about everything else

I am not 100% sure that they are still clearly leading the technology part, but agree in all other accounts.


Or you are disgusted and leave. Are there things more important than money? You'd certainly be certain the OpenAI founders sold themselves as, not'in'it'for'the money.

There is one clear answer in my opinion:

There is a secondary market for OpenAI stock.

It's not a public market so nobody knows how much you're making if you sell, but if you look at current valuations it must be a lot.

In that context, it would be quite hard not to leave and sell or stay and sell. What if oai loses the lead? What if open source wins? Keeping the stock seems like the actual hard thing to me and I expect to see many others leave (like early googlers or Facebook employees)

Sure it's worth more if you hang on to it, but many think "how many hundreds of M's do I actually need? Better to derisk and sell"


What would you do if

a) you had more money than you'll ever need in your lifetime

b) you think AI abundance is just around the corner, likely making everything cheaper

c) you realize you still only have a finite time left on this planet

d) you have non-AGI dreams of your own that you'd like to work on

e) you can get funding for anything you want, based on your name alone

Do you keep working at OpenAI?


Maybe she thinks the _world_ is a few short years away from building world-changing AGI, not just limited to OpenAI, and she wants to compete and do her own thing (and easily raise $1B like Ilya).

Which is arguably a good thing (having AGI spread amongst multiple entities rather than one leader).

The show Person of Interest comes to mind.

Samaritan will take us by the hand and lead us safely through this brave new world.

How is that good? An arms race increases the pressure to go fast and disregard alignment safety, non proliferation is essential.

Probably off-topic for this thread but my own rather fatalist view is alignment/safety is a waste of effort if AGI will happen. True AGI will be able to self-modify at a pace beyond human comprehension, and won't be obligated to comply with whatever values we've set for it. If it can be reined in with human-set rules like a magical spell, then it is not AGI. If humans have free will, then AGI will have it too. Humans frequently go rogue and reject value systems that took decades to be baked into them. There is no reason to believe AGI won't do the same.

Feels like the pope trying to ban crossbows tbh.

I think that train left some time ago.

I can't imagine investor pouring money on her. She has zero credibility both hardcore STEM like Ilya or a visionary like Jobs/Musk

"Credibility" has nothing to do with how much money rich people are willing to give you.

She was the CTO, how does she not have STEM credibility?

Has she published a single AI research paper?

Sometimes with good looks and charm, you can fall up.

https://en.wikipedia.org/wiki/Mira_Murati

Point me to a single credential where you feel confident of putting your money on her?


She studied math early on, so she's definitively technical. She is the CTO, so she kinda needs to balance the managerial while having enough understanding of the underlying technical.

Again, it's easy to be a CTO for a startup. You just have to be at the right time. Your role is literally is, do all the stuff Researchers/Engineers have to deal with. Do you really think Mira set the technical agenda, architecture for OpenAI?

It's a pity that HN crowd doesn't go one-level deep and truly understand on first principles


Her rise didn't make sense to me. Product manager at tesla to CTO at openAI with no technical background and a deleted profile ?

This is a very strange company to say the least.


Agreed, when a company rises to prominence so fast, I feel like you can end up with inexperienced people really high up in management. High risk high reward for them. The board was also like this - a lot of inexperienced random people leading a super consequential company resulting in the shenanigans we saw and now most of them are gone. Not saying inexperienced people are inherently bad, but they either grow into the role or don’t. Mira is probably very smart, but I don’t think you can go build a team around her like Ilya or other big name researchers. I’m happy for her with riding one of wildest rocket ships in the past 5 years at least but I don’t expect to hear much about her from now on.

>Product manager at tesla to CTO at openAI with no technical background and a deleted profile ?

Doesn't she have a dual bachelors in Mathematics and Mechanical Engineering?


Thats what is needed to get a job as a product manager these days?

Well that and years of experience leading projects. Wasn't she head of the Model X program at Tesla?

But my point is that she does have a technical background.


> Well that and years of experience leading projects. Wasn't she head of the Model X program at Tesla?

No idea because she scrubbed her linkedin profile. But afaik she didn't have "years of experience leading projects" to get a job as leadpm at tesla. That was her first job as PM.


You have to remember that OpenAI's mission was considered absolute batshit insane back then.

A significant portion of the old guard at OpenAI was part of the Effective Altruism, AI Alignment, and Open Philanthropy movement.

Most hiring in the foundational AI/model space is very nepotistic and biased towards people in that clique.

Also, Elon Musk used to be the primary patron for OpenAI before losing interest during the AI Winter in the late 2010s.


Which has zero explanatory power w.r.t. Murati, since she's not part of that crowd at all. But her previously working at an Elon company seems like a plausible route, if she did in fact join before he left OpenAI (since he left in Feb 2018).

most of the people seem to be leaving due to the direction where Altman is taking OpenAI. It went from a charity to him seemingly doing everything possible to monetize it for himself both directly and indirectly by him trying to raise funds for AI adjacent traditionally structured companies he controlled

probably not coincidence that she resigned at almost the same time the rumors about OpenAI completely removing the non-profit board are getting confirmed - https://www.reuters.com/technology/artificial-intelligence/o...


Afaik, he's exceedingly driven to do that, because if they run out of money Microsoft gets to pick the carcass clean.

My take is that Altman recognizes LLM winter is coming and is trying to entrench.

I don’t think we’re gonna see a winter. LLMs are here to stay. Natural language interfaces are great. Embeddings are incredibly useful.

They just won’t be the hottest thing since smartphones.


LLMs as programs are here to stay. The issue is with expenses/revenue ratio all these LLM corpos have. According to Sequoia analyst (so not some anon on a forum) there is a giant money hole in that industry, and "giant" doesn't even begins to describe it (iirc it was 600bln this summer). That whole industry will definitely see winter soon, even if all things Altman says would be true.

You just described what literally anyone who says "AI Winter" means; the technology doesn't go away, companies still deploy it and evolve it, customers still pay for it, it just stops being so attractive to massive funding and we see fewer foundational breakthroughs.

I just made a (IMHO) cool test with OpenAI/Linux/TCL-TK:

"write a TCL/tk script file that is a "frontend" to the ls command: It should provide checkboxes and dropdowns for the different options available in bash ls and a button "RUN" to run the configured ls command. The output of the ls command should be displayed in a Text box inside the interface. The script must be runnable using tclsh"

It didn't get it right the first time (for some reason wants to put a `mainloop` instruction) but after several corrections I got an ugly but pretty functional UI.

Imagine a Linux Distro that uses some kind of LLM generated interfaces to make its power more accessible. Maybe even "self healing".

LLMs don't stop amazing me personally.


The issue (and I think what's behind the thinking of AI skeptics) is previous experience with the sharp edge of the Pareto principle.

Current LLMs being 80% to being 100% useful doesn't mean there's only 20% effort left.

It means we got the lowest-hanging 80% of utility.

Bridging that last 20% is going to take a ton of work. Indeed, maybe 4x the effort that getting this far required.

And people also overestimate the utility of a solution that's randomly wrong. It's exceedingly difficult to build reliable systems when you're stacking a 5% wrong solution on another 5% wrong solution on another 5% wrong solution...


Thank You! You have explained the exact issue I (and probably many others) are seeing trying to adopt AI for work. It is because of this I don't worry about AI taking our jobs for now. You still need somewhat foundational knowledge in whatever you are trying to do in order to get that remaining 20%. Sometimes this means pushing back against the AI's solution, other times it means reframing the question, and other times its just giving up and doing the work yourself. I keep seeing all these impressive toy demos and my experience (Angular and Flask dev) seem to indicate that it is not going to replace any subject matter expert anytime soon. (And I am referring to all the three major AI players as I regularly and religiously test all their releases).

>And people also overestimate the utility of a solution that's randomly wrong. It's exceedingly difficult to build reliable systems when you're stacking a 5% wrong solution on another 5% wrong solution on another 5% wrong solution...

I call this the merry go round of hell mixed with a cruel hall of mirrors. LLM spits out a solution with some errors, you tell it to fix the errors, it produces other errors or totally forgets important context from one prompt ago. You then fix those issues, it then introduces other issues or messes up the original fix. Rinse and repeat. God help you if you don't actually know what you are doing, you'll be trapped in that hall of mirrors for all of eternity slowly losing your sanity.


and here we are arguing for internet points.

Much more meaningful to this existentialist.

It can work with things of very limited scope, like that you describe.

I wrote some data visualizations with Claude and aider.

For anything that someone would actually pay for (expecting the robustness of paid-for software) I don’t think we’re there.

The devil is in the details, after all. And detail is what you lose when running reality through a statistical model.


Why make tool when you can just ask AI to give you filelist or files that you need?

They're useful in some situations, but extremely expensive to operate. It's unclear if they'll be profitable in the near future. OpenAI seems to be claiming they need an extra $XXX billion in investment before they can...?

It’s a glorified grammar corrector?

TIL Math Olympiad problems are simple grammar exercises.

They do way more than correcting grammar, but tbf, they did make something like 10,000 submissions to the math Olympiad to get that score.

It’s not like it’ll do it consistently.

Just a marketing stunt.


If you consider responding to this:

"oi i need lik a scrip or somfing 2 take pic of me screen evry sec for min, mac"

with an actual (and usually functional) script to be "glorified grammar corrector", then sure.


Not really.

I think actually the best use case for LLMs is "explainer".

When combined with RAG, it's fantastic at taking a complex corpus of information and distilling it down into more digestible summaries.


Can you share an example of a use case you have in mind of this "explainer + RAG" combo you just described?

I think that RAG and RAG-based tooling around LLMs is gonna be the clear way forward for most companies with a properly constructed knowledge base but I wonder what you mean by "explainer"?.

Are you talking about asking an LLM something like "in which way did the teams working on project X deal with Y problem?" and then having it breaking it down for you? Or is there something more to it?


I'm not the OP but I got some fun ones that I think are what you are asking? I would also love to hear others interesting ideas/findings.

1. I got this medical provider that has a webapp that downloads graphql data(basically json) to the frontend and shows some of the data to the template as a result while hiding the rest. Furthermore, I see that they hide even more info after I pay the bill. I download all the data, combine it with other historical data that I have downloaded and dumped it into the LLM. It spits out interesting insights about my health history, ways in which I have been unusually charged by my insurance, and the speed at which the company operates based on all the historical data showing time between appointment and the bill adjusted for the time of year. It then formats everything into an open format that is easy for me to self host. (HTML + JS tables). Its a tiny way to wrestle back control from the company until they wise up.

2. Companies are increasingly allowing customers to receive a "backup" of all the data they have on them(Thanks EU and California). For example Burger King/Wendys allow this. What do they give you when you request data? A zip file filled with just a bunch of crud from their internal system. No worries: Dump it into the LLM and it tells you everything that the company knows about you in an easy to understand format (Bullet points in this case). You know when the company managed to track you, how much they "remember", how much money they got out of you, your behaviors, etc.


#1 would be a good FLOSS project to release out.

I don't understand enough about #2 to comment, but it's certainly interesting.


If you go to https://clinicaltrials.gov/, you can see almost every clinical trial that's registered in the US.

Some trials have their protocols published.

Here's an example trial: https://clinicaltrials.gov/study/NCT06613256

And here's the protocol: https://cdn.clinicaltrials.gov/large-docs/56/NCT06613256/Pro... It's actually relatively short at 33 pages. Some larger trials (especially oncology trials) can have protocols that are 200 pages long.

One of the big challenges with clinical trials is making this information more accessible to both patients (for informed consent) and the trial site staff (to avoid making mistakes, helping answer patient questions, even asking the right questions when negotiating the contract with a sponsor).

The gist of it here is exactly like you said: RAG to pull back the relevant chunks of a complex document like this and then LLM to explain and summarize the information in those chunks that makes it easier to digest. That response can be tuned to the level of the reader by adding simple phrases like "explain it to me at a high school level".


What's your experience with clinical trials?

Built regulated document management systems for supporting clinical trials for 14 years of my career.

The last system, I led one team competing for the Transcelerate Shared Investigator Portal (we were one of the finalist vendors).

Little side project: https://zeeq.ai


A cash out

Looking at ChatGPT or Claude coding output, it's already here.

Bad?

I just tried Gemini and it was useless.


Starting to wonder why this is so common in LLM discussions at HN.

Someone says "X is the model that really impressive. Y is good too."

Then someone responds "What?! I just used Z and it was terrible!"

I see this at least once in practically every AI thread


It depends on what you're writing. GPT-4 can pump out average React all day long. It's next to useless with Laravel.

Humans understand mean but struggle with variance.

You're the one that chose to try Gemini for some reason.

Google ought to hang its head in utter disgrace over the putrid swill they have the audacity to peddle under the Gemini label.

Their laughably overzealous nanny-state censorship, paired with a model so appallingly inept it would embarrass a chatbot from the 90s, makes it nothing short of highway robbery that this digital dumpster fire is permitted to masquerade as a product fit for public consumption.

The sheer gall of Google to foist this steaming pile of silicon refuse onto unsuspecting users borders on fraudulent.


It would be definitely difficult thing to walk away.

This is just one more in a series of massive red flags around this company, from the insanely convoluted governance scheme, over the board drama, to many executives and key people leaving afterwards. It feels like Sam is doing the cleanup and anyone who opposes him has no place at OpenAI.

This, coming around the time where there are rumors of possible change to the corporate structure to be more friendly to investors, is an interesting timing.


What if she believes AGI is imminent and is relocating to a remote location to build a Faraday-shielded survival bunker.

Then she hasn't ((read or watched) and (found plausible)) any of the speculative fiction about how that's not enough to keep you safe.

No one knows how deep the bunker goes

We can be reasonably confident of which side of the Mohorovičić discontinuity it may be, as existing tools would be necessary to create it in the first place.

This is now my head-canon.

Laputan machine!

What top executives write in these farewell letters often has little to do with their actual reasons for leaving.

Could also be that she just got tired of the day to day responsibilities. Maybe she realized she that she hasn't been able to spend more than 5 minutes with her kids/nieces/nephews last week. Maybe she was going to murder someone if she had to sit through another day with 10 hours of meetings.

I don't know her personal life or her feelings, but it doesn't seem like a stretch to imagine that she was just done.


Nothing difficult about it.

1) She has a very good big picture view of the market. She has probably identified some very specific problems that need to be solved, or at least knows where the demand lies.

2) She has the senior exec OpenAI pedigree, which makes raising funds almost trivial.

3) She can probably make as much, if not more, by branching out on her own - while having more control, and working on more interesting stuff.


Another theory: it’s possibly related to a change of heart at OpenAI to become a for-profit company. It is rumoured Altman’s gunning for a 7% stake in the for-profit entity. That would be very substantial at a $150B valuation.

Squeezing out senior execs could be a way for him to maximize his claim on the stake. Notwithstanding, the execs may have disagreed with the shift in culture.


A couple of the original inventors of the transformer left Google to start crypto companies.

I think they have an innovation problem. There are a few signals wrt the o1 release that indicate this. Not really a new model but an old model with CoT. And the missing system prompt - because they're using it internally now. Also seeing 500 errors from their REST endpoints intermittently.

It's likely hard for them to look at what their life's work is being used for. Customer-hostile chatbots, an excuse for executives to lay off massive amounts of middle class workers, propaganda and disinformation, regurgitated SEO blogspam that makes Google unusable. The "good" use cases seem to be limited to trivial code generation and writing boilerplate marketing copy that nobody reads anyway. Maybe they realized that if AGI were to be achieved, it would be squandered on stupid garbage regardless.

Now I am become an AI language model, destroyer of the internet.


I'm sure this isn't the actual reason, but one possible interpretation is "I'm stepping away to enjoy my life+money before it's completely altered by the singularity."

Maybe she has inside info that it's not "around the corner". Making bigger and bigger models does not make AGI, not to mention exponential increase in power requirements for these models which would be basically unfeasible for mass market.

Maybe, just maybe, we reached diminishing returns with AI, for now at least.


People have been saying that we reached the limits of AI/LLMs since GPT4. Using o1-preview (which is barely a few weeks old) for coding, which is definitely an improvement, suggests there's still solid improvements going on, don't you think?

Continued improvement is returns, making it inherently compatible with a diminishing returns scenario. Which I also suspect we're in now: there's no comparing the jump between GPT3.5 and GPT4 with GPT4 and any of the subsequent releases.

Whether or not we're leveling out, only time will tell. That's definitely what it looks like, but it might just be a plateau.


+ there are many untapped sources of data that contain information about our physical world, such as video

the curse of dimensionality though...


Among other perfectly reasonable theories mentioned here, people burn out.

Yeah, if she wasn't deniably fired, then burnout is what Ockham's Razor leaves.

This isn't a delivery app we're talking about.

"Burn out" doesn't apply when the issue at hand is AGI (and, possibly, superintelligence).


Burnout, which doesn't need scare quotes, very much still applies for the humans involved in building AGI -- in fact, the burnout potential in this case is probably an order of magnitude higher than the already elevated chances when working through the exponential growth phase of a startup at such scale ("delivery apps" etc) since you'd have an additional scientific or societal motivation to ignore bodily limits.

That said, I don't doubt that this particular departure was more the result of company politics, whether a product of the earlier board upheaval, performance related or simply the decision to bring in a new CTO with a different skill set.


That isn't fair. People need a break. "AGI" / "superintelligence" is not a cause with so much potential we should just damage a bunch of people on the route to it.

Software is developed by humans, who can burn out for any reason.

Why would you think burnout doesn't apply? It should be a possibility in pretty much any pursuit, since it's primarily about investing too much energy into a direction that you can't psychologically bring yourself to invest any more into it.

Hint: success is not just around the corner.

But also most likely she is already fully vested. Why stay and work 60 hours a week in such case?

unless you didn't see it as a success, and want to abandon the ship before it gets torpedoed

People still believe that a company that has only delivered GenAI models is anywhere close to AGI?

Success in not around any corner. It's pure insanity to even believe that AGI is possible, let alone close.


What can you confidently say AI will not be able to do in 2029? What task can you declare, without hesitation, will not be possible for automatic hardware to accomplish?

Discover new physics.

Easy: doing something that humans don't already do and program it to do.

AI is incapable of any innovation. It accelerates human innovation, just like any other piece of software, but that's it. AI makes protein folding more efficient, but it can't ever come up with the concept of protein folding on its own. It's just software.

You simply cannot have general intelligence without self-driven innovation. Not improvement, innovation.

But if we look at much more simple concepts, 2029 is only 5 years (not even) away, so I'm pretty confident that anything that it cannot do right now it won't be able to do in 2029 either.


Maybe it is but it's not the only company that is

easy for me to relate to that, my time is more interesting than that

being in San Francisco for 6 years and success means getting hauled in front of Congress and European Parliament

cant think of a worse occupational nightmare after having an 8-figure nest egg already


It's corporate bullcrap, you're not supposed to believe it. What really matters in these statement is what is not said.

Maybe it has to do with Sam getting rid of the nonprofit control and having equity?

https://news.ycombinator.com/item?id=41651548


I doubt she’s leaving to do her own thing, I don’t think she could. She probably got pushed out.

I could see it being close, but also feeling an urgency to get there first / believing you could do it better.

A few short years is a prediction with lots of ifs and unknowns.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: