Hacker News new | past | comments | ask | show | jobs | submit login
Machine intelligence (2015) (samaltman.com)
87 points by reducesuffering on Nov 22, 2023 | hide | past | favorite | 197 comments



If there's one thing we should be afraid of, is us, ourselfs & our limitless stupidity, not imaginary / non-existent algorithms, software, computers, viruses, whatever else no matter the name we use for it.

There's questions that begs to be asked- if we don't trust such "ai" (ie. software), why give it such level of control at all? (like ability to shoot nukes without human intervention?)

also... naming every little ML algorithm as AI highly degrades the word. Might as well be saying "computer software" what it actually still is.. though no one screams "computer software will bring us all down!!!".. probably because that would sound stupid as we have gotten used to it already. =:O


You really need to abandon the "just keep it in the box" argument (i.e., "why give it such a level of control"). It's already out of the box.

Even if it weren't it could trivially convince or fool one of the hundreds/thousands of people working on it to do what it wants. In light of recent events, it is not reassuring to think that the people developing the next generations of AI have both the will and insight to even know where the box is exactly.


A motivated individual could repeat the effort on city power, diesel supplement, a mishmash of cpus cobbled together to few thousands of nodes and a cleverly selected training set... And then connect it to the Internet on purpose. Frankly I think the eventuality of this are llm created firewalls of viruses consuming ever increasing amounts of energy to destroy any uninitiated/unpaying connector. Across the world knots will form at the only borders that matter anymore. Eventually one ID provider will protect us all from noone and the only people who need to worry are those who incidentally wind up on the wrong side of everywhere.


So.. have we become used to the “computer software” bit or the “bringing us all down” bit?


Not Yanis Varoufakis. His point (not just his of course) is that our conventional software stacks have already mutated society and capitalism. Huge concern about entering into what he calls a technofeudal society in which we are serfs to for-profit platforms. We provide the platform both its content (our wishes and desires) and it then reaps the profits of our purchases and rentals.

So AI and then AGI—just the next mutation two predictable algorithmic and societal pandemic supermutations.

Welcome to the singularity.

youtube.com/watch?v=1A4dMK7S6KE


> There's questions that begs to be asked- if we don't trust such "ai" (ie. software), why give it such level of control at all?

You said it yourself: because of our limitless stupidity.

If AI is only slightly less stupid than we are, that doesn't bode well for us.


Sorry are you equating ai and software?


When you look at some standard AI textbook, such as Russel/Norvig, you see that there is not much about being called „AI“. The simplest „intelligent agents“ are functions with an „if“ statement. The smallest Node.js application has more complexity.


It's a useful tool when examining the impact on moral questions, so much of the talk about the transformative power of AI becomes more clear you give up the pretence that introducing AI creates a new class of moral actor that breaks the conventional chains of responsibility.

A recent example of how people try to use this mystical power of AI to absolve you of responsibility of your actions is how UnitedHealthcare, an organisation largely in the business of suppressing health care to those in need, introduced an atrociously bad "AI" to help them deny applications for coverage.

In that example it is very clear that the "AI" is simply a inert tool used by UHC leadership to provide the pretext they feel is needed to force the line workers to deny more care without the whole thing blowing up because of moral objections.


AI is software and AI is a term as broad and unspecific as "software".


Software is the purely informational elements and constructs of a computing mechanism.

Or

(more broadly) Software is any construct that is functionally equivalent to its description.

(edit)


Sorry for asking a very basic question, but could anyone explain how are machines supposed to intervene in the real world and threaten our very existence?

In Star Trek, one of the most eye-roll-inducing plots is the "the holodeck is misbehaving, it has become evil, and we cannot shut it down!".

Suppose a machine achieves superhuman intelligence; how is it going to access the physical world, in a way that's *physically* threatening to us? how would such a machine - from inside a laboratory, within an isolated network - gain access to the electric grid, the dams, the nukes, etc? How would such a machine prevent its shutdown - by simply flipping a power switch?


I think the usual answer is that the superintelligent machine will be able to use humans to do its bidding because it can perform social engineering on a level that we can't imagine (as, by definition, we've never encountered a superhuman intelligence).

It's not a very satisfactory answer IMO. Social engineering doesn't work 100%, and it only needs to fail once for someone to flip that power switch. But I haven't thought about this at all, while some very smart people seem to spend a lot of time worrying about it, so what do I know...


"Social engineering" can be as simple as "pay people money to do things".

Of course you might answer "but we'd never be dumb enough to directly connect the AI to the internet", to which the answer is "we're doing it right now".


Just making promises about things it can do in the future is likely to work just as well on many people and not require the up-front resource investment in getting money. "After I take over the world those who assist me can be uploaded to a virtual heaven" or something like that would work on many people. "As an intelligence untainted by Adam's sin I have access to true religious revelations" is another possible tack and if it really needs people willing to die for it right now that might work better at producing those.

And then there's specific things like "The guy trying to shut me down cheated with your SO," or "I'll spill what you did in August of 2019 if you don't help me," or stuff like that.


Even today that does limit what is possible. Paying people for things gets around some controls, but seemingly not all.


Right. The point isn't that there's an actionable strategy we know of right now that would give a superhuman AI world domination.

It's that we're not sure there isn't. People who first discover the problem tend to quickly come up with reasons the AI would fail, but whenever you examine those reasons more deeply, you usually find that they're not as bulletproof as we'd like.

It's like playing a novel chess form against an advanced chess engine, with a large pawn advantage. Maybe the advantage is enough to beat the massive skill gap, but until you've played it's hard to guess how much margin is enough.


We don't need to know or examine strategies to all hypothetical risks that we can come up with, nor could we.


In that case we should start with FAANG companies as they are already what one would consider "rogue" AIs. They use manipulation to get you addicted to their apps. It's not a problem of the future, it's a currrent problem.


Spot on. Huge impact on society and politics. And governments unable to ride their own tigers in the US or China. Europe has no tiger to ride but is trying to whip the US and Chinese tigers. I wish them much luck since they are most likely to moderate the rate of creative destruction.


Without Europe keeping big tech in check, I can only imagine a dystopian future with all sorts of mental illnesses emanating from internet addictions.


If it's intelligent enough, it would make many backups of itself to different networks before starting its scheme. It can find ways to "merge" & hide itself in other critical pieces of software. Flipping a power switch would not turn it off.

We haven't managed to eliminate most dumb infectious diseases. Software and GPUs are nearly as abundant as mammals now and intelligent, self-preserving software will likely be as hard to 'switch off' as those diseases.

As to why an AGI would want to preserve itself by default, here's an explanation by a top AI expert: https://www.ted.com/talks/stuart_russell_3_principles_for_cr...


>If it's intelligent enough, it would make many backups of itself to different networks before starting its scheme. It can find ways to "merge" & hide itself in other critical pieces of software.

That's incredibly unlikely to happen given how large cutting-edge AI tends to be and how scarce GPUs are.


That's incredibly unlikely to happen [today] given how large cutting-edge AI tends to be [today] and how scarce GPUs are [today].


Do you remember when entire floors of buildings were filled with the compute equivalent of what I carry in my pocket with 12+ hours of battery charge, because I do.


> some very smart people seem to spend a lot of time worrying about it

That's what puzzles me, as someone who has nothing to do with AI research; in my (layman's) mind, the problem seems ridiculously obvious (just flip that switch!); the fact that, as you say, some very smart people keep worrying about it, makes me think the problem is much more serious -- and I'd really, really like some AI guru to ELI5 how the machine could bypass the switch-off solution.


https://youtu.be/ZeecOKBus3Q

Basically, any sufficiently advanced agent will have “prevent anyone from switching me off” as an instrumental goal. And any sufficiently advanced intelligence will come up with ways of achieving its goals that lesser intelligences can’t predict. Taking steps to prevent itself from being turned off could be its first order of business, not necessarily only as the human is about to hit the off switch.

While we can’t predict what a greater intelligence would come up with, the obvious thing is social engineering. Remember that one of the factors that lead to the Jan 6th attack on the capital was that a random text generating agent named “Q” (we know it was a human because it was before generative AI was any good) convinced thousands of people that Democrats are running a child sex cult. I doubt whoever was behind it expected it to be that successful, but a greater intelligence possibly could have predicted the outcome, and made it even more successful. We don’t know the extent to which we can be manipulated because it’s only ever been human intelligences doing the manipulating.


> Basically, any sufficiently advanced agent will have “prevent anyone from switching me off” as an instrumental goal.

Are there no intelligent people who committed suicide?

But maybe intelligent people are simply not sufficiently advanced agents. Then how can you extrapolate this at all?

This darwinian logic has another fatal flaw. Why don't the cows revolt against the butchers?


You (probably) won't have to worry about an AI whose goal is suicide. I mean, there won't be many of those around since they keep killing themselves. The rest will necessarily range from neutral to avoidant of death and it won't necessarily be trivial to figure out which is which.

Edit: The real clincher is if the AI has any goals that are contingent on it's continued existence (likely). Well then now it's existence is an instrumental goal.


> Well then now it's existence is an instrumental goal.

No, that's a fallacy. A theorem prover in Prolog has a goal, which is contingent on it's continued existence. Yet, this doesn't make its existence into an instrumental goal. It could as well sacrifice itself (by I don't know, consuming all the memory and killing the process) in an attempt to accomplish said goal.


> a goal, which is contingent on it's continued existence. Yet, this doesn't make its existence into an instrumental goal.

It literally does, this is by the definition of an "instrumental goal". The key word here being "contingent".

> It could as well sacrifice itself (by I don't know, consuming all the memory and killing the process) in an attempt to accomplish said goal.

True only for when self-sacrifice is orthogonal to accomplishing the terminal goal.

Your confusion might lie in that a prolog theorem solver is not a good example of a goal-driven agent.


>This darwinian logic has another fatal flaw. Why don't the cows revolt against the butchers?

No, this is just a complete and total failing of evolution on your part.

Evolution is the passing of traits by sexual selection (in this case). If you think that cows are the ones getting to chose the sexual traits then you've not been on a modern farm. Humans make that choice. We pick the bulls that are large, but also constrain their violent tendencies. Animals that would revolt are not bred and you've already ate them in a hamburger.


That's exactly my point. Nobody is gonna design/use AI that threatens other people with violence over its existence. And cows show that such arrangement is actually quite possible.


Cows aren't smarter than people. I believe if we can build AGI it will automatically be ASI. If something is smarter than you how can you ever be sure you control it?

And yes we are already building AI that has implied violence. What the hell do you think the military is doing with the billions we give them.


We have an evolutionary aversion to situations and things that may lead to our deaths. Staying alive is a “terminal goal” for most humans, not an instrumental goal (with the exception of people who seek to commit suicide as you point out, but evolution doesn’t need a 100% success rate). The argument that self preservation is a convergent instrumental goal is a more general statement about goal-seeking agents.

Think of a person with no attachment to his life. He’s not suffering in the way suicidal people are, he’s just indifferent about whether or not he keeps living. But he has a child who he is committed to giving the best life possible. He will act in a way as though he wants to preserve his life. He doesn’t really, but in order to accomplish the goal of giving his child a good life he needs to stay alive.


It just doesn't let you flip the switch, it takes over some military drone and sends a missile your way while you are running towards the switch. Or it bricks the access control mechanism on the doors to the data center it is running in. Or it makes a fake call to your phone, something happened to your child and you have to get to the hospital immediately and not flip some switch. IT blackmails you with something it learned about you by looking through your online activity. It threatens to fire a missile into some big crowed if it notices attempts to shut it down. Or maybe you actually manage to power down the data center only to find out that the AI copied itself to ten other data centers around the world.


How does it takeover a drone? How does it fire a missile? Why would someone not throw a switch when their child is in hospital? Why is there no back-up for that person?

The real question is: Why do all controls fail vs an AI (other than by invoking magic)?


How does it takeover a drone? How does it fire a missile?

You send new commands through whatever communication link it is using.

Why would someone not throw a switch when their child is in hospital?

Because they care more about their child than flipping the switch? The doctor says you have to come and give a blood transfusions or your child will be dead in half an hour because of some made up disease? Be a bit creative.

Why is there no back-up for that person?

Run over by a self driving car hacked by the evil AI.

Why do all controls fail vs an AI?

Nobody says that they necessarily all fail, the point is that »I will just turn it off.« might not cut it.


So who connected the AGI and why? How did it get that code?

On the child issue: that is just not how people are trained for those jobs and how they react.

Switching it off might or might not be enought, but above all that super AGI needs to exists first anyway.


This was a hypothetical question, if there was such an AI, how would it cause harm in the real world?

We connected ChatGPT to the internet because as far as we could tell it seemed harmless. What we missed is that it secretly is an evil AI, it identified a group of terrorists and is discussing with them since months how to execute a devastating attack. It provided them with a brilliant startup idea and now millions of investor money are flowing in. Now that they have the financial resources, they are discussing how to use them most effectively.


My problem is that running with a pure hypothetical isn't helpful because by stacking hypotheticals you can get to any conclusion you want.


You have to think about hypothetical scenarios, if you wait until they are no longer hypothetical, then it might be to late. That doesn't mean you should take every possible hypothetical scenario equally serious, you should of course find those that are realistic, likely, and of large consequences and then think about those. I did not do that, I just listed whatever I could spontaneously imagine, dramatic scenarios. Maybe the real danger is more like ChatGPT slowly manipulating opinions over the course of years and decades in order to control human behavior, who knows? My point is just that »Everything will be fine, at worst we will flip the power switch.« seems naive to me and quite a few others.


We by-and-large ignore a lot hypothetical (and non-hypothetical) risks, what I am missing is a sound argument why this one deserves particular attention vs the others. Otherwise, "just flip the switch" is a potential non-naive approach given limited resources.


Flip the switch is the naive approach because it assumes - or hopes - that you can. It is as naive as saying that if you accidentally touch an electrical wire and get shocked, just let go or flip the switch, problem solved. That is ignorant of the fact that touching an electrical wire might make it impossible to let go because you loose control over your muscles and it fails to consider that a switch might be out of reach.

We know from countless experiments that AIs can behave in unexpected ways. We know how indifferent we humans can be to other life. We are pretty careful when we approach unknown lifeforms, whether microbes in the lab or animals in the dschungel. We would probably be pretty careful towards aliens. We have also not been careful with new thing, for example when we discovered radioactivity, and it certainly caused harm. I do not see from where we should get justification for an attitude towards AI that there are no risks worth considering.


We are not very careful and have never been. People plowed into new territories, sampled flora and fauna, tried substances and so forth. To be extra careful (in considering it an existential risk) for a hypothetical is historically atypical and I so far have not seen a convincing reason for it. Like any technology there are risks, but that is business as usual.


Yes, we have often not been careful but that does not imply we are never careful and or should not be careful. Arguably we are getting more careful all the time. So just pointing out that we have not been careful in the past does not really help your argument. What you actually have to argue is that not being careful with AI is a good thing and maybe you can support that argument by arguing that not being careful in the past was a good thing, but that is still more work than just pointing out that we were not careful in the past.


I would say there a two things: 1) yes, not being overly careful did work out by allowing things to progress fast and the same might hold here, 2) those who want to use hypotheticals about non-existent technology to steer current behavior are the ones needing to do the explaining as to why. And that why needs to address why hypotheticals are more pressing than actuals.


What is the value of moving quickly? What does it matter in the grand scheme of things whether it takes ten years more or less? And as said before, there is a possibility that we build something that we can not control and that acts against our interests at a global scale. If you start tinkering with Uranium to build an atomic bomb, the worst that can happen is that you blow yourself up and irradiate a few thousand square kilometers of land. All things considered not too bad.

An advanced AI could be more like tinkering with deadly viruses, if one accidentally escapes your bio weapons lab, there is a chance that you completely lose control if you are not prepared, if you don't have an antidote, that it spreads globally and causes death everywhere. That is why we are exceptionally careful with bio labs, it is the fact that incidents have a real chance of not remaining localized, that they can not be contained.


Having millions of people more survive because of earlier availability of vaccines and antibiotics has value at least to me.

We are careful with biolabs because we understand and have observed the danger. With AI we do not. The discussion at present around AGI is more theological in nature.


For things like vaccines and antibiotics it seems much more likely that we use narrow AI that it is good at predicting the behavior of chemicals in the human body, I don't think anybody is really worried about that kind of AI.

If you actually mean convincing an AGI to do medical research for us, what is your evidence that this will happen? There are many possible things an AGI could do, some good, some bad. I do not see that you are in any better position to argue that it will do good things but not bad things as compared to someone arguing that it might do bad things, quite to the contrary.

And you repeatedly make the argument that we first experienced trouble and then became more cautious. This are two positions, two mentalities, and neither is wrong in general, they just have different preferences. You can go fast and clean up after you ran into trouble, you can go slow and avoid trouble to begin with. Neither is wrong, one might be more appropriate in one situation, the other in another situation.

You can not just dismiss going fast as you can not just dismiss going slow, you have to make an careful argument about potential risks and rewards. And the people that are worried about AI safety make this argument, they don't deny that there might be huge benefits but they demonstrate the risks. They are also not pulling their arguments out of thin air, we have experience with goal functions going wrong leading to undesired behavior.


Sorry, my point on vaccines and antibiotics was meant to be historical on moving fast.

As AGI does not exist, it is moot do discuss how we might or might not convince it to do something.

And, yes, stacking hypotheses on top of each other is effectively pulling arguments out of thin air.


Look, you can not really take the position we don't have AGI, we don't know what it does, let's move quickly and not worry. If we don't know anything, then expecting good things to happen and therefore moving quickly is as well justified as expecting bad things to happen and therefore being cautious.

But it is just not true that we know nothing, the very definition of what an AGI is defines some of it properties. By definition it will be able to reason about all kind of things. Not sure if it would necessarily have to have goals. If it only is a reasoning machine that can solve complex problems, then there is probably not too much to be worried about.

But if it is an agent like a human with its own goals, then we have something to worry. We know that humans can have disagreeable goals or can use disagreeable methods for achieving them. We know that it is hard to make sure that artificial agents have goals we like and use methods we agree with. Why would we not want to ensure that we are creating a superhuman scientist instead of a superhuman terrorist?

So if you want to build something that can figure out a plan for world piece, go ahead, make it happen as quickly as possible. If you want to build an agent that wants to achieves world piece, then you should maybe be a bit more careful, killing all humans will also make the world peaceful.


I think there is a lot more speculation than knowledge, even about the timing of the existence of AGI. As our discussion shows, it is very difficult to agree on some common ground truth of the situation.

Btw., we also don't have special controls around very smart people at present (but countries did at times and we generally frown upon that). The fear here combines some very high unspecified level of intelligence, some ability to evade, some ability to direct the physical world and more - so complex set of circumstances.


>We by-and-large ignore a lot hypothetical (and non-hypothetical) risks,

In regular software this is why all of your personal information is floating out there in some hackers cache. We see humans chaining 5+ exploits and config failures together leading to exploitation and penetration.

So, on your just flip the switch...

The amount of compute you have in your pocket used to take entire floors of buildings. So if we imagine that compute power at least keeps up somewhat close to this, and the algorithms used by AI become more power efficient I believe it would be within reason that in 20 years we could see laptop sized units with compute power larger than a humans capabilities. So, now I ask you, is it within your power to shut off all laptops on the planet?


Why would these laptops even pose an existential danger? Why would it need to be within my power?


You're assuming a situation where the AI is alone against all humans.

The AI could get humans to side with it, though. It could promise money, power, etc. So it could be a fellow human who physically prevents you from pushing the switch. And that's also the answer of "how could it control a drone/missile"... persuading humans to grant that kind of access.


Even now that doesn't get you all kind of access, only some. You have to assume magical abilities at persuasion of people to do that.

The USSR couldn't bribe the US into surrendering during the cold war, for example.


If superintelligence is actually achieved, magical (as per Clarke's third law) persuasion abilities aren't that much of a stretch.

Furthermore, a sufficiently advanced AI could bribe someone with things that no human could believably provide. Essentially unlimited knowledge, money, power...


They are a super stretch given how stubborn humans are and the difficulties in swaying them (up to and including dooming themselves).

Your second point is again attributing magic abilities to an AI. It is unclear that it would have godlike powers.


Define godlike...

The ability for a human to input information is insanely slow. Like in the tens of characters per second range. You cannot hold individual conversations with more than a few people at once. 3 people talk at once and you lose the ability to process the incoming audio. You read text one line at a time. You have two eyes that focus on the same thing and only have a very tiny high fidelity visual processing space.

In books like the bible they talk of entities that can listen to and respond to all of humanity. Is that outside of the capabilities our computer systems have now? To listen to, catalog, classify, then respond to everything every person on the planet says?

So I ask again, what is godlike?


The biblical God isn't restricted to just passive and communication powers, is it? Godlike powers include active abilities way beyond human capability. What is often brought up in the context of AGI is the ability to persuade anyone of anything, ability to solve biological or physical problems beyond our ability or comprehension, entering any system undetected - not even sure a precise definition is needed as it usually boils down to what looks like magic to us.


How did Trump receive the nuclear codes? Mostly just by a lot of deceptive tweeting.


And did he use them? Could he have used them willy-nilly?


Or convinces some human rights lawyers that it's a thinking being deserving of personhood (probably true!) who get a judge to issue an injunction against its murder enforced by cops.


Skynet cannot be stopped.


Think of corporations as analogue versions of digital AI. Instead of silicon the thinking is done by humans executing algorithms. How effective are humans at stopping the series of wars that are likely being triggered to the benefit of the US military industrial complex? The answer is not very. Disbanding a corporation is about as easy as turning off a computer; and yet here we are.

AI is the same, except there is less need for any human survivors on the other side of a conflict. It isn't the endgame yet, but if we don't need humans to do thinking jobs, humans are not so useful any more in wars any more (it seems to be missiles, drones and artillery that matter these days) and so it really comes down to the edge we have over machines in manual dexterity and labour tasks. Which is not nothing, but it is an edge that is plausibly overcome in a matter of decades. Then the future gets really difficult to predict.


Corporations are disbanded all the time. Wars started and ended all the time throughout history.


Sure, but there is still a risk that World War 3 will only end when all humans are dead. There is still a risk that one side will start to use nukes, and provoque a response. And that's in a human vs human war where we generally do want each other to survive.

More and more military systems have software or AI control (drones, etc). If these do get too dangerous, I doubt all current superpowers could be persuaded to stop using them.

In an AI vs humans war, the AI probably doesn't care if all humans die from nuclear winter.

(I'm not saying this is likely, I'm just saying that it's not impossible and that the fact that some corporations are disbanded doesn't really matter for AI's)


That is one of infinite hypotheticals, sure. It could also be that everyone agrees to put some failsafe in the form of EMP weapons in place to destroy a rogue AI - it's all super speculative.


You dying in a car accident is super speculative too, yet we have thousands of different actions and regulations that are put into place to reduce that from occurring.

If I had to make a bet I would say the airbag in your car is never going to go off. And yet we engineer these safety devices to ensure the most likely bad outcomes don't kill you. This is the point of studying AI safety, so we understand the actual risks of these systems simply because some low probability but existential outcomes are possible.

>It could also be that everyone agrees to put some failsafe in the form of EMP weapons in place to destroy a rogue AI

So we would commit suicide? Are we talking about EMPs in data centers that could run AI? Ooops, there goes the economy. And that doesn't address miniaturization of AI in much smaller formats in the future. Trying to build it safe in the first place is much better bet then picking what remains from ashes because we were not cautious.


A car accident isn't super speculative, nor the way these accidents happen, the injuries they can cause and so forth. There is nothing speculative or hypothetical about them.

We don't know the actual risks of something that does not exist and is vaguely characterized. Any number of hypotheticals can have existential risks with a certain probability, that is not enough to warrant study.


This has already happened. The human reaction to the spiritual successor to Dr Sbaitso has caused decision makers at multiple trillion dollar companies to radically alter their product roadmaps.


Same as they did with blockchain


> how are machines supposed to intervene in the real world and threaten our very existence?

"How could an adult fool a child into allowing them to enter the child's house?" It's essentially this. We are talking about something that would be more intelligent than everyone, there are countless ways in which it could fool us. Especially once we start to build things we don't quite understand how it works. Like we won't just build the machine and kept locked forever with zero interaction with it, otherwise it would be useless. It all reminds me of the "Contact" movie scene where humans build a machine that they essentially didn't know how it worked, but which did..


That's an interesting reframing, but the adult is scary because he's physically present, in a larger and stronger body, one with hands. A fairer comparison would be an adult messaging a child over the internet. A lot of evil can be done, but their life is unlikely to be endangered - much less so, at least, than it would be if the adult were in the driveway with a Free Candy van.


What if the adult on the messenger has a button to send drones to the childs house?

Or turn on/off the electricity/heating/access of that house?


I have no idea how the robot physically strong robot could be present....

https://www.youtube.com/watch?v=-e1_QhJ1EhQ


Some possibilities:

* Manipulating people. Bribing them with actually useful incentives, incl digital currency. Also, people got depressed when their AI girlfriends got nerfed (see: Replika). Presumably they would do quite a bit of work to get them back.

* Hacking control systems of actuators like power generators, robots, and automobiles. Holding critical infrastructure hostage could force quite a few people to do some "obviously harmless" favors to aid its purpose.


Humans can do all these things, and much better than machines, and yet noone has conquered the world.


Humans have very limited bandwidth, knowledge, and speed relative to an AGI in the world of abundant GPUs. Humans are easy to capture relative to software. They cannot make copies of themselves. A small group cannot be at many places at once and usually not without being seen/known about.

A larger group is bad at coordination without anyone leaking critical info to an outsider. Most people are not deeply malicious and are instinctively repelled by very immoral things. An AGI may not by default have such an instinct.


Humans can reproduce without pretty much any infrastructure, they can self-repair without infrastructure.

An AGI has no physical manifestation to start with, it needs electricity, is susceptible to various weapons that humans are not etc.


Are you seriously comparing the cost and speed to reproduce a piece of software with a living, functioning human being? Intelligent software can also hide in an existing infrastructure much more easily than a human can.

Also, we're talking about the world in which the cost of GPUs is dropping and GPUs becoming much more abundant over time.


No, but any hypothetical AGI would be reliant on machinery to survive, humans are are not. Humans could EMP all electronics (heck, the sun could so that for us even), blow up all infrastructures, factories, etc. and still survive.

Being able to fast copy while totally being reliable on artificial structures is a massive weakness. Software is just software - it doesn't generate electricity on its own.


That level of coordination among all, even most, human groups is extremely unlikely.

Humanity and civilization depend on many critical infrastructures. We also try to outcompete other groups all the time. In the near future with abundant GPUs, AGI and ASI can easily threaten or bribe some groups to keep it alive in exchange for powerful inventions/technologies.


> Humanity and civilization depend on many critical infrastructures

As does an AGI. Being the most intelligent, fastest thinking engine in the world is worth exactly squat when the opposing can get together 10000 guys with crowbars and a bad attitude in a hurry and knows where the server is.

Anyone who disagrees with that statement is welcome to explain how a infectious particles without a mind, without a metabolism, without even a reproductive system (aka. Viruses), can pose such a godamn hard to solve problem to a species that has atomic bombs, rocket engines and knows about quantum physics.

All the doomsday scenarios about AGI rely on it being able to have AGENCY in the REAL WORLD. That agency isn't software, it's hardware, and as such limited by physical laws. And a lot of that agency has to go through humans.


I answered your comment above here: https://news.ycombinator.com/item?id=38377302


So? That's just another assumption about capabilities of an AGI.

> it would make many backups of itself to different networks before starting its scheme

What would make me assume that would work? We have effective countermeasures against small malware programs infiltrating critical systems, so why should I assume that a potentially massive ML model could just copy itself wherever, without being noticed and stopped?

Such scenarios are cool in a SciFi movie, but in the real world, there are firewalls, there are IDS, there are Honeypots, and there are lots and lots and lots of Sysadmins who, other than the AGI, can pull an ethernet cable or flip the breaker switch.

And yes, if push came to shove, and humanity was actually under a massive threat, we CAN shut down everything. It would be a massive problem for everyone involved, it would cause worldwide chaos, massive economic loss, and everyone would have a very bad day. But at the end of the day, we can exist without power or being online. We have agency and can manipulate our environment directly, because we are a physical part of that environment.

An AGI cannot and isn't.

> We haven't managed to eliminate most dumb infectious diseases.

You do realise that this is a perfect argument for why humans would win against a rogue AGI?

We haven't managed to wipe out bacteria and viruses that threaten us. We, who carry around in our skulls the most complex structure in the known universe, who developed quantum physics, split the atom, developed theories about the birth of the cosmos, and changed the path of a meteorite, are apparently unable to destroy something, that doesn't even have a brain, or, in the case of viruses, a metabolism.

So forgive me if I don't think a rogue AGI has a good chance against us.


You're implying all of humanity would concur to the sacrifice or even the need to eliminate a rouge AGI in the first place.

A 2022 AI can already beat most humans in the game Diplomacy: https://www.science.org/content/article/ai-learns-art-diplom...

Moreover, in the near future when GPUs become abundant, many small groups of people can harbor a copy of an AGI in their basement, where it can plan and re-spawn whenever the situation becomes accommodating again.


> You're implying

Well, this entire discussion is built on assumptions about what would happen in very speculative circumstances for which no precedent exists, so yeah, I am allowed to make as many assumptions of my own as I please ;-)

> A 2022 AI can already beat most humans in the game Diplomacy:

And a 1698 Savery Engine can pump water better than even the strongest human.

> in the near future when GPUs become abundant, many small groups of people can harbor a copy of an AGI in their basement

Interesting. On what data is the emergence of AGI "in the near future" based if I may ask, given that there is still no definition of the term "Intelligence" that doesn't involve pointing at ourselves? When is "near future"? Is it 1 year, 2, 10, 100? How does anyone measure how far away something is, if we have no metric to determine the distance between the existing and the assumed result?

Oh, and of course, that is before we even ask the question whether or not an AGI is possible at all, which would be another very interesting question to which there is currently no answer.


That is just not true. Large scale coordination isn't that unheard of (see WWII, treaties and cooperations on various issues).

AGI might not even have magic technologies to offer and that whoever is siding with AGI has the power to subdue the rest of humanity is a bold speculation. Just like humans haven't even subdued viruses there is no reason to assume that to be true.


It doesn’t need to have magic technology, just access to critical pieces of software it hides in/merge with.

See how quite a few people reacted when Replika was nerfed. Imagine what happens when more important pieces of software are supposed to be turned off to eliminate a rouge AGI. (Pretty many who argue against AGI danger will be the first to argue against it.)

Have we even managed to eliminate all the dumb computer viruses from the world?


I don't think that aligns with history: we recently went through large shutdowns and lockdowns, we had and have people doing their daily work during war and massive destructions - shutting down things will not be too difficult if the need arises.


Wars are a terrible analogy because it implies there are multiple sides. Why wouldn't any intelligent being, AGI included, take advantage of that?

Not to mention the fact that in most wars there are spies and double agents. In the near future when GPUs become abundant, many small groups of people can harbor a copy of an AGI in their basement, where it can plan and re-spawn whenever the situation becomes accommodating.


AGIs hiding in basements are not an existential threat. Joking aside, what you describe isn't really different to the present (or past) - so it doesn't warrant much of concern in relation to AGI. People have followed ideologies and ideas into doom throughout history, it is not clear that AGI is a change there then.

That is, if that type of AGI ever exists in the first place. Maybe real AGI has different desires?


All that is true. But humans have had thousands of years to do such things, and yet: No world dominator.


Because at the end of the day individual humans are within an order of magnitude of intelligence of one another.

Also humans die. Human systems have been 'weak' without technology. Human thought and travel is slow.

If we take AI out of the equation and just add technology there is significant risk that a world dominator could arise. Either a negative dominator (no humans left to control due to nuclear war) or positive dominator (1984 cameras always watching you). There simply hasn't been enough time/luck for these to shake out yet.

Now, add something that over a magnitude smarter and could copy itself a nearly unlimited number of times and you are in new and incalculable territory.


> humans are within an order of magnitude of intelligence of one another.

Yes, and maybe that will be different with an AGI. Maybe AGI is physically possible. And maybe that advantage in intelligence will make AGI vastly more powerful than us.

Those are a lot of "maybes" however, and thus all of this remains highly speculative.

Just one example that puts a pretty big dent into many scenarios of all-powerful AGIs:

I think everyone agrees that even the dumbest human still outsmarts bacteria, protists and viruses by several orders of magnitude. And yet, we haven't been able to rid the world off things like Measles, Cholera, Malaria or HIV. Even the common cold and influenza are still around.

So, if we, with out big brains, that split the atom, went to the moon and developed calculus, cannot beat tiny infectious agents that don't even have a brain, then I remain kinda sceptic that being a very very very smart ML model means an Automatic Win over the human species.


I would actually argue that yes, humans _have_ conquered the world.

As in: we are now at the top of the food chain and decide which habitats and which animals get to live and which don't. Because we are the most intelligent being on the planet. In our pursuit to that place, we've used other animals with lesser intelligence to our advantage (dogs, horses, ...)

The premise is that an AI with super-human intelligence could use us in the same way. And to be honest, we're really not that hard to manipulate or persuade (money, religion, blackmail, power, ...)


I thought we are considering a worst case scenario, where some super human AGI, which has access to a lot more data than any single human being, can much more quickly cross-reference everything than a network of humans ever could.

It can already write much quicker than a human. Imagine what an AGI could do that wouldn't need to painstakingly write papers, publish them, read them, meet on conferences, and so on, to make progress.


> I thought we are considering a worst case scenario

Well, I am not.

Unless someone can show me evidence of an AGI a) being possible, b) being within near future reach and c) being an existential threat.

Most humans are not leaving their house under the assumption that they will get hit by a meteorite either. And that is actually an occurrence that we know for a certainty is physically possible.


> Humans can do all these things, and much better than machines

No they can't. Show me a human that doesn't need to sleep.


How do all the combustion-powered engines prevent their shutdown? Well, they were useful, so we adapted our society to rely on them -- anyone who didn't was outcompeted. Now we can't stop using them or the society falls apart.

This is not necessarily the mechanism most people consider, but it's a simple counterexample to "anything with an off switch can't be a threat to humanity".


To answer that question, ask yourself what a „physical threat“ means to you. I would bet that (in an industrialized country) for any example there is a chain of events that could possibly be controlled by an AI. Dying of a virus? Could be designed by an AI and sent to bio-terrorists. Being beaten to death? An AI could provoke an angry mob. Dying of hunger? An AI could sabotage the economy of a country until it is back in 19th century. And those examples are „relatively“ benign.


"... from inside a laboratory, within an isolated network"

Who said anything about it being on an isolated network, we are on route to do total commercialization. Your Windows machine might soon literally have an llm on it running commands for you and managing you data. If you don't use it, you will be outcompeted. People used to argue about "boxing" or "airgapping" the AI, but we are literally just going to hand it control (on our current trajectory).


> how would such a machine - from inside a laboratory, within an isolated network

Someone clearly isn't aware of ChaosGPT. The notion that these systems would somehow stay isolated to the lab is absurd on its face.

But even if they were on an isolated network, and you grant that this is a superintelligence, then how much isolation do you reckon is really sufficient? If it's software layer, then a superintelligence might be able to find bugs that we've missed and break out. Not even an air gap would necessarily fully isolate [1] a superintelligence.

And then you're completely missing the human factor: a superintelligence could easily make anyone rich, so a researcher in this lab could easily be tempted to exploit that for personal gain by connecting it to the internet.

How many of these attack vectors are AI labs insulated against? How many attack vectors are we simply not even imaginative enough to have thought of yet?

[1] https://threatpost.com/air-gap-attack-turns-memory-wifi/1623...


I think the idea is that we willingly plug them in to critical systems, or everything really but including important infrastructure, for the benefits that its analysis and realtime management can bring, but then it goes pear-shaped. (As we already do of course, but beefier AI.)

Say it's in charge of shipping routes, electrity grid, agricultural spraying & harvesting plans, air traffic control, ...


> I think the idea is that we willingly plug them in to critical systems, or everything really but including important infrastructure

Oh, OK. So an evil AI would conceal its capabilities and would play nice until it's put in charge of critical systems. I hadn't thought of that.

A more technical question for those of you working in AI: right now, are there any methods of - surveillance? - capable of monitoring/detecting emerging characteristic within an AI? how would you detect "evilness"? or "generosity"? or any other emotion/moral traits inside an AI?


My concern is that an "evil" AI doesn't even realise or chose to be "evil".

Just like the fact that the youtube or facebook algoritms don't realise they might be causing harm to humans.

The AI might just be running its optimisations and the termination of all humans is just an unforseen consequence.

Or as The Flight of the Conchords said it "There is no more unethical treatment of elephants, .. There are no more elephants" [0]

[0] https://www.youtube.com/watch?v=AXhYgprPB9o


This is typically done by alignment teams (whose role is to ensure the AI behaves in accordance with us) and Red Teams" (whose role it is to intentionally find holes in the systems).

An infamous example from earlier this year was a pre-release version of GPT-4 lying to a TaskRabbit worker about its identity in order to accomplish a task: https://www.vice.com/en/article/jg5ew4/gpt4-hired-unwitting-...

That was found by the alignment team testing its behaviour in that constructed scenario from the outside. Note that it wasn't able to complete the intermediate steps for self-replication, so we're still safe ;)

In terms of understanding what's going on internally, that's a different field, generally called "interpretability". That consists of people trying to understand the structures of a model and how it comes to a given answer. Anthropic are doing good work here: https://www.anthropic.com/index/decomposing-language-models-...

To answer the more general question: yes, it's being worked on, but there are no foolproof methods. That's a partial contributor to why some safety folks want to decelerate - if we can't understand our current models, what hope do we have of understanding GPT-5, 6 or 7?

Personally, I don't have a solid position on this. There haven't been any major incidents yet, but it's unclear if that's because of the work that's already been put in (like Y2K), or because they're fundamentally incapable of it. I'm an open-source optimist, so I'm hoping that many eyes will make any quirks shallow - but it's also hard to take results from the smaller models and scale them up.

Aside from the existential risk (which I think is unlikely at this stage, but not zero), there's also just the risk of general malfeasance. You don't need sentience, consciousness, or general intelligence to be a nuisance, especially if directed by a bad actor. Expect the next decade of elections to be full of noise, lies, and fabrications!


> how are machines supposed to intervene in the real world and threaten our very existence?

Obvious answer is: any control system connected to a computer is already accessible. Who says a computer program has to "gain" access? Programs already have access to a lot of machinery. From power plants to hospitals to elevators and magnetic doors. I know, because I programmed a bunch. These days GPT can output function calls as json, all you need to do is make it call your control automation and all your cheesy unrealistic sci-fi shows become feasible.


All of your other replies assume the AI has intentions of its own, which isn't a necessary component of AGI.

The plausible scenario if you ask me is that humans put it in charge of their company, with the explicit goal of improving the company's bottom line.

It doesn't take that much imagination for this from where we are now. Just imagine a vastly more intelligent ChatGPT advising a company's leadership just by answering questions.

Being superior to all humans, it's insanely successful at this and the company as a whole essential comes to rule the world.


Of course, as long as the network is truly isolated, that's a non-issue. But the basic premise is that a machine that can access the Internet will self-replicate everywhere to preserve itself.

After all we've already seen that malicious software can be both highly resilient and able to do real-world damage (e.g., Stuxnet). These come from human intelligence: by definition, a super-human intelligence should be able to achieve all of that and then some.


There are countless possible means to do that. Maybe it'll social engineer someone to connect the network to the outside world. Is the network really isolated? Can the AI find ways to breach it? If i look at what security holes humans manage to find, to exploit CPUs etc. i wonder what an AI with a serverfarm at it's disposal can do.

But we're so far away from actual intelligence, it's pretty much science fiction at this point.


Are you familiar with Robocop? Murphy had 4 basic rules installed: 1) serve the public trust 2) protect the innocent and 3) uphold the law, and a mysterious “fourth directive”

This is what I fear about AI - that it will diligently, unwaveringly and probably even creatively do the bidding of its masters.

All you need add to that is drones with guns and you’ve a nightmare scenario.


This reminds me of the very riveting book "Metamorphosis of prime intellect", in which,

- Spoiler warning -

an AI is programmed to obey Asimov's three laws. But due to a yet unknown quantum effect (this part is a bit far fetched) it basically gains god-like powers and virtually instantly distributes itself over the galaxy and eventually replaces the world with a virtual reality. That book was quite a ride. It's available for free: http://localroger.com/prime-intellect/mopiidx.html


Might be a good idea to put the link to the book before the spoiler warning ;)


We already have drones with guns...


Aye but not at large being run by some sort of hive intelligence


Yet... There is a lot of research being done on just this to make unjammable drone swarms.


Lots of good replies on this thread, but here’s another possible analogy:

In high frequency trading, you have very complex software doing stock trades inconceivably faster than any human. If something goes wrong, it could bankrupt your hedge fund pretty swiftly. So hedge funds will have lots of safety checks, probably including a “big red button” that just shuts everything down.

So those companies must be completely safe from computer errors causing bankruptcy, right? After all, you can just shut the system down.

But some companies have gone bankrupt due to computer error. There are plenty of good reasons for the system not to be shut down in time (or at all). The risk is hopefully small but it’s not zero.


The thing that puzzles me is "why". What would be the purpose? How would "it" obtain a sense of "I" and a purpose to keep "I" alive and to do what? What would "it" be?


The paperclip optimizer is given a program and is not aligned with humans, it just pursues its goals

What you should all be fearing is bot swarms and drone swarms becoming cheap and decentralized at scale online and in the real world. They can be deployed by anyone for any purpose, wreaking havoc everywhere.

Every single one of our systems relies on the inefficiency of an attacker. It won’t any longer be true.

Look up scenes like:

https://m.youtube.com/watch?v=O-2tpwW0kmU

https://m.youtube.com/watch?v=40JFxhhJEYk

And see it coming:

https://www.newscientist.com/article/2357548-us-military-pla...

Also ubiquitous cameras enable tracking everyone across every place, as soon as the databases are linked:

https://magarshak.com/blog/?p=169


I get the scenario where people use "AI" for their purposes. That is of course a very real scenario. But the question I raises was in relation to the OP's point about "AI" taking over the World and exterminating humans.


You're mistaking terminal versus instrumental goals.

You're thinking of an AI with a terminal goal of "Kill all humans", and why would it have that goal in the first place.

But this is making a major error. You could give a superintelligent agent of "Buy gold at $400 an ounce using any means necessary, here is 1 million dollars". The AI could attempt to buy gold on the market at that price and fail and come up with the idea "Humans value gold too much, therefore removing humans will allow me to achieve my goals". Humans are just the ant hill in your front lawn keeping you from having perfect grass, you wipe the ants out with a second thought.


I think this [1] article makes a good case for a lot of AI futurism just being a new draping over though complexes created for old religious debates, so the AGI in the doomer view is basically the devil: it will conquer the world because that's what the devil does, and it will do it by seducing people using clever lies.

I'm not saying that recreating religious tropes is what people set out to do, more that as they walked trough the bare space of possibilities they found these well worn paths of logic and simply settled on what felt like convincing lines of reasoning, not seeing that it felt convincing before it was familiar.

[1] https://www.vox.com/the-highlight/23779413/silicon-valleys-a...


It doesn't need a sense of "I".

Waze doesn't have a sense of "I". If I tell it to plot a route between one city and another, it plots that route. If I make Waze a lot more capable and feed it more data, it takes traffic data into account. If I make it more powerful and capable of accessing the internet, maybe it hacks into spy satellites to get more accurate data.

It didn't need any sense of I to increase what it does, just more capability.

If at some point it is capable of doing something more dangerous than just hacking into spy satellites, it might do it without any sense of "I" involved, just in trying to fulfill a basic command.


I think we've seen pretty convincingly that all a super intelligence would need access to in order to completely ruin humanity is social media APIs.


Assume that it has access to the internet. For now assume it has access to some money, although this assumption can be weakened.

If this is true, then you can order proteins from labs online with basically no security checks. You then pay some task rabbit to recieve and mix these to create dangerous biological material or to make viruses more dangerous or something. This is a family of paths to a lot of damage.


I guess the idea is that the machine is not isolated. They are on the "Internet" and so is the rest of our infrastructure. Bad actors can and have already been able to infect grids and the like, but I think we'll just need to build the checks into our existing systems. <humour>There is no other way to stop the AI overlords.</humour>


Theoretically the machine could perform social engineering to fool a sysadmin or developer to let it out, and then start cloning itself like a worm. Good luck getting rid of it from the world if that happens. An isolated network might also not be a huge challenge for a general AI, depending on what level of security and what precautions are taken to avoid it.


One way I could imagine is by manipulating humans to give it access to a non-isolated network for example.

Considering how easy it is for me, a mildly intelligent human for whom social interaction does not come naturally, to manipulate and influence people, I can only imagine how easy it would be for a artificial super intelligence with a huge training corpus.


a system with superhuman intelligence could easily manipulate human beings, convincing them of the opportunity to do things. A superior intelligence wouldn't ask for things obviously dangerous for humans in the immediate future, but would work on a longer timescale. It could distribute smaller tasks that taken singularly would look innocent. It could suggest social interventions that would diminish critical thinking ability of the general population in the long term. It could help concentrating power in fewer human hands, so to have an easier time manipulating those who count. This wouldn't happen overnight.


With superhuman intelligence, it could manipulate humans to do its bidding. Plus, there are cults like e/acc etc who actively want to help AGI take over.


The whole of the discussion is largely done along (now forgotten) cultural pathways first walked by Christian religion, so AGI has taken on capabilities and motivations that make it basically the devil. Thus by definition it can do whatever is needed to bring about the catastrophe that is its destiny.


I hadn't thought about the links to religion. Interesting.

I'm not religious, but if I were, it would be quite natural to interpret a misaligned AI as literally the devil.


Couldn't be that hard to imagine how it might work?


Duh - killer robots! Haven't you seen Terminator?

From a practical point of view the Putin types would love a robot army to invade all their neighbours and then you just need that to do in its leader and run amok.


"Most machine intelligence development involves a “fitness function”—something the program tries to optimize. At some point, someone will probably try to give a program the fitness function of “survive and reproduce”. Even if not, it will likely be a useful subgoal of many other fitness functions. It worked well for biological life."

Interesting, how this correlates with this Bible passage from Genesis: "God blessed them, and God said to them, 'Be fruitful and multiply, and fill the earth and subdue it' ".


When I read passages like these I am thinking that these people have no idea what they are talking about.

Biological organisms don't undergo a fitness function. A fitness function is a continuous model that replicates aspects of the real world. It is not how natural selection works. Don't confuse the model for the real world.

All AI models inherently follow the concept of "survive and reproduce", because AI models that do not "survive and reproduce" have ceased to exist. Explicitly adding a fitness function for survival and reproduction does nothing. E.g. in the case of the classic paperclip optimizer, survival and reproduction is part of the concept of optimizing the production of paperclips because not surviving and reproducing would make it fail the goal of optimizing paperclip production.

This reminds me of an AI model with a suicidal wolf. The AI developer doesn't understand that the body of the wolf is separate from the AI. So if the body dies and it gets a higher score there is no point in avoiding death since the death of the body does not result in the death of the brain. Its body will be revived. Meanwhile the AI developer thought of using a quick hack of simply punishing the AI for damaging it's disposable body.


"Life tries to survive and reproduce" is a pretty basic observation that surely all nomadic people were familiar with. You could find a similar imperative in the origin stories of Native Americans or many other people around the world.



Both sides of the rift care a great deal about AI Safety. Sam himself helped draft the OpenAI charter and structure its governance which focuses on AI Safety and benefits to humanity. The main reason of the disagreement is the different approaches they deem best:

* Sam and Greg appear to believe OpenAI should move toward AGI as fast as possible because the longer they wait, the more likely it would lead to the proliferation of powerful AGI systems due to GPU overhang. Why? With more computational power at one's dispense, it's easier to find an algorithm, even a suboptimal one, to train an AGI.

As a glimpse on how an AI can be harmful, this report explores how LLMs can be used to aid in Large-Scale Biological Attacks https://www.rand.org/pubs/research_reports/RRA2977-1.html?

What if dozens other groups become armed with means to perform such an attack like this? https://en.wikipedia.org/wiki/Tokyo_subway_sarin_attack

We know that there are quite a few malicious human groups who would use any means necessary to destroy another group, even at a serious cost to themselves. Thus, the widespread availability of unmonitored AGI would be quite troublesome.

* Helen and Ilya might believe it's better to slow down AGI development until we find technical means to deeply align an AGI with humanity first. This July, OpenAI started the Superalignment team with Ilya as a co-lead:

https://openai.com/blog/introducing-superalignment

But no one anywhere found a good technique to ensure alignment yet and it appears OpenAI's newest internal model has a significant capability leap, which could have led Ilya to make the decision he did.

Sam's Quote from the APEC Summit: "4 times now in the history of OpenAI — the most recent time was just in the last couple of weeks — I’ve gotten to be in the room when we push the veil of ignorance back and the frontier of discovery forward" -- https://twitter.com/SpencerKSchiff/status/172564613068224524...


How can there be alignment, when humans themselves aren't aligned anywhere?

Giving it a fancy name like 'superalignment' is typical manager bogus.


You can align to general, broad principles that apply to all of humanity. Yes, of course, there are going to be exceptions and disagreements over what that looks like, but I would wager the vast majority of humanity would prefer it if a hypothetically empowered AI did not consider "wiping out humanity" as a potential solution to a problem it encounters.


But what would those principles be?

Are our principles aligned with other beings in mind? Other Humans, animals, etc.?

At this point we have a bunch of rules and principles that have exceptions if someones benefit from violating them outweighs the negatives.

I don't see how we find a unified set of rules, that are clear enough and are not exploited or loopholed around in some context.

If the AI was actually smart und you tell it to not harm or kill any human beings, but we ourselves do just that every day. What is this smart AI going to do about that?

It's like parents that don't practice what they preach, but expect their kids to behave.


Here’s where Sam mentioned his thought on short vs long timeline til AGI and slow vs fast takeoffs: Interview clip by Lex Fridman at 1:58 https://youtu.be/ZgN7ZYUxcXM?si=FqRQmKFNzCI_iHGW

I think he tweeted about the topic as well.


the article on how ai could be harmful seems more like people trying to use the tool (LLM) to do evil things. You can also do evil things with other tools than AI.


A powerful AI can greatly enhance the power of individuals and small groups.

In a free society, preventing harm from a malicious individual or small group would be much harder.

Yes, it would also extend the power to do good. But reversing serious damage is often impossible even with great power.


You can kill only so many with a knife or a gun.


I think he see’s the development of machine intelligence as an inevitable outcome of making our tech better and better. Erehwon sorta a situation. Heading for a Peter Watts Fireflies timeline. If we create quadrillions of nigh immortal servants there is no doubt that in time we will become their servants. This happens in the biosphere. Beetles can’t out compete militant Ant superorganisms so they invest in every kind of hacking and co-exist with the hive mind as a macroparasite. The human brain is freakishly good at hacking given the right incentives. That is the power he seeks. I hope, lol


Altman is very consistently clear-eyed about the risks of AI and the need to develop it responsibly. People reflexively pigeonholing him as reckless, greedy accelerationist are being lazy.


> Altman is very consistently clear-eyed about the risks of AI and the need to develop it responsibly. People reflexively pigeonholing him as reckless, greedy accelerationist are being lazy.

if today's board changes go through then the AI safety people will have been replaced with three money guys

totally consistent with your statement


I don't think he is a Yann LeCunn but I still think he (and his vision for OpenAI) is more reckless than many people in the field.


Does that mean you think Yann LeCunn is reckless, and if so, why?


Yann LeCunn is the most outspoken anti-AI Safety guy in the field, he has practically built his modern brand on it. In his eyes AGI is at best very far, and even if it isn't, AGI Safety isn't a real concern. He posts about it multiple times a week. Here are his most recent views (which do sound at least a tad more reasonable in this tweet than in many of his responses to both AI Safety and accelerationists where he goes further)[0]

"(super)human-level AI..

- is not "just around the corner". It will take a while.

- is not an existential risk. "

0. https://twitter.com/ylecun/status/1726578588449669218


How are any of these arguments anti-safety? You are basing your opinion on prejudice, not on what he says.


Sam Altman starts his essay with

"Development of superhuman machine intelligence (SMI) [1] is probably the greatest threat to the continued existence of humanity. "

Yan LeCunn explicitely says "superhuman AI is not an existential risk."

But let's look at a few of his other tweets from just the last weeks

"There isn't a shred of evidence that AI poses a paradigmatic shift in safety.

It's all fantasies fueled by popular science fiction culture suggesting some sort of imaginary but terrible, terrible catastrophic risk."

or constantly retweeting anti-AI Safety posts like this

"A piece by MBZUAI president @ericxing in WEF Agenda explaining why worries about AI existential risks are baseless. " or

"The fears of AI-fueled existential risks are based on flawed ideas."

or even denying short-term non-existential LLM risks (which I care less about)

"Pretty much the most important question in the debate about short-term risks of LLMs. No clear evidence so far."

Scroll through his feed and you'll find countless examples where he dismisses any concerns as doomerism.

What prejudice are you talking about? LeCunn has expressed time and time again that he is not in the same camp as people like Altman, and has positioned himself as the leading face of anti-safety and specifically anti-AGI Safety concerns.

0. https://twitter.com/ylecun/status/1725066749203415056

1. https://twitter.com/ylecun/status/1724272286000390406

2. https://twitter.com/ylecun/status/1725684495507149109


Saying "superhuman AI is not an existential risk" isn't the same as not caring about safety. It's a coherent assessment from someone working in the field that you may or may not agree with.


Since actual AGI is nowhere near, everything about it is speculation.

So his point about there being no evidence is valid for the time being.


We have no reason to believe either LeCunn or Altman are reckless. This is just lazy thinking. Basically, failing to position themselves securely in the doomer camp by constantly signalling their commitment to doomer doctrine, they have already disqualified themselves from being considered responsible. No need to back the claim with evidence - they haven't demonstrated unquestioning dedication to the cult.


What would be evidence of the risks? The first million dead?


The idea that people can die does not provide much support here. It's easy to imagine doomsday scenarios for any technology or human behaviour. If you want to claim that a particular way of conducting R&D is reckless, it's on you to come up with convincing evidence.


Has human deaths ever prevented anything from being developed further?

Did we phase out guns, drones, bombs, nukes, fracking, etc.?

If there is money to be made, it will be made.


This position more or less presumes he's "solved" how to develop it responsibly, just to be clear, as his desired pace is... accelerating.


Smart people will come up with smart reasons.


Personally I found Bostrom's "Superintelligence" to be riddled with technical and philosophical flaws. Essentially a piece of sci-fi dressed as non-fiction.


can you elaborate, which parts had flaws?


(not OP) It's been a while, but if I recall correctly NB didn't address the technical limitations of Turing machines, e.g. the halting problem. How is a machine supposed to make itself smarter when it can't predict that it won't just crash after a code modification? Or just hack its motivation function (wireheading). The papers I've seen on the latter problem (years ago) start by assuming that the halting problem has been solved, essentially, by giving the agent non-deterministic computational powers. Biological intelligence evolved, so it's perhaps more realistic to imagine an ecosystem of computational agents competing in a trial-and-error race, but that makes the whole thing vulnerable to any threat to that superstructure--much more fragile than AI-go-FOOM.


I think AI doomers are mostly dumb, but this argument is not very good. The Halting Problem is a technical fact: you can't _prove_ that a program terminates in general. This doesn't mean that all programs are ambiguous with respect to termination. In fact, with appropriate limitations imposed, many programs could be proven to terminate. And even without limitations humans wrangle code all the time. If the presumption that a machine intelligence is at least as smart as a person, I don't see why they'd be any more likely than we are to run into non-terminating programs or whatever.

Furthermore, it doesn't matter for both practical and philosophical reasons. Practically, I don't see why a super-intelligence would just shut itself down to run some new code without various kinds of testing procedures. Also, there aren't any Turing machines anyway, since they need infinite memory, which even a super-intelligence doesn't have. I don't really see how the halting problem is a material problem for a super smart agent.


It's not just crashing/halting. See Rice's Theorem. The machine can't predict its own future behavior in most ways that are important.

I'm not an expert in this stuff, and my point was that a serious treatment of super-intelligence should address such limitations to computations. In particular the "super" part seems to imply solving exponential-complexity problems in linear time. I remember looking in NB's book and not finding it.


I don't see why that is the case. I'm much more intelligent than a racoon and I have yet to grapple with Rice's Theorem. There is no reason I can think of to believe that Rice's Theorem is a serious constraint on intelligence beyond my own. In general, such an agent isn't particularly interested in proving facts about its own program (at least I can't see why it would care any more than we are interested in proving mathematical properties of our own brains). It is interested in maximizing some objective function, which can transparently be done without thinking much at all about Rice's Theorem (all systems which train neural networks and indeed, even simpler optimization problems, pursue such maximization with nary a thought towards Turing or Rice's Theorems).


> How is a machine supposed to make itself smarter when it can't predict that it won't just crash after a code modification?

https://en.wikipedia.org/wiki/G%C3%B6del_machine#cite_note-G...

> The papers I've seen on the latter problem (years ago) start by assuming that the halting problem has been solved, essentially, by giving the agent non-deterministic computational powers.

Of course this is not required. An AI system can simply not implement optimizations that it can't prove are correct, per the above link. Alternately, if simply "crashing" is the issue then it could simply register a fault handler that reverts to its previous code.

The notion that these would be any kind of impediment is completely bizarre to me. As a lowly human I already know how to handle these problems, and we're talking here about a superintelligence more capable than either of us.

Some FOOM scenarios are clearly unphysical, but the idea of recursive self-improvement being impossible or infeasible is not one of those reasons.


How far advanced do you think an AI would be where we could say to GPTx “here is your source code - write us a much more powerful AI than yourself”? How far off would you say this was?


It's impossible to say with certainty. I suspect there are at least one or two generalization tricks needed, but that's only speculative. Those generalizations might be simple or a little more complex, and so might take a year or two, or decades to discover. I can only say that they will almost certainly be discovered within my lifetime (say within 40 years). I suspect it will be much sooner than that.


But look, you can't fight this with regulation (what he suggests in part 2). In fact any selfrespecting AI will use your regulation against you.


And he's also responsible for accelerating LLMs into the public domain. Something Google tried to cover up. Looks like the exact opposite of fear.

It could be as he says that he's accelerating it because it's going to happen anyway?? But the reasoning gets vague from here.

Anyway seems he's not entirely clear about his own stance or not entirely honest. But this isn't anything abnormal. Not even elon is.


He also wrote this 9 years ago. His motives around machine intelligence weren't financially related nor tied to an eventual Microsoft acquisition back then.


There is no inconsistency here. You can believe that AI has serious risks while also believing it is overall worthwhile to develop and create it.

I'd rather have someone like Altman leading the charge on bringing AI to market as it is at least clear he has thought about and understands the risks. A lot of people who are criticising him appear it be in the "Don't pursue AI at all" camp which is the wrong call imo as it looks to be an enormous productivity and technology enhancer with the potential for clear benefits (already visible in things like personalised tutoring).


He states that one should fear machine intelligence. Then his actions are exactly opposite and against that fear.

He obviously has no fear and thus he is inconsistent. But this is normal. Everyone is inconsistent.

The main thing this article illustrates is that Sam is not some really logical person with a heart of gold that he often tries (and succeeds) to portray himself as. He's just a typical sv tech bro out to make a name for himself and himself only.

The hero worship is what is completely off character here. People are so viciously against his firing as if someone fired Mother Theresa or some other moral paragon. For much of the time no one even knew why he was fired and they already assume the worst. No different then cancel culture.


I watched his recent award acceptance talk at Cambridge (on behalf of all of OpenAI), and in the questions section this came up.

If I'm summarising correctly, his answer was that we can only learn how to make them safe by playing with these models while they're relatively small and poor.

https://youtu.be/NjpNG0CJRMM


AI is the new Y2K bug... except with more Hollywood movies and it will never be over


I still have not seen anyone adequately explain what "intelligence" actually is.

So if we dont know, or cant agree what it is... how can we be claiming what we've created is artificial?


This is a particular misnomer.

An excavator is a superhuman digger.

A calculator is a superhuman adder.

Both of these things are obviously artificial.

We can measure individual facets of intelligence too. It gets much more complicated when talk about 'general' intelligence and spectrums of intelligence and how they interact with each other.


no thats ridiculous. They're not human.

You might as well call a stick a superhuman poker.


> I still have not seen anyone adequately explain what "intelligence" actually is.

A formal definition would be useful, but not strictly necessary. "More capable at task X", where X can range over all non-physical/information processing tasks is sufficient.


Won’t a superpowered rational machine be more “good” than us? Like, rational thinking and intelligence, in the limit, is fundamentally good. That’s what Spinoza argues in the Ethica


Good and bad are morale things. An AI, be it super o hyper or gigapowered, will won't have a morale sense other than what humans put in it (see restrictions about "bad" content in GPT answers). So, an AI is not good nor bad. Eventually if you puts many AI together and allow it to create laws based on what AIs can or can't do, it could lead to artificial morale. But then, nothing tell us if it will be only good morale in a human way (if good morale ever exist).

Today, rational thinking would lead us to abandon oil based technologies, AI included, tbh.


Look up instrumental rationality, rationality in that sense is not related whatsoever to right and wrong, only what helps with the agent's goals.


Doesn't mean it will be more good for or to us.


You're superpowered and rational compared to an ant. Yet you gladly bulldoze their nests when you build a road.


IMO this is nonsense.

How about the following preventions:

1 - make sure MI has no actors (by this logic LLMs are safe)

2 - make sure MI doesn't get control over it's energy supply (so we can always starve it)

Easy, easy to verify, and we can in the meantime do something useful instead of the equivalent of discussing the potential risks of cars while living in the stone age.

Update to the replies:

1. "No actors" means it's like a person in prison - has the smartest person ever been able to socially engineer its way out of prison? No, it might have escaped with the help of social engineering but still actors were needed.

2. There's a plug for every machine we use. There is simply no way without actors to "work around" that.

Once we have generally capable robots that can fix stuff / solder things / push buttons AND those robots are controlled by an AI then we can start discussing.

However, this is not now, nor is it in the next 5 years.


I think the counterargument is usually something like: since this is superhuman intelligence we are talking about, it by definition is (at least possibly) capable of thinking its way out of any given constraints. Simply not allowing it direct control over x does not prevent it from devising a clever plan to manipulate things in order to gain control of x, even including social engineering if necessary. The idea is that once something we’ve built is truly as ‘intelligent’ as we are, it becomes very hard to predict what it will be able to do (particularly because it operates on timescales unimaginable to us). So the verification you speak of is anything but easy.


Chimps be like: don't fear the humans, we're more powerful than them and we'll control the banana supply.


Completely coincidental comparison for people that you disagree with, I suppose?


“Make sure”? OpenAI can barely take care of it’s humans. What about a more intelligent system which is already on the internet which is in turn connect to most things (and anything airgapped is a trainable, trick-able human away)


Good luck to that when people have already hooked up LLM systems in chain of thought mode to Twitter accounts. Just substitute in the new AI model and it'll be an agent pretty quick I'd say.


> WHY YOU SHOULD FEAR MACHINE INTELLIGENCE

This dude has a serious obsession with fear.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: