AI safety is an apocalypse cult and we should recognize that everyone who has ever been in an apocalypse cult thinks that "no, this time it's really the end of the world". Everyone who believes crazy shit thinks they're being reasonable, so just because we're super sure we're being reasonable doesn't mean what we believe isn't crazy.
Do we believe that we alone, of all people who have believed in the coming end, have it right this time?
There have been people for a long time who think the world is going to end. We can use your argument to dismiss anyone who fears any existential risk like asteroids, nuclear war, climate change, AI, etc...
And since the world exists every group will be wrong except one.
I think the risk of human extermination is much lower than many in this group believe because I think AI is much more compute bound than algorithm bound. So I think a hard takeoff is very unlikely. But while the median case for climate change is probably worse than the median case for AI. The existential risks for AI seem much higher. And it's a concern worth taking seriously.
The issue is that the effective altruism crowd are focused exclusively on AI safety, at least in what they publish online. If they talked about all possible threats to humanity and how to solve them, from AI, to climate change, to nuclear bombs, to asteroids, to disease, they would be an interesting and important group.
They don't. They hardly talk about anything else except AI. From the outside it looks a lot like an obsession.
Personally I think they treat AI as an existential threat above all others because they consider themselves to be the pinnacle of intelligence in the universe and AI is showing that human intelligence might not be so special after all. So it's an existential threat to their sense of self. Other existential threats merely threaten their lives and are therefore not as important.
This would be fine, if acknowledged with a degree of self awareness. But it's not, instead it's wrapped up in so many layers of convoluted logic that most people, insiders and outsiders both, can't see the real reason for this obsession.
I've never been on this website before, but am I correct in understanding that the "biosecurity" categoery is about a month old?
It would be interesting to understand how much of the "existential risk" category is devoted to biosecurity or pandemics, and how recent those posts are.
Pending the above, my tentative takeaway from this is that COVID - arguably the greatest threat to human health and wellbeing in the past several decades - seems to have been a minor/non-existent blip on their radar until it actually happened. This raises serious questions in my mind about the community's predictive abilities and/or bias.
> am I correct in understanding that the "biosecurity" category is about a month old?
It looks like the Forum is confusing here: it has "Biosecurity & Pandemic Preparedness" [1] as one topic, and "Biosecurity" [2] as its largest subtopic. But when you click on the parent topic it's not showing the subtopic posts; I've filed a bug.
Thanks for the reading. The 2017 talk/article is interesting in light of COVID:
"Some of the reasons I'm skeptical of natural risks are that first of all, they've never really happened before. Humans have obviously never been caused to go extinct by a natural risk, otherwise we would not be here talking. It doesn't seem like human civilization has come close to the brink of collapse because of a natural risk, especially in the recent past.
"You can argue about some things like the Black Death, which certainly caused very severe effects on civilization in certain areas in the past. But this implies a fairly low base rate. We should think in any given decade, there's a relatively low chance of some disease just emerging that could have such a devastating impact. Similarly, it seems like it rarely happens with nonhuman animals that a pathogen emerges that causes them to go extinct. I know there's one confirmed case in mammals. I don't know of any others. This scarcity of cases also implies that this isn't something that happens very frequently, so in any given decade, we should probably start with a prior that there's a low probability of a catastrophically bad natural pathogen occurring."
I wonder if this is a case of extrapolation fallacy? Modern human dynamical behaviour is significantly different to both animal behaviour and less modern behaviour. Viruses spread more easily in the age of global travel; those that used to be too deadly to spread very far suddenly have more opportunity.
EDIT: Reading this more carefully, the speaker does actually address globalisation, but seems to dismiss it as a counterargument and I'm not really sure why.
Anyway I read through that article. There's a lot of guff about Newcomb's paradox and Eliezer, and a single paragraph referencing COVID, with two links supporting your statement. I've clicked through those links and have come out unimpressed.
Scott rightly highlights how the media and institutions got things wrong, but conversely gives only a few examples of "generic smart people on Twitter" getting some things right in the early days of the outbreak. He confesses that he himself did not predict the seriousness of COVID.
The second link is used to support the claim "the wider tech community, were using masks and stocking up on essential goods, even as others were saying to worry about the flu instead": https://putanumonit.com/2020/02/27/seeing-the-smoke/ .
The linked article was written at the end of February, when panic-buying had already firmly set in across the US.
There is nothing here about the reasoning or predictive abilities of the rationalist or EA community specifically. Nor is there any compelling comparison of its response compared to the wider public.
Climate change is unlikely to kill everyone on earth. Other risks, such as an engineered pandemic, asteroid impact, or an AI apocalypse have the possibility of killing ~everyone. This is not saying that climate change is not a real issue.
The AIpocalypse seems incredibly unlikely. I'm a lot more worried about the nukes, and even if we end up with robotic overlords, I'd bet they'll be a whole lot better at administration than meat -admins have proven.
I'm with you on engineered superviruses. Feasible, likely, and incredibly high impact.
What I kind of keep coming back to is the risk profile of all this stuff. That magic vector of likelihood * impact. Global warming is happening now. And it's real bad - worse than I think we give it credit for.
What I worry about is that we're the proverbial frog in the pot. Things get just slightly hotter each year, so we'll miss it when we actually boil.
So what you're saying is that if a risk can totally upend our society, destroy most of our cities, make most of our farmland unusable, destabilize geopolitics in a way that's almost certain to lead to war between nuclear powers (who also have the ability to engineer deadly diseases), and massively disrupt every ecosystem on earth, and that there's enough evidence to say with almost complete certainty that this will come to pass without massive societal and political change and technological intervention... but it's unlikely to kill everyone... then it doesn't really deserve mention?
The cascading risks you mention are certainly real and serious, and are worthy of our best and urgent efforts to solve. Effective altruists are rightly concerned about these effects e.g. https://80000hours.org/problem-profiles/climate-change/
My comment was written with the summary of that article in mind (I didn't make this clear):
> Climate change is going to significantly and negatively impact the world. Its impacts on the poorest people in our society and our planet’s biodiversity are cause for particular concern. Looking at the worst possible scenarios, it could be an important factor that increases existential threats from other sources, like great power conflicts, nuclear war, or pandemics. But because the worst potential consequences seem to run through those other sources, and these other risks seem larger and more neglected, we think most readers can have a greater impact in expectation working directly on one of these other risks.
There is an excellent chapter on the existential risks associated with climate change in Toby Ord's book The Precipice, which you can get a free copy of at https://80000hours.org/the-precipice/
The way the EA forum handles parent topics is confusing, but this is only counting posts tagged with the top-level label. It has several subcategories, and there are hundreds of posts when you include them:
* 218: Biosecurity
* 176: Covid
* 54: Pandemic preparedness
* 40: Global catastrophic biological risk
* 39: Vaccines
* 20: Biotech
* 16: Life sciences
* 11: Dual-use
* 10: Biosurveillance
(Posts can have multiple labels, so the above list double-counts a bit. I don't see an easy way to extract the actual count from the forum, but it's at least 218, the count for the largest category.)
> The issue is that the effective altruism crowd are focused exclusively on AI safety.
How many EA people do you know?
That seems to get the most clicks, but I usually associate them with things like mosquito bed nets to help prevent malaria, because that's a far more common topic in my experience.
I'm sure you can find particular people who are all about the AI risk stuff that seems pretty wild to me, too.
But when you say "focused exclusively" I feel like you may have mostly seen news articles about wild AI stuff and not the average EA enthusiast who spends their time talking about picking efficient charities.
As someone who is not an EA, but has been peripherally aware of them for years, the bed nets stuff was their big thing until maybe three years ago, but it genuinely is the case that a lot of their energy had recently redirected to AI safety and other X-risks.
That isn't to say that the malaria stuff isn't still a big topic, but there has been a shift.
Yep, I think the community is pretty evenly split amongst "longtermist"-focused people, and this includes all the X-Risk stuff, as well as the more classical "Global Health & Happiness" people. At this point they are actually relatively distinct groups I'd wager. One thing missing from the article is that, at least in my experience, the vast majority of the opportunists are in the longtermist camp.
Longtermist topics are the perfect combination of high-prestige and smart sounding with a complete lack of accountability, because the impact of research or projects won't materialize for decades. Ironically, in practice it directly contradicts the founding tenants of EA, despite being a logical conclusion of the underlying moral philosophy.
AI safety and "longtermism" (I feel slightly more stupid just by writing this) is an escape route for the group's realization that (a) they cannot really make a difference and (b) even if they could, it would require a lot of actual work. The person who goes out in the middle of a cold night and cooks soup for homeless people can quite easily count how many mouths he fed; sounds quite effective and quantitative to me. Or maybe the volunteer who is helping fight malaria in a desert somewhere outside of America/Europe; actually, they don't need to count, their exhaustion at the end of a hard day is proof enough of their altruism.
But since the real goals of the EAs are not to do any of that (it's just a bunch of socially-awkward geeks looking to feel accepted in some kind of community and find sexual partners; in other words, trying to feel like normal humans), they fooled themselves into believing that such short term actions are "ineffective"; actually, any actual measurable hard work has to necessarily be ineffective, or else it would actually have to be done. So they keep pushing their agenda to longer and longer term issues, more abstract and less well-defined as possible, which cannot be actually measured, so they can continue to build this weird, meaningless, ridiculous fantasy that they live in (and which supports their social hierarchy) without having to actually do anything concrete.
>The issue is that the effective altruism crowd are focused exclusively on AI safety.
That's really not true. There's a huge overlap in these groups, but the top EA charities are not shoveling all their money at AI alignment orgs... even if one can argue that they should.
Arguing that AI safety is needed only demonstrates that it's altruistic, but you have a lot farther to go if you want to say funding research in this decade can be effective.
I think they're worried about it because they honestly do care about existential risk. If you can come up with some more likely existential threats than AI, I'm sure they'll worry about that too.
> They don't. They hardly talk about anything else except AI.
Even if that were true, there are lots of people already looking at those other existential risks. The EA crowd was the only group concerned about AI for a long, long time. Seeing an existential risk that nobody else does leads to evangelism and hyper focus so people start paying attention.
> AI is much more compute bound than algorithm bound
Any reason to believe this? The human brain runs on 20 watts (!). Obviously biological hardware is not directly comparable to silicon, perhaps silicon is intrinsically less efficient by some factor such that in order to obtain 20W of human brain compute we need to expend K times more in silicon. But is that factor really so large that we are compute-bound? How big would K have to be for us to be compute-bound anyway? If you could have Einstein for 200kW (i.e. silicon is intrinsically 1000x less efficient than brains) would that be compute-bound? It seems much more likely that we are algorithm-bound and simply not using our available compute efficiently.
Yudkowsky's hard take off mentions "solving protein folding", nanomachines, persuading everyone to let them connect to the internet, infinite adaptability, etc. I think those are unrealistic, but obviously they still have a non-zero chance.
More probably bad possibilities are AI-Hitler (monomaniacal populist absolute leader) or AI-Stalin (manipulative, smart, absolutely paranoid, rising through the ranks) ... so something that's human-like enough to be able to connect with humans, to manipulate them, but at the same time less affected by the psychological shortcomings. (Ie. such an AI could spend enough time to cross-interrogate every underling, constantly watch them, etc.)
And yes, a very efficient immortal dictator is very bad news, but still bound by human-like limits.
And the big infinite dollar question is could this hypothetical AI improve on itself by transcending human limits? Let's say by directly writing programs that it has conscious control over? Can it truly "watch" a 1000 video streams in real-time?
Can it increase the number of its input-and-output channels while maintaining its human-like efficiency?
Because it's very different to run a fast neural network that spits out a myriad labels for every frame of a video stream (YOLO does this already, but it's not 20W!) and to integrate those labels into actions based on a constantly evolving strategy.
Sure maybe the hypothetical AI will simply run a lot of AlphaZero-like hybrid tree-search estimator-evaluator things ...
Anyway, what I'm trying to say is that our 20W efficiency comes with getting tried very fast, and using "fast mode" thinking for everything. (Slow mode is the exception, and using it is so rare that we basically pop open a glass of champagne every time.)
I agree that the most likely way an AI would take control involves social/political engineering, but that doesn't mean it will have human-like morals making it keep humanity alive once it doesn't need us or that it will have human-like limits.
>And the big infinite dollar question is could this hypothetical AI improve on itself by transcending human limits? Let's say by directly writing programs that it has conscious control over? Can it truly "watch" a 1000 video streams in real-time?
Even if its mind wasn't truly directly scalable, it could make 1000 short or long-lived copies of itself to delegate those tasks to.
Asteroids, nuclear war, climate change, and AI. As the old song says "One of these things is not like the other". We know that asteroids exist and many large ones are uncomfortably close to our home. We know that nuclear weapons exist and have seen them used. We know that the climate is getting hotter (*).
AI is...well, people losing their shit over ChatGPT (**) aside, AI is not going to be real enough to worry about for a few more decades at least.
(*) Anyone who's about to regurgitate some fossil-fuel industry talking points in response, just save your breath.
Would you rather we wait until AIs are actually posing a threat for us to study ways to align them with human values? Tons of money already goes into fighting climate change and basically everyone on earth is aware of the threat it poses. The AI safety field is only about a decade old and is relatively unknown. Of course it makes sense to raise awareness there
You seem to agree that if at some point in the next few decades AI will be something we need to worry about, so I'm trying to figure out exactly what it is you oppose.
Would you have opposed research into renewable energy in the 1970s since global warming was still a few decades away from being something we needed to worry about?
What’s the upper bound for climate change’s existential risks? The end of human existence and society as we know it except for a relatively small number of survivors living in a world those of us alive today can barely imagine?
The scenario described in this article comes the closest, detailing a mechanism that may have been responsible for the end-Permian mass extinction, wherein warming oceans become anoxic and begin to release huge quantities of hydrogen sulfide gas. The gas is directly lethal to most life in the oceans and on land, and destroys the ozone layer as a bonus. It's hard for me to imagine any way in which an agricultural human civilization could survive that scenario.
By "climate change" people mean the sort of change that will be realistically caused by humans within the next few hundred years, not the sort of extinction caused by massive volcano activity that takes 10,000 years and happens once every 100 million years. Obviously if the climate changes drastically enough fast enough then everyone dies but nobody is suggesting that's the case for preset day human activity.
In terms of timeline, the key sentence would be "The so-called thermal extinction at the end of the Paleocene began when atmospheric CO2 was just under 1,000 parts per million (ppm)". The end-Permian was likely around 2500 ppm[1]. It doesn't really matter whether that carbon comes from supervolcanoes or from human emissions.
When the article was written, CO2 concentration was 385 ppm. Today it's 421 ppm. If CO2 concentrations were to rise linearly, the timeline for reaching 1,000 ppm would be 250 years. Achieving that would require emissions to stabilize at ~2014 levels. If emissions keep rising, then perhaps it's closer to 150 years. If strong feedbacks kick in, like methane from melting permafrost, then maybe 100 years or less, or maybe we'll reach that happy 2500 ppm mark and see what a real mass-extinction looks like.
Of course the ocean has a big thermal mass, so it will probably take quite some time to heat up enough to trigger the disaster even after we reach that level of atmospheric carbon. Hopefully everything will be fine?
"It doesn't really matter whether that carbon comes from supervolcanoes or from human emissions."
It matters a lot. Whilst I was eating dinner I've now read the article you cited, the paper that it cites (Kump, Arthur & Pavlov, 2015). I also cross checked the argument and numbers against several other papers. Ward's argument is extremely slippery. This is sadly what I'm coming to expect from academics with unverifiable models. You have to read everything they write adversarially.
"In [Kump2015]'s models, if the deepwater H2S concentrations were to increase beyond a critical threshold during such an interval of oceanic anoxia, then the chemocline separating the H2S-rich deepwater from oxygenated surface water could have floated up to the top abruptly. The horrific result would be great bubbles of toxic H2S gas erupting into the atmosphere."
Observe that his argument starts with a "critical threshold" for H2S levels, not CO2, and that he doesn't tell us what this critical level is. The obvious questions are thus: what is this level, is it realistic for our present day oceans to reach this critical level and if so, how? To get to those levels of H2S he hypothesises that:
"if ancient volcanism raised CO2 and lowered the amount of oxygen in the atmosphere, and global warming made it more difficult for the remaining oxygen to penetrate the oceans, conditions would have become amenable for the deep-sea anaerobic bacteria to generate massive upwellings of H2S."
In other words in this theory the supervolcanoes have to come first, triggering a sharp drop in oxygen levels in the atmosphere, which in turn then causes the chemocline to move, which then causes the H2S upswellings. It's all caused by a massive loss of oxygen, not increase in CO2 levels, which is simply another result of the volcanisms. Kump2015 also makes it clear that their hypothesis requires a truly massive drop in oxygen levels to occur. We'll look at how much in a moment.
But just a few paragraphs later Ward has forgotten all about the volcanoes and oxygen levels. Suddenly it's all about absolute CO2 levels - not even rates! Cause, effect and unrelated side effects have become entirely muddled, probably because he knows nobody would care about his article unless he ties it to global warming armageddon somehow and because hey, this entire field is nothing but assumptions, suppositions, and playing with numbers that can never be verified anyway so why not?
His argument is brittle in other ways. He asserts without any backing argument that whilst supervolcanoes explain all the other non-asteroid mass extinction events it doesn't for the Permian because, apparently, supervolcanoes are really great for plants on land which can "probably" survive the warming. And there was me thinking that large scale volcanic activity is supposed to be very bad for plants because it dims the atmosphere:
"Plant growth is restricted and mass extinction can be caused."
Back to the H2S claims. Turning our attention to the Kump2015 paper we find immediately that it's got an annoying structure in which they work backwards from their desired scenario to calculate the level of H2S that could trigger it:
"Thus, if the H2S of the deep sea increased during an anoxic interval beyond a critical value (1 mmol/kg), upwelling regions of the world ocean would become sulfidic, even with the modern [oxygen level at atmospheric pressure] .... A slightly more sulfidic ocean with H2S = 3 mmol/kg ... could sustain ... 2000 times the present-day flux and a critical value for the atmosphere (see following)."
But these critical values are never placed clearly in context. Is 3 millimols a lot or not much? Note that the 3 mmol/kg value is the absolute "best" case for their scenario; it can only be that low in (they estimate) 0.1% of the world's oceans because normal ocean requires levels 20x higher. They do admit that:
"The [H2S] condition is extreme, and thus likely to have been rarely achieved in Earth history. Is there any evidence that such conditions have occurred in the geologic past?"
How extreme is it? The value of the normal ocean would be important to have here but they don't give it to us. The paper "Hydrogen Sulfide in the Black Sea" by Volkov & Neretin does give values though, and bear in mind this is by far the most anoxic basin in the world: at a depth of 1km the value is ~314 micromols/kg. So these H2S levels that are claimed to trigger this process are literally many, many orders of magnitude higher than even the Black Sea.
The levels of oxygen drop needed is something they also don't seem to directly share, but in one section they're kicking around a figure like half of all today's oxygen having vanished, or for a different event, 99% of all oxygen having gone.
Even putting aside that these papers are just piles of completely unverifiable suppositions stacked like a jenga tower, there is absolutely nothing even remotely close to realistic about this scenario happening to us. It requires oxygen levels to drop so much that if it were to ever become an actual threat we'd all have died of oxygen starvation long before.
The solubility of gasses in water decreases with rising temperature. For ocean absorption of O2 from the atmosphere, the surface temperature is particularly important. This is why even if atmospheric O2 does not decrease, this could still be a concern -- albeit one with a different threshold than a scenario where the ocean warms and atmospheric O2 concentrations decrease at the same time.
Still, I certainly do hope that you're correct! And thank you for the thoughtful and well-considered reply!
Personally, I find the AI safety arguments sound pretty far-fetched, but I can’t come up with solid arguments against them. Also, when I look for a good ”debunking”, I don’t find anything. Mostly all I find are mockery, ad-hominem and the like. Frankly, the people who argue climate change isn’t real do a better job than the people who argue AI risk isn’t real, and they’re wrong!
I don't think it's impossible to come up with arguments against them. For one thing, extrapolating the current gradual rate of progress forwards means that we will get to see several minor "intelligence spills" before the hypothetical big and last one, and by observing what went wrong, humanity will have the opportunity to come up with solutions.
Since very smart human beings have in history done a lot of damage, but never ended the species or anything, we can conclude that the world has an "intelligence safety margin" that extends up to Alexander the Great or Napoleon. There have been some incredibly smart people in human history and none of them have ruined everything, a few countries at most.
I'm pretty sure they have stock arguments against what you said, even from people like Nick Bostrom who I think have a bit better handle on the situation. In that vein though, I think making the world more robust in general might be a way forward. Having multiple broad spectrum anti-virals for every class of virus ready to go and stockpiled, having flexible on-shore manufacturing infrastructure, with supply chains that are on the same continent or at least have massive component stockpiling on national levels, and having smart and vigilant people running these systems from the very top down to the janitors and security guards. Make a world where the machine stopping because a few agents, human or otherwise, doing something unexpected, much less likely. At least to the point EAs can start buying mosquito nets again.
Alexander the Great or Napoleon have been limited in how much damage they can do, because they're human too. They don't live for hundreds of years, they can't copy their brains to new hardware, and they can't modify their own minds to remove that pesky conscience when it slows them down. Perhaps most importantly, Napoleon's happiness, power, and continued survival depends on having a civilization of surviving humans working under his command, whereas that might only temporarily be true for AI.
At best, history serves as a counterexample to the idea that if an AI goes bad the attendants would just unplug it, seeing how often it is that dictators don't get stabbed by their aides as soon as they start causing mass deaths, instead often receiving broad popular support as the world burns.
>I don't think it's impossible to come up with arguments against them. For one thing, extrapolating the current gradual rate of progress forwards means that we will get to see several minor "intelligence spills" before the hypothetical big and last one, and by observing what went wrong, humanity will have the opportunity to come up with solutions.
That blog post makes some giant unsupported assumptions, like that the reason for the failure of a few activists to stop gain-of-function research was that the government is fundamentally bad at addressing risks, and not that biologists might know more about the risk/benefit tradeoff than amateurs.
From what I've seen, the AI safety arguments basically rely on assuming that the singularity is not merely real, but guaranteed: that we will create a full-fledged general AI, that it will be smarter than us, and that it will be fully capable of upgrading itself to be far beyond our control before we have any ability to prevent it from doing so.
Every step of this relies on assumptions that are not merely questionable, but unfalsifiable.
Whether AGI is possible or not, regardless of anyone's personal opinion, is as yet unprovable and unfalsifiable.
Assuming that AGI itself is possible, there is no way to tell whether we, as humans, can create an intelligence that is "smarter" than we are.
Assuming that we can create an AGI that is "smarter" than we are, there is no way to determine whether it would be able to upgrade itself to become exponentially smarter than that, and beyond human understanding and control.
If you have a hard time coming up with arguments against these things, maybe it's because they're fundamentally unfalsifiable, which makes them useless for trying to build any framework of understanding on.
Do you think it's a strike against these points to call them unfalsifiable? They seem no more so than any claim of technological possibility, such as "nuclear fission is possible", "fission chain reactions are possible", "nuclear weapons are possible", "nuclear fusion is possible", "nuclear fusion can be made economical", etc. Unless a technology violates laws of physics it's very hard to prove through math or experiment that something can't be done.
But the AI safety groups don't just assume these without any justification. The existence of the human brain itself is either a proof and very strong launching point for most of these. In 1930, maybe you could have convinced someone that a nuclear bomb is impossible or at least an unfalsifiable worry, because one had never yet been built, but your reasoning is like trying to cast doubt on the possibility of artificial digestion, when there are already billions of stomachs roaming the earth.
The idea that a computer can't possibly be made to accomplish whatever a brain can is losing plausibility with every passing day. And as for surpassing it: a human brain is powerful, but has so many surmountable limitations, like: requiring 20 years of education to get up to speed with existing experts; dying after 70-90 years; not being able to run copies of oneself in parallel; etc. We only need to imagine removing these limitations, and doing so violates no known scientific principles.
It's like having a 50-megaton warhead in your lab, and meanwhile the rocket scientists are gradually figuring out how to make rockets fly, and you're saying "yes this is a big bomb, but the idea that one of these could be more dangerous if mounted on a missile is an unfalsifiable assumption!"
Nuclear fission, nuclear fusion, digestion—all of these are reasonably well-understood phenomena. The physics and chemistry behind them are clear. Making nuclear fusion energy-positive and economically viable is an engineering problem.
Consciousness, intelligence, sapience—these are not well-understood phenomena. We don't know what makes us conscious. We don't know if other animals are, or to what degree. It's not even possible to determine with any scientific certainty that another human being is conscious.
As things stand, "we can build an AGI" is not a scientific statement. It is not grounded on a foundation that allows clear reasoning about it, one way or the other.
Your arguments are not incorrect; however, they do not bridge the gap to "and so we can definitely make a conscious, sapient, intelligent computer." Being able to replicate particular capabilities of the brain is not the same thing.
And, again, that's only the first unfalsifiable proposition that must be satisfied in order for the purported AI threat to be real. They also have to be capable of breaking free of our control, decide we're a threat to them for whatever reason, and have the means to carry it out.
Artificially sustained nuclear fission was, in 1923, not-well-understood and thought impossible; radioactive decay had been observed but the idea of chain reaction had not yet been conceived (it would take another 10 years). Natural nuclear fusion we currently know to be possible in the sun, but we don't yet know how to sustain it artificially today. Digestion was not-well-understood in 1823; although they knew that it was somehow similar in input-and-output to combustion, which could be imitated artificially, they did not know how to replicate digestion on the chemical level. Now we do.
Consciousness is irrelevant, as it's clearly not necessary for AGI nor even human intelligence; otherwise you wouldn't say "it's not even possible to determine with any scientific certainty that another human being is conscious."
On what grounds do you believe that there are capabilities of the brain that cannot be replicated by a computer? When I look at the 1.5 kgs of matter in a typical human brain, I don't see anything that jumps out and says "my operation is not computable!"
At least with fusion, the high temperatures and pressures are a clear barrier. Our brains don't need to be held at 3.8 trillion psi and 15 million Kelvin in order to enjoy poetry.
I can certainly argue on the other points as well (breaking free / deciding threat / acquiring means) but you need to pick your goalposts one at a time. I think the first step would be asking yourself: if you were a super-genius but being held by guards in solitary confinement on a remote island with only a supercomputer connected to the internet, how would you earn money?
Those conditions clearly would not have stopped even an ordinary Satoshi Nakamoto from gaining control of sufficient resources to hire private military contractors to arrange his escape. I'm not sure what a superhuman would do, but that's a human baseline.
> the AI safety arguments basically rely on assuming that the singularity is not merely real, but guaranteed
The argument goes like this: If you want to save the world, then doomsday scenario S is the best place for you to invest your resources if S has a nonzero probability, and currently we are investing less resources in S than in other existential risk scenarios (per basis point of probability).
"AI risk" is a pretty good candidate for S (especially back in 2010-2015 when the movement was just starting.)
> Whether AGI is possible or not, regardless of anyone's personal opinion, is as yet unprovable and unfalsifiable.
"AGI is impossible" is certainly falsifiable: All I have to do is build an AGI and show it to you.
Further, there is no theoretical reason AGI is impossible. Rather the reverse; consider these Well Settled Scientific Facts:
- (1) Human Mind = Human Brain
- (2) Human Brain obeys the laws of physics
- (3) We understand the laws of physics well enough to simulate them in a computer
If you accept these three facts, then in theory AGI is possible: You could implement AGI by building a machine implementation of the Human Brain by fully simulating the underlying physics.
> Assuming that AGI itself is possible, there is no way to tell whether we, as humans, can create an intelligence that is "smarter" than we are.
> Assuming that we can create an AGI that is "smarter" than we are, there is no way to determine whether it would be able to upgrade itself to become exponentially smarter than that, and beyond human understanding and control.
An AI safety person would say to this: "You're right, we don't know -- and that's exactly the problem."
What we can do is assign a probability based on our confidence. How likely do you think it is that we can do those things? 20%? 2%? 0.000002%?
If you say "0.000002%", what makes it so extraordinarily certain it's impossible? If you say "2%" or "20%" then as a matter of self-preservation, shouldn't our society be devoting a lot of money and smart people's time and attention to figuring out how to make sure it doesn't happen?
Most of the people that could have better things to do than to respond and give it more credibility.
As well, while the arguments are logical, most of them rely upon large assumptions to move between steps. If any of these assumptions fail, the entire thing fails. Especially the hard take off assumption.
As well, the assumption that AGI is happening in the next 3-10 years. I’d say most prominent people in the AI research space don’t think we’re much, if any, closer to AGI. Yet you have Yud and LW screaming that we will all be dead in a few years and AGI is right around the corner.
When people like Chollet and Ng say we aren’t close to AGI, I’m more likely to believe they’re right, vs. Yud who hasn’t contributed to any actual developments within the field besides theorizing about alignment and how AGI can go wrong.
Yeah, I can't really argue against the whole idea. My "P(doom)"(language from the article), at least as related to agentic AI, is pretty low. About 7 or 8 percent before I die, and I'm not really old. There's probably not a lot I could do about it anyway, given that most AI jobs are alleged to just make it worse. That said, I can't really go around telling people with different numbers they should be working on antiviral research or physical manufacturing, just because I (personally, without their massive and cohesive "evidence" (i.e. gamed out scenarios) bases) think that's more important.
I'm not one of those "all value in the universe" people, P(doom) for me does not have to be that bad. I would actually consider the inability to manufacture or unwillingness to manufacture certain things at scale in the US to count. Generally, my point is that agentic AI is not something I can really do much about, and other things have a combined P(doom) probability higher than it. The unwillingness in certain groups to use what they consider to be toxic or "forever" chemicals in the manufacture of physical products at scale that can make them world a better and less fragile place is something that could be worked in instead.
My argument against is that we can't do much about it. We can't predict how machines will interpret our rules if they are smarter than us. Nor do we have a shared set of values we can articulate. Nor can we control rogue users making rogue AI.
So unless one is proposing a global crackdown on AI research, AI safety is a lost cause.
Any uncontrolled group that creates an AI can either find a body of safety research accessible to them, or not. Preparing the former is hardly a lost cause.
AI safety reminds me a lot of regulatory capture, and I don't think it's a coincidence that its top proponents are in charge of the dominant players in the space. It's easy to preach about AI safety when you're sitting on all the GPU power.
There are many, but the biggest one is just that the possibility space is vast and unknowable and while drawing a line straight through it to AI doom is of course possible it doesn't say anything at all about likelihood, and it can't. It makes a ton of assumptions and expects everyone to just go along with them to reach its desired conclusion.
AI certainly has risks, I don't think any reasonable human being doubts that, its just that the AI doom cult seems to think the worst outcomes are near certainties without really backing that up.
> AI doom cult seems to think the worst outcomes are near certainties without really backing that up
What are you basing this on? There have been tons of arguments written about why these worst outcomes are likely. Read Bostrom's Superintelligence for example, or Yudkowsky's Intelligence Explosion Microeconomics.
Many conversations with AI doomers. They gloss over and make assumptions about intelligence that aren't really backed by priors and when this is pointed out they hand wave and say "but computer".
> Read Bostrom's Superintelligence for example, or Yudkowsky's Intelligence Explosion Microeconomics.
I don't really have any interest in doing so, and if I'm honest have a particularly unfavorable read of Yudkowsky as a person based on his cultish following.
The ai safety arguments rely on something called the “orthoganality thesis” which is a huge, unintuitive assumption. In real life, intelligence is associated with picking better goals. Generally, an entity with higher intelligence will pick different goals than a being with lower intelligence.
The orthoganility thesis is an unproven assumption that intelligence and goals are not corellated, meaning an intelligent being can pursue stupid goals. States like that, it’s obviously wrong and laughable. But by using complex language, EA cultists hide the ridiculous assumptions their system has so that they can maintain their feelings of superiority while gaining real power that enables them to abuse others.
No goals are intrinsically stupid. The only reason that you might think some particular goal is stupid is that it goes against your goals.
You could conceive of a super intelligent AI that came into existence with the goal of terminating itself. That would be an "stupid goal" from our perspective, since we have the goal of self-preservation really ingrained in our brains.
But for a being that self-termination is the absolute best thing ever, it's not stupid. It makes perfect sense, since, well, that it's goal. It doesn't care about self-preservation, it doesn't care about becoming more intelligent/rich/powerful, other than as an instrumental goal to help achieve self-termination, if it's not able to do so in its current state.
And most importantly, no amount of getting more intelligent would change this fundamental goal, just as humans getting more intelligent has not overridden our fundamental goals of "breathe, feed, have sex". It may have given us other goals as well, but those are very much still there.
Not only that, but we've managed to subvert our reward functions in exactly the way AI safety people fear. Evolution tried to get us to reproduce as much as possible, and we came up with ways to get the reward without producing offspring.
That’s a pretty naive understanding of sex. In human history sex has often played as important a role in bonding, community, and social hierarchy as it has in child production. These things are beneficial for survival. It’s very common for evolution to result in redirected drives.
I’m not sure what you mean by “stupid” but it’s definitely not the colloquial meaning. A human seeking suicide is not stupid. In fact humans often do seek suicide because humans are playing social games where the maximizing move is to commit suicide. Supporting these people requires changing their context so that they have a wider and better variety of options available.
This is just a specific example of why I reject the orthogonality thesis. You change the context, you educate the agent, you change the goals of the agent. I do not agree that humans only chase “breathe feed sex” and while I do believe many stupid behaviors do come from evolutionary history, It’s plainly obvious that education, training, and genes play a role in self restraint and goal redirection.
What do you mean by stupid? I used that word only because you used it first.
And I agree that humans do not only chase "breath feed sex", as I explicitly said that in my comment. We have other goals as well. But those are very much still there.
>In real life, intelligence is associated with picking better goals.
"Better" according to what metric?
It may be the case that there is a tendency for high-intelligence humans to pick "more enlightened" goals. Perhaps there is a natural "enlightened goals" attractor for our species.
However I don't think we can extrapolate from that to a fundamentally alien AI.
I think even if this statistical tendency exists, it has clear counterexamples -- consider that 2 genius chess players may have opposite goals, of beating one another. And we shouldn't bet the future of humanity on this statistical tendency extrapolating outside of the original distribution of human species.
Here are some intuition pumps on how diverse goals can be even across intelligent species:
As soon as you phrase goal selection in terms of metrics, you’re assuming that goal selection is based on some other goal - that is you’re already assuming the orthoganality thesis. Your logic is fully circular.
One thing that’s interesting to note about all of the examples you picked - every single one of those species shows cooperative behaviors. They share many other behaviors that are more similar than they are different. To reject the orthoganality thesis it’s sufficient to show that there is an empirical general association between intelligence and certain goals - then we can extrapolate an AI although of course it will function differently and may have many unusual behaviors will tend to follow those goals more directly. For instance intelligence is associated with : cooperation, empathy, inter and intra species communication, curiosity, etc. all of the species you mentioned exhibit these more than less intelligent species. Meanwhile something like “hunting to eat” is observed across the intelligence spectrum.
I've known lots of intelligent people who pursue stupid goals.
I maybe agree with you that there's an this belief that a maximally intelligent creature will blindly follow maximally obviously stupid goals, and that belief is under-argued, but your phrasing above isn't the slam dunk argument that you seem to believe.
Remember the term here is “associated”. If you spend a lot of time with people from diverse mental backgrounds, which is much harder to do than you might think, you’ll easily see that people who lack education, common sense, or the necessary context to be intelligent to pick much worse goals than people who aren’t in that position. Of course none of us humans are particularly intelligent, and all of us are stupid. Being a human is not intrinsically about intelligence - being a human is about being a social mammal. It’s funny seeing how rationalists often fail to exercise while emphasizing their own rationality - a perfect contradiction demonstrated by the difference between imagination and bodily realities.
> The orthoganility thesis is an unproven assumption that intelligence and goals are not corellated, meaning an intelligent being can pursue stupid goals. States like that, it’s obviously wrong and laughable.
To me it's obviously wrong when stated that way because 'stupid' is not an appropriate metric for goals. We may consider goals good or bad within our value system, but that has little do to with 'stupid' or 'intelligent.' e.g. a body builder may have a goal to get as ripped as possible, a VC to make as much money as possible, an ascetic to deny the flesh as much as possible. Which of these are stupid or intelligent goals? I don't think that's a question that makes sense.
We may consider the goals of the Athenians (to expand their power) "worse" than the goals of the Melians (to be left alone)[0], but I don't see how they were "stupider."
> The orthogonality thesis is an unproven assumption
A safety mindset would suggest that rather than disregarding it until proven true, we should worry about it until proven false.
> meaning an intelligent being can pursue stupid goals.
It would pursue very intelligent instrumental goals, but the terminal goal is a free variable and I don't think there exists any measure by which terminal goals can be considered smart or stupid. It would be whatever is implied by its programming.
> States like that, it’s obviously wrong and laughable.
Perhaps not so obviously wrong nor so laughable as you think?
> There’s no clear delineation in real entities between instrumental and terminal goals.
In healthy individuals, yes there absolutely is. Terminal goals are the ones you pursue for their own sake; instrumental goals are the ones you pursue as part of a plan to pursue a terminal goal, or another instrumental goal which connects to a terminal goal. Most people go to work in the morning not as a terminal goal, but as an instrumental goal; employment is in service of another instrumental goal of earning money; earning money is in service of a terminal goal of not starving to death. This is not exactly controversial stuff here. Some people do get so focused on an instrumental goal like "earning money" that they develop tunnel-vision and forget what terminal goal that money was originally in service of, but that's something most of them will eventually realize and then write a self-help book about.
Anyway, it takes intelligence to decide what your instrumental goals should be, such as whether there's perhaps a cleverer way to make money than by going to work for your boss each morning, but there's no way in which intelligence will help you choose your terminal goals. For the most part they aren't something you can even consciously choose.
Yes, you can stretch your model to try to explain why humans go to work.
In reality, people do not need to go to work to “not starve to death” as you say. There are a myriad of ways to survive without working a daily job.
Humans have to be socialized and trained to work a 9 to 5 job - there’s an entire education system structured to help create humans who view that as an acceptable goal.
No what you are saying may not be controversial in your little community but the AI panic is mostly isolated to a small community in a small corner of the USA.
Between "I go to the grocery store because I've been socialized and trained that going to the grocery store regularly is a Good Thing, and I have adopted it as a terminal goal to which I know I should dedicate efforts", vs "I go to the grocery store because I'm out of carrots, my dinner recipe calls for carrots, and I think I can get some there", the latter model is not the one that strikes me as being stretched to explain human behaviour.
As for there being "a myriad of ways to survive without working a daily job", congratulations! Your intelligence has allowed you to identify alternative instrumental goals that provide a path to your terminal goal; now you can rank them and choose the best option. You can also grow carrots in the garden or ask your neighbour if they have any, or ask your spouse to pick some up on the way home. Your intelligence will do the work and find a way. But your intelligence isn't what will guide you toward preferring carrot soup over parsnip soup, and preferring parsnip soup over fasting.
You’re arguing again from a position that assumes that entities have clearly defined terminal versus instrumental goals - which is precisely the position I reject. For instance in your example of “terminal goal is groceries” versus “terminal goal is hunger” neither of these describe how actual humans make decisions. Instead there’s a process, part biological and part environmental. The human checks the fridge then thinks “oh I feel hungry”. Is hunger the terminal goal or the fridge? That question doesn’t even make sense - it’s an interaction between the agent and the environment. Do Pavlov’s dogs have a terminal goal of “salivating to bells” or “salivating to food”? Again the question doesn’t make sense- the agent has built a habit from within a certain environment and the salivating is not goal directed. That’s why training works for dogs, and putting the fridge out of site reduces hunger in humans.
Think more carefully about the implications of multiple ways to survive here. Why do people pick one over the other? In a terminal/instrumental goal model, agents would pick the instrumental route that maximizes the return on the terminal goal. In reality we see that instead humans adopt habits, processes, and heuristics that guide them through daily life even when those do not lead to any specific goal.
Yeah, that's because re-evaluating your entire life plan and belief structure every second is expensive, and heuristics are cheap. People certainly have flaws in their thinking, which is why we fall prey to pyramid schemes, gambling, responding to pointless comment chains on HN, and so on. I don't disagree with this, and again, it's why we publish so many self-help books. But I believe it's our weakness and stupidity, not our superior intelligence and clear thinking, that traps us in bad habits.
So remind me of your original point? I believe you said it's "obviously wrong and laughable" that "an intelligent being can pursue stupid goals". Now here you are trying to convince me that humans are the ones who, like Pavlov's dogs, "pursue habits, processes, and heuristics that guide them through daily life even when those do not lead to any specific goal". Even when those habits involve repeatedly re-opening a fridge that you already know has no carrots in it, or salivating at a bell when you already know no food is coming.
So I'm confused how that proves your point about AGI. If I accept your view, it seems that if an AGI does merely no better than a human on this metric, I should anticipate all sorts of strange and irrational behaviour, including the pursuit of goals that would appear stupid, such as addiction to a reward channel. That does not seem to undermine the orthogonality thesis.
And the smarter the AGI gets, presumable the less it should lean on Pavlovian heuristics and the more it should make use of clear thought, which puts it more in my camp.
So that would apparently put the lower bound at "the AGI takes unexpected and irrational actions because it's not a rational agent and doesn't think coherently", and the upper bound at "the AGI takes unexpected and dangerous actions as rational steps toward an unaligned terminal goal".
I'm not sure where in this chain of thought it becomes laughably obvious that intelligence and goals are correlated, such that an AGI's increasing intelligence will tend it toward actions that we humans approve of, because anything else would be a "stupid goal"?
Humans + AI will always be stronger than AI alone. So why be afraid of AI? Nothing has changed, humans still have all the agency and remain the #1 tangible threat to other humans which has been the case throughout almost all of human history.
What have they actually done for AI safety? Written essays and held symposiums in grand Wytham Abbey. Attracted followers and forum posts. Where's their git repo?
"Apocalypse cults are always wrong, AI safety is an apocalypse cult, therefore it is (probably) wrong" is an argument which really only betrays the ignorance of the speaker regarding the specifics of the AI doomer argument. Why are apocalypse cults always wrong? Do the same specific reasons apply to the doomers here?
Humans are attracted to certain types of mythological stories - and there’s nothing wrong with that. It becomes a problem when the stories and reality are conflated, like is happening with AI. The idea that apocalypse cults are always wrong comes from two source: (1) empirical evidence, every time there’s been one it’s wrong and (2) understanding human sociology, humans are susceptible to making certain types of thought errors as groups and apocalypse cults such as the AI one demonstrate those errors everywhere.
So yeah an apocalypse cult could be right, but it would necessarily be for the wrong reasons. Just like a broken clock that matches the current time 2 minutes of the full day.
I am well aware of the AI doomer argument, but you appear to assume that argumentation can be trusted just because it is convincing*. I believe you are not factoring in the possibility of epistemological error.
*for what it's worth I do not find it convincing but I am arguing "even if I found it convincing, I wouldn't believe it"
What's your approach? Do you only believe arguments that sound unconvincing? Or just ignore all arguments, and try to adopt the beliefs that the popular kids in the schoolyard talk about?
The first step would be to recognize the error bounds on epistemological error and redo one’s estimates, but the tl;dr is that one’s certainty on edge case predictions should drop by a lot.
Having low certainty on edge case predictions is fine, but it also should result in correspondingly larger updates when those edge-case predictions come true, which one had previously doubted. For me, that was the case with AlphaGo, AlphaFold, and now ChatGPT. In all three cases I was highly skeptical that, in my lifetime, AI would ever beat humans at Go, adapt the same architecture to problems like protein folding, and blow the Turing Test right out of the water.
I've had to update accordingly, and now I'm less skeptical that the barriers ahead will be harder to break than the ones behind us.
If there was an apocalypse cult in ancient Pompeii predicting the mountain would explode and bury everyone with ash, they would not be here. And there easily could have been.
This is not an argument, it is a tautology powered by the anthropic principle.
While you make a reasonable point in terms of human psychology, when you create an AI, you are leaving the realm of human psychology and dealing with something new that doesn’t have to play by the same rules. I’d argue that’s a big difference.
We’re creating a fundamentally alien intelligence from scratch and basically just crossing our fingers and hoping to hell it cares about keeping us alive.
On the other hand, we and we alone live in a time where a grapefruit-sized weapon could obliterate a reasonably sized town, AND when such weapons are increasingly in reach of non-state entities. I would imagine the chance of apocalypse rises as fewer people in agreement would be required to enact it.
Just because the parts of the community in the Effective Altruism / AI Safety have developed aspects of a cult doesn't mean that their arguments are wrong. It just means that they're humans doing human things. It sounds like their main mistake was to not realize that the cult-like behavior of other cults was sociological, due to the fact that they contained humans, as opposed to from their belief systems; and thus didn't take any steps to try to prevent their own movement from developing cult-like aspects.
That said, if you have a movement of thousands (?) of people all trying to figure out how to make AI safe for over a decade, and you still don't feel like you're any closer to that goal at the end of it, then you should probably step back and ask yourself what you're doing wrong.
There are top level scientistis warning of this apocalypse. Sam Altman which was the CEO of the very site you are posting and the CEO of OpenAI has warned starkly that there is a probability that humanity will go extinct from AI in his latest interview. Have you seen it? You think he is somehow bluffing or is he part of the same cult?
Sam Altman isn't a scientist but even if he was, so what, scientists have a reputation for constantly claiming the world will end unless they get more grant funding. Fear is a great motivator and makes people feel important.
I'm sure a lot of people are basing their belief on AI on sci-fi interpretation.
If you're one of the folks doing this at least acknowledge the story would have been quite boring if it was 800 pages of "everything went well, the AI turned out really great, very helpful. Would use again"
If you're not blindly afraid of zombies you shouldn't be blindly afraid of AI. Give it thought.
If you're still scared after dismissing sci-fi bias, cool, I'm all ears.
The AI safety community has many people who were previously extremely optimistic about the sheer world-changing potential of AI, but then realized that getting AI right might be harder than getting it wrong, and that figuring out how to get it right must be a priority before a superintelligent AI is made. It's not people who just saw Terminator and made up their mind.
Can you back up those claims? I don't believe all of the nuance is contained in your comment.
I'm looking into MIRI, if that's not "the AI safety community" please do correct me.
I've become very aware ever since ChatGPT started saying inaccurate things confidently that humans have a much worse hit rate.
Seriously, keep an eye out for it, you'll see it everywhere. At least ChatGPT will double check if you ask if it's sure. People tend to just get annoyed when you don't blindly trust their "research" haha.
Eliezer Yudkowsky and the LessWrong forum popularized AI safety/alignment ideas. (The Effective Altruism community was originally mostly populated by people from LessWrong.) I think this article is a little awkwardly infatuated with LessWrong, I say even as a fan, but it does fit this discussion conveniently well about where they're coming at the subject from: https://unherd.com/2020/12/how-rational-have-you-been-this-y...
While MIRI is prominently connected to Yudkowsky, I wouldn't treat them as defining the AI alignment community. There are many people not involved with it who make substantive posts and discussions on LessWrong and the Alignment Foundations forum. There are other organizations too. OpenAI considers alignment important and has researchers concerned with it, though Yudkowsky argues the company doesn't do enough to prioritize it relative to the AI progress they make. Anthropic is an AI company prioritizing AI safety through interpretability research.
I think I'm at the edge of my ability with AI as I'm noticing I'm trying to argue against the usefulness of the concepts rather than the concepts themselves. At the very least I'm not smart enough to casually read these sites (lesswrong, AI alignment) at this time.
I remember feeling like this (brain CPUs pegged at 100%) trying to slog through HPMOR the first time too in fairness, it's just too many concepts to take in in one sitting. I'll get there eventually if I keep at it but not on my first read.
I'll consider my opinions on AI safety void due to lack of knowledge for now, always try to jump over the first stage of competence. I'll start with the Wikipedia page for AI alignment, haha.
Thank you for your responses in any case, I'll dig into this further!
Oh, I think my last post was more about the people concerned with AI safety rather than the topic itself. If you want to get closer to the actual topic itself, this article is a surprisingly great resource: https://www.vox.com/future-perfect/2018/12/21/18126576/ai-ar...
IMO AI Safety is to AI as HR is to Employees. You can't just learn it out of a book and even then it's a pretty bad experience for everyone involved. The world is not a corporation and the existence of "AI Safety" just guarantees that no "mere mortals" will ever get to touch AI.
Waiting for the day our Overlords make Graphic Card 2.0 and refuse to sell any to the plebs 'cause the overlords "know better"
You’re vastly overestimating the risks but you’re also not crazy - humans naturally respect authority figures and follow their beliefs which is in fact good for a healthy life.
My biggest problem with AI safety is that, simply, the problem they envisage doesn't exist yet (generally, at a minimum, relying on the existence of "AGI"). Hence discussions about it have to make a huge amount of assumptions about a whole range of aspects of what the AI threat will be - what the AI will be capable of, what it's impact will be - before getting on to what possible solutions might be relevant to preventing it. But given the first two are so undefined, the later is pure speculation - one that is difficult to criticise directly, because any specific critisms can usually be easy deflected by adjusting any of the above assumptions without making a substantial change to the "inevitable" conclusion.
That's why it feels like an apocalypse cult to me - it's a conclusion, that has little strong evidence today, stacked on top of a constantly shifting set of assumptions, allowing adherents to avoid backing their arguments with evidence.
For the freseable future the real danger I see from AI is inferior basic statistical models being jammed into production pretending to be some sort of all seeing all knowing AGI - a class of product we are not really that close to.
I don't think there is anything safe about ChatGPT regurgitating and hallucinating false information about someone's business based on real legal proceedings that it had been trained on. [0]
As long as it cannot also explain its own decisions transparently [1], then there there will always be a need for AI safety.
I don't have any solid arguments either, but I just get the feeling that after the first very public tragedy, like a killbot rampaging in New York City, we'll see a very, very sudden tilt in the public eye towards never wanting to touch this technology again. People are terrified of killer robots.
One of the likely outcomes like that is even stronger pivoting towards walled gardens and end to end stuff than we previously had on the Internet. Can't train a model on someone else's encrypted database without decrypting it first.
Do we believe that we alone, of all people who have believed in the coming end, have it right this time?