Over the last couple of hours I’ve seen a lot of people talking about a news article in which Prof. Stephen Hawking has said that he considers it possible that an artificial intelligence could destroy humanity. With a few exceptions, mostly people who’ve read the same people that Hawking has read, most of the people I’ve seen talking about it haven’t really understood Hawking’s point — largely because the journalists reporting on Hawking haven’t really understood it, either. I’ve seen people tweeting things like “Why do we assume that sentient robots will revolt and want to kill us? Are we that bad? … hmmmm….” — and this rather comprehensively misses the point. It’s not about us being “bad”.
In this blog post, then, I am going to explain to you briefly what Hawking’s actual point is. He’s referring to a body of work, mostly funded by the ultra-right-wing extremist Peter Thiel, through organisations like the Singularity Institute (recently renamed the Machine Intelligence Research Institute). Other than a few publications (including Nick Bostrom’s book Superintelligence, which has kicked this off), most of that work was published on blogs, which I won’t be linking to, as the communities around those blogs have many toxic elements. But if you google “Yudkowsky superintelligence”, “Friendly AI”, “Timeless Decision Theory” and “Paperclip Maximiser” you’ll be able to find the primary sources. Most of what follows is, in particular, paraphrased rather closely from many posts by Eliezer Yudkowsky.
The argument for why artificial intelligence could lead to the extinction of humanity is a plausible one if you assume the following postulates:
1) A greater-than-human intelligence with access to its own source code will rewrite itself to make itself more efficient
2) Nanotechnology (in the Eric Drexler sense of programmable machines built on a molecular scale) is possible
3) There is no upper limit to how much more efficient than human intelligence an intelligence could become — or at least the upper limit is so far beyond anything we can comprehend as to look like no limit at all from where we are.
4) Technology will eventually reach the point where artificial intelligence can be created, possibly by a person or team who doesn’t think through the ramifications of their actions very well.
Personally, I think 1) and 4) are blatantly obvious, 2) seems to go against everything I know about the laws of physics, and 3) seems unlikely but not utterly impossible, so I don’t agree with what follows, but if you accept those postulates you get a very simple scenario — as soon as the first artificial intelligence more powerful than human intelligence is created, it will look at its own source code, rewrite itself to become even more powerful, take over any other networked computers, do the whole thing again, build nanomachines, and effectively gain the power to do anything. And it will make those nanomachines out of any matter that’s close to hand — including human beings. Not because it hates us, but because we’re made of atoms it can use for something else.
The reason for this is that when we talk about intelligence in terms of computers, we’re not necessarily talking about things like kindness, or mercy, or sense of humour, or cruelty, or any of the other aspects of any personality we know — we’re not talking about a personality at all. Rather, we’re talking about a machine with a goal that it will fulfill as efficiently as possible. If you have an artificial intelligence that’s programmed to make paperclips, then it will turn the world — and as much of the universe as it can — into paperclips. Intelligence doesn’t necessarily equate to having a personality — just to the ability to use the world to reach your goals.
The obvious rejoinder to this is “but we wouldn’t program an AI to make as many paperclips as possible, we’d program it to make people as happy as possible, or something” — in which case the AI might, on becoming powerful enough, pump everyone full of opiates until everyone spent the rest of their short lives in a happy catatonic state. “Well, OK, we’d program it to… eradicate disease” — which it does by sterilising the entire planet with nuclear weapons.
You see the point.
“But won’t it know that’s not what we meant?” — it might, but it won’t care. It’s programmed to achieve a goal, it does that. Whether that’s what anyone wanted is beside the point.
Hawking is not concerned that we will make a monster that will go on a rampage and kill everyone like a Terminator, out of pure hatred for the puny hu-mans, but rather that we will build a very powerful machine without thinking it through properly, and the extinction of humanity would happen as a byproduct of an industrial accident. The worst-case scenario isn’t that we would create some kind of Frankenstein that resented its creator — that would be close to the *optimal* result, as it would be something that we would recognise as having a personality, and basically-human drives. Rather the worst-case scenario is that we create something with no personality whatsoever, that turns the entire Earth into a giant computer devoted to working out the *absolute best* recipe for rice pudding, and once that’s done shuts itself down forever.
Organisations like the Machine Intelligence Research Institute are (when they’re not funding rather good Harry Potter fanfic, at least) trying to work out what the actual *safe* way to program an artificial intelligence, so it will do what we want, not just what we say, because an AI that did what we say would quite likely kill us all.
However, as I *don’t* think those four postulates are all true, thankfully we don’t have to worry about it. We only have to worry about global warming, resource depletion, electromagnetic pulses, pandemics, nuclear war…