I have come to the conclusion that anyone who talks about how easy it’s going to be to simulate a human brain in a computer either understands computers but doesn’t understand biology, or doesn’t understand computers but understands biology. I’m currently studying for a Master’s in Bioinformatics, so I have an equal lack of understanding of both subjects.
The argument seems to be “the genome is like a computer program – it contains all the information needed to build a person. The genome’s only a few gigabytes long, so the Kolmogrov complexity of the ‘create a brain’ program must be less than that. We have computers that can run programs that long, so it’s only a matter of time before we can run a ‘create a brain’ program on our computers”.
Now, firstly, I simply don’t believe that one can reduce the problem in this way. Intuitively, it doesn’t make much sense. I have a little over 20GB of Beach Boys MP3s/FLACs on my hard drive. They couldn’t be compressed much more than that without loss of information. The human brain is supposed to be the most complex known object in the universe. I simply don’t believe that the most complex known object in the universe has a lower Kolmogrov complexity than the surf-pop harmony stylings of the Wilson brothers. I mean, I’ve not even counted my Jan and Dean MP3s in there!
But let’s ignore the intuitive problems, and also ignore various practical issues like epigenetic inheritance, and assume for the moment that the human genetic code is a computer program, and this 3GB (or “3/4 of the size of the directory which just contains my very favourite Beach Boys live bootlegs” in human terms) program will, if run on the correct hardware, produce a human body, including the brain. Here is where we hit the problem with the concept of Kolmogrov complexity, so freely bandied around by a lot of these people.
Basically, Kolmogrov complexity is a measure of how small the smallest computer program that can produce a given output is. For example, say we want to run a program that outputs “Hello World!” and a line break. In Perl (the language with which I’m most familiar) this would be:
print “Hello World!\n”;
That’s 39 bytes long. This means that we know the Kolmogrov complexity of a Hello, World program must be 39 bytes or less. It might be possible to do it in fewer bytes in some other programming language, but we know that any program more than 39 bytes long isn’t the shortest possible program that does that.
Now, the reason Kolmogrov complexity is a useful measure is that it doesn’t vary *much* between languages and platforms. Say you have a program written in perl, but for some reason you want to run it in Java. ‘All’ you need to do is wrap it in another program, which converts perl to Java, so if your ‘convert perl to Java’ program is, say, 1.2 megabytes (that’s the size of the /usr/bin/perl program on my GNU/Linux system, which converts perl to machine code, so that’s a reasonable size), the length of the shortest Java program to do that thing must be at most the length of the perl program plus 1.2 megabytes.
As program size gets bigger, that ‘plus 1.2 megabytes’ gets swamped by the size of the program, so Kolmogrov complexity is a very good measure of the complexity of *getting a particular computer to perform a task*.
But the problem is that it doesn’t take into account the complexity of *the hardware performing the task*, so it’s not very good when moving between vastly different types of hardware.
Take a jukebox as an example. If you want it to play, say, Good Vibrations by the Beach Boys, you look for the code for that (say, A11) and punch that into the jukebox, which executes that ‘program’. Now, that proves that the Kolmogrov complexity of the ‘play Good Vibrations’ program is a couple of bytes long.
But if I want my computer to play Good Vibrations, the simplest program that will do it is ‘playsound /media/disk/Beach\ Boys/Smiley\ Smile/Good\ Vibrations.mp3’ – that’s thirty-five times the length of the jukebox ‘program’. But that’s not all – you have to count the size of the ‘playsound’ program (15 kilobytes) and the MP3 file (3.8 megabytes). Moving our ‘program’ from the jukebox to my computer has made it several million times as long, because we’ve had to take information that was previously in hardware (the physical Beach Boys CD within the jukebox and the ability of the jukebox to play music) and convert it into software (the MP3 file and the playsound program).
Now, I never normally talk about my day job here, because I don’t want to give anyone an excuse to confuse my views with those of my employers, but it’s almost impossible not to, here. The lab in which I work produces a piece of software which allows you to run programs compiled for one kind of computer on another kind. The product I work on allows you to take programs which are compiled for computers with x86 processors (such as, almost certainly, the one you’re using to read this) that run GNU/Linux, and run them on machines which have POWER chips, which also run GNU/Linux.
Now, this program took many person-decades of work by some very, very bright people, and a huge amount of money, to develop. It’s a very complex, sophisticated piece of software. Every time even something relatively small is changed, it has to go through a huge battery of tests because something put in to make, say, Java run faster might make, for example, the Apache web server break. (This is lucky for me as it means I have a job). Even given this, it’s still not perfect – programs run slower on it than they would on an x86 box (sometimes not very much slower, but usually at least a bit slower), and there are some programs that can’t work properly with it (not many, but some). It’s astonishingly good at what it does, but what it does is, by necessity, limited. (To the extent that, for example, the programs still have to be for the same GNU/Linux distro the POWER machine is running – you can’t use it to run, for example, a Red Hat 5 program on a Red Hat 4 box).
Now, this hugely complex, sophisticated, expensive-to-create program converts from one chip type to another. But both chips are Von Neumann architectures. Both use the same peripheral devices, and the same interfaces to those devices. Both are designed by human beings. And the people writing that program have access to information about the design of both types of chip, and can test their program by running the same program on an x86 box and on a POWER box with their program and seeing what the result is.
Now, when it comes to the ‘program’ that is the genetic code, none of that’s true. In this case, the hardware+operating system is the cell in which the genetic code is embedded, plus the womb in which that cell gets embedded, the umbilical cord that brings it nutrients, the systems that keep the mother’s body temperature regulated, the hormone levels that change at different times… basically, instead of having two chips, both of which you can examine, and leaving everything else the same and trying to get your three gig program to run (which I know from experience is in itself a massive problem), you have to simulate an entire human being (or near as dammit) in software in order to run the genetic code program – which we’re running, remember, *in order to simulate part of a human being!*
And you have to do that with no access to source code, with no way of testing like-for-like (unless there are women who are lining up to be impregnated with randomly-altered genetic material to see what happens), and with the knowledge that the thing you’re creating isn’t just a computer program, but at least potentially a sentient being, so a coding error isn’t going to just cause someone to lose a day’s work because of a crash, but it’ll give this sentient being some kind of hideous genetic illness.
And what are you left with, at the end of that effort? A baby. One that can’t interact with the physical world, and that is many, many times slower than a real baby. And that will undoubtedly have bugs in (any computer program longer than about thirty lines has bugs in). And that requires huge amounts of energy to run.
I can think of more fun, more reliable ways of making a baby, should I happen to want one.
But the point is, that even that would be a phenomenal, incredible achievement. It would dwarf the Manhattan Project and the moon landings and the Human Genome Project. It would require billions in funding, thousands of people working on it, many decades of work, and several huge conceptual breakthroughs in both computer science and biology.
Which is not to say that it’s impossible. I’ve never seen a good argument against it being theoretically possible to create artificial general intelligence, and I’ve not seen any convincing ones against the possibility of uploading and emulating a particular person’s brain state. And assuming that technology continues to improve and civilisation doesn’t collapse, it may well happen one day. But people like Kurzweil arguing that the relatively small size of the genome makes it a trivial problem, one that will be solved in the next couple of decades, are like those people who drew graphs in 1950 showing that if top speeds achieved by humanity carried on increasing we’d be at the speed of light in the 1990s. The first example of that I saw was in Heinlein’s essay Pandora’s Box. The retro-futurology page has an examination of some of the other predictions Heinlein made in that essay. Suffice it to say, he didn’t do well. And Heinlein was far more intelligent and knowledgeable than Kurzweil.
And of course, hidden in that paragraph above is a huge assumption – “assuming that technology continues to improve and civilisation doesn’t collapse”. It’s that, in part, that I want to talk about in the next part of this, coming up in a few hours.