How We Know What We Know 2: Occam’s Razor
(Continues from part one)
So far we’ve examined how we form a scientific theory. What we need to know now is what makes a *good* theory – how do we choose between two theories which make the same predictions?
The answer is a principle which has been known since the fourteenth century, but which is still widely misunderstood – Occam’s Razor.
What Occam’s Razor says is that when given two competing explanations, all things being equal, we should prefer the simpler one.
Intuitively, this makes sense – if we have two explanations of why telephones ring, one of which is “electrical pulses are sent down a wire” and the other is “electrical pulses are sent down a wire, except for my phone, which has magic invisible pixies which make a ringing noise and talk to me in the voices of my friends”, we can be pretty confident in dismissing the second explanation and thinking no more about it – it introduces additional unnecessary complexities into things.
It is important, however, to note that this only applies if the two competing hypotheses make the same predictions. If the magic pixie hypothesis also predicted, for example, that none of my friends would remember any of the phone calls I remembered having with them (because they were really with the pixies) then if that were correct we would have a good reason for preferring the more complex hypothesis over the less complex one – it would explain the additional datum. (In reality, we would need slightly more evidence than just my friends’ forgetfulness before we accepted the pixie hypothesis, but it would be a way to distinguish between the two hypotheses).
Another example – “There is a force that acts on all bodies, such that they are attracted to other bodies in proportion to the product of their masses and in inverse proportion to the distance in between them”. Compare to “Angels push all bodies, in such a way that they move in the same way that they would if there was a force that acted upon them, such that they were attracted to other bodies in proportion to the product of their masses and in inverse proportion to the distance in between them”. The two hypotheses make the same predictions, so we go with Newton’s theory of universal gravitation rather than the angel theory. If we discovered that if we asked the angels very nicely by name to stop pushing they would, we would have a good reason to accept the angel hypothesis.
A third, real-life example – “life-forms evolve by competing for resources, with those best able to gain resources surviving to reproduce. Over many millions of years, this competition gives rise to the vast diversity of life-forms we see around us.” versus “God made every life form distinctly, just over six thousand years ago, and planted fake evidence to make it look like life forms evolve by competing for resources, with those best able to gain resources surviving to reproduce and giving rise to the vast diversity of life-forms we see around us, in order to test our faith.”
Any possible piece of evidence for the first hypothesis is a piece of evidence for the second, and vice versa. Under those circumstances, we need to discard the second hypothesis. (Note that in doing so we are not discarding the God hypothesis altogether – this comparison says nothing about the God or gods believed in by intelligent religious people such as, say, Andrew Rilstone or Fred Clark, though of course there may well be equally good arguments against those deities. But it does give us more-than-ample reason to dismiss without further thought the vicious, evil deities worshipped by Tim LaHaye or Fred Phelps.
But hang on, doesn’t it work the other way, too? Can’t we say “that big long explanation about masses and distances is far more complicated than just saying ‘angels did it’, so we should just say that”?
Well, no… remember what we’re trying to do is find the simplest explanation for a phenomenon. if you accept gravity as an explanation, that’s a single explanation for everything. If you use the angel explanation, you have to ask about every apparent act of gravity “Why did that happen?” and get the answer “angel number forty-nine trillion decided to push that molecule in that direction” – you’re just shifting all the complexity into the word ‘angel’, not getting rid of it.
So the question now is what do we mean by ‘explanation’? After all, nothing is ever ultimately explained. We ask why things fall to the ground, we get ‘because gravity’. We ask why does gravity exist, and after a few centuries we discover it’s because mass warps space-time. We ask why that happens… and so far answer came there none. Ultimately with *any* question you can keep asking ‘why?’ and at some point we hit the boundaries of what is explicable. Does this mean that there’s no such thing as an explanation?
Clearly it doesn’t – we have an intuitive understanding of what the word ‘explanation’ means – but how can we formalise that understanding in a way that allows us to discuss it properly?
I would suggest this as a rough definition – something counts as an explanation if it is the answer to two separate questions.
By which I mean, if the force of gravity were *only* the answer to the question “why do things fall down?” then it would be no answer at all, really – it’s just shifting the problem across. “Things fall because there is a force of things-fallingness” sounds like an explanation to many people, but it doesn’t actually tell you anything new.
However, gravity is *also* the answer to the question “why do planets go in elliptical orbits around the sun?” – two apparently unrelated facts, things falling and planets going in orbit, can be explained by the same principle.
This kind of explanation can happen in all the sciences – and explanations can even cross sciences. Take cancer as an example. There are several diseases that we call cancer (lung cancer is not the same disease as leukaemia is not the same disease as a brain tumour), and they all have the same explanation – a cell starts replicating too much, and the replicated cells themselves also reproduce too fast. They compete for resources with the normal cells, and eventually starve them out, because they can reproduce faster. That explanation works for all the different diseases we call cancer, whatever their outcomes, and whatever their original cause.
But that explanation can then even be taken off into other fields. I once worked for a company that wasn’t making very many sales, and had the sales people on a salary, not just commission. They took on more sales staff, because they weren’t making very many sales – but the new sales staff didn’t make enough more sales to justify their salaries. So they took on more sales staff, because they weren’t making very many sales…
I realised, just looking at the organisation, that the sales department had literally become a cancer in the business. It was draining the business’ resources and using them to grow itself at a frightening rate while the rest of the business was being starved. I quit that job, and within six months the company had been wound up.
That’s the power of a really good explanation – it will be applicable to multiple situations, and tell you what is happening in all of them. The explanation “parts of a system that take resources from the rest of the system to grow at a rapid rate without providing resources back to the rest of the system will eventually cause the system to collapse” works equally well for biological systems and for companies. That principle is a powerful explanation, and it’s the simplest one that will make those predictions.
So now we have the two most important tools of empiricism, the basis of science – we have the concept of the simplest explanation that fits the facts, and we have the idea of feedback. Those two are all you *need* for you to be doing science – and we’ll come back to both of them later, when we talk about Bayes’ Theorem, Solomonoff Induction and Kolmogrov Complexity – but if those are your only tools it’ll take you a while to get anywhere. We also need to be able to think rigorously about our results, and the best tool we have for that is mathematics. Next, we’ll look at proof by contradiction, the oldest tool for rigorous mathematical thinking that we know of.
How We Know What We Know: 1 – Feedback
One of the reasons I’ve started this series of posts is because I have a huge respect for the scientific method – in fact, I’d go so far as to say that I think the scientific method is the only means we have of actually knowing anything about the world, or indeed anything at all – but I think that even many other people who claim to believe science to be important don’t fully understand how it works. I also think that many of the people who do know how the scientific method works are not fully aware of the implications of this.
This is not to say, of course, that I am an authority or an expert – in fact, questioning authority and experts is one of the things that defines the scientific method – but it does mean that I’ve thought about this stuff a lot, and might have something worthwhile to say.
To start with, let’s look at what the scientific method isn’t. When I talk about the scientific method here I’m talking about what is, in effect, a Platonic ideal version of science. Science as it is actually practiced has all sorts of baggage that comes with being a human being, or with working in a university environment. Try and imagine here that I am talking about the things that a hypothetical alien race’s science would have in common with ours.
The most important thing for us to note as being unnecessary for science is peer review. That’s not to say peer review is a bad thing – in fact it can be a very good thing, a way to separate out crackpottery from real science, and more importantly a way to discover what your embarassing mistakes are before you become committed to believing in them – but it’s not necessary for doing science. That can be shown rather easily by the fact that neither Newton’s Principia or Darwin’s On The Origin Of Species were peer-reviewed, but it would be hard to argue that Newton and Darwin weren’t scientists.
More importantly, there’s some evidence that peer review actually doesn’t do any better at telling good science from bad than choosing at random. I have some problems with the methodology of that study (I think meta-analyses are, if anything, actively bad science rather than just being neutral as peer review is), but other studies have shown that in fact the majority of published studies in peer-reviewed journals are likely to be false.
So if I’m not talking about science-as-it-is-practiced, with all its flaws and human errors, what am I talking about? What is the core of the scientific method?
Well, the first, and most important, part is feedback.
Feedback may be the single most important concept in science – so much so that it’s been reinvented under different names in several different disciplines. Feedback is the name it’s given in cybernetics – the science of control systems, which is what I’m most familliar with – and in information theory and engineering. In computer programming it’s known as recursion. In biology it’s known as evolution by natural selection. And in mathematics it’s called iteration. All of these are the same concept.
Feedback is what happens when the output of a system is used as one of the inputs (or the only input) of that system. So musicians will know that if you prop an electric guitar up against an amp, or have your microphone too near a speaker, you quickly get a high-pitched whining tone. That’s because the tone from the speaker is going into the guitar’s pickups, or into the mic, in such a way that the low frequencies cancel out while the high frequencies add up. The sound goes straight out of the speaker and back into the pickup or mic, and can quickly become overwhelmingly loud.
That’s what we call ‘positive feedback’. Positive feedback leads to exponential growth very quickly – in fact it’s pretty much always the cause of exponential growth. We can see how easily this happens using a computer program:
#!/usr/bin/perl
$myNumber = 2;
while ( $myNumber > 0 ) {
print $myNumber. ” “;
$myNumber *= $myNumber;#This says that as long as myNumber is greater than
#0 – which it always is – the program should
#multiply it by itself, after printing it to the
#screen.}
This program starts with the number two, multiplies it by itself, and then takes the number it gets and uses that as its input, multiplying it by itself. When I ran this program on my computer, the numbers got so big that the computer couldn’t cope with them before I had a chance to blink – it just kept saying the answer was infinity. The first few outputs, though, were 2, 4, 16, 256, 65536, 4294967296, 1.84467440737096 x 10^19. That last number is roughly a two with nineteen noughts following it, for those of you who don’t know exponential notation.
So positive feedback can make things change a huge amount very, very quickly. So what does negative feedback do?
Negative feedback does the opposite, of course, which means that it keeps things the same. The easiest example of negative feedback at work I can think of is a thermostat. A thermostat is set for a temperature – say eighteen degrees – and controls a heating and a cooling device. When the temperature hits nineteen degrees, it turns the heater off and the cooler on, and when it hits seventeen it turns the cooler off and the heater on. Again, the output (the temperature) is being used as the input, but this time the output does the opposite of what the input is doing – if the input moves up the output moves down – and so it keeps it steady.
Negative feedback is used in all control systems, because negative feedback looks just like an intelligence trying to find a particular goal. That’s because it is how intelligent agents (like people) try to get to their goals.
Imagine you’re driving a car – the input is what you see through the windscreen, while the output is the way your hands turn the steering wheel. You want to go in a straight line, but you see that the car is veering to the left – as a result, you turn the steering wheel slightly to the right. If it veers to the right, you turn the steering wheel to the left. If you’re a good driver, this feedback becomes almost automatic and you do this in a series of almost imperceptible adjustments. (If you’re me, you veer wildly all over the road and your driving instructor quits in fear for his life).
So what happens when you put positive and negative feedback together? The answer is you get evolution by natural selection.
A lot of people, for some reason, seem to have difficulty grasping the idea of evolution (and not just religious fundamentalists, either). Evolution by natural selection is actually a stunningly simple idea – if you get something that copies itself (like an amoeba, or a plant, or a person), eventually you’ll get tons of copies of it all over the place – positive feedback. But things that copy themselves need resources – like food and water – in order to make more copies. If there aren’t enough resources for everything, then some of them will die (negative feedback from the environment – the environment ‘saying’ “OK, we’ve got enough of you little replicators now”).
Only the ones that live will be able to make more copies of themselves, so if some of the copies are slightly different (giraffes with longer necks, or people who are clever enough to avoid being eaten by sabre-toothed tigers), the ones whose differences help them live longest will make the most copies.
And those differences will then be used as the starting point for the next rounds of feedback, both positive and negative – so the differences get amplified very quickly when they’re useful, and die off very quickly when they’re useless, so you soon end up with giraffes whose necks are taller than my house, and humans who can invent quantum physics and write Finnegans Wake, within what is, from the point of view of the universe, the blink of an eye.
But what has that to do with the scientific method?
Everything – in fact, in essence, it is the scientific method.
To do science, you need to do three – and only three – things. You need to have a hypothesis, perform an experiment to test that hypothesis, and revise your hypothesis in accordance with the result. It’s a process exactly like that of natural selection.
In particular, for science we want negative feedback – we desperately want to prove ourselves wrong. We come up with a hypothesis – let’s say “All things fall to the ground, except computer monitors, which float”. We now want to see if our hypothesis will survive, just like our giraffes or people did. So we want negative feedback. So we have to ask what test will prove us wrong?
What we don’t want is a test that seems to confirm our hypothesis – that’s boring. We got our hypothesis from looking at the world – maybe I dropped a cup on the floor and it broke (that’s where positive feedback from the environment comes in – we need something from the environment to start the ball rolling). So we don’t want to run a test where we already know the answer – we’re not trying to prove to ourselves that we’re right. So we don’t try dropping another cup.
A test that might go wrong there is dropping a computer monitor. If we try that, we discover that our initial hypothesis was wrong – computer monitors don’t float. So we revise our hypothesis – maybe to “All things fall to the ground, and if you put your foot under a monitor when you drop it, it really hurts” – and then we test the new hypothesis.
When your hypothesis matches experiment time and again – when everything you or anyone else can think to throw at it, that might prove it wrong, matches what your hypothesis says – then you’ve got a theory you can use to make predictions. You’ve suddenly got the ability to predict the future! That’s pretty impressive, for something that is, in essence, no different from what my guitar does when leaned against an amp.
You can also use it to ‘predict’ the past, in the same way – which is why things like paleontology are sciences, and why social sciences like history are called social sciences rather than arts. You can do the same thing there, except that the experiments involve looking for things that have already happened but you don’t know, rather than trying new things and seeing what happened. You might, for example, come up with the hypothesis “Tyrannosaurus Rex was actually a vegetarian.” Using that hypothesis, you’d make various predictions – that if you looked at a T. Rex skull it would have lots of flat teeth, suitable for grinding vegetation, for example. Then you’d go and look at the skull, and examine the teeth, and see that in fact it had tons of razor-sharp teeth suitable for ripping flesh, and revise your hypothesis, maybe coming up with “Tyrannosaurus Rex was actually not a vegetarian.”
(Apologies to my friends Mike and Debi, whose field I have grossly oversimplified there).
This is the big difference between scientists and other groups – like conspiracy theorists or a sadly-large number of politicians. Conspiracy theorists go looking for evidence that confirms their ‘theories’, and they find it. You can always find confirmation of anything, if you’re willing to ignore enough negative evidence. If you go looking for evidence that you’re wrong – and you do so sincerely, and invite others to aid you in your search – and you don’t find it, you’re probably right.
Next week – how to choose between alternative theories.
Linkblogging for 27/06/09
Just a quick one today as I’m visiting my parents…
Jess Nevins has the best piece I’ve read on the death of Michael Jackson, treating Jackson’s life as a Gothic text on which to perform literary analysis.
Patrick at Lib Dem Voice is calling for a repeal of section 141 of the Mental Health Act, which states that any MP who gets sectioned will be removed from their seat and not returned, no matter how brief their illness. This is something with which I absolutely agree – there is no reason why someone treated for, say, depression, can’t be an entirely productive MP later on.
The Mail are misogynist arseholes, film at eleven.
J.H. WIlliams and Todd Klein have collaborated on a print of the section of The Morte d’Arthur where he pulls the sword from the stone. I own two of Klein’s earlier prints, the collaborations with Alan Moore and Neil Gaiman, and they’re really very good indeed. I’ll probably buy this one to go with the others.
microRNA appears to target cancer cells specifically and trigger apoptosis. Very promising, but the actual paper cited is behind a pay-wall.
And Jon Morris is putting up MP3s of some old 78s he’s found.
Hat And Beard 1 – Chuck D
As you probably know, today is the 200th anniversary of the birth of two of the greatest people who ever lived, Abraham Lincoln and Charles Darwin, and given that I write here about politics and science it seemed absurd not to mark this (those waiting for the last Final Crisis post will have to wait a bit more – I was going to post yesterday but my home net access was FUBAR, and I’m writing these two today. But I’ll have a very long, special post for you on Saturday). For Lanky Linc I’m just going to talk about freedom generally, rather than Lincoln’s own achievements specifically (though I’m sure this will disappoint my friend Tilt, the one person I know who has a recording of the Gettysburg Address on his MP3 player) as I think everyone reading this will be broadly in agreement that slavery was a bad thing. However, it is likely that people reading this will *not* know some things about Darwin that are worth knowing, so here’s a rough guide to some misconceptions about Darwin, and to what he *actually* did:
Misconception 1 – Darwin came up with the idea of evolution
As a matter of fact, the idea of evolution had been current in biology long before Darwin, starting with Jean-Baptiste Lamarck, and supported by such notable biologists of the time as Buffon and Darwin’s grandfather, Erasmus Darwin. However, the reason Lamarck is considered to have been mistaken and Buffon and Erasmus Darwin’s names now survive only as footnotes to literature of the time (Buffon mentioned as being current in Shaw’s boyhood in the Lamarckian preface to Back To Methuselah and Erasmus Darwin’s experiments with vermicelli being an inspiration for Frankenstein) is that they hadn’t come up with a satisfactory explanation of how or why evolution happens, ‘just’ observed that it did. Their best guess (and it was a reasonable one) was that, for example, a proto-giraffe wanted to reach higher leaves, so stretched its neck out until it could, then passed that stretchy neck on to its descendants. This idea, of evolution having a purpose or being directed by a mind toward a goal, still seems very popular among the public (it resurfaces, for example in Terry Nation’s Dalek stories and Grant Morrison’s X-Men run) but it’s almost universally regarded as wrong by the scientific community (except for a couple of people on the margins like Rupert Sheldrake, who is also almost universally regarded as wrong by the scientific community).
What Darwin came up with was the idea of natural selection.
Darwin’s work had something to do with genes
While genetics provides a lot of modern support for Darwin’s ideas, and a lot of popularisers like Richard Dawkins now explain his ideas in those terms, the science of genetics developed later than evolutionary biology, and to start with relatively independently. The ideas in genetics were originally the work of the Austrian monk Gregor Mendel, and the molecular basis of modern genetics was mostly worked out by Rosalind Franklin based on earlier work by Linus Pauling (then Crick & Watson put the finishing touches on Franklin’s work and took the credit for themselves). Genetics and evolutionary theory reinforce each other, but they developed independently.
Evolution is ‘just a theory’, it’s not been proved
This is literally true, but it’s also a misunderstanding of how science works. Contrary to media reports, there is no such thing as ‘scientific proof’ and nothing has ever been ‘proved’ scientifically – everything in science is contingent, and subject to change if new evidence comes in. That, more than anything else, is what makes science science.
However, the word ‘theory’ has a rather more technical meaning in science than it does in vernacular English. When speaking casually, we can say “I have a theory about that…” meaning just “I have an idea”. In science, on the other hand, that would be called a conjecture or (at best) a hypothesis. A theory is an idea which explains things that have been observed, that contains testable predictions, and that has been tested multiple times and found to be correct every time. It could still be wrong, but once something’s called a theory it’s extremely unlikely to be wrong, because it agrees with every test we can put it to. The mass of evidence for the idea of evolution is large enough that we can safely say we’re as sure that evolution happens as we are of anything. We might still improve some of the fine details, just as happens with the theory of (say) gravity, but just as we know that if we drop a heavy weight it’ll fall, even if we turn out to be wrong about the twentieth digit of the gravitational constant, we know that humanity came from an apelike ancestor which came from a monkeylike ancestor which came from a shrewlike ancestor and so on.
So what did Darwin do? Well, two things.
Firstly, and least importantly, he provided a greater mass of evidence for the idea of evolution than anyone had ever done before, and all in one place. On The Origin Of The Species is a great book, but it’s also almost unreadable, and for much the same reasons as those other two great unreadable books of the 19th century Capital and The Golden Bough – the sheer, unrelenting mass of detailed evidence he provides is enough to convince you very early on, but then he goes on, and on, and on and on and on, shooting down every possible objection and listing twenty-five bits of evidence for almost every sentence. By the time you get a third of the way through you will know more about the slightly-interestingly-shaped beaks of different species of finch than you ever thought possible.
This makes it sound like a bad book, but in fact it was a necessary book – if you want to convince people of a revolutionary idea, you have to overwhelm them with evidence, and Darwin spent literally decades of his life collating the evidence to prove his point.
But the most important thing he did (along with Alfred Russell Wallace, who came up with the idea independently when Darwin had nearly finished writing his book) was to come up with the idea of natural selection.
Like most great revolutionary new ideas, this was made up of a couple of old ideas that no-one else had ever thought of putting together before (this is not sarcasm – that is how most geniuses work). Darwin took Lamarckian evolution and added to it the idea, originally developed by Thomas Malthus, of competition for scarce resources. What he came up with was brilliant in its simplicity.
To take the example of the giraffe, used above when talking about Lamarck, imagine you have a load of horselike animals living in an environment where there’s not much grass, but there are plenty of bushes and trees. The horselike animals breed rapidly while living off the bushes, and when they breed some are naturally born with longer necks than others – not through any deliberate stretching, but just through normal variation in the same way some people are born with lighter or darker hair. And just like the colour of hair, their children will tend to inherit the slightly longer necks.
So eventually, so many horselike creatures are born that they eat all the bushes, and they start starving. However, the ones with the slightly longer necks can reach the lower leaves on the trees, and eat them and survive and breed. Eventually you have a population of longer-necked horselike creatures. They have then eaten so many of the lower leaves that only those who can reach the higher leaves will survive. Repeat this over many generations, and you have a giraffe.
And this simple process can explain, to the best of our current knowledge, all the near-infinite variety of lifeforms in the world, from the AIDS virus to the peacock to the plum tree to the mountain gorilla. All you need, to get to all that, is some stuff that breeds and eats, and not quite enough stuff for it to eat. Then wait a few hundred million years, and voila! You get a species of ape that seems destined to destroy it all…
That is what Darwin explained, and that is why Darwin is one of the handful of most important people who ever lived.


2 comments