I’ve just posted a question at Mathoverflow challenging people to come up with examples of proofs that appear to require a sort of “cognitive leap” that humans are surprisingly good at and that it is hard to imagine computers ever being capable of. I don’t myself believe that there is some fundamental way in which humans are better at mathematics than computers are (or rather, could ever be). So this is the first of what may turn into a series, if I have time, of posts in which I shall try to demonstrate that examples that have been suggested are of proofs that a computer program could in principle find too. (Many of the examples suggested are in areas where I have absolutely no hope of doing this: I’m not going to be able to tell you how a computer could have invented Thom’s work on cobordism, for instance.)

In this post I want to tackle a proof that was a pretty radical departure when it first came out: Cantor’s indirect proof of the existence of transcendental numbers.

An initial problem is of course to specify what the computer can be assumed to know already. I suppose the short answer is that it can know the mathematics that Cantor knew before he came up with his argument. Should that include the harmonic analysis that he worked on before he turned to set theory? Perhaps, but I won’t make use of that in this account. However, what we *can* assume is basic facts about the real numbers and sequences, such as the convergence of bounded monotone sequences, the Bolzano-Weierstrass theorem, the nested-intervals property, and so on. And of course, the major thing that we *cannot* assume is anything that remotely smells of the definition of countability.

It’s very hard to get the proof (by which I mean the one that uses nested intervals, which was Cantor’s first proof) out of one’s head enough to think about this. I feel a strong temptation to give the usual steps and explain why they are sensible — but that is justifying with hindsight, which is easy.

Now one idea that was already around, since it was what Liouville did, was that you could produce a transcendental number as the limit of a sequence. In fact, and this is something that’s only just occurred to me, you can even think of Liouville’s argument as a sort of unnecessarily complicated diagonal argument. What do I mean by this? Well, he shows that for any height (by which I mean largest modulus of a coefficient of the minimal polynomial), any degree and any closed interval of positive length there is a closed subinterval of positive length that contains no algebraic numbers of at most that height and degree. His number is just an explicit number that lies in the intersection of some intervals that between them rule out all algebraic numbers. So the differences between Liouville’s argument and Cantor’s argument are (i) that Liouville rules out whole sets of algebraic numbers all at once, whereas Cantor picks them off one by one, and, much less interestingly, (ii) Liouville actually chooses an explicit collection of intervals that does the job, when it’s obvious that there are all sorts of numbers that are transcendental for the same reason.

Let’s suppose that Cantor was thinking hard about Liouville’s argument. (This has switched from “How could a computer do it?” to “How could a human do it?” but I regard these two questions as essentially the same, as long as you don’t appeal to moments of genius, flashes of inspiration, etc.) He would have noticed (perhaps not completely consciously) the following very important feature: Liouville does not produce the number all in one go, so to speak. For instance, he does not choose a number such as or Rather, he constructs a sequence such that its limit is algebraic.

Now how does constructing a sequence make things easier for Liouville? The answer is that it allows him to split the problem up into a whole lot of subproblems: make sure you avoid these algebraic numbers, then these ones, then these ones, and so on.

It’s not all plain sailing though, since if you are taking the limit of a sequence, then at stage you don’t know what’s going to happen later on. So you must make little promises as you go along: I promise that whatever I do later on, the limit will be close enough to this number here that the argument so far remains valid. But for that to be possible, you have to find not just a single *point* that rules out a whole bunch of algebraic numbers, but an entire *interval* of points. (Or rather, it isn’t obvious that you absolutely have to go for an interval, but it is a lot easier to ensure that the limit lies in an interval than it is to ensure that it lies in a more complicated set. And in any case, Liouville’s proof does give an interval.)

If you hadn’t seen Liouville’s argument either, then it is quite easy to imagine being stuck in the following rut. The proof that there is an irrational number goes, “Here’s an irrational number.” So it’s sort of obvious that if you want to find a transcendental number then you have to do the same sort of thing: you go out and find some numbers like and and you do your best to prove that they are transcendental. The idea of constructing a number *using a limiting process* is rather different. Having said that, it is not without precedent. By the time of Liouville, the intermediate value theorem had been proved, so anyone who had thought carefully about why exists would be familiar with the idea of limiting processes.

But to go back to Cantor, he might have had the following kind of thought: “Liouville created a sequence with a transcendental limit. What is the most general form of this argument I can come up with?” (It is an interesting question whether computers could do this kind of stepping back and generalizing. I think that under certain circumstances they should certainly be able to. A particularly simple example is generalizing hypotheses — e.g. when you’ve proved something additive about the integers mod , see if you can prove it for all groups — or removing them entirely.) “Well, if I’m creating an increasing sequence, I’ve got to make sure that the first term has some property that stops it being equal to some bunch of algebraic numbers, and that if I perturb it slightly then it still has that property. Hmm, which numbers did Liouville rule out by insisting that his number was very close to 0.1100010000…0001? If you look carefully at what he actually proved, you see that no number that is very close to this one can be a root of a polynomial of degree at most some d with coefficients that aren’t too large in modulus. What can we say about the set of such polynomials? Well, it’s obviously finite for a start. But hang on! If it’s finite, then that’s only *finitely* many roots of polynomials that Liouville has ruled out by this stage. We didn’t need some special number to do that! It’s *obvious* that we can find an interval that avoids a finite collection of numbers. What was Liouville *thinking*? Actually, if we pursue this thought we have a very nice abstract version of Liouville’s argument. By gradually increasing the degree and the maximum size of a coefficient, we admit all roots of polynomials, but we admit only finitely many at a time. So inside the first interval I can find another interval that misses another bunch of algebraic numbers, and inside that I can find another one — wow, I don’t need any of that rational-approximation stuff that Liouville went on about.

“In fact … I don’t even need the numbers to be algebraic! This is crazy! All I need to know is that I can divide up the numbers I’m trying to avoid into a sequence of collections of numbers, each of which is finite. Of course, if I can do that, then I can put my numbers in a sequence, since I can just write out those finite collections in increasing order. Conversely, a sequence of numbers is basically a sequence of collections of numbers where each collection contains just one number. What I seem to have proved here is that for *any* sequence of real numbers I can find a real number that is not equal to any term in that sequence. Wow, I quite like that — it seems to get to the heart of why there are transcendental numbers but be massively more general.

“But hang on a moment. That’s quite strange, because it’s sort of saying that if I match each positive integer to a real number, then I must miss out a real number. In some sense, there seem to be genuinely *more* real numbers than there are positive integers. It’s almost as though we’ve got an infinity that’s ‘bigger than infinity’. Hmm … I’ll think about that tomorrow.”

I think some kind of account like that shows that *even though* Cantor’s argument came as a big shock and was resisted by many, we do not have to make a lazy appeal to his genius (and still less his madness) to explain how he could have come up with the idea. It was a radically new approach, and yet it grows naturally out of the old.

That is not to belittle Cantor’s achievement. It is hard to imagine, writing comfortably in the twenty-first century, what kinds of background assumptions Cantor must have been taught and must have had to set aside. Perhaps that is what we really mean by genius in this instance — not that the idea popped into his brain by means of some utterly inexplicable process, but rather that he was sufficiently flexible to follow his thoughts where they naturally led, without letting himself be distracted by hidden assumptions. (This makes me think of special relativity — perhaps it is reasonable to say that the genius there was mainly daring to question “obvious truths” such as that velocities add and that simultaneity is an absolute concept. The mathematics itself is not hugely hard.)

A natural question at this point is how Liouville might have thought of *his* argument. I’ll leave that one for another day.

One final note: I did not introduce the word “countable” into my fictitious account of Cantor’s thought processes. Apparently Cantor didn’t do so for some time after he came up with the argument. If you present the argument in the usual way — first we prove that the algebraic numbers are countable, then that the reals are uncountable, and from that it follows that there exists a transcendental number — then you are making it look as though the idea of countability springs from nowhere. But in fact it can emerge naturally from the observation that Liouville’s sequence deals with finitely many algebraic numbers at a time and eventually deals with all of them.

December 10, 2010 at 2:36 am |

At the level of possibly unedifying generality, I would be interested in how analogies might be translated into computational routines. In your own account of Liouville’s proof, you realized that it can plausibly be described as a diagonal argument. It reminds me a bit of the Poincare quote about mathematics consisting of calling different things by the same name.

Another, possibly overused example, would be Lie looking for continuous symmetry groups as an analogue to Galois Theory. It seems like a fundamental thought process.

December 10, 2010 at 11:27 am |

[…] This post was mentioned on Twitter by ajlopez, blogs of the world. blogs of the world said: So this is the first of what may turn into a series, if I have time, of posts in which I s… http://reduce.li/qruyby #finding […]

December 11, 2010 at 9:35 pm |

In partial answer to Roy: Peter J. Freyd and Andrej Scedrov gave a graphical calculus for allegories. On n-Catgory Cafe the question was asked: how is that related to string diagram calculus for predicate logic? See, for example:

[book] Categories, Allegories, by Peter J. Freyd and Andrej Scedrov, Elsevier, 1990.

and

“Functorial polymorphism”,

Bainbridge, E S | Freyd, P J | Scedrov, A | Scott, P J

THEOR. COMP. SCI. Vol. 70, no. 1, pp. 35-64. 1990

“In the past several years types have become an important component of programming language design. They provide a logical framework to ensure that programs meet given specifications, support a partial correctness or verification mechanism, enhance software maintenance, and encourage the systematic building of complex modules from simpler ones. These features are crucial in large-scale programming projects requiring coordination among many teams of programmers. “

December 11, 2010 at 11:50 pm |

With an admixture of serious and tongue-in-cheek humor … an easy-to-implement strategy for identifying hard-to-compute mathematical results is to search MathSciNet for articles whose titles contain the phrase

“What is …”.This query returns a list of 838 articles.Terry Tao’s

“What is good mathematics?”asserts that “the concept of mathematical quality is a high-dimensional one, and lacks an obvious canonical total ordering” … such sets are generically hard to search.Of course,

Notices of the AMSregularly publishes“What is …?”articles, and in aggregate these span most of the range of modern mathematical inquiry.My favorite titles are plaintive:

“If multi-agent learning is the answer, what is the question?”or humorous:“What is the meaning of these constant interruptions?”or self-referential:“What is a question?”These

“What is …?”articles (informally) sample a class of mathematical questions that are very difficult for computers to answer … and still more difficult for computers to ask … and isn’t the same broadly true of humans?Thank you, Tim, for this fine weblog … and thank you too, for your many fine comments on many other forums. Happy holidays to all!

December 12, 2010 at 1:06 am |

“On Analogy”. Aristotle’s word was “paradeigma”, also known as “reasoning by example”. He analyzed analogical inference as a mixed syllogism that combined inductive and deductive components. Peirce later analyzed it as involving all three modes of inference: abductive, deductive, and inductive. Jon Awbrey collected a few fragments of source materials here:

http://mywikibiz.com/Talk:Inquiry

December 12, 2010 at 9:06 pm |

Jonathan, thanks; I have the dimmest notion of how categories work, but it’s something I’d like to learn more about.

December 12, 2010 at 11:20 pm |

I would not call Cantor’s proof “indirect”. In section 1 of his 1874 paper “Ueber eine Eigenschaft des Inbegriffs aller reellen algebraischen Zahlen” (On a property of the set of all real algebraic numbers) he first gives an explicit enumeration of the set of all algebraic (real) numbers, and in section 2 he gives an explicit (albeit impractical) construction of a real number that cannot appear in this enumeration. Thus, he shows that there are transcendental numbers by constructing a sequence of algebraic numbers that must converge to a transcendental number.

It is not clear to me what Cantor’s motivation was. The title of the article suggests he was more interested in the yet unnamed concept of countability (or some kind of effective countability); he calls his proof in section 2 an “application” of this concept. It does not sound as if this application was his main concern.

December 13, 2010 at 3:32 pm

It was sloppy of me to use the word “indirect”, I agree. I haven’t looked at Cantor’s paper, but I would hazard a guess that his thought process was quite similar to the one outlined above: that he started out thinking about why there were transcendental numbers, realized that his argument was very general, and then gave the general argument with the existence of transcendental numbers as an “application”. That is, perhaps the application was his main concern to start with, but not by the time he had finished the paper. I could be completely wrong though (and in particular have not read the article by Robert Gray mentioned in the next comment).

December 13, 2010 at 10:17 am |

Robert Gray, “Georg Cantor and transcendental numbers”, American Mathematical Monthly, vol. 101 (1994), pages 819-832

December 13, 2010 at 12:48 pm |

Dear Prof. Gowers,

I am seeing a naked woman’s back right at the end of your post, on the frontpage and in this page, with the legend “Ads by Google”.

Sincerely,

December 13, 2010 at 12:51 pm

Update: It has just disappeared, fortunately for me.

December 13, 2010 at 3:27 pm

I knew I shouldn’t have commented on Dick Lipton’s Wikileaks post …

December 13, 2010 at 5:31 pm |

Reading Frege’s work and coming up with Russel’s antinomy might be hard for a computer (although not exactly within the scope of your challenge).

However, as long as you are allowed to change the program it is not really a Turing machine we are up against. You might consider to commit on the axioms you feed into your machine. If mathematics is not like chess it should be easy for us to come up with proofs your machine will never be able to produce. On the other hand, if mathematics is just like chess, you will eventually win by presenting us better and better programs.

December 23, 2010 at 3:04 pm

The story has it that Russell discovered his paradox when pondering Cantor’s diagonal argument, as applied to the universal set. If we consider the identity mapping the set of all sets that are not members of themselves automatically pops out. So, given the diagonal argument, there’s a simple and plausible step-by-step story explaining how to get to Russell’s paradox.

December 28, 2010 at 8:10 am

The computer has to come up with the idea to apply the correct argument to the correct set. Brute-force would be a possible strategy however the huge amount of available information could make this impractible. Hence we search for rules to reduce the size of the search space. At one point we must commit on these rules (like in a chess machine). Is there a general rule or pattern reproducing Russel’s insight? Or does our code contain some a-posteriori insight like …. ApplyTo(DiagonalArgument, UniversalSet)?

December 15, 2010 at 3:57 am |

I think another important part of creativity is letting yourself make mistakes and unjustified assumptions. Since I only (partially) understand my own creative process, I’ll have to give an example from there. I was trying to prove a certain result one time. I had already proven a partial result on it using a certain equation, which I imagined as a diamond shape in my mind. However, it seemed that a new idea would be needed to solve it completely. After working on it for a few months, I finally had an idea that eventually allowed me to prove the complete result. I imagined that there was another equation that was a larger diamond. At this point, I had no idea how this larger diamond would help prove my result. It turned out that it didn’t exist, but a similar equation did exist, as long as I made a certain assumption which I really had no justification for making. Luckily, that assumption turned out to be true, and the equation ended up working to prove the full result.

The point is, I think that for a computer to be able to make a similar “creative” leap in a reasonable amount of time, it would have to be allowed to make random or unjustifiable assumptions, and “play” with unproven things. The trick would be figuring out how to keep these assumptions sufficiently “creative” while sufficiently relevant to the problem at hand.

December 27, 2010 at 2:36 pm |

Apologies for an off-topic comment. In base 12 let 10=a and 11=b. Then 131=ab.

December 27, 2010 at 3:42 pm |

There’s a terrific Chauvenet Lecture by Saunders Mac Lane,

Hamiltonian mechanics and geometry(1970), in which Mac Lane argues that a mathematician’s notion of “naturality” and a physicists notion of “physicality” are really the the same notion, and that this identity arises because both mathematicians and physicists require that their understanding be coordinate-independent.I posted about Mac Lane’s point-of-view on Dick Lipton’s weblog, under the topic

Unexpected connections In mathematics… but if you think about it, Mac Lane’s observation is relevant to Tim’s question too.Namely, for computers to derive insights into mathematical naturality from the direct experience of physicality—as mathematicians so often have done—then a necessary element of programming a mathematical computer is to provide them with hands, eyes, ears, mobility … and curiosity.

Example: for a computer to conceive of links, knots, and braids … it might be very helpful if the computer first learned to weave fabrics, smith chains, and tie shoes.

As Bill Thurston puts it:

We have a considerable ways to go, before mathematical computer programs “learn to sing by singing.”

January 18, 2011 at 4:53 pm |

Hi Prof. Gowers,

Speaking about numbers, you might find interesting the following result:

The set of all prime numbers is locally connected in Furstenberg’s topology, not locally connected in Golomb’s and Kirch’s topologies (Demonstratio Mathematica Vol. XLIII No 4 2010).

January 20, 2011 at 12:00 pm |

Hi Prof. Gowers,

Also, there is a result on Fibonacci integers (the multiplicative group generated by the Fibonacci numbers) in Journal of Number Theory 131 (2011) 440-457: a near-asymptotic formula for their counting function, the proof based on both combinatorial and analytic arguments.

April 6, 2011 at 11:58 pm |

Hallo!I want to thank you for this kind of articles. I’ve always thought is quite useful this work you often do to break an idea in many simple insincts, but i never had a really concrete example on myself of why it can help. Today i’ve one:some months ago i read on Ben Green page for student this problem(“for enthusiasts” :)):are there 2 real invertible matrices 2×2,that generates the free group on 2 generators?…after some frustrating attempt to construct the two matrices(that is the analogue of find an explicit trascendental number),i realized the potential analogy with your discussion:we want to find something free of relation,let’s look how much do we loose every step of relation;and i was very happy to realize that the situation is exactly in analogy with Cantor one(where here cardinality can be substituted with measure);you can describe every relation as subset of zeros of some real noncostant polinomial in 8 variables, then at every step you loose a set of measure 0, and at the end you have just a countable number of steps(the words),then you loose just a set of measure 0(all this part you can also do with Baire theorem)….sorry if maybe is a little bit OT but i found the analogy very nice, and also the fact that the proof can came up to mind very easier after having seen some decompositions of ideas such that you have done here…Bye

April 12, 2011 at 3:54 pm |

From Dauben’s biography and other sources I got the impression that Cantor got to the (un)countability question through his work on the uniqueness of trigonometric series and that the application to transcendentals was there to make the real result, the uncountability of R, more palatable to the mathematical community.

May 28, 2011 at 9:28 am |

The difference between a finite-state machine(computer) and a human is the capacity to make mistakes. This capacity underlies human creativity. It’s different from the ability of genetic algorithms to evolve because there’s no way at present to pre-determine human mistakes(good or bad) perhaps because humans exist in a dynamic equilibrium. On the other hand, from the inception of a computer each procedure is defined recursively and so every procedure must be predictable.

April 6, 2012 at 9:54 pm |

[…] – Rouse Ball chair, Cambridge U, Fields Medal 1998, see ICM 2010 or Finding Cantor’s proof that there are transcendental numbers, and he was piqued to comment Re: Steig Larsson, or perhaps the translator Reg Keeling in Wiles […]