Archive for Gauss

irrationals [guest post/book review]

Posted in Books, University life with tags , , , , , , , , , on June 6, 2012 by xi'an

When I received The irrationals: A story of the numbers you can’t count on by Julian Havil for reviewing for CHANCE, Pierre Alquier happened to be in my office at CREST and I proposed him to write the review, which he did within a few weeks (and thus prior to the book publication!). Here is his nice and comprehensive review:

This book is intended to be a short history of irrational numbers, since the discovery of the first irrational, √3, by the ancient Greeks until the first rigorous definitions of real numbers by Cantor and Dedekind. In addition to the historical aspect, the author does not hesitate to go into mathematical details and to provide some of the most remarkable proofs in the history of irrationals.

The book is essentially organized around the emergence of key mathematical concepts, rather than based on a strict chronological order. Thanks to the historical perspective, we learn a lot about some famous mathematicians like Pythagoras, Euclid, Gauss or Euler. The book is also full of amazing anecdotes. For example, it reveals the way to find the tomb of Roger Apéry, who proved that ζ(3) is irrational, in the labyrinth of Père Lachaise cemetery in Paris. All of this make the reading of this book a real enjoyment. The appendix contains more involved mathematical developments. The only weak point that I would like to point out is the absence of bibliography that would allow the interested reader to go further into the history of number theory, or into number theory itself.

The book can roughly be divided into 4 parts: (1) the discovery of irrationals and the first calculus with square roots, in chapters 1 and 2, (2) the proof that some remarkable numbers like π and e are irrationals in Chapters 3, 4 and 5, (3) some classification of the irrationals based on approximations by rationals, and the discovery of transcendental numbers (Chapters 6, 7 and 8) and, finally, (4) the proper definition of the real numbers by several mathematicians, including Dedekind (9 and 10).

Chapters 1 and 2 deal with the antique world: the proof of the irrationality of √3, the influence of the Pythagoras and Euclid, and the first algebraic manipulations of the irrationals by the Arabs, the Hindus, and European mathematicians like Fibonacci in the early Renaissance. A lot of information is provided about several Greeks mathematicians and philosophers and the reader might sometimes get lost. However, both chapters contain valuable historical information, as well as some nice proofs based on geometry.

Chapters 3, 4 and 5 give the proof of the irrationality of some remarkable numbers. The method of continued fractions is explained in Chapter 3, leading to the irrationality of e. A simpler proof due to Fourier is given in Chapter 4. The proof of the irrationality of π2 (and thus of π) by Hermite is also given in details in that Chapter. Chapter 5 takes the reader to the seventies: it provides the striking proof of that ζ(3) is irrational by Roger Apéry. Surprisingly enough, unlike most recent mathematical proofs, this one only requires a knowledge of elementary mathematics to be understood.

Chapter 6 is one of the most remarkable parts of the book, because of the number of results given there, and the elegance of the proofs. It focuses on approximations of irrationals by rationals. It is obvious that, given any number x and an integer q, one can find another integer p with |x-p/q|<1/q. However, is it possible to find infinitely many p and q such that |x-p/q|<1/q1+ε for a given ε>0? One of the striking facts proved in this chapter is that for ε=1, the answer is yes if, and only if, x is irrational. In Chapter 7, a classification of irrationals based on various values for ε is described. The idea is to define a number x to be “more irrational” if the property still holds for larger values of ε. This leads to the introduction of a new family of irrationals: the transcendentals, studied in Chapters 7 and 8. Actually, if the property holds for ε>1, then x is a transcendental number. It’s been conjectured for a long time that π and e are transcendentals. However, the first number L to be proved to be transcendental was specially designed by Liouville to fit the results of Chapter 6. This construction is explained in Chapter 7: L is build such that, for any ε>0, there are infinitely many p and q such that |L-p/q|<1/q1+ε, and this proves that L is transcendental.

Finally, Chapter 9, 10 and 11 deal with more recent questions such as the problem of randomness in the decimal expansion of irrational numbers, and the first rigorous definitions of the set R of real numbers by Kossak, Cantor, Heine and Dedekind. Dedekind’s definition of a real number as a cut of the set of rationals became the classical one, but it is known that the other constructions are equivalent. The chapter about randomness is a bit short and unfortunately the recent approaches to define random sequences by Chaitin, Solovay and Martin-Löf are not mentioned. This part ends with some conclusion on the role of irrationals in modern mathematics.

This book contains a lot of fun for whoever likes mathematics. As it goes into details, I would recommend The irrationals: A story of the numbers you can’t count on particularly to students or to mathematicians non specialized in number theory, who would like to learn about its history – or just to enjoy some remarkably elegant proofs. From that perspective, some chapters like Chapters 6 and 10 are particularly successful.

As a side note, here is a terrific biography of Roger Apéry by his son, who is also a mathematician. When I was a student in Caen, Apéry was famous, both for his result and for having once forgotten his son (the same one?) on his motocycle in the parking lot of the university when supposedly driving him to school. (I [even more personally] find most interesting the description of the competition between the young ENS students, Roger Apéry and Jacqueline Lelong-Ferrand, for the first position at the agrégation final exam, since I had the privilege of having Madame Lelong-Ferrand as my professor of differential geometry in Paris 6, circa 1983…)

Bayes at the Bac’ [and out!]

Posted in Kids, Statistics with tags , , , , , , on June 24, 2011 by xi'an

In the mathematics exam of the baccalauréat my son (and 160,000 other students) took on Tuesday, the probability problem was a straightforward application of Bayes’ theorem. Given a viral test with 99% positives for infected patients and 97% negatives for non-infected patients, in a population with 2% of infected patients, what is the probability that the patient is infected given that the test is positive? (It looks like another avatar of Exercise 1.7  in The Bayesian Choice!) A lucky occurrence, given that I had explained to my son Bayes’ formula earlier this year (neither the math book nor the math teacher mentioned Bayes, incidentally!) and even more given that, in a crash revision Jean-Michel Marin gave him the evening before, they went over it once again! The other problems were a straightforward multiple choice about complex numbers (with one mistake!), some calculus around the functional sequence xne-x, and some arithmetic questions around Gauss’s and Bezout’s theorems. A few hours after I wrote the above, the (official) news came that this question had been posted on the web prior to the exam by someone and thus that it would be canceled from the exam by the Ministry for Education! The grade will then be computed on the other problems, which is rather unfair for the students. (On the side, the press release from the Ministry contains a highly specious argument that regulation allows for three to five exercises in the exam, hence that there is nothing wrong with reducing the number of exercises to three!) Not so lucky an occurrence, then, and I very deadly hope this will not impact in a drastic manner my son’s result! (Most likely, the grading will be more tolerant and students will not unduly suffer from the action of a very few….)

Keynes’ derivations

Posted in Books, Statistics with tags , , , , , on March 29, 2010 by xi'an

Chapter XVII of Keynes’ A Treatise On Probability contains Keynes’ most noteworthy contribution to Statistics, namely the classification of probability distributions such that the arithmetic/geometric/harmonic empirical mean/empirical median is also the maximum likelihood estimator. This problem was first stated by Laplace and Gauss (leading to Laplace distribution in connection with the median and to the Gaussian distribution for the arithmetic mean). The derivation of the densities f(x,\theta) of those probability distributions is based on the constraint the likelihood equation

\sum_{i=1}^n \dfrac{\partial}{\partial\theta}\log f(y_i,\theta) = 0

is satisfied for one of the four empirical estimate, using differential calculus (despite the fact that Keynes earlier derived Bayes’ theorem by assuming the parameter space to be discrete). Under regularity assumptions, in the case of the arithmetic mean, my colleague Eric Séré showed me this indeed leads to the family of distributions

f(x,\theta) = \exp\left\{ \phi^\prime(\theta) (x-\theta) - \phi(\theta) + \psi(x) \right\}\,,

where \phi and \psi are almost arbitrary functions under the constraints that \phi is twice differentiable and f(x,\theta) is a density in x. This means that \phi satisfies

\phi(\theta) = \log \int \exp \left\{ \phi^\prime(\theta) (x-\theta) + \psi(x)\right\}\, \text{d}x\,,

a constraint missed by Keynes.

While I cannot judge of the level of novelty in Keynes’ derivation with respect to earlier works, this derivation therefore produces a generic form of unidimensional exponential family, twenty-five years before their rederivation by Darmois (1935), Pitman (1936) and Koopman (1936) as characterising distributions with sufficient statistics of constant dimensions. The derivation of the distributions for which the geometric or the harmonic means are MLEs then follows by a change of variables, y=\log x,\,\lambda=\log \theta or y=1/x,\,\lambda=1/\theta, respectively. In those different derivations, the normalisation issue is treated quite off-handedly by Keynes, witness the function

f(x,\theta) = A \left( \dfrac{\theta}{x} \right)^{k\theta} e^{-k\theta}

at the bottom of page 198, which is not integrable in x unless its support is bounded away from 0 or \infty. Similarly, the derivation of the log-normal density on page 199 is missing the Jacobian factor 1/x (or 1/y_q in Keynes’ notations) and the same problem arises for the inverse-normal density, which should be

f(x,\theta) = A e^{-k^2(x-\theta)^2/\theta^2 x^2} \dfrac{1}{x^2}\,,

instead of A\exp k^2(\theta-x)^2/x (page 200). At last, I find the derivation of the distributions linked with the median rather dubious since Keynes’ general solution

f(x,\theta) = A \exp \left\{ \displaystyle{\int \dfrac{y-\theta}{|y-\theta|}\,\phi^{\prime\prime}(\theta)\,\text{d}\theta +\psi(x) }\right\}

(where the integral ought to be interpreted as a primitive) is such that the recovery of Laplace’s distribution, f(x,\theta)\propto \exp-k^2|x-\theta| involves setting (page 201)

\psi(x) = \dfrac{\theta-x}{|x-\theta|}\,k^2 x\,,

hence making \psi a function of \theta as well. The summary two pages later actually produces an alternative generic form, namely

f(x,\theta) = A \exp\left\{ \phi^\prime(\theta)\dfrac{x-\theta}{|x-\theta|}+\psi(x) \right\}\,,

with the difficulties that the distribution only vaguely depends on \theta, being then a step function times exp(\psi(x)) and that, unless \phi is properly calibrated, A also depends on \theta.

Given that this part is the most technical section of the book, this post shows why I am fairly disappointed at having picked this book for my reading seminar. There is no further section with innovative methodological substance in the remainder of the book, which now appears to me as no better than a graduate dissertation on the probabilistic and statistical literature of the (not that) late 19th century, modulo the (inappropriate) highly critical tone.