Since Stan Ulam is buried in Cimetière du Montparnasse, next to CREST, Andrew and I paid his grave a visit on a sunny July afternoon. Among elaborate funeral constructions, the Aron family tomb is sober and hidden behind funeral houses. It came as a surprise to me to discover that Ulam had links with France to the point of him and his wife being buried in Ulam’s wife family vault. Since we were there, we took a short stroll to see Henri Poincaré’s tomb in the Poincaré-Boutroux vault (missing Henri’s brother, the French president Raymond Poincaré). It came as a surprise that someone had left a folder with the cover of 17 equations that changed the World on top of the tomb). Even though the book covers Poincaré’s work on the three body problem as part of Newton’s formula. There were other mathematicians in this cemetery, but this was enough necrophiliac tourism for one day.
Archive for Henri Poincaré
Yesterday, Luke Bornn and Pierre Jacob gave a talk at our big’MC ‘minar. While I had seen most of the slides earlier, either at MCMski IV, Banff, Leuven or yet again in Oxford, I really enjoyed those talks as they provided further intuition about the techniques of Wang-Landau and non-negative unbiased estimators, leading to a few seeds of potential ideas for even more potential research. For instance, I understood way better the option to calibrate the Wang-Landau algorithm on levels of the target density rather than in the original space. Which means (a) a one-dimensional partition target (just as in nested sampling); (b) taking advantage of the existing computations of the likelihood function; and (b) a somewhat automatic implementation of the Wang-Landau algorithm. I do wonder why this technique is not more popular as a default option. (Like, would it be compatible with Stan?) The impossibility theorem of Pierre about the existence of non-negative unbiased estimators never ceases to amaze me. I started wondering during the seminar whether a positive (!) version of the result could be found. Namely, whether perturbations of the exact (unbiased) Metropolis-Hastings acceptance ratio could be substituted in order to guarantee positivity. Possibly creating drifted versions of the target…
One request in connection with this post: please connect the Institut Henri Poincaré to the eduroam wireless network! The place is dedicated to visiting mathematicians and theoretical physicists, it should have been the first one [in Paris] to get connected to eduroam. The cost cannot be that horrendous so I wonder what the reason is. Preventing guests from connecting to the Internet towards better concentration? avoiding “parasites” taking advantage of the network? ensuring seminar attendees are following the talks? (The irony is that Institut Henri Poincaré has a local wireless available for free, except that it most often does not work with my current machine. And hence wastes much more of my time as I attempt to connect over and over again while there.) Just in connection with IHP, a video of Persi giving a talk there about Poincaré, two years ago:
In relation with the special issue of Science & Vie on Bayes’ formula, the French national radio (France Culture) organised a round table with Pierre Bessière, senior researcher in physiology at Collège de France, Dirk Zerwas, senior researcher in particle physics in Orsay, and Hervé Poirier, editor of Science & Vie. And myself (as I was quoted in the original paper). While I am not particularly fluent in oral debates, I was interested by participating in this radio experiment, if only to bring some moderation to the hyperbolic tone found in the special issue. (As the theme was “Is there a universal mathematical formula? “, I was for a while confused about the debate, thinking that maybe the previous blogs on Stewart’s 17 Equations and Mackenzie’s Universe in Zero Words had prompted this invitation…)
As it happened [podcast link], the debate was quite moderate and reasonable, we discussed about the genesis, the dark ages, and the resurgimento of Bayesian statistics within statistics, the lack of Bayesian perspectives in the Higgs boson analysis (bemoaned by Tony O’Hagan and Dennis Lindley), and the Bayesian nature of learning in psychology. Although I managed to mention Poincaré’s Bayesian defence of Dreyfus (thanks to the Theory that would not die!), Nate Silver‘s Bayesian combination of survey results, and the role of the MRC in the MCMC revolution, I found that the information content of a one-hour show was in the end quite limited, as I would have liked to mention as well the role of Bayesian techniques in population genetic advances, like the Asian beetle invasion mentioned two weeks ago… Overall, an interesting experience, maybe not with a huge impact on the population of listeners, and a confirmation I’d better stick to the written world!
“If you placed your finger at that point, the two halves of the string would still be able to vibrate in the sin 2x pattern, but not in the sin x one. This explains the Pythagorean discovery that a string half as long produced a note one octave higher.” (p.143)
The following chapters are all about Physics: the wave equation, Fourier’s transform and the heat equation, Navier-Stokes’ equation(s), Maxwell’s equation(s)—as in The universe in zero word—, the second law of thermodynamics, E=mc² (of course!), and Schrödinger’s equation. I won’t go so much into details for those chapters, even though they are remarkably written. For instance, the chapter on waves made me understand the notion of harmonics in a much more intuitive and lasting way than previous readings. (This chapter 8 also mentions the “English mathematician Harold Jeffreys“, while Jeffreys was primarily a geophysicist. And a Bayesian statistician with major impact on the field, his Theory of Probability arguably being the first modern Bayesian book. Interestingly, Jeffreys also was the first one to find approximations to the Schrödinger’s equation, however he is not mentioned in this later chapter.) Chapter 9 mentions the heat equation but is truly about Fourier’s transform which he uses as a tool and later became a universal technique. It also covers Lebesgue’s integration theory, wavelets, and JPEG compression. Chapter 10 on Navier-Stokes’ equation also mentions climate sciences, where it takes a (reasonable) stand. Chapter 11 on Maxwell’s equations is a short introduction to electromagnetism, with radio the obvious illustration. (Maybe not the best chapter in the book.) Continue reading
I do not know if it is a coincidence or if publishers were competing for the same audience: after reviewing The universe in zero word: The story of mathematics as told through equations, in this post (and in CHANCE, to appear in 25(3)!), I noticed Ian Stewart’s 17 equations That Changed the World, published in 2011, and I bought a copy to check the differences between both books.
I am quite glad I did so, as I tremendously enjoyed this book, both for its style and its contents, both entertaining and highly informative. This does not come as a big surprise, given Stewart’s earlier books and their record, however this new selection and discussion of equations is clearly superior to The universe in zero word! Maybe because it goes much further in its mathematical complexity, hence is more likely to appeal to the mathematically inclined (to borrow from my earlier review). For one thing, it does not shy away from inserting mathematical formulae and small proofs into the text, disregarding the risk of cutting many halves of the audience (I know, I know, high powers of (1/2)…!) For another, 17 equations That Changed the World uses the equation under display to extend the presentation much much further than The universe in zero word. It is also much more partisan (in an overall good way) in its interpretations and reflections about the World.
In opposition with The universe in zero word, formulas are well-presented, each character in the formula being explained in layman terms. (Once again, the printer could have used better fonts and the LaTeX word processor.) The (U.K. edition, see tomorrow!) cover is rather ugly, though, when compared with the beautiful cover of The universe in zero word. But this is a minor quibble! Overall, it makes for an enjoyable, serious and thought-provoking read that I once again undertook mostly in transports (planes and métros). Continue reading
Larry Wasserman wrote a blog entry on the normalizing constant paradox, where he repeats that he does not understand my earlier point…Let me try to recap here this point and the various comments I made on StackExchange (while keeping in mind all this is for intellectual fun!)
The entry is somehow paradoxical in that Larry acknowledges (in that post) that the analysis in his book, All of Statistics, is wrong. The fact that “g(x)/c is a valid density only for one value of c” (and hence cannot lead to a notion of likelihood on c) is the very reason why I stated that there can be no statistical inference nor prior distribution about c: a sample from f does not bring statistical information about c and there can be no statistical estimate of c based on this sample. (In case you did not notice, I insist upon statistical!)
To me this problem is completely different from a statistical problem, at least in the modern sense: if I need to approximate the constant c—as I do in fact when computing Bayes factors—, I can produce an arbitrarily long sample from a certain importance distribution and derive a converging (and sometimes unbiased) approximation of c. Once again, this is Monte Carlo integration, a numerical technique based on the Law of Large Numbers and the stabilisation of frequencies. (Call it a frequentist method if you wish. I completely agree that MCMC methods are inherently frequentist in that sense, And see no problem with this because they are not statistical methods. Of course, this may be the core of the disagreement with Larry and others, that they call statistics the Law of Large Numbers, and I do not. This lack of separation between both notions also shows up in a recent general public talk on Poincaré’s mistakes by Cédric Villani! All this may just mean I am irremediably Bayesian, seeing anything motivated by frequencies as non-statistical!) But that process does not mean that c can take a range of values that would index a family of densities compatible with a given sample. In this Monte Carlo integration approach, the distribution of the sample is completely under control (modulo the errors induced by pseudo-random generation). This approach is therefore outside the realm of Bayesian analysis “that puts distributions on fixed but unknown constants”, because those unknown constants parameterise the distribution of an observed sample. Ergo, c is not a parameter of the sample and the sample Larry argues about (“we have data sampled from a distribution”) contains no information whatsoever about c that is not already in the function g. (It is not “data” in this respect, but a stochastic sequence that can be used for approximation purposes.) Which gets me back to my first argument, namely that c is known (and at the same time difficult or impossible to compute)!
Let me also answer here the comments on “why is this any different from estimating the speed of light c?” “why can’t you do this with the 100th digit of π?” on the earlier post or on StackExchange. Estimating the speed of light means for me (who repeatedly flunked Physics exams after leaving high school!) that we have a physical experiment that measures the speed of light (as the original one by Rœmer at the Observatoire de Paris I visited earlier last week) and that the statistical analysis infers about c by using those measurements and the impact of the imprecision of the measuring instruments (as we do when analysing astronomical data). If, now, there exists a physical formula of the kind
where φ is a probability density, I can imagine stochastic approximations of c based on this formula, but I do not consider it a statistical problem any longer. The case is thus clearer for the 100th digit of π: it is also a fixed number, that I can approximate by a stochastic experiment but on which I cannot attach a statistical tag. (It is 9, by the way.) Throwing darts at random as I did during my Oz tour is not a statistical procedure, but simple Monte Carlo à la Buffon…
Overall, I still do not see this as a paradox for our field (and certainly not as a critique of Bayesian analysis), because there is no reason a statistical technique should be able to address any and every numerical problem. (Once again, Persi Diaconis would almost certainly differ, as he defended a Bayesian perspective on numerical analysis in the early days of MCMC…) There may be a “Bayesian” solution to this particular problem (and that would nice) and there may be none (and that would be OK too!), but I am not even convinced I would call this solution “Bayesian”! (Again, let us remember this is mostly for intellectual fun!)