structure and uncertainty, Bristol, Sept. 26

Another day full of interesting and challenging—in the sense they generated new questions for me—talks at the SuSTain workshop. After another (dry and fast) run around the Downs; Leo Held started the talks with one of my favourite topics, namely the theory of g-priors in generalized linear models. He did bring a new perspective on the subject, introducing the notion of a testing Bayes factor based on the residual statistic produced by a classical (maximum likelihood) analysis, connected with earlier works of Vale Johnson. While I did not truly get the motivation for switching from the original data to this less informative quantity, I find this perspective opening new questions for dealing with settings where the true data is replaced with one or several classical statistics. With possible strong connections to ABC, of course. Incidentally, Leo managed to produce a napkin with Peter Green’s intro to MCMC dating back from their first meeting in 1994: a feat I certainly could not reproduce (as I also met both Peter and Leo for the first time in 1994, at CIRM)… Then Richard Everit presented his recent JCGS paper on Bayesian inference on latent Markov random fields, centred on the issue that simulating the latent MRF involves an MCMC step that is not exact (as in our earlier ABC paper for Ising models with Aude Grelaud). I already discussed this paper in an earlier blog and the only additional question that comes to my mind is whether or not a comparison with the auxiliary variable approach of Møller et al. (2006) would make sense.

In the intermission, I had a great conversation with Oliver Ratman on his talk of yesterday on the surprising feature that some models produce as “data” some sample from a pseudo-posterior.. Opening once again new vistas! The following talks were more on the mathematical side, with James Cussens focussing on the use of integer programming for Bayesian variable selections, then Éric Moulines presenting a recent work with a PhD student of his on PAC-Bayesian bounds and the superiority of combining experts. Including a CRAN package. Éric concluded his talk with the funny occurence of Peter’s photograph on Éric’s Microsoft Research Profile own page, due to Éric posting our joint photograph at the top of Pic du Midi d’Ossau in 2005… (He concluded with a picture of the mountain that was the exact symmetry of mine yesterday!)

The afternoon was equally superb with Gareth Roberts covering fifteen years of scaling MCMC algorithms, from the mythical 0.234 figure to the optimal temperature decrease in simulated annealing, John Kent playing the outlier with an EM algorithm—however including a formal prior distribution and raising the challenge as to why Bayesians never had to constrain the posterior expectation, which prompted me to infer that (a) the prior distribution should include all constraints and (b) the posterior expectation was not the “right” tool in non-convex parameters spaces—. Natalia Bochkina presented a recent work, joint with Peter Green, on connecting image analysis with Bayesian asymptotics, reminding me of my early attempts at reading Ibragimov and Has’minskii in the 1990’s. Then a second work with Vladimir Spoikoini on Bayesian asymptotics with misspecified models, introducing a new notion of effective dimension. The last talk of the day was by Nils Hjort about his coming book on “Credibility, confidence and likelihood“—not yet advertised by CUP—which sounds like an attempt at resuscitating Fisher by deriving distributions in the parameter space from frequentist confidence intervals. I already discussed this notion in an earlier blog, so I am fairly skeptical about it, but the talk was representative of Nils’ highly entertaining and though-provoking style! Esp. as he sprinkled the talk with examples where MLE (and some default Bayes estimators) did not work. And reanalysed one of Chris Sims‘ example presented during his Nobel Prize talk…

books versus papers [for PhD students]

Before I run out of time, here is my answer to the ISBA Bulletin Students’ corner question of the term: “In terms of publications and from your own experience, what are the pros and cons of books vs journal articles?

While I started on my first book during my postdoctoral years in Purdue and Cornell [a basic probability book made out of class notes written with Arup Bose, which died against the breakers of some referees' criticisms], my overall opinion on this is that books are never valued by hiring and promotion committees for what they are worth! It is a universal constant I met in the US, the UK and France alike that books are not helping much for promotion or hiring, at least at an early stage of one’s career. Later, books become a more acknowledge part of senior academics’ vitae. So, unless one has a PhD thesis that is ready to be turned into a readable book without having any impact on one’s publication list, and even if one has enough material and a broad enough message at one’s disposal, my advice is to go solely and persistently for journal articles. Besides the above mentioned attitude of recruiting and promotion committees, I believe this has several positive aspects: it forces the young researcher to maintain his/her focus on specialised topics in which she/he can achieve rapid prominence, rather than having to spend [quality research] time on replacing the background and building reference. It provides an evaluation by peers of the quality of her/his work, while reviews of books are generally on the light side. It is the starting point for building a network of collaborations, few people are interested in writing books with strangers (when knowing it is already quite a hardship with close friends!). It is also the entry to workshops and international conferences, where a new book very rarely attracts invitations.

Writing a book is of course exciting and somewhat more deeply rewarding, but it is awfully time-consuming and requires a high level of organization young faculty members rarely possess when starting a teaching job at a new university (with possibly family changes as well!). I was quite lucky when writing The Bayesian Choice and Monte Carlo Statistical Methods to mostly be on leave from teaching, as it would have otherwise be impossible! That we are not making sufficient progress on our revision of Bayesian Core, started two years ago, is a good enough proof that even with tight planning, great ideas, enthusiasm, sale prospects, and available material, completing a book may get into trouble for mere organisational issues…

Carnon [and Core]

I am now for a few days in Carnon, near Montpellier, to work on the completion (!) of Bayesian Core, started two years ago not that far in Luminy… The small beach town is right on the Mediterranean Sea, located on a spit (or lido), itself carrying a canal between the Lez river and the sea. A quiet enough place, far from interruptions of all sorts! Although we are really not that far from completion, various commitments here and there kept Jean-Michel and myself from doing it over the past months. I am thus looking forward those two and a half days of hard work (and not even a break to go climbing in the back country!).

Core not in CiRM

Despite not enjoying this year the optimal environment of CiRM, we are still making good progress on the revision (or the R vision) of Bayesian Core. In the past two days, we went over Chapters 1 (Introduction), 2 (Normal Models), 5 (Capture-Recapture Experiments), and 6 (Mixture Models), with Chapters 3 (Regression), 4 (Generalised Linear Models) and 9 (Image Analysis) being close to completion. While having a “last”go at the R tutorial part of Chapter 1, I came across this paragraph

One of the most frustrating features of R is that the graphical device is not refreshed while a program is executed in the main window. This implies that, if you switch from one terminal to another or if the screen saver starts, the whole or parts of the graph currently on the graphical device will not be visible until the completion of the program. Conversely, refreshing very large graphs will delay the activation of the prompt >.

that I very gladly deleted, as the current 2.11.1 version of R does no longer suffer from this painful freeze in the graphics (at least on my Kubuntu 10.10 version).

Actually, I do not think I mentioned it in a previous post: our new edition will be called Bayesian Essentials with R. Both to distinguish it from Bayesian Core (as it should be published in the Use R! series) and because it appeared (thanks to colleagues and readers) that core did not sound very appealing to English-speaking audiences looking for a statistics book…

Randomness [French version]

As a (weak) coincidence (with my review of Randomness through Computation), the French Mathematical Society chose the theme of its yearly meeting (on June 17-18 at CIRM) as “Qu’est ce qu’un nombre au hasard ?” with talks by

The talk by David Xiao is accompanied by a fairly interesting survey that goes much farther than what is found in Randomness through Computation. The same applies to the talk by Laurent Bienvenu, with a survey that is closer in spirit to the themes of Randomness through Computation (Kolmogorov and Chaitin complexities, Martin-Löf randomness).

Another (weak) coincidence is that I received yesterday in the mail Richard von Mises’ Probability, Statistics and Truth, which I hope to read carefully in the coming weeks, in connection with the above, but also Bayesian statistics (“a misundertsanding of one of the classical formulas of probability calculus”).

le théorème de l’engambi

When I climbed in Luminy last year, one of the ways was called le théorème de l’engambi. Looking on the internet, I found this was the title of a book written by a local, Maurice Gouiran. The other evening, at the airport, the book was on sale in the bookstore, so I bought it and read it in the plane back to Paris. It is a local crime novel with highly local characters (to the point I do not understand all they say), local places like l’Estaque, the OM football club, La Gineste, Luminy, and what is apparently the most appealing theorem in novels, Fermat’s last theorem! (Engambi means messy affair in local dialect.) Overall the book is more pleasant to read for the local flavour than for the crime enquiry per se, especially because it involves scenes that take place in CIRM itself (including the restaurant and the terrace outside under the old oaks!). There is of course no indication on the nature of the three page proof produced by the first corpse of the book, but the description of the mathematical community is rather accurate, overall. The author mentions in a postnote that he is aware of Wiles’ proof, but believes (as a poet) in an alternative proof that Fermat had really found. (This book is not to be confused with Guedj’s parrot theorem, which is a novelesque story of mathematics, even though it ends up on the same premise that a parrot could recite Fermat’s proof…)

CoRe in CiRM [end]

Back home after those two weeks in CiRM for our “research in pair” invitation to work on the new edition of Bayesian Core, I am very grateful for the support we received from CiRM and through it from SMF and CNRS. Being “locked” away in such a remote place brought a considerable increase in concentration and decrease in stress levels. Although I was planning for more, we have made substantial advances on five chapters of the book (out of nine), including a completely new chapter (Chapter 8) on hierarchical models and a thorough rewriting of the normal chapter (Chapter 2), which along with Chapter 1 (largely inspired from  Chapter 1 of Introducing Monte Carlo Methods with R, itself inspired from the first edition of Bayesian Core,!). is nearly done. Chapter 9 on image processing is also quite close from completion, with just the result of a batch simulation running on the Linux server in Dauphine to include in the ABC section. As the only remaining major change is the elimination of reversible jump from the mixture chapter (to be replaced with Chib’s approximation) and from the time-series chapter (to be simplified into a birth-and-death process). Going back to the CiRM environment, I think we were lucky to come during the vacation season as there is hardly anyone on the campus, which means no car and no noise. The (good) feeling of remoteness is not as extreme as in Oberwolfach, but it is truly a quality environment. Besides, being able to work 24/7 in the math library is a major plus. as we could go and grab any reference we needed to check. (Presumably, CiRM is lacking in terms of statistics books, compared with Oberwolfach, still providing most of the references we were looking for.) At last, the freedom to walk right out of the Centre into the national park for a run, a climb or even a swim (in Morgiou, rather than Sugiton) makes working there very tantalising indeed! I thus dearly hope I can enjoy again this opportunity in a near future…


