Archive for December, 2008

ABC-SMC redux

Posted in Statistics with tags , , , on December 31, 2008 by xi'an

Pierre Del Moral, Arnaud Doucet and Ajay Jasra just wrote a paper on ABC entitled “An Adaptive Sequential Monte Carlo Method for Approximate Bayesian Computation” that is more than welcomed as it links the ABC algorithm with their foundational SMC paper of 2006 in JRSS Series B. It thus brings a new light on the SMC ABC-PRC proposal of Sisson, Fan and Tanaka, already discussed in this post, in that both papers are based on the 2006 Series B paper idea of using a backward kernel L(z,z’) to simplify the importance weight and remove the dependence on the unknown likelihood from this weight.

The main point is that, despite a common framework, the weights of Del Moral et al. differ from those of Sisson et al. First, Del Moral et al. assume that the forward kernel K is invariant against the true target (which is a tempered version of the true posterior in sequential Monte Carlo), a choice not explicitely made in Sisson et al. (there is a mention that “Choices of K include a standard smoothing kernel (e.g., Gaussian) or a Metropolis–Hastings accept/reject step” but the target of the Metropolis–Hastings kernel is not specified—with a potential difficulty in using a proposal including a Dirac mass for importance sampling—and the kernel chosen in the toy mixture example is the “Gaussian random walk“). Second, the ABC weights of Del Moral et al. reduce to the ratio of acceptance rates for the thresholded domain, while Sisson et al. end up with the ratio of the prior densities. Only when the prior is flat do both weights coincide, but again with an invalid argument in Sisson et al. if K is not invariant against the true target.

notes on Adaptive SMC for ABC

A few comments on the paper itself:

One of the strengths of the paper is that Del Moral et al. rely on repeated simulations of the x‘s given the parameter, rather than using a single simulation given the parameter as in Sisson et al. (where this choice M=1 is considered as “the most computationally efficient“). In that perspective, each simulated parameter gets a non-zero weight that is proportional to the number of accepted x‘s. I wonder if the choice of the number M of replications could be adapted to the efficiency of the probability [of acceptation] estimation or something like that. The limiting case in M brings an exact simulation from the (tempered) target so there is a convergence principle and the stabilisation of the approximation could be assessed to control M.

The adaptivity in the ABC-SMC algorithm is dual in that (a) the Markov kernel on the parameter can be adapted in a typical PMC argument and (b) the threshold εt‘s can also be constructed on-line. The argument is to keep decreasing those thresholds slowly enough to keep a large number of accepted transitions from the previous sample. The only drawback I see is that the final value of the threshold ε is set in advance. Appart from the ultimate choice of ε = 0, which can be done in some settings as illustrated by the paper, this final value is difficult to calibrate. I also think the adaptivity criterion on the threshold ε could be Rao-Blackwellised into the sum of the importance weights rather than the indicators themselves. As noted in the paper, the ESS cannot be used per se since it is not monotonic.

Faster till when?

Posted in Running, Statistics with tags , , on December 28, 2008 by xi'an

In the December issue of Significance, there is an interesting article by Joseph Hilbe (who also wrote a nice review on our book Bayesian Core in the Journal of Statistical Software) on the prediction of the ultimate time in athletics. (There are plenty of interesting articles in this issue, actually! In a few years, this general audience Statistics journal from the RSS truly turned into an excellent magazine.) The questions are whether or not there is a lower bound, called the ultimate performance time, to the performance times in athletic events like the 100m or the marathon, and if yes, what is the predicted value for this ultimate performance time. Hilbe’s paper is actually a criticism of a single paper by Kruper and Sterken published this year in Statistical Thinking in Sports. The shortcomings of Kruper and Sterken’s model are clearly exposed, the most glaring one being that the ultimate performance time for women 5000m has been beaten this year by four seconds… For instance, the predicted ultimate time for the men marathon is 2:00:56, in relation to the current world record of 2:03:59 by Gebrselassie. There is no reason to believe that younger men like Nicolas Menza who won the Boulogne-Billancourt half-marathon this year in 1:00:12 cannot improve this record if training earlier for marathons. So the model of Kruper and Sterken is limited and more factors should be taken into account like training facilities and techniques, morphological improvements, alimentation and medical advances, not to mention drugs or genetic manipulations… But I do not understand the perspective taken by the paper that does not go further into Statistics than factual descriptive statistics and that concludes that “experience in the sport and a basic knowledge of human physiology and biomechanics are better (to make) predictions than mathematical analysis“. Hardly fitting for enhancing the profession for the general public and for a journal whose motto is Statistics making sense!!!

Snow white X’mas

Posted in Kids, Mountains, Travel with tags , on December 25, 2008 by xi'an

This is the time of our traditional X’mas ski trip and the conditions are fairly good this year: enough (if not plenty of) snow, (mostly) blue skies, really warm weather, and much less people! (This may be an impact of the economic slowdown but, indeed, despite repeated announcements of those perfect snow conditions, there are indeed less people both on ecrins4the slopes and in the local village, compared with the setting of earlier years.) This clearly makes for better skiing, of course, even though I somehow find skiing vacations the ultimate lesson in frustration: for skiing rather than climbing tantalising nearby peaks (or La Meije in the distance), for spending only 2 minutes going down a slope and 15 minutes or more going up again on a freezing lift, for queeing at the lift (moderately so this year!), for ending up the day with only 16 to 20 minutes of excellent rides (say, four in the morning at the opening and there is no one yet on the slopes and three to four in the afternoon when closing but this is less enjoyable as the slopes are mostly frozen), for never finding the same excellent skis as the ones I had once in Switzerland at Flumserberg near Zurich, the same dry snow as we had at Mount Hutt in New Zealand. the same smooth and long (9 minutes!) runs as we had earlier this year at Adap’skii in Bormio, etc., etc.! Anyway, this is still an overall pleasant time spent in the mountains and I wish all of you celebrating X’mas a very happy Christmas 2008.

Ps– A game for the lift that drives my kids crazy but that somehow alleviates the boredom of sitting in the cold is to try to guess the number of chairs in a given lift as early as possible. (If you wait long enough, you can obviously get the exact answer.) While this may sound like the traditional tramcar problem, it is not because the chair numbers you see are obviously correlated with your own chair number.

Misconceptions on Bayesianism

Posted in Books, Statistics, University life with tags , , , on December 22, 2008 by xi'an

It seems to me that the most common attack against Bayesianism relates to its sectarian aspects. While unjustified, this criticism is grounded and long-lasting for several reasons. The first one is the Bayesian claim to universality: no other branch of Statistics attempts to cover so generically all branches of Statistics, from estimation to testing, from design to non-parametrics, from minimax theory to graphical modelling. That Bayesian principles can integrate so smoothly all kinds of statistical optimalities may feel like propaganda to non-Bayesians, even though there are many proofs of this efficiency, from consistency to admissibility, from Dutch-book arguments to exchangeability (see de Finetti).

The second reason is that no other (major) approach to Statistics is so strongly anchored on philosophical principles. This deep connection with Philosophy sounds to me like a strong added value, in particular for analysing the nature of learning (see Savage and Dawid) and the influence of a prioris (back to Laplace and Poincaré), but the threads linking modern Statistics to Mathematics and to Informatics may make this additional link (and the argumentative discussions involved in some Bayesian papers) seem old-fashioned and un-scientific. (It is also true that the literature abounds with philosophical arguments that are not always of the highest quality, see for instance some of the introductory paragraphs of the nonetheless fundamental and foundational Theory of Probability!) The essential fact that a Bayesian analysis relies on the choice of a prior distribution inevitably opens the door to the sectarian criticism, even though it is as well an (the!) inevitable part of the Bayesian principles. That two different statistical analyses of the same data could conduct to two different conclusions is seen by some as a major default in the theory, while it seems unavoidable (see again Poincaré). The criticism is that the use of a prior is un-scientific or un-objective (or un-falsifiable in Popperian terms) and that this choice is based on tenets only understandable to members of the sect…

A third and related reason is that Bayesians have developed along the years a real sense of community. For one thing, no other (major) branch of Statistics has members so naturally gathered under a common denomination (e.g., likelihoodists?! Basu had dubbed the Indian construct likelihood-wallah on those using likelihood, but this has obviously not stuck! Fiducians could be the closest to this fame, but I am not even sure the name exists.) There are many good things in having a feeling of community and, as mentioned on Saturday, this includes real benefits in terms of collaborative research and in keeping the unitarian perspective of the statistical approach, but one drawback of communities is that people outside the community may naturally feel dismissive, ostracised, excluded, suspicious, jealous, or, in the most extreme cases, antagonistic and belligerent, i.e. anti-Bayesians. (This is a point shared with religions and sects, most obviously, that those not “in” are automatically “out”.) The fact that this community has also developed some traditions that could be dubbed “rituals”, like having meetings in sea resorts (in Spain and elsewhere), and alas rarely in cold and mountainous places (even though MCMC’ski could be the start of a new tradition!), with a strong emphasis on partying! Again, nothing wrong with adding a few extra good reasons to attending conferences, but this may not seem right to outsiders who have never attended a poster session at a Valencia meeting that starts in the hotel bar at 10am and ends up at two in the morning with people still loudly arguing around papers. Having launched Bayesian Analysis was also a great idea, even though I remember it being fiercely debated at a Bayesian meeting (where I must admit I voted against it!), but it also strengthens the (wrong) impression of a closed group with its own agenda “only publishing in its own journals”.

The last reason I want to point out is the fact that Bayesianism draws its name from one man, Thomas Bayes, and that, while there are good reasons for this filiation, this is also a feature shared with sects! As any other branch of Statistics, Bayesian theory has been built on the work of many and this singling out one person as the founder of the theory is unfortunate. Especially when considering that Bayes’ main posthumous work is not really in Statistics and was rediscovered by Laplace a few years later. While it seems a wee late to switch the denomination, I really think the abuse of the (maybe apocryphical) picture of the Reverend on our webpages and in our talks and of the corresponding cult of personality including caring for Bayes’ tomb in a London cemetery (!) should cease to be part of our attitude. It would certain help in reducing the sectarian libels.

Ps—The column Dr Fisher’s casebook in the recent December issue of Significance is quite representative of these misconceptions on Bayesianism, ranking Bayesians as born-again fundamentalists…

Joining a statistical society, why and which society?

Posted in Statistics, University life with tags , , , , on December 20, 2008 by xi'an

Being currently a member of four statistical societies, namely the American Statistical Association (ASA), the Institute of Mathematical Statistics (IMS), the International Society for Bayesian Analysis (ISBA), and the Royal Statistical Society (RSS), I sometimes get questions from colleagues as to why I joined those societies. The initial reason for joining was for me to get access to the journals from those societies. For instance, I joined the IMS in 1987 and I have received the Annals of Statistics since then (I also subscribe to Annals of Applied Probability and to the new and exciting Annals of Applied Statistics). I am actually a life member of the IMS and very supportive of this society for its truly international and strongly academic orientations that bridge Probability and Statistics. Its meetings are always of the highest quality and the society is very active in sponsoring joint events (like the Adap’ski and MCMC’ski meetings) and in supporting young members (through the Laha award and through free membership for students) and members from developping countries. The society is also very open in recruiting members willing to do so for its various committees and I have served in several of them in the past years. So, from an academic point of view, joining the IMS makes a lot of sense, the sooner the better.

In contrast, the ASA is somehow too professional oriented to give me the same feeling of community that goes with my membership to ISBA, the IMS, or the RSS. So the main incentive there for me is to get a subscription to JASA, which is one of the top Statistics journals. I sometimes attend the annual JSM meeting but it is such a huge gathering that by the end of the first day I usually get tired of the crowd and of the need to rush from one end of a big convention center to the other and I do not get the same benefits as in smaller IMS meetings. The only WNAR meeting I attend (in Fairbanks, Alaska) was however much more interesting if only because of its smaller size. I guess the weaker link I have with ASA is that it is a national society and that I have looser connections with US Statistics these days than I had earlier… The connections with the ASA staff is also much more impersonal because of the size of the society and this also contribute to weaken the feeling of community, contacts being too business-like (if highly efficient).

This may also explain the difference with my attachment to the Royal Statistical Society, despite it being also a national society. My colleagues in Paris will most likely blame this on my anglophilia, but I do feel the RSS is a superb society that runs efficiently but that is also member-oriented, with a strong emphasis on methodology and a well-balanced link between academia and the professional world. Their publications are also of the highest quality (and I hope I did not contribute too much in bringing the qualiy of JRSS B down!). As a national society, I also find their opening to members from abroad quite remarkable. The Research Section and its handling of Read Papers is an unique example of academic excellence and the recent addition of Pre-ordinary meetings an illustration of the way the RSS keeps reinventing itself. The presidency of Peter Green was particularly influential in this respect. (This excellence and wealth of activities is also the reason why I belong to a British society rather than to a French one!)

Last but not least, being a member of ISBA is also very logical for me as the sense of community is obviously the strongest there. In a sense, it is formalising a long-lasting link with persons I have know for years, met at numerous meetings and whose work I have read and worked on for years too. This feeling of sharing more than just an interest in a specific statistical methodology is certainly puzzling for those outside the community, but it is truly there and the end it does contribute to boost research (and publications!) by making collaborative work almost an evidence. When fun and work mix so seamlessly together, it is indeed fun to work longer and harder! The same applies to the involvement in the committee work related with the society: we still are a small enough group that reading PhD theses for the Savage award or books for the DeGroot award is within one’s abilities. There is no reason of joining for publications, since the on-line and exciting new journal, Bayesian Analysis, is freely available, but there are so many other reasons that this is not a drawback! Most Bayesian meetings will offer a free ISBA membership with the registration fees. So if you feel the slightest interest in Bayes’ theorem, Bayesian methodology, Bayesian philosophy, Bayesian theory, Bayesian networks, computing methods, or in applying any of those in a particular field, you should join IBSA and contribute to its life. (The membership fees are ridiculously low, for one thing, and there is a joint membership with IMS.)

Meetings on mixtures

Posted in Statistics with tags , , on December 19, 2008 by xi'an

While preparing a future meeting on mixtures, I (surprisingly) found a still active link to the very first meeting I organised. It took place in the CNRS vacation center of Aussois (French Alps) in 1995 and it already focussed on mixtures. It was a very exciting meeting with some great talks, different perspectives and schools, and lasting connections. I also took part later in organising with Mike Titterington a smaller scale meeting on mixtures held at the International Centre for Mathematical Sciences (ICMS) in March 2001, where the small size of the audience allowed for open (and sometimes heated!) discussions.

The future meeting in 2010 will be held at ICMS, in Edinburgh, in March 2010. It is organised jointly with Kerrie Mengersen and Mike Titterington, and it will be of the same type as the previous one, with a limited and restricted audience made of invitees, and a small number of long talks followed by discussions and exchanges. However, anyone who would desperately like to attend, in particular PhD students from the UK with a topic related to mixtures, should feel free to contact me. (The purpose of restricting attendance is of course to preserve the congenial and interactive feeling of small meetings, which are mostly those where “something” happens!, and to concentrate the talks on a limited number of topics.) The topics involved for the 2010 meeting are

  • frequentist and Bayesian advances in mixture inference, especially concerning identifiability and connection with non-parametric and semi-parametric statistics—this includes the specific Bayesian issue of the label-switching phenomenon and proposals for its resolution or dismissal;
  • use of mixtures in mis-specified problems, in particular for data that are not independent and identically distributed;
  • new theory and methodology for computational issues, identifying advances and bottlenecks, covering associated issues of convergence, mixing and ways of reducing computational expense;
  • new latent-structure models, including the use of covariates in connection with components;
  • cross-fertilization concerning these and other aspects among those who contribute to the statistical, computer science and other literatures.

The attached picture of the Ring of Steall, facing Ben Nevis, was taken in 2003 during one of the most beautiful day “hikes” I ever did. (Sadly, another hiker fell to his death on the same ridge the very same day.)

O’Bayes 09

Posted in Statistics, University life with tags , , on December 15, 2008 by xi'an

The next Objective Bayes meeting, O-Bayes09, is taking place in Philadelphia, in the superb modern building of the Wharton Business School, from June 5th till June 9th, 2009, and is organised by Larry Brown, Ed George, Linda Zhao and Kai Zhang. It follows earlier meetings on objective Bayes methodology held in Purdue, USA, 1996, València, Spain, 1998, Ixtapa, Mexico, 2000, Granada, Spain, 2002, Aussois, France, 2003—the one that I organised and where the O’Bayes nn brand was officially launched!—, Branson, MO, USA, 2005, and Rome, Italy, 2007. To borrow from the main page of the conference website, “the principal objectives of O-Bayes09 will be to facilitate the exchange of recent research developments in objective Bayes methodology, to provide opportunities for new researchers to shine, and to establish new collaborations and partnerships that will channel efforts into pending problems and open new directions for further study. O-Bayes09 will also serve to further crystallize objective Bayes methodology as an established area for statistical research.” This has always been an exciting meeting with a small enough attendance to facilitate debates and exchanges in a most congenial and traditional Bayesian way!

The conference starts on the 5th of June with a series of tutorials on Bayesian Statistics in the morning and withTheory of Probability lectures on Jeffreys’ book Theory of Probability on the afternoon. I am organising this afternoon session following a reading class I gave this year in Paris. I will also give an introductory talk along this reading paper we wrote this Spring with Nicolas Chopin and Judith Rousseau. Anyone interested in contributing to this session should feel free contact me, since there is room for posters and discussants. I originally wanted to organise this meeting in St John’s College, Cambridge, to celebrate the 70th anniversary of the publication of Theory of Probability, but it is more than fitting to have a session on Jeffreys and his lasting influence at an Objective Bayes conference.