## this issue of Series B

Posted in Books, Statistics, Travel, University life with tags , , , , , , , , , , on September 5, 2014 by xi'an

The September issue of [JRSS] Series B I received a few days ago is of particular interest to me. (And not as an ex-co-editor since I was never involved in any of those papers!) To wit: a paper by Hani Doss and Aixin Tan on evaluating normalising constants based on MCMC output, a preliminary version I had seen at a previous JSM meeting, a paper by Nick Polson, James Scott and Jesse Windle on the Bayesian bridge, connected with Nick’s talk in Boston earlier this month, yet another paper by Ariel Kleiner, Ameet Talwalkar, Purnamrita Sarkar and Michael Jordan on the bag of little bootstraps, which presentation I heard Michael deliver a few times when he was in Paris. (Obviously, this does not imply any negative judgement on the other papers of this issue!)

For instance, Doss and Tan consider the multiple mixture estimator [my wording, the authors do not give the method a name, referring to Vardi (1985) but missing the connection with Owen and Zhou (2000)] of k ratios of normalising constants, namely

$\sum_{l=1}^k \frac{1}{n_l} \sum_{t=1}^{n_l} \dfrac{n_l g_j(x_t^l)}{\sum_{s=1}^k n_s g_s(x_t^l) z_1/z_s } \longrightarrow \dfrac{z_j}{z_1}$

where the z’s are the normalising constants and with possible different numbers of iterations of each Markov chain. An interesting starting point (that Hans Künsch had mentioned to me a while ago but that I had since then forgotten) is that the problem was reformulated by Charlie Geyer (1994) as a quasi-likelihood estimation where the ratios of all z’s relative to one reference density are the unknowns. This is doubling interesting, actually, because it restates the constant estimation problem into a statistical light and thus somewhat relates to the infamous “paradox” raised by Larry Wasserman a while ago. The novelty in the paper is (a) to derive an optimal estimator of the ratios of normalising constants in the Markov case, essentially accounting for possibly different lengths of the Markov chains, and (b) to estimate the variance matrix of the ratio estimate by regeneration arguments. A favourite tool of mine, at least theoretically as practically useful minorising conditions are hard to come by, if at all available.

Posted in Statistics, University life with tags , , , , , , , , , , , on November 8, 2012 by xi'an

Following last week read of Hartigan and Wong’s 1979 K-Means Clustering Algorithm, my Master students in the Reading Classics Seminar course, listened today to Agnė Ulčinaitė covering Rob Tibshirani‘s original LASSO paper Regression shrinkage and selection via the lasso in JRSS Series B. Here are her (Beamer) slides

Again not the easiest paper in the list, again mostly algorithmic and requiring some background on how it impacted the field. Even though Agnė also went through the Elements of Statistical Learning by Hastie, Friedman and Tibshirani, it was hard to get away from the paper to analyse more widely the importance of the paper, the connection with the Bayesian (linear) literature of the 70’s, its algorithmic and inferential aspects, like the computational cost, and the recent extensions like Bayesian LASSO. Or the issue of handling n<p models. Remember that one of the S in LASSO stands for shrinkage: it was quite pleasant to hear again about ridge estimators and Stein’s unbiased estimator of the risk, as those were themes of my Ph.D. thesis… (I hope the students do not get discouraged by the complexity of those papers: there were fewer questions and fewer students this time. Next week, the compass will move to the Bayesian pole with a talk on Lindley and Smith’s 1973 linear Bayes paper by one of my PhD students.)

## Monte Carlo Statistical Methods third edition

Posted in Books, R, Statistics, University life with tags , , , , , , , , , , , , , on September 23, 2010 by xi'an

Last week, George Casella and I worked around the clock on starting the third edition of Monte Carlo Statistical Methods by detailing the changes to make and designing the new table of contents. The new edition will not see a revolution in the presentation of the material but rather a more mature perspective on what matters most in statistical simulation:

## The day I invented Bayesian Lasso…

Posted in Books, Statistics with tags , , , on August 16, 2010 by xi'an

George Casella remarked to me last month in Padova that, once he and Trevor Park published The Bayesian Lasso in JASA, they received many claims for prior discovery of “Bayesian Lasso”! So, as a joke, let me add my claim as well! Indeed, in the first (1994) edition of The Bayesian Choice, I included an example in Chapter 4 (Example 4.2) about the fact that using a double exponential prior along a Cauchy likelihood was producing a zero MAP (maximum a posteriori) estimate. Isn’t that the essence of the Bayesian lasso?! Of course, as you can still check in the current edition, the example was intended as a counter-example to the use of MAP estimates, not as an argument about the parsimony induced by double exponential priors. (Exercice 4.6 in both editions builds upon this example to notice that, with a small enough scale parameter, the absolute shrinkage to zero vanishes.) I thus lost the opportunity of “inventing” the Bayesian Lasso! To my shame, I must add that in the even earlier 1992 French edition of the book, I made a mistake in the derivation of the MAP and hence completely missed the point!!!