Michael Betancourt found this street name in London and used it for his talk in Seattle. Even though he should have photoshopped the dead end symbol, which begged for my sarcastic comment during the talk…
Archive for STAN
[Heading off to mountainous areas with no Internet or phone connection, I posted a series of entries for the following week, starting with this brilliant trailer of Michael:]
My first session today was Markov Chain Monte Carlo for Contemporary Statistical Applications with a heap of interesting directions in MCMC research! Now, without any possible bias (!), I would definitely nominate Murray Pollock (incidentally from Warwick) as the winner for best slides, funniest presentation, and most enjoyable accent! More seriously, the scalable Langevin algorithm he developed with Paul Fearnhead, Adam Johansen, and Gareth Roberts, is quite impressive in avoiding computing costly likelihoods. With of course caveats on which targets it applies to. Murali Haran showed a new proposal to handle high dimension random effect models by a projection trick that reduces the dimension. Natesh Pillai introduced us (or at least me!) to a spectral clustering that allowed for an automated partition of the target space, itself the starting point to his parallel MCMC algorithm. Quite exciting, even though I do not perceive partitions as an ideal solution to this problem. The final talk in the session was Galin Jones’ presentation of consistency results and conditions for multivariate quantities which is a surprisingly unexplored domain. MCMC is still alive and running!
The second MCMC session of the morning, Monte Carlo Methods Facing New Challenges in Statistics and Science, was equally diverse, with Lynn Kuo’s talk on the HAWK approach, where we discovered that harmonic mean estimators are still in use, e.g., in MrBayes software employed in phylogenetic inference. The proposal to replace this awful estimator that should never be seen again (!) was rather closely related to an earlier solution of us for marginal likelihood approximation, based there on a partition of the whole space rather than an HPD region in our case… Then, Michael Betancourt brilliantly acted as a proxy for Andrew to present the STAN language, with a flashy trailer he most recently designed. Featuring Andrew as the sole actor. And with great arguments for using it, including the potential to run expectation propagation (as a way of life). In fine, Faming Liang proposed a bootstrap subsampling version of the Metropolis-Hastings algorithm, where the likelihood acknowledging the resulting bias in the limiting distribution.
My first afternoon session was another entry on Statistical Phylogenetics, somewhat continued from yesterday’s session. Making me realised I had not seen a single talk on ABC for the entire meeting! The issues discussed in the session were linked with aligning sequences and comparing many trees. Again in settings where likelihoods can be computed more or less explicitly. Without any expertise in the matter, I wondered at a construction that would turn all trees, like into realizations of a continuous model. For instance by growing one branch at a time while removing the MRCA root… And maybe using a particle like method to grow trees. As an aside, Vladimir Minin told me yesterday night about genetic mutations that could switch on and off phenotypes repeatedly across generations… For instance the ability to glow in the dark for species of deep sea fish.
When stating that I did not see a single talk about ABC, I omitted Steve Fienberg’s Fisher Lecture R.A. Fisher and the Statistical ABCs, keeping the morceau de choix for the end! Even though of course Steve did not mention the algorithm! A was for asymptotics, or ancilarity, B for Bayesian (or biducial??), C for causation (or cuffiency???)… Among other germs, I appreciated that Steve mentioned my great-grand father Darmois in connection with exponential families! And the connection with Jon Wellner’s LeCam Lecture from a few days ago. And reminding us that Savage was a Fisher lecturer himself. And that Fisher introduced fiducial distributions quite early. And for defending the Bayesian perspective. Steve also set some challenges like asymptotics for networks, Bayesian model assessment (I liked the notion of stepping out of the model), and randomization when experimenting with networks. And for big data issues. And for personalized medicine, building on his cancer treatment. No trace of the ABC algorithm, obviously, but a wonderful Fisher’s lecture, also most obviously!! Bravo, Steve, keep thriving!!!
The BayesComp MCMski V [or MCMskv for short] has now its official website, once again maintained by Merrill Lietchy from Drexel University, Philadelphia, and registration is even open! The call for contributed sessions is now over, while the call for posters remains open until the very end. The novelty from the previous post is that there will be a “Breaking news” [in-between the Late news sessions at JSM and the crash poster talks at machine-learning conferences] session to highlight major advances among poster submissions. And that there will be an opening talk by Steve [the Bayesian] Scott on the 4th, about the frightening prospect of MCMC death!, followed by a round-table and a welcome reception, sponsored by the Swiss Supercomputing Centre. Hence the change in dates. Which still allows for arrivals in Zürich on the January 4th [be with you].
On Monday, I went to Amsterdam to give a seminar at the University of Amsterdam, in the department of psychology. And to visit Eric-Jan Wagenmakers and his group there. And I had a fantastic time! I talked about our mixture proposal for Bayesian testing and model choice without getting hostile or adverse reactions from the audience, quite the opposite as we later discussed this new notion for several hours in the café across the street. I also had the opportunity to meet with Peter Grünwald [who authored a book on the minimum description length principle] pointed out a minor inconsistency of the common parameter approach, namely that the Jeffreys prior on the first model did not have to coincide with the Jeffreys prior on the second model. (The Jeffreys prior for the mixture being unavailable.) He also wondered about a more conservative property of the approach, compared with the Bayes factor, in the sense that the non-null parameter could get closer to the null-parameter while still being identifiable.
Among the many persons I met in the department, Maarten Marsman talked to me about his thesis research, Plausible values in statistical inference, which involved handling the Ising model [a non-sparse Ising model with O(p²) parameters] by an auxiliary representation due to Marc Kac and getting rid of the normalising (partition) constant by the way. (Warning, some approximations involved!) And who showed me a simple probit example of the Gibbs sampler getting stuck as the sample size n grows. Simply because the uniform conditional distribution on the parameter concentrates faster (in 1/n) than the posterior (in 1/√n). This does not come as a complete surprise as data augmentation operates in an n-dimensional space. Hence it requires more time to get around. As a side remark [still worth printing!], Maarten dedicated his thesis as “To my favourite random variables , Siem en Fem, and to my normalizing constant, Esther”, from which I hope you can spot the influence of at least two of my book dedications! As I left Amsterdam on Tuesday, I had time for a enjoyable dinner with E-J’s group, an equally enjoyable early morning run [with perfect skies for sunrise pictures!], and more discussions in the department. Including a presentation of the new (delicious?!) Bayesian software developed there, JASP, which aims at non-specialists [i.e., researchers unable to code in R, BUGS, or, God forbid!, STAN] And about the consequences of mixture testing in some psychological experiments. Once again, a fantastic time discussing Bayesian statistics and their applications, with a group of dedicated and enthusiastic Bayesians!
Following the highly successful [authorised opinion!, from objective sources] MCMski IV, in Chamonix last year, the BayesComp section of ISBA has decided in favour of a two-year period, which means the great item of news that next year we will meet again for MCMski V [or MCMskv for short], this time on the snowy slopes of the Swiss town of Lenzerheide, south of Zürich. The committees are headed by the indefatigable Antonietta Mira and Mark Girolami. The plenary speakers have already been contacted and Steve Scott (Google), Steve Fienberg (CMU), David Dunson (Duke), Krys Latuszynski (Warwick), and Tony Lelièvre (Mines, Paris), have agreed to talk. Similarly, the nine invited sessions have been selected and will include Hamiltonian Monte Carlo, Algorithms for Intractable Problems (ABC included!), Theory of (Ultra)High-Dimensional Bayesian Computation, Bayesian NonParametrics, Bayesian Econometrics, Quasi Monte Carlo, Statistics of Deep Learning, Uncertainty Quantification in Mathematical Models, and Biostatistics. There will be afternoon tutorials, including a practical session from the Stan team, tutorials for which call is open, poster sessions, a conference dinner at which we will be entertained by the unstoppable Imposteriors. The Richard Tweedie ski race is back as well, with a pair of Blossom skis for the winner!