Archive for STAN

Masterclass in Bayesian Statistics in Marseilles next Fall

Posted in Books, Kids, Mountains, pictures, R, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , on April 9, 2018 by xi'an

This post is to announce a second occurrence of the exciting “masterclass in Bayesian Statistics” that we organised in 2016, near Marseilles. It will take place on 22-26 October 2018 once more at CIRM (Centre International de Recherches Mathématiques, Luminy, Marseilles, France). The targeted audience includes all scientists interested in learning how Bayesian inference may be used to tackle the practical problems they face in their own research. In particular PhD students and post-docs should benefit most directly from this masterclass. Among the invited speakers, Kerrie Mengersen from QUT, Brisbane, visiting Marseilles this Fall, will deliver a series of lectures on the interface between Bayesian statistics and applied modelling, Havard Rue from KAUST will talk on computing with INLA, and Aki Vehtari from Aalto U, Helsinki, will give a course on Bayesian model assessment and model choice. There will be two tutorials on R and on Stan.

All interested participants in this masterclass should pre-register as early as possible, given that the total attendance is limited to roughly 90 participants. Some specific funding for local expenses (i.e., food + accommodation on-siteat CIRM) is available (thanks to CIRM, and potentially to Fondation Jacques Hadamard, to be confirmed); this funding will be attributed by the scientific committee, with high priority to PhD students and post-docs.

StanCon in Helsinki [29-31 Aug 2018]

Posted in Books, pictures, R, Statistics, Travel, University life with tags , , , , , , , , on March 7, 2018 by xi'an

As emailed to me by Aki Vehtari, the next StanCon will take place this summer in the wonderful city of Helsinki, at the end of August. On Aalto University Töölö Campus precisely. The list of speakers and tutorial teachers is available on the webpage. (The only “negative point” is that the conference does not include a Tuesday, the night of the transcendence 2 miles race!) Somewhat concluding this never-ending summer of Bayesian conferences!

bridgesampling [R package]

Posted in pictures, R, Statistics, University life with tags , , , , , , , , , on November 9, 2017 by xi'an

Quentin F. Gronau, Henrik Singmann and Eric-Jan Wagenmakers have arXived a detailed documentation about their bridgesampling R package. (No wonder that researchers from Amsterdam favour bridge sampling!) [The package relates to a [52 pages] tutorial on bridge sampling by Gronau et al. that I will hopefully comment soon.] The bridge sampling methodology for marginal likelihood approximation requires two Monte Carlo samples for a ratio of two integrals. A nice twist in this approach is to use a dummy integral that is already available, with respect to a probability density that is an approximation to the exact posterior. This means avoiding the difficulties with bridge sampling of bridging two different parameter spaces, in possibly different dimensions, with potentially very little overlap between the posterior distributions. The substitute probability density is chosen as Normal or warped Normal, rather than a t which would provide more stability in my opinion. The bridgesampling package also provides an error evaluation for the approximation, although based on spectral estimates derived from the coda package. The remainder of the document exhibits how the package can be used in conjunction with either JAGS or Stan. And concludes with the following words of caution:

“It should also be kept in mind that there may be cases in which the bridge sampling procedure may not be the ideal choice for conducting Bayesian model comparisons. For instance, when the models are nested it might be faster and easier to use the Savage-Dickey density ratio (Dickey and Lientz 1970; Wagenmakers et al. 2010). Another example is when the comparison of interest concerns a very large model space, and a separate bridge sampling based computation of marginal likelihoods may take too much time. In this scenario, Reversible Jump MCMC (Green 1995) may be more appropriate.”

a conceptual introduction to HMC [reply from the author]

Posted in Statistics with tags , , , , , , , , on September 8, 2017 by xi'an

[Here is the reply on my post from Michael Bétancourt, detailed enough to be promoted from comment to post!]

As Dan notes this is meant as an introduction for those without a strong mathematical background, hence the focus on concepts rather than theorems! There’s plenty of maths deeper in the references. ;-)

 I am not sure I get this sentence. Either it means that an expectation remains invariant under reparameterisation. Or something else and more profound that eludes me. In particular because Michael repeats later (p.25) that the canonical density does not depend on the parameterisation.

What I was trying to get at is that expectations and really all of measure theory are reparameteriztion invariant, but implementations of statistical algorithms that depend on parameterization-dependent representations, namely densities, are not. If your algorithm is sensitive to these parameterization dependencies then you end up with a tuning problem — which parameterization is best? — which makes it harder to utilize the algorithm in practice.

Exact implementations of HMC (i.e. without an integrator) are fully geometric and do not depend on any chosen parameterization, hence the canonical density and more importantly the Hamiltonian being an invariant objects. That said, there are some choices to be made in that construction, and those choices often look like parameter dependencies. See below!

“Every choice of kinetic energy and integration time yields a new Hamiltonian transition that will interact differently with a given target distribution (…) when poorly-chosen, however, the performance can suffer dramatically.”

This is exactly where it’s easy to get confused with what’s invariant and what’s not!

The target density gives rise to a potential energy, and the chosen density over momenta gives rise to a kinetic energy. The two energies transform in opposite ways under a reparameterization so their sum, the Hamiltonian, is invariant.

Really there’s a fully invariant, measure-theoretic construction where you use the target measure directly and add a “cotangent disintegration”.

In practice, however, we often choose a default kinetic energy, i.e. a log density, based on the parameterization of the target parameter space, for example an “identify mass matrix” kinetic energy. In other words, the algorithm itself is invariant but by selecting the algorithmic degrees of freedom based on the parameterization of the target parameter space we induce an implicit parameter dependence.

This all gets more complicated when we introducing the adaptation we use in Stan, which sets the elements of the mass matrix to marginal variances which means that the adapted algorithm is invariant to marginal transformations but not joint ones…

The explanation of the HMC move as a combination of uniform moves along isoclines of fixed energy level and of jumps between energy levels does not seem to translate into practical implementations, at least not as explained in the paper. Simulating directly the energy distribution for a complex target distribution does not seem more feasible than moving up likelihood levels in nested sampling.

Indeed, being able to simulate exactly from the energy distribution, which is equivalent to being able to quantify the density of states in statistical mechanics, is intractable for the same reason that marginal likelihoods are intractable. Which is a shame, because conditioned on those samples HMC could be made embarrassingly parallel!

Instead we draw correlated samples using momenta resamplings between each trajectory. As Dan noted this provides some intuition about Stan (it reduced random walk behavior to one dimension) but also motivates some powerful energy-based diagnostics that immediately indicate when the momentum resampling is limiting performance and we need to improve it by, say, changing the kinetic energy. Or per my previous comment, by keeping the kinetic energy the same but changing the parameterization of the target parameter space. :-)

In the end I cannot but agree with the concluding statement that the geometry of the target distribution holds the key to devising more efficient Monte Carlo methods.

Yes! That’s all I really want statisticians to take away from the paper. :-)

Statistical rethinking [book review]

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , , on April 6, 2016 by xi'an

Statistical Rethinking: A Bayesian Course with Examples in R and Stan is a new book by Richard McElreath that CRC Press sent me for review in CHANCE. While the book was already discussed on Andrew’s blog three months ago, and [rightly so!] enthusiastically recommended by Rasmus Bååth on Amazon, here are the reasons why I am quite impressed by Statistical Rethinking!

“Make no mistake: you will wreck Prague eventually.” (p.10)

While the book has a lot in common with Bayesian Data Analysis, from being in the same CRC series to adopting a pragmatic and weakly informative approach to Bayesian analysis, to supporting the use of STAN, it also nicely develops its own ecosystem and idiosyncrasies, with a noticeable Jaynesian bent. To start with, I like the highly personal style with clear attempts to make the concepts memorable for students by resorting to external concepts. The best example is the call to the myth of the golem in the first chapter, which McElreath uses as an warning for the use of statistical models (which almost are anagrams to golems!). Golems and models [and robots, another concept invented in Prague!] are man-made devices that strive to accomplish the goal set to them without heeding the consequences of their actions. This first chapter of Statistical Rethinking is setting the ground for the rest of the book and gets quite philosophical (albeit in a readable way!) as a result. In particular, there is a most coherent call against hypothesis testing, which by itself justifies the title of the book. Continue reading

MCMskv, Lenzerheide, 4-7 Jan., 2016 [news #2]

Posted in Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , on September 7, 2015 by xi'an

moonriseA quick reminder that the early bird registration deadline for BayesComp MCMski V is drawing near. And reminding Og’s readers that there will be a “Breaking news” session to highlight major advances among poster submissions. For which they can apply when sending the poster template. In addition, there is only a limited number of hotel rooms at the Schweizerhof, the main conference hotel and the first 40 participants who will make a reservation there will get a free one-day skipass!

STAN [no dead end]

Posted in Books, Statistics, Travel with tags , , on August 22, 2015 by xi'an

stanmoreMichael Betancourt found this street name in London and used it for his talk in Seattle. Even though he should have photoshopped the dead end symbol, which begged for my sarcastic comment during the talk…