Archive for banana

adaptive incremental mixture MCMC

Posted in Statistics with tags , , , , , , , on August 12, 2022 by xi'an

Sadly, I missed this adaptive incremental mixture MCMC paper by my friends Florian Maire, Nial Friel, Antonietta Mira, and Adrian E. Raftery when it came out in JCGS in 2019. The core of the paper is about building a time-inhomogeneous mixture independent proposal, starting from an initial distribution and adding one component when hitting a point for which the ratio target / proposal is large, as this points out a part of the space that is not well-enough explored, while the other components do not change, except for a proportional decrease in the weights. This proposal reminded me of the inspiring paper of Gåsemyr (2003), which in some ways inspired our population Monte Carlo sampler. Obviously, there is a what-you-get-is-what-you-see drawback to the approach in that regions where this ratio is high may never be explored by the proposal, despite its adaptivity.

The added component is Normal, centred at the associated (accepted) proposed value ø and with covariance matrix a local estimate based on past iterations of the algorithm. And with weight proportional to the (powered) target density at ø, which does not require a normalising constant. The method however requires setting a certain number of calibration parameters like the power γ for the weight, the lower bound M for the ratio target to proposal, the rate of diminishing adaptation (which is also needed for ergodicity à la Roberts and Rosenthal (2007)).  And the implicit choice of a particular parameterisation for the Normal mixture to be close enough to the target. In the posted experiments, the number of components in the mixture does not grow to unmanageable figures, but a further adaption could be in removing components that are inactive or leading to systematic rejection as we did in the population Monte Carlo paper.

return of the boomerang

Posted in Books, pictures, Statistics, Travel with tags , , , , , on January 26, 2021 by xi'an

Pagani, Wiegand and Nadarajah wrote a paper past last Spring on the Rosenbrock distribution. Now, I did not know this distribution under that name but as the banana benchmark distribution I met for the first time in the 1999 Haario, Saksman and Tamminen paper on adaptive MCMC. And that I used in several papers (the picture below being borrowed from Statisfaction!)

The Rosenbrock function was introduced by… Howard Rosenbrock in 1960 in a computer journal as a benchmark for optimisation. (Or by someone else to keep up with Stigler’s Law of Eponymy.) It can be turned into a probability density by exponentiating its opposite. It corresponds to a Normal N(μ,σ²) marginal on the first component, followed by T Normal

N(x²t-1,σ²/10)

conditional distributions on the following components. It is thus fully known, incl. its normalising constant, and easy to simulate. Hence to use as a fat tail target for benchmarking MCMC algorithms. The authors propose an extension as the hybrid Rosenbrock where several parallel sequences stem from the same component, but it is unclear to me how useful of a generalisation this is…

deep and embarrassingly parallel MCMC

Posted in Books, pictures, Statistics with tags , , , , , , , on April 9, 2019 by xi'an

Diego Mesquita, Paul Blomstedt, and Samuel Kaski (from Helsinki, like the above picture) just arXived a paper on embarrassingly parallel MCMC. Following a series of papers discussed on this ‘og in the past. They use a deep learning approach of Dinh et al. (2017) to the computation of the probability density of a convoluted and non-volume-preserving transform of a given random variable to turn multiple samples from sub-posteriors [corresponding to the k k-th roots of the true posterior] into a sample from the true posterior. If I understand correctly the argument [on page 4], the deep neural network provides a density estimate that apparently does better than traditional non-parametric density estimates. Maybe by being more efficient than a Parzen-Rosenblat estimator which is of order the number of simulations… For any value of θ, the estimate of the true target is the product of these estimates and for a value of θ simulated from one of the subposteriors an importance weight naturally ensues. However, for a one-dimensional transform of θ, h(θ), I would prefer estimating first the density of h(θ) for each sample and then construct an importance weight. If only to avoid the curse of dimension.

On various benchmarks, like the banana-shaped 2D target above, the proposed method (NAP) does better. Even in relatively high dimensions. Given that the overall computing times are not produced, with only the calibration that the same number of subsamples were produced for all methods, it would be interesting to test the same performances on even higher dimensions and larger population sizes.

Can we have our Banana cake and eat it too?

Posted in Statistics with tags , , on February 10, 2018 by xi'an

gone banamaths!

Posted in pictures, University life with tags , , , , , on April 4, 2016 by xi'an

%d bloggers like this: