**A**fter a fairly long delay (since the first version was posted and submitted in December 2014), we eventually revised and resubmitted our paper with Kaniav Kamary [who has now graduated], Kerrie Mengersen, and Judith Rousseau on the final day of 2018. The main reason for this massive delay is mine’s, as I got fairly depressed by the general tone of the dozen of reviews we received after submitting the paper as a Read Paper in the Journal of the Royal Statistical Society. Despite a rather opposite reaction from the community (an admittedly biased sample!) including two dozens of citations in other papers. (There seems to be a pattern in my submissions of Read Papers, witness our earlier and unsuccessful attempt with Christophe Andrieu in the early 2000’s with the paper on controlled MCMC, leading to 121 citations so far according to G scholar.) Anyway, thanks to my co-authors keeping up the fight!, we started working on a revision including stronger convergence results, managing to show that the approach leads to an optimal separation rate, contrary to the Bayes factor which has an extra √log(n) factor. This may sound paradoxical since, while the Bayes factor converges to 0 under the alternative model exponentially quickly, the convergence rate of the mixture weight α to 1 is of order 1/√n, but this does not mean that the separation rate of the procedure based on the mixture model is worse than that of the Bayes factor. On the contrary, while it is well known that the Bayes factor leads to a separation rate of order √log(n) in parametric models, we show that our approach can lead to a testing procedure with a better separation rate of order 1/√n. We also studied a non-parametric setting where the null is a specified family of distributions (e.g., Gaussians) and the alternative is a Dirichlet process mixture. Establishing that the posterior distribution concentrates around the null at the rate √log(n)/√n. We thus resubmitted the paper for publication, although not as a Read Paper, with hopefully more luck this time!

## Archive for Ultimixt

## mixture modelling for testing hypotheses

Posted in Books, Statistics, University life with tags Bayes factor, Bayesian hypothesis testing, Christophe Andrieu, controlled MCMC, JRSSB, peer review, Read paper, revision, testing as mixture estimation, Ultimixt, University of Bristol on January 4, 2019 by xi'an## weakly informative reparameterisations

Posted in Books, pictures, R, Statistics, University life with tags Bayesian modelling, Edinburgh, Gaussian mixture, JCGS, location-scale parameterisation, moments, non-informative priors, publication, R package, Ultimixt on February 14, 2018 by xi'an**O**ur paper, weakly informative reparameterisations of location-scale mixtures, with Kaniav Kamary and Kate Lee, got accepted by JCGS! Great news, which comes in perfect timing for Kaniav as she is currently applying for positions. The paper proposes a unidimensional mixture Bayesian modelling based on the first and second moment constraints, since these turn the remainder of the parameter space into a compact. While we had already developed an associated R package, Ultimixt, the current editorial policy of JCGS imposes the R code used to produce all results to be attached to the submission and it took us a few more weeks than it should have to produce a directly executable code, due to internal library incompatibilities. (For this entry, I was looking for a link to our special JCGS issue with my picture of Edinburgh but realised I did not have this picture.)

## mixtures are slices of an orange

Posted in Kids, R, Statistics with tags CFE 2015, Gaussian mixture, hyperparameter, improper priors, invariance, Lenzerheide, location-scale parameterisation, London, MCMskv, Metropolis-Hastings algorithm, mixtures of distributions, non-informative priors, poster, R, reference priors, Switzerland, Ultimixt on January 11, 2016 by xi'an**A**fter presenting this work in both London and Lenzerheide, Kaniav Kamary, Kate Lee and I arXived and submitted our paper on a new parametrisation of location-scale mixtures. Although it took a long while to finalise the paper, given that we came with the original and central idea about a year ago, I remain quite excited by this new representation of mixtures, because the use of a global location-scale (hyper-)parameter doubling as the mean-standard deviation for the mixture itself implies that all the other parameters of this mixture model [beside the weights] belong to the intersection of a unit hypersphere with an hyperplane. [Hence the title above I regretted not using for the poster at MCMskv!]This realisation that using a (meaningful) hyperparameter (μ,σ) leads to a compact parameter space for the component parameters is important for inference in such mixture models in that the hyperparameter (μ,σ) is easily estimated from the entire sample, while the other parameters can be studied using a non-informative prior like the Uniform prior on the ensuing compact space. This non-informative prior for mixtures is something I have been seeking for many years, hence my on-going excitement! In the mid-1990‘s, we looked at a Russian doll type parametrisation with Kerrie Mengersen that used the “first” component as defining the location-scale reference for the entire mixture. And expressing each new component as a local perturbation of the previous one. While this is a similar idea than the current one, it falls short of leading to a natural non-informative prior, forcing us to devise a proper prior on the variance that was a mixture of a Uniform U(0,1) and of an inverse Uniform 1/U(0,1). Because of the lack of compactness of the parameter space. Here, fixing both mean and variance (or even just the variance) binds the mixture parameter to an ellipse conditional on the weights. A space that can be turned into the unit sphere via a natural reparameterisation. Furthermore, the intersection with the hyperplane leads to a closed form spherical reparameterisation. Yay!

While I do not wish to get into the debate about the [non-]existence of “non-informative” priors at this stage, I think being able to using the invariant reference prior π(μ,σ)=1/σ is quite neat here because the inference on the mixture parameters should be location and scale equivariant. The choice of the prior on the remaining parameters is of lesser importance, the Uniform over the compact being one example, although we did not study in depth this impact, being satisfied with the outputs produced from the default (Uniform) choice.

From a computational perspective, the new parametrisation can be easily turned into the old parametrisation, hence leads to a closed-form likelihood. This implies a Metropolis-within-Gibbs strategy can be easily implemented, as we did in the derived Ultimixt R package. (Which programming I was not involved in, solely suggesting the name *Ultimixt* from ultimate mixture parametrisation, a former title that we eventually dropped off for the paper.)

Discussing the paper at MCMskv was very helpful in that I got very positive feedback about the approach and superior arguments to justify the approach and its appeal. And to think about several extensions outside location scale families, if not in higher dimensions which remain a practical challenge (in the sense of designing a parametrisation of the covariance matrices in terms of the global covariance matrix).

## MCMskv #2 [ridge with a view]

Posted in Mountains, pictures, R, Statistics, Travel, University life with tags ABC, Gaussian mixture, hyperparameter, improper priors, Lenzerheide, MCMskv, Metropolis-Hastings algorithm, mixtures of distributions, non-informative priors, poster, R, reference priors, Switzerland, Ultimixt on January 7, 2016 by xi'an**T**uesday at MCMSkv was a rather tense day for me, from having to plan the whole day “away from home” [8km away] to the mundane worry of renting ski equipment and getting to the ski runs over the noon break, to giving a poster over our new mixture paper with Kaniav Kamary and Kate Lee, as Kaniav could not get a visa in time. It actually worked out quite nicely, with almost Swiss efficiency. After Michael Jordan’s talk, I attended a Bayesian molecular biology session with an impressive talk by Jukka Corander on evolutionary genomics with novel ABC aspects. And then a Hamiltonian Monte Carlo session with two deep talks by Sam Livingstone and Elena Akhmatskaya on the convergence of HMC, followed by an amazing entry into Bayesian cosmology by Jens Jasche (with a slight drawback that MCMC simulations took about a calendar year, handling over 10⁷ parameters). Finishing the day with more “classical” MCMC convergence results and techniques, with talks about forgetting time, stopping time (an undervalued alternative to convergence controls), and CLTs. Including a multivariate ESS by James Flegal. (This choice of sessions was uniformly frustrating as I was also equally interested in “the other” session. The drawback of running parallel sessions, obviously.)

The poster session was busy and animated, but I alas could not get an idea of the other posters as I was presenting mine. This was quite exciting as I discussed a new parametrisation for location-scale mixture models that allows for a rather straightforward “non-informative” or reference prior. (The paper with Kaniav Kamary and Kate Lee should be arXived overnight!) The recently deposited CRAN package Ultimixt by Kaniav and Kate contains Metropolis-Hastings functions related to this new approach. The result is quite exciting, especially because I have been looking for it for decades and I will discuss it pretty soon in another post, and I had great exchanges with the conference participants, which led me to consider the reparametrisation in a larger scale and to simplify the presentation of the approach, turning the global mean and variance as hyperparameters.

The day was also most auspicious for a ski break as it was very mild and sunny, while the snow conditions were (somewhat) better than the ones we had in the French Alps two weeks ago. (Too bad that the Tweedie ski race had to be cancelled for lack of snow on the reserved run! The Blossom ski reward will have again to be randomly allocated!) Just not exciting enough to consider another afternoon out, given the tension in getting there and back. (And especially when considering that it took me the entire break time to arXive our mixture paper…)