dynamic mixtures and frequentist ABC

Posted in Statistics with tags , , , , , , , , , , , , , , , on November 30, 2022 by xi'an

This early morning in NYC, I spotted this new arXival by Marco Bee (whom I know from the time he was writing his PhD with my late friend Bernhard Flury) and found he has been working for a while on ABC related problems. The mixture model he considers therein is a form of mixture of experts, where the weights of the mixture components are not constant but functions on (0,1) of the entry as well. This model was introduced by Frigessi, Haug and Rue in 2002 and is often used as a benchmark for ABC methods, since it is missing its normalising constant as in e.g.

$f(x) \propto p(x) f_1(x) + (1-p(x)) f_2(x)$

even with all entries being standard pdfs and cdfs. Rather than using a (costly) numerical approximation of the “constant” (as a function of all unknown parameters involved), Marco follows the approximate maximum likelihood approach of my Warwick colleagues, Javier Rubio [now at UCL] and Adam Johansen. It is based on the [SAME] remark that under a uniform prior and using an approximation to the actual likelihood the MAP estimator is also the MLE for that approximation. The approximation is ABC-esque in that a pseudo-sample is generated from the true model (attached to a simulation of the parameter) and the pair is accepted if the pseudo-sample stands close enough to the observed sample. The paper proposes to use the Cramér-von Mises distance, which only involves ranks. Given this “posterior” sample, an approximation of the posterior density is constructed and then numerically optimised. From a frequentist view point, a direct estimate of the mode would be preferable. From my Bayesian perspective, this sounds like a step backwards, given that once a posterior sample is available, reconnecting with an approximate MLE does not sound highly compelling.

it’s complicated…

Posted in pictures, Statistics with tags , , , , , , on June 21, 2021 by xi'an

Yesterday saw a first round of the regional and departmental elections in France, with a terribly low participation (around 30% of the voters, except in Corsica where 56% of the voters voted)…. [The map here is about the departmental elections, with departments being delineated in white and the subdivisions corresponding to the cantons. Corse/Corsica is “missing” because it is now a single entity. Same thing about Paris.] The only nice surprise about this outcome is that abstention particularly impacted the lepenist votes, which almost uniformly went down compared with the previous regional elections. And hence that the ill chances that a region gets a nazional majority are lowered. Although it is difficult to analyse why: polls were predicting a brown sludge tsunami, the gilets jaune movement (or morass) is not yet over and mostly aligned with the populist themes of the RN, and (some) people  seem unhappy with about any decision taken by any level of authority during the Covid-19 crisis. It will be interesting to watch the second round final results, next week, but I doubt we will see a voting surge happening, esp. since the frontist danger is now downplayed.

posterior distribution missing the MLE

Posted in Books, Kids, pictures, Statistics with tags , , , , , , , on April 25, 2019 by xi'an

An X validated question as to why the MLE is not necessarily (well) covered by a posterior distribution. Even for a flat prior… Which in restrospect highlights the fact that the MLE (and the MAP) are invasive species in a Bayesian ecosystem. Since they do not account for the dominating measure. And hence do not fare well under reparameterisation. (As a very much to the side comment, I also managed to write an almost identical and simultaneous answer to the first answer to the question.)

Bayesian maps of Africa

Posted in pictures, Statistics with tags , , , , , , on March 21, 2018 by xi'an

A rather special issue of Nature this week (1 March 2018) as it addresses Bayesian geo-cartography and mapping childhood growth failure and educational achievement (along with sexual differences) all across Africa! Including the (nice) cover of the journal, a preface by Kofi Annan, a cover article by Brian Reich and Murali Haran, and the first two major articles of the journal, one of which includes Ewan Cameron as a co-author. As I was reading this issue of Nature in the train back from Brussels, I could not access the supplementary material, so could not look at the specifics of the statistics, but the maps look quite impressive with a 5×5 km² resolution. And inclusion not only of uncertainty maps but also of predictive maps on the probability of achieving WHO 2025 goals. Surprisingly close to one in some parts of Africa. In terms of education, there are strong oppositions between different regions, with the south of the continent, including Madagascar, showing a positive difference for women in terms of years of education. While there is no reason (from my train seat) to doubt the statistical analyses, I take quite seriously the reservation of the authors that the quality of the prediction cannot be better than the quality of the data, which is “determined by the volume and fidelity of nationally representative surveys”. Which relates to an earlier post of mine about a similar concern with the deaths in Congo.

mixtures of mixtures

Posted in pictures, Statistics, University life with tags , , , , , , , , , on March 9, 2015 by xi'an

And yet another arXival of a paper on mixtures! This one is written by Gertraud Malsiner-Walli, Sylvia Frühwirth-Schnatter, and Bettina Grün, from the Johannes Kepler University Linz and the Wirtschaftsuniversitat Wien I visited last September. With the exact title being Identifying mixtures of mixtures using Bayesian estimation.

So, what is a mixture of mixtures if not a mixture?! Or if not only a mixture. The upper mixture level is associated with clusters, while the lower mixture level is used for modelling the distribution of a given cluster. Because the cluster needs to be real enough, the components of the mixture are assumed to be heavily overlapping. The paper thus spends a large amount of space on detailing the construction of the associated hierarchical prior. Which in particular implies defining through the prior what a cluster means. The paper also connects with the overfitting mixture idea of Rousseau and Mengersen (2011, Series B). At the cluster level, the Dirichlet hyperparameter is chosen to be very small, 0.001, which empties superfluous clusters but sounds rather arbitrary (which is the reason why we did not go for such small values in our testing/mixture modelling). On the opposite, the mixture weights have an hyperparameter staying (far) away from zero. The MCMC implementation is based on a standard Gibbs sampler and the outcome is analysed and sorted by estimating the “true” number of clusters as the MAP and by selecting MCMC simulations conditional on that value. From there clusters are identified via the point process representation of a mixture posterior. Using a standard k-means algorithm.

The remainder of the paper illustrates the approach on simulated and real datasets. Recovering in those small dimension setups the number of clusters used in the simulation or found in other studies. As noted in the conclusion, using solely a Gibbs sampler with such a large number of components is rather perilous since it may get stuck close to suboptimal configurations. Especially with very small Dirichlet hyperparameters.