Archive for map

it’s complicated…

Posted in pictures, Statistics with tags , , , , , , on June 21, 2021 by xi'an


Yesterday saw a first round of the regional and departmental elections in France, with a terribly low participation (around 30% of the voters, except in Corsica where 56% of the voters voted)…. [The map here is about the departmental elections, with departments being delineated in white and the subdivisions corresponding to the cantons. Corse/Corsica is “missing” because it is now a single entity. Same thing about Paris.] The only nice surprise about this outcome is that abstention particularly impacted the lepenist votes, which almost uniformly went down compared with the previous regional elections. And hence that the ill chances that a region gets a nazional majority are lowered. Although it is difficult to analyse why: polls were predicting a brown sludge tsunami, the gilets jaune movement (or morass) is not yet over and mostly aligned with the populist themes of the RN, and (some) people  seem unhappy with about any decision taken by any level of authority during the Covid-19 crisis. It will be interesting to watch the second round final results, next week, but I doubt we will see a voting surge happening, esp. since the frontist danger is now downplayed.

posterior distribution missing the MLE

Posted in Books, Kids, pictures, Statistics with tags , , , , , , , on April 25, 2019 by xi'an

An X validated question as to why the MLE is not necessarily (well) covered by a posterior distribution. Even for a flat prior… Which in restrospect highlights the fact that the MLE (and the MAP) are invasive species in a Bayesian ecosystem. Since they do not account for the dominating measure. And hence do not fare well under reparameterisation. (As a very much to the side comment, I also managed to write an almost identical and simultaneous answer to the first answer to the question.)

Bayesian maps of Africa

Posted in pictures, Statistics with tags , , , , , , on March 21, 2018 by xi'an

A rather special issue of Nature this week (1 March 2018) as it addresses Bayesian geo-cartography and mapping childhood growth failure and educational achievement (along with sexual differences) all across Africa! Including the (nice) cover of the journal, a preface by Kofi Annan, a cover article by Brian Reich and Murali Haran, and the first two major articles of the journal, one of which includes Ewan Cameron as a co-author. As I was reading this issue of Nature in the train back from Brussels, I could not access the supplementary material, so could not look at the specifics of the statistics, but the maps look quite impressive with a 5×5 km² resolution. And inclusion not only of uncertainty maps but also of predictive maps on the probability of achieving WHO 2025 goals. Surprisingly close to one in some parts of Africa. In terms of education, there are strong oppositions between different regions, with the south of the continent, including Madagascar, showing a positive difference for women in terms of years of education. While there is no reason (from my train seat) to doubt the statistical analyses, I take quite seriously the reservation of the authors that the quality of the prediction cannot be better than the quality of the data, which is “determined by the volume and fidelity of nationally representative surveys”. Which relates to an earlier post of mine about a similar concern with the deaths in Congo.

mixtures of mixtures

Posted in pictures, Statistics, University life with tags , , , , , , , , , on March 9, 2015 by xi'an

linz4And yet another arXival of a paper on mixtures! This one is written by Gertraud Malsiner-Walli, Sylvia Frühwirth-Schnatter, and Bettina Grün, from the Johannes Kepler University Linz and the Wirtschaftsuniversitat Wien I visited last September. With the exact title being Identifying mixtures of mixtures using Bayesian estimation.

So, what is a mixture of mixtures if not a mixture?! Or if not only a mixture. The upper mixture level is associated with clusters, while the lower mixture level is used for modelling the distribution of a given cluster. Because the cluster needs to be real enough, the components of the mixture are assumed to be heavily overlapping. The paper thus spends a large amount of space on detailing the construction of the associated hierarchical prior. Which in particular implies defining through the prior what a cluster means. The paper also connects with the overfitting mixture idea of Rousseau and Mengersen (2011, Series B). At the cluster level, the Dirichlet hyperparameter is chosen to be very small, 0.001, which empties superfluous clusters but sounds rather arbitrary (which is the reason why we did not go for such small values in our testing/mixture modelling). On the opposite, the mixture weights have an hyperparameter staying (far) away from zero. The MCMC implementation is based on a standard Gibbs sampler and the outcome is analysed and sorted by estimating the “true” number of clusters as the MAP and by selecting MCMC simulations conditional on that value. From there clusters are identified via the point process representation of a mixture posterior. Using a standard k-means algorithm.

The remainder of the paper illustrates the approach on simulated and real datasets. Recovering in those small dimension setups the number of clusters used in the simulation or found in other studies. As noted in the conclusion, using solely a Gibbs sampler with such a large number of components is rather perilous since it may get stuck close to suboptimal configurations. Especially with very small Dirichlet hyperparameters.

Statistics slides (5)

Posted in Books, Kids, Statistics, University life with tags , , , , , on December 7, 2014 by xi'an

La Défense from Paris-Dauphine, Nov. 15, 2012Here is the fifth and last set of slides for my third year statistics course, trying to introduce Bayesian statistics in the most natural way and hence starting with… Rasmus’ socks and ABC!!! This is an interesting experiment as I have no idea how my students will react. Either they will see the point besides the anecdotal story or they’ll miss it (being quite unhappy so far about the lack of mathematical rigour in my course and exercises…). We only have two weeks left so I am afraid the concept will not have time to seep through!