Archive for mixture of distributions

population quasi-Monte Carlo

Posted in Books, Statistics with tags , , , , , , , , , , , , on January 28, 2021 by xi'an

“Population Monte Carlo (PMC) is an important class of Monte Carlo methods, which utilizes a population of proposals to generate weighted samples that approximate the target distribution”

A return of the prodigal son!, with this arXival by Huang, Joseph, and Mak, of a paper on population Monte Carlo using quasi-random sequences. The construct is based on an earlier notion of Joseph and Mak, support points, which are defined wrt a given target distribution F as minimising the variability of a sample from F away from these points. (I would have used instead my late friend Bernhard Flury’s principal points!) The proposal uses Owen-style scrambled Sobol points, followed by a deterministic mixture weighting à la PMC, followed by importance support resampling to find the next location parameters of the proposal mixture (which is why I included an unrelated mixture surface as my post picture!). This importance support resampling is obviously less variable than the more traditional ways of resampling but the cost moves from O(M) to O(M²).

“The main computational complexity of the algorithm is O(M²) from computing the pairwise distance of the M weighted samples”

The covariance parameters are updated as in our 2008 paper. This new proposal is interesting and reasonable, with apparent significant gains, albeit I would have liked to see a clearer discussion of the actual computing costs of PQMC.

the strange occurrence of the one bump

Posted in Books, Kids, R, Statistics with tags , , , , , , , , on June 8, 2020 by xi'an

When answering an X validated question on running an accept-reject algorithm for the Gamma distribution by using a mixture of Beta and drifted (bt 1) Exponential distributions, I came across the above glitch in the fit of my 10⁷ simulated sample to the target, apparently displaying a wrong proportion of simulations above (or below) one.


It took me a while to spot the issue, namely that the output of


was favouring simulations from the drifted exponential by truncating. Permuting the elements of z before returning solved the issue (as shown below for a=½)!


Posted in Statistics with tags , , , , , , , , on January 24, 2020 by xi'an

On 26 and 27 March 2020, the maths department of the Université of Rouen, Normandy, France, organizes a (free) workshop on mixture distributions. With the following speakers

    • Christophe Biernacki  (Laboratoire Paul Painlevé, Univ. Lille 1 et INRIA)
    • Vincent Brault (Laboratoire Jean Kuntzmann, Univ. Grenoble Alpes)
    • Gilles Celeux  (Laboratoire de Mathématiques d’Orsay, Univ. Paris Sud et INRIA)
    • Elisabeth Gassiat  (Laboratoire de Mathématiques d’Orsay, Univ. Paris Sud)
    • Van Hà Hoang  (Laboratoire de Mathématique Raphaël Salem, Univ. Rouen Normandie)
    • Hajo Holzmann  (Philipps-University Marburg, Germany)
    • Dimitri Karlis  (Department of Statistics, Athens University of Economics and Business, Greece)
    • Trung Tin Nguyen (LMNO, Univ. Caen Normandie)
    • Andrea Rau  (Département de Génétique Animale, INRA, Jouy en Josas)
    • Pierre Vandekerkhove  (Laboratoire d’Analyse et de Mathématiques Appliquées, Univ. Paris-Est Marne-la-Vallée)
    • Cinzia Viroli  (Department of Statistical Sciences, Universita di Bologna, Italia)

Unfortunately, since this is my former department, I will not be able to attend as I am taking part into the SIAM Conference on Uncertainty Quantification (UQ20), on the very same days. In a session on likelihood-free inference.

the three i’s of poverty

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , on September 15, 2019 by xi'an

Today I made a “quick” (10h door to door!) round trip visit to Marseille (by train) to take part in the PhD thesis defense (committee) of Edwin Fourrier-Nicolaï, which title was Poverty, inequality and redistribution: an econometric approach. While this was mainly a thesis in economics, meaning defending some theory on inequalities based on East German data, there were Bayesian components in the thesis that justified (to some extent!) my presence in the jury. Especially around mixture estimation by Gibbs sampling. (On which I started working almost exactly 30 years ago, when I joined Paris 6 and met  Gilles Celeux and Jean Diebolt.) One intriguing [for me] question stemmed from this defense, namely the notion of a Bayesian estimation of a three i’s of poverty (TIP) curve. The three i’s stand for incidence, intensity, and inequality, as, introduced in Jenkins and Lambert (1997), this curve measure the average income loss from the poverty level for the 100p% lower incomes, when p varies between 0 and 1. It thus depends on the distribution F of the incomes and when using a mixture distribution its computation requires a numerical cdf inversion to determine the income p-th quantile. A related question is thus on how to define a Bayesian estimate of the TIP curve. Using an average over the values of an MCMC sample does not sound absolutely satisfactory since the upper bound in the integral varies for each realisation of the parameter. The use of another estimate would however require a specific loss function, an issue not discussed in the thesis.

a jump back in time

Posted in Books, Kids, Statistics, Travel, University life with tags , , , , , , , , , , , on October 1, 2018 by xi'an

As the Department of Statistics in Warwick is slowly emptying its shelves and offices for the big migration to the new building that is almost completed, books and documents are abandoned in the corridors and the work spaces. On this occasion, I thus happened to spot a vintage edition of the Valencia 3 proceedings. I had missed this meeting and hence the volume for, during the last year of my PhD, I was drafted in the French Navy and as a result prohibited to travel abroad. (Although on reflection I could have safely done it with no one in the military the wiser!) Reading through the papers thirty years later is a weird experience, as I do not remember most of the papers, the exception being the mixture modelling paper by José Bernardo and Javier Giròn which I studied a few years later when writing the mixture estimation and simulation paper with Jean Diebolt. And then again in our much more recent non-informative paper with Clara Grazian.  And Prem Goel’s survey of Bayesian software. That is, 1987 state of the art software. Covering an amazing eighteen list. Including versions by Zellner, Tierney, Schervish, Smith [but no MCMC], Jaynes, Goldstein, Geweke, van Dijk, Bauwens, which apparently did not survive the ages till now. Most were in Fortran but S was also mentioned. And another version of Tierney, Kass and Kadane on Laplace approximations. And the reference paper of Dennis Lindley [who was already retired from UCL at that time!] on the Hardy-Weinberg equilibrium. And another paper by Don Rubin on using SIR (Rubin, 1983) for simulating from posterior distributions with missing data. Ten years before the particle filter paper, and apparently missing the possibility of weights with infinite variance.

There already were some illustrations of Bayesian analysis in action, including one by Jay Kadane reproduced in his book. And several papers by Jim Berger, Tony O’Hagan, Luis Pericchi and others on imprecise Bayesian modelling, which was in tune with the era, the imprecise probability book by Peter Walley about to appear. And a paper by Shaw on numerical integration that mentioned quasi-random methods. Applied to a 12 component Normal mixture.Overall, a much less theoretical content than I would have expected. And nothing about shrinkage estimators, although a fraction of the speakers had worked on this topic most recently.

At a less fundamental level, this was a time when LaTeX was becoming a standard, as shown by a few papers in the volume (and as I was to find when visiting Purdue the year after), even though most were still typed on a typewriter, including a manuscript addition by Dennis Lindley. And Warwick appeared as a Bayesian hotpot!, with at least five papers written by people there permanently or on a long term visit. (In case a local is interested in it, I have kept the volume, to be found in my new office!)