Archive for mixtures of distributions

LaTeX issues from Vienna

Posted in Books, Statistics, University life with tags , , , , , , , , , , , on September 21, 2017 by xi'an

When working on the final stage of our edited handbook on mixtures, in Vienna, I came across unexpected practical difficulties! One was that by working on Dropbox with Windows users, files and directories names suddenly switched from upper case to lower cases letters !, making hard-wired paths to figures and subsections void in the numerous LaTeX files used for the book. And forcing us to change to lower cases everywhere. Having not worked under Windows since George Casella gave me my first laptop in the mid 90’s!, I am amazed that this inability to handle both upper and lower names is still an issue. And that Dropbox replicates it. (And that some people see that as a plus.)

The other LaTeX issue that took a while to solve was that we opted for one chapter one bibliography, rather than having a single bibliography at the end of the book, mainly because CRC Press asked for this feature in order to sell chapters individually… This was my first encounter with this issue and I found the solutions to produce individual bibliographies incredibly heavy handed, whether through chapterbib or bibunits, since one has to bibtex one .aux file for each chapter. Even with a one line bash command, this is annoying to the extreme!

zurück nach Wien

Posted in pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , on September 16, 2017 by xi'an

Today, I am travelling to Vienna for a few days, primarily for assessing a grant renewal for a research consortium federating most Austrian research groups on a topic for which Austria is a world-leader. (Sorry for being cryptic but I am unsure how much I can disclose about this assessment!) And taking advantage on being in Vienna, for a two-day editing session with Sylvia Früwirth-Schnatter and Gilles Celeux on our Handbook of mixtures analysis project. Which started a few years ago with another meeting in Vienna. And taking further advantage on being in Vienna, for an evening at the Volksoper, conveniently playing Die Zauberflöte!

Jeffreys priors for mixtures [or not]

Posted in Books, Statistics, University life with tags , , , , , on July 25, 2017 by xi'an

Clara Grazian and I have just arXived [and submitted] a paper on the properties of Jeffreys priors for mixtures of distributions. (An earlier version had not been deemed of sufficient interest by Bayesian Analysis.) In this paper, we consider the formal Jeffreys prior for a mixture of Gaussian distributions and examine whether or not it leads to a proper posterior with a sufficient number of observations.  In general, it does not and hence cannot be used as a reference prior. While this is a negative result (and this is why Bayesian Analysis did not deem it of sufficient importance), I find it definitely relevant because it shows that the default reference prior [in the sense that the Jeffreys prior is the primary choice in nonparametric settings] does not operate in this wide class of distributions. What is surprising is that the use of a Jeffreys-like prior on a global location-scale parameter (as in our 1996 paper with Kerrie Mengersen or our recent work with Kaniav Kamary and Kate Lee) remains legit if proper priors are used on all the other parameters. (This may be yet another illustration of the tequilla-like toxicity of mixtures!)

Francisco Rubio and Mark Steel already exhibited this difficulty of the Jeffreys prior for mixtures of densities with disjoint supports [which reveals the mixture latent variable and hence turns the problem into something different]. Which relates to another point of interest in the paper, derived from a 1988 [Valencià Conference!] paper by José Bernardo and Javier Giròn, where they show the posterior associated with a Jeffreys prior on a mixture is proper when (a) only estimating the weights p and (b) using densities with disjoint supports. José and Javier use in this paper an astounding argument that I had not seen before and which took me a while to ingest and accept. Namely, the Jeffreys prior on a observed model with latent variables is bounded from above by the Jeffreys prior on the corresponding completed model. Hence if the later leads to a proper posterior for the observed data, so does the former. Very smooth, indeed!!!

Actually, we still support the use of the Jeffreys prior but only for the mixture mixtures, because it has the property supported by Judith and Kerrie of a conservative prior about the number of components. Obviously, we cannot advocate its use over all the parameters of the mixture since it then leads to an improper posterior.

exciting week[s]

Posted in Mountains, pictures, Running, Statistics with tags , , , , , , , , , , , , , , on June 27, 2017 by xi'an

The past week was quite exciting, despite the heat wave that hit Paris and kept me from sleeping and running! First, I made a two-day visit to Jean-Michel Marin in Montpellier, where we discussed the potential Peer Community In Computational Statistics (PCI Comput Stats) with the people behind PCI Evol Biol at INRA, Hopefully taking shape in the coming months! And went one evening through a few vineyards in Saint Christol with Jean-Michel and Arnaud. Including a long chat with the owner of Domaine Coste Moynier. [Whose domain includes the above parcel with views of Pic Saint-Loup.] And last but not least! some work planning about approximate MCMC.

On top of this, we submitted our paper on ABC with Wasserstein distances [to be arXived in an extended version in the coming weeks], our revised paper on ABC consistency thanks to highly constructive and comments from the editorial board, which induced a much improved version in my opinion, and we received a very positive return from JCGS for our paper on weak priors for mixtures! Next week should be exciting as well, with BNP 11 taking place in downtown Paris, at École Normale!!!

parameter space for mixture models

Posted in Statistics, University life with tags , , , on March 24, 2017 by xi'an

“The paper defines a new solution to the problem of defining a suitable parameter space for mixture models.”

When I received the table of contents of the incoming Statistics & Computing and saw a paper by V. Maroufy and P. Marriott about the above, I was quite excited about a new approach to mixture parameterisation. Especially after our recent reposting of the weakly informative reparameterisation paper. Alas, after reading the paper, I fail to see the (statistical) point of the whole exercise.

Starting from the basic fact that mixtures face many identifiability issues, not only invariance by component permutation, but the possibility to add spurious components as well, the authors move to an entirely different galaxy by defining mixtures of so-called local mixtures. Developed by one of the authors. The notion is just incomprehensible for me: the object is a weighted sum of the basic component of the original mixture, e.g., a Normal density, and of k of its derivatives wrt its mean, a sort of parameterised Taylor expansion. Which implies the parameter is unidimensional, incidentally. The weights of this strange mixture are furthermore constrained by the positivity of the resulting mixture, a constraint that seems impossible to satisfy in the Normal case when the number of derivatives is odd. And hard to analyse in any case since possibly negative components do not enjoy an interpretation as a probability density. In exponential families, the local mixture is the original exponential family density multiplied by a polynomial. The current paper moves one step further [from the reasonable] by considering mixtures [in the standard sense] of such objects. Which components are parameterised by their mean parameter and a collection of weights. The authors then restrict the mean parameters to belong to a finite and fixed set, which elements are coerced by a maximum error rate on any compound distribution derived from this exponential family structure. The remainder of the paper discusses of the choice of the mean parameters and of an EM algorithm to estimate the parameters, with a confusing lower bound on the mixture weights that impacts the estimation of the weights. And no mention made of the positivity constraint. I remain completely bemused by the paper and its purpose: I do not even fathom how this qualifies as a mixture.

a response by Ly, Verhagen, and Wagenmakers

Posted in Statistics with tags , , , , , , , , on March 9, 2017 by xi'an

Following my demise [of the Bayes factor], Alexander Ly, Josine Verhagen, and Eric-Jan Wagenmakers wrote a very detailed response. Which I just saw the other day while in Banff. (If not in Schiphol, which would have been more appropriate!)

“In this rejoinder we argue that Robert’s (2016) alternative view on testing has more in common with Jeffreys’s Bayes factor than he suggests, as they share the same ‘‘shortcomings’’.”

Rather unsurprisingly (!), the authors agree with my position on the dangers to ignore decisional aspects when using the Bayes factor. A point of dissension is the resolution of the Jeffreys[-Lindley-Bartlett] paradox. One consequence derived by Alexander and co-authors is that priors should change between testing and estimating. Because the parameters have a different meaning under the null and under the alternative, a point I agree with in that these parameters are indexed by the model [index!]. But with which I disagree when arguing that the same parameter (e.g., a mean under model M¹) should have two priors when moving from testing to estimation. To state that the priors within the marginal likelihoods “are not designed to yield posteriors that are good for estimation” (p.45) amounts to wishful thinking. I also do not find a strong justification within the paper or the response about choosing an improper prior on the nuisance parameter, e.g. σ, with the same constant. Another a posteriori validation in my opinion. However, I agree with the conclusion that the Jeffreys paradox prohibits the use of an improper prior on the parameter being tested (or of the test itself). A second point made by the authors is that Jeffreys’ Bayes factor is information consistent, which is correct but does not solved my quandary with the lack of precise calibration of the object, namely that alternatives abound in a non-informative situation.

“…the work by Kamary et al. (2014) impressively introduces an alternative view on testing, an algorithmic resolution, and a theoretical justification.”

The second part of the comments is highly supportive of our mixture approach and I obviously appreciate very much this support! Especially if we ever manage to turn the paper into a discussion paper! The authors also draw a connection with Harold Jeffreys’ distinction between testing and estimation, based upon Laplace’s succession rule. Unbearably slow succession law. Which is well-taken if somewhat specious since this is a testing framework where a single observation can send the Bayes factor to zero or +∞. (I further enjoyed the connection of the Poisson-versus-Negative Binomial test with Jeffreys’ call for common parameters. And the supportive comments on our recent mixture reparameterisation paper with Kaniav Kamari and Kate Lee.) The other point that the Bayes factor is more sensitive to the choice of the prior (beware the tails!) can be viewed as a plus for mixture estimation, as acknowledged there. (The final paragraph about the faster convergence of the weight α is not strongly

weakly informative reparameterisations for location-scale mixtures

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , on January 19, 2017 by xi'an

fitted_density_galaxy_data_500itersWe have been working towards a revision of our reparameterisation paper for quite a while now and too advantage of Kate Lee visiting Paris this fortnight to make a final round: we have now arXived (and submitted) the new version. The major change against the earlier version is the extension of the approach to a large class of models that include infinitely divisible distributions, compound Gaussian, Poisson, and exponential distributions, and completely monotonic densities. The concept remains identical: change the parameterisation of a mixture from a component-wise decomposition to a construct made of the first moment(s) of the distribution and of component-wise objects constrained by the moment equation(s). There is of course a bijection between both parameterisations, but the constraints appearing in the latter produce compact parameter spaces for which (different) uniform priors can be proposed. While the resulting posteriors are no longer conjugate, even conditional on the latent variables, standard Metropolis algorithms can be implemented to produce Monte Carlo approximations of these posteriors.