**I** recently came across an ABC paper in PLoS ONE by Xavier Rubio-Campillo applying this simulation technique to the validation of some differential equation models linking force sizes and values for both sides. The dataset is made of battle casualties separated into four periods, from *pike and musket* to the *American Civil War*. The outcome is used to compute an ABC Bayes factor but it seems this computation is highly dependent on the tolerance threshold. With highly variable numerical values. The most favoured model includes some fatigue effect about the decreasing efficiency of armies along time. While the paper somehow reminded me of a most peculiar book, I have no idea on the depth of this analysis, namely on how relevant it is to model a battle through a two-dimensional system of differential equations, given the numerous factors involved in the matter…

Filed under: Books, Kids, pictures, Statistics Tagged: ABC, ABC model choice, Bayes factor, differential equation, elves, PLoS ONE, warhammer ]]>

Filed under: Kids, pictures, Travel, University life Tagged: England, lawn, Magdalen College, University of Oxford, winter light ]]>

“If no information is available, π(α|M) must not deliver information about α.”

**I**n a recent arXival apparently submitted to Bayesian Analysis, Giovanni Mana and Carlo Palmisano discuss of the choice of priors in metrology. Which reminded me of this meeting I attended at the Bureau des Poids et Mesures in Sèvres where similar debates took place, albeit being led by ferocious anti-Bayesians! Their reference prior appears to be the Jeffreys prior, because of its reparameterisation invariance.

“The relevance of the Jeffreys rule in metrology and in expressing uncertainties in measurements resides in the metric invariance.”

This, along with a second order approximation to the Kullback-Leibler divergence, is indeed one reason for advocating the use of a Jeffreys prior. I at first found it surprising that the (usually improper) prior is used in a marginal likelihood, as it cannot be normalised. A source of much debate [and of our alternative proposal].

“To make a meaningful posterior distribution and uncertainty assessment, the prior density must be covariant; that is, the prior distributions of different parameterizations must be obtained by transformations of variables. Furthermore, it is necessary that the prior densities are proper.”

The above quote is quite interesting both in that the notion of *covariant* is used rather than *invariant* or *equivariant*. And in that properness is indicated as a requirement. (Even more surprising is the noun associated with covariant, since it clashes with the usual notion of covariance!) They conclude that the marginal associated with an improper prior is null because the normalising constant of the prior is infinite.

“…the posterior probability of a selected model must not be null; therefore, improper priors are not allowed.”

Maybe not so surprisingly given this stance on improper priors, the authors cover a collection of “paradoxes” in their final and longest section: most of which makes little sense to me. First, they point out that the reference priors of Berger, Bernardo and Sun (2015) are not invariant, but this should not come as a surprise given that they focus on parameters of interest versus nuisance parameters. The second issue pointed out by the authors is that under Jeffreys’ prior, the posterior distribution of a given normal mean for n observations is a *t* with n degrees of freedom while it is a *t* with n-1 degrees of freedom from a frequentist perspective. This is not such a paradox since both distributions work in different spaces. Further, unless I am confused, this is one of the marginalisation paradoxes, which more straightforward explanation is that marginalisation is not meaningful for improper priors. A third paradox relates to a contingency table with a large number of cells, in that the posterior mean of a cell probability goes as the number of cells goes to infinity. (In this case, Jeffreys’ prior is proper.) Again not much of a bummer, there is simply not enough information in the data when faced with a infinite number of parameters. Paradox #4 is the Stein paradox, when estimating the squared norm of a normal mean. Jeffreys’ prior then leads to a constant bias that increases with the dimension of the vector. Definitely a bad point for Jeffreys’ prior, except that there is no Bayes estimator in such a case, the Bayes risk being infinite. Using a renormalised loss function solves the issue, rather than introducing as in the paper uniform priors on intervals, which require hyperpriors without being particularly compelling. The fifth paradox is the Neyman-Scott problem, with again the Jeffreys prior the culprit since the estimator of the variance is inconsistent. By a multiplicative factor of 2. Another stone in Jeffreys’ garden [of forking paths!]. The authors consider that the prior gives zero weight to any interval not containing zero, as if it was a proper probability distribution. And “solve” the problem by avoid zero altogether, which requires of course to specify a lower bound on the variance. And then introducing another (improper) Jeffreys prior on that bound… The last and final paradox mentioned in this paper is one of the marginalisation paradoxes, with a bizarre explanation that since the mean and variance μ and σ are not independent a posteriori, “the information delivered by x̄ should not be neglected”.

Filed under: Books, Statistics, University life Tagged: evidence, Harold Jeffreys, hierarchical Bayesian modelling, improper priors, inadmissibility, invariance, Jeffreys priors, marginalisation paradoxes, Neyman-Scott problem, noninformative priors, over-interpretation of improper priors, reference priors ]]>

“Vous voulez me faire jouer un rôle hein mais ca ne marche pas avec moi, vous jouez tous des rôles, la journée finie, vous rentrez chez vous, vous vous démaquillez, vous enlevez vos masques. Plutôt vous avez deux masques, un pour la journée et un pour le soir, le vrai et le faux, le faux c’est celui de la journée.”

Filed under: pictures, Travel Tagged: métro static, Paris, ramblings ]]>

A question that came to me from reading the introduction to the paper is why a method like Møller et al.’s (2006) auxiliary variable trick should be considered more “exact” than the pseudo-marginal approach of Andrieu and Roberts (2009) since the later can equally be seen as an auxiliary variable approach. The answer was on the next page (!) as it is indeed a special case of Andrieu and Roberts (2009). Murray et al. (2006) also belongs to this group with a product-type importance sampling estimator, based on a sequence of tempered intermediaries… As noted by the authors, there is a whole spectrum of related methods in this area, some of which qualify as exact-approximate, inexact approximate and noisy versions.

Their main argument is to support importance sampling as the method of choice, including sequential Monte Carlo (SMC) for large dimensional parameters. The auxiliary variable of Møller et al.’s (2006) is then part of the importance scheme. In the first toy example, a Poisson is opposed to a Geometric distribution, as in our ABC model choice papers, for which a multiple auxiliary variable approach dominates both ABC and Simon Wood’s synthetic likelihood for a given computing cost. I did not spot which artificial choice was made for the Z(θ)’s in both models, since the constants are entirely known in those densities. A very interesting section of the paper is when envisioning *biased* approximations to the intractable density. If only because the importance weights are most often biased due to the renormalisation (possibly by resampling). And because the variance derivations are then intractable as well. However, due to this intractability, the paper can only approach the impact of those approximations via empirical experiments. This leads however to the interrogation on how to evaluate the validity of the approximation in settings where truth and even its magnitude are unknown… Cross-validation and bootstrap type evaluations may prove too costly in realistic problems. Using biased solutions thus mostly remains an open problem in my opinion.

The SMC part in the paper is equally interesting if only because it focuses on the data thinning idea studied by Chopin (2002) and many other papers in the recent years. This made me wonder why an alternative relying on a sequence of approximations to the target with *tractable* normalising constants could not be considered. A whole sequence of auxiliary variable completions sounds highly demanding in terms of computing budget and also requires a corresponding sequence of calibrations. (Now, ABC fares no better since it requires heavy simulations and repeated calibrations, while further exhibiting a damning missing link with the target density. ) Unfortunately, embarking upon a theoretical exploration of the properties of approximate SMC is quite difficult, as shown by the strong assumptions made in the paper to bound the total variation distance to the true target.

Filed under: Books, Kids, pictures, Statistics, Travel, University life Tagged: ABC, auxiliary variable, bias vs. variance, CRiSM, estimating constants, importance sampling, Monte Carlo Statistical Methods, normalising constant, pseudo-marginal MCMC, SMC, unbiased estimation, University of Warwick ]]>

Filed under: Kids Tagged: état d'urgence, binationals, France, French government, nationality, Paris, République ]]>

Filed under: Kids, pictures, Wines Tagged: Chinese restaurant process, Italian wines, Notre-Dame, Paris, risotto, rue Saint-Louis-en-l'Ile, scallops ]]>

I shall discuss the problem of lack of replicability of results in science, and point at selective inference as a statistical root cause. I shall then present a few strategies for addressing selective inference, and their application in genomics, brain research and earlier phases of clinical trials where both primary and secondary endpoints are being used.

**Details:** February 8, 2016, 16h, Université Pierre & Marie Curie, campus Jussieu, salle 15-16-101.

Filed under: pictures, Statistics, University life Tagged: Benjamini, false discovery rate, Jussieu, p-values, replication crisis, seminar, Université Pierre et Marie Curie ]]>

Somewhat tangentially, this reminds me of a paper I read recently where the Geometric Geo(p) distribution was represented as the sum of two independent variates, namely a Binomial B(p/(1+p)) variate and a Geometric 2G(p²) variate. A formula that can be iterated for arbitrarily long, meaning that a Geometric variate is an infinite sum of [powers of two] weighted Bernoulli variates. I like this representation very much (although it may well have been know for quite a while). However I fail to see how to take advantage of it for simulation purposes. Unless the number of terms in the sum can be determined first. And even then it would be less efficient than simulating a single Geometric…

Filed under: Books, Kids, Statistics Tagged: arXiv, Bernoulli distribution, Geometric distribution, normal distribution ]]>

Filed under: Books, Kids, pictures, Travel, University life, Wines Tagged: ale, Eagle and Child, Leuven or yet again in Oxford, Lewis, pub, steak and ale pie, Tolkien ]]>