Archive for Biometrika

marginal likelihood as exhaustive X validation

Posted in Statistics with tags , , , , , , , , on October 9, 2020 by xi'an

In the June issue of Biometrika (for which I am deputy editor) Edwin Fong and Chris Holmes have a short paper (that I did not process!) on the validation of the marginal likelihood as the unique coherent updating rule. Marginal in the general sense of Bissiri et al. (2016). Coherent in the sense of being invariant to the order of input of exchangeable data, if in a somewhat self-defining version (Definition 1). As a consequence, marginal likelihood arises as the unique prequential scoring rule under coherent belief updating in the Bayesian framework. (It is unique given the prior or its generalisation, obviously.)

“…we see that 10% of terms contributing to the marginal likelihood come from out-of-sample predictions, using on average less than 5% of the available training data.”

The paper also contains the interesting remark that the log marginal likelihood is the average leave-p-out X-validation score, across all values of p. Which shows that, provided the marginal can be approximated, the X validation assessment is feasible. Which leads to a highly relevant (imho) spotlight on how this expresses the (deadly) impact of the prior selection on the numerical value of the marginal likelihood. Leaving outsome of the least informative terms in the X-validation leads to exactly the log geometric intrinsic Bayes factor of Berger & Pericchi (1996). Most interesting connection with the Bayes factor community but one that depends on the choice of the dismissed fraction of p‘s.

in the name of eugenics [book review]

Posted in Statistics with tags , , , , , , , , , , , , on August 30, 2020 by xi'an

In preparation for the JSM round table on eugenics and statistics, organised by the COPSS Award Committee, I read the 1985 book of Daniel Kevles, In the Name of Eugenics: Genetics and the Uses of Human Heredity, as recommended by Stephen Stiegler. While a large part of the book was published in The New Yorker, in which Kevles published on a regular basis, and while he abstains from advanced methodological descriptions, focussing more on the actors of this first attempt at human genetics and of the societal consequences of biased interpretations and mistaken theories, his book is a scholarly accomplishment, with a massive section of notes and numerous references. This is a comparative history of eugenics from the earliest (Francis Galton, 1865) to the current days (1984) since “modern eugenics” survived the exposure of the Nazi crimes (including imposed sterilizations that are still enforced to this day). Comparative between the UK and the US, however, hardly considering other countries, except for a few connections with Germany and the Soviet Union, albeit in the sole perspective of Muller’s sojourn there and the uneasy “open-minded” approach to Lysenkoism by Haldane. (Japan is also mentioned in connection with Neel’s study of the genetic impact of the atomic bombs.) While discussing the broader picture, the book mostly concentrates on the scientific aspects, on how the misguided attempts to reduce intelligence to IQ tests or to a single gene, and to improve humanity (or some of its subgroups) by State imposed policies perceived as crude genetic engineering simultaneously led to modern genetics and a refutation of eugenic perspectives by most if not all. There is very little about statistical methodology per, beside stories on the creation of Biometrika and the Annals of Eugenics, but much more on the accumulation of data by eugenic societies and the exploitation of this data for ideological purposes. Galton and Pearson get the lion’s share of the book, while Fisher does not get more coverage than Haldane or Penrose. Overall, I found the book immensely informative as exposing the diversity of scientific and pseudo-scientific viewpoints within eugenism and its evolution towards human genetics as a scientific endeavour.

adaptive ABC tolerance

Posted in Books, Statistics, University life with tags , , , , , , , , , on June 2, 2020 by xi'an

“There are three common approaches for selecting the tolerance sequence (…) [they] can lead to inefficient sampling”

Umberto Simola, Jessi Cisewski-Kehe, Michael Gutmann and Jukka Corander recently arXived a paper entitled Adaptive Approximate Bayesian Computation Tolerance Selection. I appreciate that they start from our ABC-PMC paper, i.e., Beaumont et al. (2009) [although the representation that the ABC tolerances are fixed in advance is somewhat incorrect in that we used in our codes quantiles of the distances to set our tolerances.] This is also the approach advocated for the initialisation step by the current paper.  Although remaining a wee bit vague. Subsequent steps are based on the proximity between the resulting approximations to the ABC posteriors, more exactly with a quantile derived from the maximum of the ratio between two estimated successive ABC posteriors. Mimicking the Accept-Reject step if always one step too late.  The iteration stops when the ratio is almost one, possibly missing the target due to Monte Carlo variability. (Recall that the “optimal” tolerance is not zero for a finite sample size.)

“…the decrease in the acceptance rate is mitigated by the improvement in the proposed particles.”

A problem is that it depends on the form of the approximation and requires non-parametric hence imprecise steps. Maybe variational encoders could help. Interesting approach by Sugiyama et al. (2012), of which I knew nothing, the core idea being that the ratio of two densities is also the solution to minimising a distance between the numerator density and a variable function times the bottom density. However since only the maximum of the ratio is needed, a more focused approach could be devised. Rather than first approximating the ratio and second maximising the estimated ratio. Maybe the solution of Goffinet et al. (1992) on estimating an accept-reject constant could work.

A further comment is that the estimated density is not properly normalised, which lessens the Accept-Reject analogy since the optimum may well stand above one. And thus stop “too soon”. (Incidentally, the paper contains the mixture example of Sisson et al. (2007), for which our own graphs were strongly criticised during our Biometrika submission!)

the exponential power of now

Posted in Books, Statistics, University life with tags , , , , , , , , , , on March 22, 2020 by xi'an

The New York Times had an interview on 13 March with Britta Jewell (MRC, Imperial College London) and Nick Jewell (London School of Hygiene and Tropical Medicine & U of C Berkeley), both epidemiologists. (Nick is also an AE for Biometrika.) Where they explain quite convincingly that the devastating power of the exponential growth and the resulting need for immediate reaction. An urgency that Western governments failed to heed, unsurprisingly including the US federal government. Maybe they should have been told afresh about the legend of paal paysam, where the king who lost to Krishna was asked to double rice grains on the successive squares of a chess board. (Although this is presumably too foreign a thought experiment for The agent orange. He presumably prefers the unbelievable ideological rantings of John Ioannides. Who apparently does mind sacrificing “people with limited life expectancies” for the sake of the economy.) Incidentally, I find the title “The exponential power of now” fabulous!

séminaire P de S

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , on February 18, 2020 by xi'an

As I was in Paris and free for the occasion (!), I attended the Paris Statistics seminar this afternoon, in the Latin Quarter. With a first talk by Kweku Abraham on Bayesian inverse problems set a prior on the quantity of interest, γ, rather than its transform G(γ), observed with noise. Always perturbed by the juggling of different distances, like L² versus Kullback-Leibler, in non-parametric frameworks. Reminding me of probabilistic numerics, at least in the framework, since the crux of the talk was 100% about convergence. And a second talk by Leanaïc Chizat on convex neural networks corresponding to an infinite number of neurons, with surprising properties, including implicit bias. And a third talk by Anne Sabourin on PCA for extremes. Which assumed very little on the model but more on the geometry of the distribution, like extremes being concentrated on a subspace. As I was rather tired from an intense week at Warwick, and after a weekend of reading grant applications and Biometrika submissions (!), my foggy brain kept switching to these proposals, trying to make connections with the talks, not completely inappropriately in two cases out of three. (I am afraid the same may happen tomorrow at our probability seminar on computer-based proofs!)