Nature [5 Jan issue]

Nature in its 5 Jan issue has an editorial by Daniël Lakens asking for statistical reviews prior to research being performed and data being collected which sounds like a reasonable idea provided reviewers with proper expertise and dedication can be found, an issue the editorial does not mention. Main focus on sample size that sounds overly simplistic… it contains the following funny (?) jab:

“I do not propose that reviewers debate matters as such as frequentist versus Bayesian philosophies of statistics.”

One could see a connexion with preregistered trials, with the sound argument that hypotheses should be clearly stated prior to getting data.

The issue also contains an open-access paper by WHO and U of Washington researchers (incl. Bayesian John Wakefield) on estimating the number of COVID-19 deaths from excess deaths. With the issue that data is missing for some countries. With a critical commentary from Enrique Acosta on not adjusting for avoided deaths. And apparently (and surprisingly) not accounting for age structure in each country, esp. since regression is involved. The modelling is done via a Poisson count model. And analysed by Bayesian methods. As often I wonder why France doesn’t feature in the picture, except for a mention that the ratio of excess deaths to COVID-19 deaths is less than one, and French Guiana is not on the maps… Unclear issues about highly reliable countries like Germany and Sweden. And splines… Instead of Gaussian processes. No attempt at capture recapture?

And a somewhat puzzling paper [rewarded by the journal cover] on diminishing disruption of scientific papers over time. It is sort of obvious that as the numbers explode novelty and impact diminish. If only because an increasing number of papers never get cited. Based on a single CD index (with a typo in the formula!) Nothing about maths? As noted by the authors in their conclusion the sheer number of disruptive papers had remained essentially constant…

master project?

A potential master project for my students next year inspired by an X validated question: given a Gaussian mixture density

f(x)\propto\sum_{i=1}^m \omega_i \sigma^{-1}\,\exp\{-(x-\mu_i)^2/2\sigma^2\}

with m known, the weights summing up to one, and the (prior) information that all means are within (-C,C), derive the parameters of this mixture from a sufficiently large number of evaluations of f. Pay attention to the numerical issues associated with the resolution.  In a second stage, envision this problem from an exponential spline fitting perspective and optimise the approach if feasible.

parallel tempering on optimised paths

Saifuddin Syed, Vittorio Romaniello, Trevor Campbell, and Alexandre Bouchard-Côté, whom I met and discussed with on my “last” trip to UBC, on December 2019, just arXived a paper on parallel tempering (PT), making the choice of tempering path an optimisation problem. They address the touchy issue of designing a sequence of tempered targets when the starting distribution π⁰, eg the prior, and the final distribution π¹, eg the posterior, are hugely different, eg almost singular.

“…theoretical analysis of reversible variants of PT has shown that adding too many intermediate chains can actually deteriorate performance (…) [while] on non reversible regime adding more chains is guaranteed to improve performances.”

The above applies to geometric combinations of π⁰ and π¹. Which “suffers from an arbitrarily suboptimal global communication barrier“, according to the authors (although the counterexample is not completely convincing since π⁰ and π¹ share the same variance). They propose a more non-linear form of tempering with constraints on the dependence of the powers on the temperature t∈(0,1).  Defining the global communication barrier as an average over temperatures of the rejection rate, the path characteristics (e.g., the coefficients of a spline function) can then be optimised in terms of this objective. And the temperature schedule is derived from the fact that the non-asymptotic round trip rate is maximized when the rejection rates are all equal. (As a side item, the technique exposed in the earlier tempering paper by Syed et al. was recently exploited for a night high resolution imaging of a black hole from the M87 galaxy.)

