**W**hen looking at a question on X validated, on the expected Metropolis-Hastings ratio being one (not all the time!), I was somewhat bemused at the OP linking to an anonymised paper under review for ICLR, as I thought this was breaching standard confidentiality rules for reviews. Digging a wee bit deeper, I realised this was a paper from the previous ICLR conference, already published both on arXiv and in the 2018 conference proceedings, and that ICLR was actually resorting to an open review policy where both papers and reviews were available and even better where anyone could comment on the paper while it was under review. And after. Which I think is a great idea, the worst possible situation being a poor paper remaining un-discussed. While I am not a big fan of the brutalist approach of many machine-learning conferences, where the restrictive format of both submissions and reviews is essentially preventing in-depth reviews, this feature should be added to statistics journal webpages (until PCIs become the norm).

## Archive for review

## open reviews

Posted in Statistics with tags brutalism, cross validated, ICLR, Peer Community, proceedings, refereeing, review on September 13, 2019 by xi'an## ABC intro for Astrophysics

Posted in Books, Kids, Mountains, R, Running, Statistics, University life with tags ABC, Approximate Bayesian computation, Autrans, Bayesian foundations, Bayesian methodology, Book, computational astrophysics, review, Statistics for Astrophysics, summer course, survey, Vercors on October 15, 2018 by xi'an**T**oday I received in the mail a copy of the short book published by edp sciences after the courses we gave last year at the astrophysics summer school, in Autrans. Which contains a quick introduction to ABC extracted from my notes (which I still hope to turn into a book!). As well as a longer coverage of Bayesian foundations and computations by David Stenning and David van Dyk.

## accelerating MCMC

Posted in Statistics with tags acceleration of MCMC algorithms, coupling, Hamiltonian Monte Carlo, India, MCMC, Monte Carlo Statistical Methods, motorbike, NUTS, Rajasthan, review, survey, tempering, WIREs on May 29, 2017 by xi'an**I** have recently [well, not so recently!] been asked to write a review paper on ways of accelerating MCMC algorithms for the [review] journal WIREs Computational Statistics and would welcome all suggestions towards the goal of accelerating MCMC algorithms. Besides [and including more on]

- coupling strategies using different kernels and switching between them;
- tempering strategies using flatter or lower dimensional targets as intermediary steps, e.g., à la Neal;
- sequential Monte Carlo with particle systems targeting again flatter or lower dimensional targets and adapting proposals to this effect;
- Hamiltonian MCMC, again with connections to Radford (and more generally ways of avoiding rejections);
- adaptive MCMC, obviously;
- Rao-Blackwellisation, just as obviously (in the sense that increasing the precision in the resulting estimates means less simulations).

## estimation versus testing [again!]

Posted in Books, Statistics, University life with tags Bayes factors, Bayesian inference, Harold Jeffreys, hypothesis testing, parameter estimation, point null hypotheses, psychology, refereeing, review, spike-and-slab prior, unification on March 30, 2017 by xi'an**T**he following text is a review I wrote of the paper “Parameter estimation and Bayes factors”, written by J. Rouder, J. Haff, and J. Vandekerckhove. (As the journal to which it is submitted gave me the option to sign my review.)

The opposition between estimation and testing as a matter of prior modelling rather than inferential goals is quite unusual in the Bayesian literature. In particular, if one follows Bayesian decision theory as in Berger (1985) there is no such opposition, but rather the use of different loss functions for different inference purposes, while the Bayesian model remains single and unitarian.

Following Jeffreys (1939), it sounds more congenial to the Bayesian spirit to return the posterior probability of an hypothesis * H⁰* as an answer to the question whether this hypothesis holds or does not hold. This however proves impossible when the “null” hypothesis

*has prior mass equal to zero (or is not measurable under the prior). In such a case the mathematical answer is a probability of zero, which may not satisfy the experimenter who asked the question. More fundamentally, the said prior proves inadequate to answer the question and hence to incorporate the information contained in this very question. This is how Jeffreys (1939) justifies the move from the original (and deficient) prior to one that puts some weight on the null (hypothesis) space. It is often argued that the move is unnatural and that the null space does not make sense, but this only applies when believing very strongly in the model itself. When considering the issue from a modelling perspective, accepting the null*

**H⁰***means using a new model to represent the model and hence testing becomes a model choice problem, namely whether or not one should use a complex or simplified model to represent the generation of the data. This is somehow the “unification” advanced in the current paper, albeit it does appear originally in Jeffreys (1939) [and then numerous others] rather than the relatively recent Mitchell & Beauchamp (1988). Who may have launched the spike & slab denomination.*

**H⁰**I have trouble with the analogy drawn in the paper between the spike & slab estimate and the Stein effect. While the posterior mean derived from the spike & slab posterior is indeed a quantity drawn towards zero by the Dirac mass at zero, it is rarely the point in using a spike & slab prior, since this point estimate does not lead to a conclusion about the hypothesis: for one thing it is never exactly zero (if zero corresponds to the null). For another thing, the construction of the spike & slab prior is both artificial and dependent on the weights given to the spike and to the slab, respectively, to borrow expressions from the paper. This approach thus leads to model averaging rather than hypothesis testing or model choice and therefore fails to answer the (possibly absurd) question as to which model to choose. Or refuse to choose. But there are cases when a decision must be made, like continuing a clinical trial or putting a new product on the market. Or not.

In conclusion, the paper surprisingly bypasses the decision-making aspect of testing and hence ends up with a inconclusive setting, staying midstream between Bayes factors and credible intervals. And failing to provide a tool for decision making. The paper also fails to acknowledge the strong dependence of the Bayes factor on the tail behaviour of the prior(s), which cannot be [completely] corrected by a finite sample, hence its relativity and the unreasonableness of a fixed scale like Jeffreys’ (1939).

## new reproducibility initiative in TOMACS

Posted in Books, Statistics, University life with tags academic journals, ACM, ACM Transactions on Modeling and Computer Simulation, computer, modelling, refereeing, replicating computational results procedure, reproducible research, review, software, TOMACS on April 12, 2016 by xi'an*[A quite significant announcement last October from *TOMACS* that I had missed:]*

To improve the reproducibility of modeling and simulation research, TOMACS* * is pursuing two strategies.

*Number one:* authors are encouraged to include sufficient information about the core steps of the scientific process leading to the presented research results and to make as many of these steps as transparent as possible, e.g., data, model, experiment settings, incl. methods and configurations, and/or software. Associate editors and reviewers will be asked to assess the paper also with respect to this information. Thus, although not required, submitted manuscripts which provide clear information on how to generate reproducible results, whenever possible, will be considered favorably in the decision process by reviewers and the editors.

*Number two:* we will form a new replicating computational results activity in modeling and simulation as part of the peer reviewing process (adopting the procedure RCR of ACM TOMS). Authors who are interested in taking part in the RCR activity should announce this in the cover letter. The associate editor and editor in chief will assign a RCR reviewer for this submission. This reviewer will contact the authors and will work together with the authors to replicate the research results presented. Accepted papers that successfully undergo this procedure will be advertised at the TOMACS web page and will be marked with an ACM reproducibility brand. The RCR activity will take place in parallel to the usual reviewing process. The reviewer will write a short report which will be published alongside the original publication. TOMACS also plans to publish short reports about lessons learned from non-successful RCR activities.

*[And now the first paper reviewed according to this protocol has been accepted:]*

The paper Automatic Moment-Closure Approximation of Spatially Distributed Collective Adaptive Systems is the first paper that took part in the new replicating computational results (RCR) activity of TOMACS. The paper completed successfully the additional reviewing as documented in its RCR report. This reviewing is aimed at ensuring that computational results presented in the paper are replicable. Digital artifacts like software, mechanized proofs, data sets, test suites, or models, are evaluated referring to ease of use, consistency, completeness, and being well documented.

## on the Jeffreys-Lindley’s paradox (revision)

Posted in Statistics, University life with tags Aris Spanos, Bayesian foundations, Dennis Lindley, improper priors, Jeffreys-Lindley paradox, paradoxes, philosophy, Philosophy of Science, review, revision on September 17, 2013 by xi'an**A**s mentioned here a few days ago, I have been revising my paper on the Jeffreys-Lindley’s paradox paper for Philosophy of Science. It came as a bit of a (very pleasant) surprise that this journal was ready to consider a revised version of the paper given that I have no formal training in philosophy and that the (first version of the) paper was rather hurriedly made of a short text written for the 95th birthday of Dennis Lindley and of my blog post on Aris Spanos’ “*Who should be afraid of the Jeffreys-Lindley paradox?*“, recently published in Philosophy of Science. So I found both reviewers very supportive and I am grateful for their suggestions to improve both the scope and the presentation of the paper. It has been resubmitted and rearXived, and I am now waiting for the decision of the editorial team with *the* appropriate philosophical sense of detachment…