**T**he paper ‘Unbiased Markov chain Monte Carlo methods with couplings’ by Pierre Jacob et al. will be discussed (or Read) tomorrow at the Royal Statistical Society, 12 Errol Street, London, tomorrow night, Wed 11 December, at 5pm London time. With a pre-discussion session at 3pm, involving Chris Sherlock and Pierre Jacob, and chaired by Ioanna Manolopoulou. While I will alas miss this opportunity, due to my trip to Vancouver over the weekend, it is great that that the young tradition of pre-discussion sessions has been rekindled as it helps put the paper into perspective for a wider audience and thus makes the more formal Read Paper session more profitable. As we discussed the paper in Paris Dauphine with our graduate students a few weeks ago, we will for certain send one or several written discussions to Series B!

## Archive for Journal of the Royal Statistical Society

## unbiased MCMC discussed at the RSS tomorrow night

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags AABI, coupling, discussion paper, Journal of the Royal Statistical Society, Markov chain Monte Carlo algorithm, MCMC, Read paper, Royal Statistical Society, Series B, unbiasedness, Université Paris Dauphine, Vancouver on December 10, 2019 by xi'an## a good start in Series B!

Posted in Books, pictures, Statistics, University life with tags ABC, approximate Bayesian inference, generative model, Journal of the Royal Statistical Society, Olympic National Park, peer review, Series B, sunrise, Wasserstein distance on January 5, 2019 by xi'an**J**ust received the great news for the turn of the year that our paper on ABC using Wasserstein distance was accepted in Series B! Inference in generative models using the Wasserstein distance, written by Espen Bernton, Pierre Jacob, Mathieu Gerber, and myself, bypasses the (nasty) selection of summary statistics in ABC by considering the Wasserstein distance between observed and simulated samples. It focuses in particular on non-iid cases like time series in what I find fairly innovative ways. I am thus very glad the paper is going to appear in JRSS B, as it has methodological consequences that should appeal to the community at large.

## selected parameters from observations

Posted in Books, Statistics with tags censored data, FDR, joint dis, Journal of the Royal Statistical Society, random effects, ranking and selection, Stephen Senn, truncated normal on December 7, 2018 by xi'an**I** recently read a fairly interesting paper by Daniel Yekutieli on a Bayesian perspective for parameters selected after viewing the data, published in Series B in 2012. (Disclaimer: I was not involved in processing this paper!)

The first example is to differentiate the Normal-Normal mean posterior when θ is N(0,1) and x is N(θ,1) from the restricted posterior when θ is N(0,1) and x is N(θ,1) truncated to (0,∞). By restating the later as the repeated generation from the joint until x>0. This does not sound particularly controversial, except for the notion of *selecting the parameter after viewing the data*. That the posterior support may depend on the data is not that surprising..!

“The observation that selection affects Bayesian inference carries the important implicationthat in Bayesian analysis of large data sets, for each potential parameter,it is necessary to explicitly specify a selection rule that determines when inferenceis provided for the parameter and provide inference that is based on theselection-adjusted posterior distribution of the parameter.” (p.31)

The more interesting distinction is between “fixed” and “random” parameters (Section 2.1), which separate cases where the data is from a truncated distribution (given the parameter) and cases where the joint distribution is truncated but misses the normalising constant (function of θ) for the truncated sampling distribution. The “mixed” case introduces an hyperparameter λ and the normalising constant integrates out θ and depends on λ. Which amounts to switching to another (marginal) prior on θ. This is quite interesting even though one can debate of the very notions of “random” and “mixed” “parameters”, which are those where the posterior most often changes, as true parameters. Take for instance Stephen Senn’s example (p.6) of the mean associated with the largest observation in a Normal mean sample, with distinct means. When accounting for the distribution of the largest variate, this random variable is no longer a Normal variate with a single unknown mean but it instead depends on all the means of the sample. Speaking of the largest observation mean is therefore misleading in that it is neither the mean of the largest observation, nor a parameter *per se* since the index [of the largest observation] is a random variable induced by the observed sample.

In conclusion, a very original article, if difficult to assess as it can be argued that selection models other than the “random” case result from an intentional modelling choice of the joint distribution.

## visual effects

Posted in Books, pictures, Statistics with tags Bayesian inference, Cardiff, concrete shoes, data visualisation, fudge, Journal of the Royal Statistical Society, leave-one-out calibration, noninformative priors, Royal Statistical Society, RSS, Series A, Statistical Modeling on November 2, 2018 by xi'an**A**s advertised and re-discussed by Dan Simpson on the Statistical Modeling, &tc. blog he shares with Andrew and a few others, the paper Visualization in Bayesian workflow he wrote with Jonah Gabry, Aki Vehtari, Michael Betancourt and Andrew Gelman was one of three discussed at the RSS conference in Cardiff, last ~~week~~ month, as a Read Paper for Series A. I had stored the paper when it came out towards reading and discussing it, but as often this good intention led to no concrete ending. [Except *concrete* as in *concrete shoes*…] Hence a few notes rather than a discussion in Series ~~B~~ A.

Exploratory data analysis goes beyond just plotting the data, which should sound reasonable to all modeling readers.

Fake data [not fake news!] can be almost [more!] as valuable as real data for building your model, oh yes!, this is the message I am always trying to convey to my first year students, when arguing about the connection between models and simulation, as well as a defense of ABC methods. And more globally of the very idea of statistical modelling. While indeed “Bayesian models with proper priors are generative models”, I am not particularly fan of using the prior predictive [or the evidence] to assess the prior as it may end up in a classification of more or less all but terrible priors, meaning that all give very little weight to neighbourhoods of high likelihood values. Still, in a discussion of a TAS paper by Seaman et al. on the role of prior, Kaniav Kamary and I produced prior assessments that were similar to the comparison illustrated in Figure 4. (And this makes me wondering which point we missed in this discussion, according to Dan.) Unhappy am I with the weakly informative prior illustration (and concept) as the amount of fudging and calibrating to move from the immensely vague choice of N(0,100) to the fairly tight choice of N(0,1) or N(1,1) is not provided. The paper reads like these priors were the obvious and first choice of the authors. I completely agree with the warning that “the utility of the ~~the~~ prior predictive distribution to evaluate the model does not extend to utility in selecting between models”.

MCMC diagnostics, beyond trace plots, yes again, but this recommendation sounds a wee bit outdated. (As our 1998 reviewww!) Figure 5(b) links different parameters of the model with lines, which does not clearly relate to a better understanding of convergence. Figure 5(a) does not tell much either since the green (divergent) dots stand within the black dots, at least in the projected 2D plot (and how can one reach beyond 2D?) Feels like I need to rtfm..!

“Posterior predictive checks are vital for model evaluation”, to wit that I find Figure 6 much more to my liking and closer to my practice. There could have been a reference to Ratmann et al. for ABC where graphical measures of discrepancy were used in conjunction with ABC output as direct tools for model assessment and comparison. Essentially predicting a zero error with the ABC posterior predictive. And of course “posterior predictive checking makes use of the data twice, once for the fitting and once for the checking.” Which means one should either resort to loo solutions (as mentioned in the paper) or call for calibration of the double-use by re-simulating pseudo-datasets from the posterior predictive. I find the suggestion that “it is a good idea to choose statistics that are orthogonal to the model parameters” somewhat antiquated, in that this sounds like rephrasing the primeval call to ancillary statistics for model assessment (Kiefer, 1975), while pretty hard to implement in modern complex models.

## free and graphic session at RSS 2018 in Cardiff

Posted in pictures, Statistics, Travel, University life with tags annual conference, Cardiff, data visualisation, graphics, Journal of the Royal Statistical Society, Royal Statistical Society, RSS, Wales on July 11, 2018 by xi'an**R**eposting an email I received from the Royal Statistical Society, this is to announce a discussion session on three papers on Data visualization in Cardiff City Hall next September 5, as a free part of the RSS annual conference. (But the conference team must be told in advance.)

**Paper**: ‘**Visualizing spatiotemporal models with virtual reality: from fully immersive environments to applications in stereoscopic view****’**

**Authors**: Stefano Castruccio (University of Notre Dame, USA) and Marc G. Genton and Ying Sun (King Abdullah University of Science and Technology, Thuwal)

** ****Paper: ****‘****Visualization in Bayesian workflow’**

**Authors:**** **Jonah Gabry (Columbia University, New York), Daniel Simpson (University of Toronto), Aki Vehtari (Aalto University, Espoo), Michael Betancourt (Columbia University, New York, and Symplectomorphic, New York) and Andrew Gelman (Columbia University, New York)

**Paper: ‘****Graphics for uncertainty’**

**Authors: **Adrian W. Bowman (University of Glasgow)

*PDFs and supplementary files of these papers from StatsLife and the RSS website. As usual, contributions can be sent in writing, with a deadline of September 19.*

## the end of the Series B’log…

Posted in Books, Statistics, University life with tags blogging, discussion paper, Journal of the Royal Statistical Society, Series B, Series B'log on September 22, 2017 by xi'an**T**oday is the last and final day of Series B’log as David Dunson, Piotr Fryzlewicz and myself have decided to stop the experiment, *faute de combattants*. (As we say in French.) The authors nicely contributed long abstracts of their papers, for which I am grateful, but with a single exception, no one came out with comments or criticisms, and the idea to turn some Series B papers into discussion papers does not seem to appeal, at least in this format. Maybe the concept will be rekindled in another form in the near future, but for now we let it lay down. So be it!