In the paper, the authors consider a spike-and… forest prior!, where the Bayesian CART selection of active covariates proceeds through a regression tree, selected covariates appearing in the tree and others not appearing. With a sparsity prior on the tree partitions and this new ABC approach to select the subset of active covariates. A specific feature is in splitting the data, one part to learn about the regression function, simulating from this function and comparing with the remainder of the data. The paper further establishes that ABC Bayesian Forests are consistent for variable selection.

“…we observe a curious empirical connection between π(θ|x,ε), obtained with ABC Bayesian Forests and rescaled variable importances obtained with Random Forests.”

The difference with our ABC-RF model choice paper is that we select summary statistics [for classification] rather than covariates. For instance, in the current paper, simulation of pseudo-data will depend on the selected subset of covariates, meaning simulating a model index, and then generating the pseudo-data, acceptance being a function of the L² distance between data and pseudo-data. And then relying on all ABC simulations to find which variables are in more often than not to derive the median probability model of Barbieri and Berger (2004). Which does not work very well if implemented naïvely. Because of the immense size of the model space, it is quite hard to find pseudo-data close to actual data, resulting in either very high tolerance or very low acceptance. The authors get over this difficulty by a neat device that reminds me of fractional or intrinsic (pseudo-)Bayes factors in that the dataset is split into two parts, one that learns about the posterior given the model index and another one that simulates from this posterior to compare with the left-over data. Bringing simulations closer to the data. I do not remember seeing this trick before in ABC settings, but it is very neat, assuming the small data posterior can be simulated (which may be a fundamental reason for the trick to remain unused!). Note that the split varies at each iteration, which means there is no impact of ordering the observations.

]]>*“We demonstrate HMC’s sensitivity to these parameters by sampling from a bivariate Gaussian with correlation coefficient 0.99. We consider three settings (ε,L) = {(0.16; 40); (0.16; 50); (0.15; 50)}”* Ziyu Wang, Shakir Mohamed, and Nando De Freitas. 2013

**I**n an experiment with my PhD student Changye Wu (who wrote all R codes used below), we looked back at a strange feature in an 2013 ICML paper by Wang, Mohamed, and De Freitas. Namely, a rather poor performance of an Hamiltonian Monte Carlo (leapfrog) algorithm on a two-dimensional strongly correlated Gaussian target, for very specific values of the parameters (ε,L) of the algorithm.

The Gaussian target associated with this sample stands right in the middle of the two clouds, as identified by Wang et al. And the leapfrog integration path for (ε,L)=(0.15,50)

keeps jumping between the two ridges (or tails) , with no stop in the middle. Changing ever so slightly (ε,L) to (ε,L)=(0.16,40) does not modify the path very much

but the HMC output is quite different since the cloud then sits right on top of the target

with no clear explanation except for a sort of periodicity in the leapfrog sequence associated with the velocity generated at the start of the code. Looking at the Hamiltonian values for (ε,L)=(0.15,50)

and for (ε,L)=(0.16,40)

does not help, except to point at a sequence located far in the tails of this Hamiltonian, surprisingly varying when supposed to be constant. At first, we thought the large value of ε was to blame but much smaller values still return poor convergence performances. As below for (ε,L)=(0.01,450)

]]>**A** very pleasant stroll through central Paris this afternoon, during “la” finale, when France was playing Croatia. Bars were all overflowing onto the pavements and sometimes the streets, each action was echoed throughout town, and we certainly did not miss any goal, even from the heart of the Luxembourg gardens! Which were deserted except for the occasional tourist, just as the main thoroughfares, except for police cars and emergency vehicles. Since the game ended, horns have been honking almost nonstop, even in the quietest suburbs.

The second article was more surprising as it defended the use of algorithms for more democracy. Nothing less. Written by Wendy Tam Cho, professor of political sciences, law, statistics, and mathematics at UIUC, it argued that the software that she develops to construct electoral maps produces fair maps. Which sounds over-rosy imho, as aiming to account for all social, ethnic, income, &tc., groups, i.e., most of the axes that define a human, is meaningless, if only because the structure of these groups is not frozen in time. To state that “computers are impervious to the lure of power” is borderline ridiculous, as computers and algorithms are [so far] driven by humans. This is not to say that gerrymandering should not be fought by technological means, especially and obviously by open source algorithms, as existing proposals (discussed here) demonstrate, but to entertain the notion of a perfectly representative redistricting is not only illusory, but also far from democratic as it shies away from the one person one vote at the basis of democracy. And the paper leaves us on the dark as to whom will decide on which group or which characteristic need be represented in the votes. Of course, this is the impression obtained by reading a one page editorial in Nature [in an overcrowded and sweltering commuter train] rather than the relevant literature. Nonetheless, I remain puzzled at why this editorial was ever published. (Speaking of democracy, the issue contains also warning reports about Hungary’s ultra-right government taking over the Hungarian Academy of Sciences.)

]]>**I**n conjunction with the official (if not state-) visit of Donald Trump to the UK, a list of demonstrations throughout the kingdom, on top of the national demonstration in London on July 13, Together Against Trump, jointly organised by the Stop Trump Coalition (STC) and Stand Up To Trump (SUTT). (Even stoptrump.uk wrong address returns an appropriate heading!)

**Paper**: ‘**Visualizing spatiotemporal models with virtual reality: from fully immersive environments to applications in stereoscopic view****’**

**Authors**: Stefano Castruccio (University of Notre Dame, USA) and Marc G. Genton and Ying Sun (King Abdullah University of Science and Technology, Thuwal)

** ****Paper: ****‘****Visualization in Bayesian workflow’**

**Authors:**** **Jonah Gabry (Columbia University, New York), Daniel Simpson (University of Toronto), Aki Vehtari (Aalto University, Espoo), Michael Betancourt (Columbia University, New York, and Symplectomorphic, New York) and Andrew Gelman (Columbia University, New York)

**Paper: ‘****Graphics for uncertainty’**

**Authors: **Adrian W. Bowman (University of Glasgow)

*PDFs and supplementary files of these papers from StatsLife and the RSS website. As usual, contributions can be sent in writing, with a deadline of September 19.*