Archive for the pictures Category

unusual clouds [jatp]

Posted in pictures, Travel, Wines with tags , , , , , , , , , , on July 19, 2018 by xi'an

ABC variable selection

Posted in Books, Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , on July 18, 2018 by xi'an

Prior to the ISBA 2018 meeting, Yi Liu, Veronika Ročková, and Yuexi Wang arXived a paper on relying ABC for finding relevant variables, which is a very original approach in that ABC is not as much the object as it is a tool. And which Veronika considered during her Susie Bayarri lecture at ISBA 2018. In other words, it is not about selecting summary variables for running ABC but quite the opposite, selecting variables in a non-linear model through an ABC step. I was going to separate the two selections into algorithmic and statistical selections, but it is more like projections in the observation and covariate spaces. With ABC still providing an appealing approach to approximate the marginal likelihood. Now, one may wonder at the relevance of ABC for variable selection, aka model choice, given our warning call of a few years ago. But the current paper does not require low-dimension summary statistics, hence avoids the difficulty with the “other” Bayes factor.

In the paper, the authors consider a spike-and… forest prior!, where the Bayesian CART selection of active covariates proceeds through a regression tree, selected covariates appearing in the tree and others not appearing. With a sparsity prior on the tree partitions and this new ABC approach to select the subset of active covariates. A specific feature is in splitting the data, one part to learn about the regression function, simulating from this function and comparing with the remainder of the data. The paper further establishes that ABC Bayesian Forests are consistent for variable selection.

“…we observe a curious empirical connection between π(θ|x,ε), obtained with ABC Bayesian Forests  and rescaled variable importances obtained with Random Forests.”

The difference with our ABC-RF model choice paper is that we select summary statistics [for classification] rather than covariates. For instance, in the current paper, simulation of pseudo-data will depend on the selected subset of covariates, meaning simulating a model index, and then generating the pseudo-data, acceptance being a function of the L² distance between data and pseudo-data. And then relying on all ABC simulations to find which variables are in more often than not to derive the median probability model of Barbieri and Berger (2004). Which does not work very well if implemented naïvely. Because of the immense size of the model space, it is quite hard to find pseudo-data close to actual data, resulting in either very high tolerance or very low acceptance. The authors get over this difficulty by a neat device that reminds me of fractional or intrinsic (pseudo-)Bayes factors in that the dataset is split into two parts, one that learns about the posterior given the model index and another one that simulates from this posterior to compare with the left-over data. Bringing simulations closer to the data. I do not remember seeing this trick before in ABC settings, but it is very neat, assuming the small data posterior can be simulated (which may be a fundamental reason for the trick to remain unused!). Note that the split varies at each iteration, which means there is no impact of ordering the observations.

la finale

Posted in Kids, pictures, Travel with tags , , , , , , , , on July 16, 2018 by xi'an

A very pleasant stroll through central Paris this afternoon, during “la” finale, when France was playing Croatia. Bars were all overflowing onto the pavements and sometimes the streets, each action was echoed throughout town, and we certainly did not miss any goal, even from the heart of the Luxembourg gardens! Which were deserted except for the occasional tourist, just as the main thoroughfares, except for police cars and emergency vehicles. Since the game ended, horns have been honking almost nonstop, even in the quietest suburbs.

graph of the day & AI4good versus AI4bad

Posted in Books, pictures, Statistics with tags , , , , , , , , on July 15, 2018 by xi'an

Apart from the above graph from Nature, rendering in a most appalling and meaningless way the uncertainty about the number of active genes in the human genome, I read a couple of articles in this issue of Nature relating to the biases and dangers of societal algorithms. One of which sounded very close to the editorial in the New York Times on which Kristian Lum commented on this blog. With the attached snippet on what is fair and unfair (or not).

The second article was more surprising as it defended the use of algorithms for more democracy. Nothing less. Written by Wendy Tam Cho, professor of political sciences, law, statistics, and mathematics at UIUC, it argued that the software that she develops to construct electoral maps produces fair maps. Which sounds over-rosy imho, as aiming to account for all social, ethnic, income, &tc., groups, i.e., most of the axes that define a human, is meaningless, if only because the structure of these groups is not frozen in time. To state that “computers are impervious to the lure of power” is borderline ridiculous, as computers and algorithms are [so far] driven by humans. This is not to say that gerrymandering should not be fought by technological means, especially and obviously by open source algorithms, as existing proposals (discussed here) demonstrate, but to entertain the notion of a perfectly representative redistricting is not only illusory, but also far from democratic as it shies away from the one person one vote  at the basis of democracy. And the paper leaves us on the dark as to whom will decide on which group or which characteristic need be represented in the votes. Of course, this is the impression obtained by reading a one page editorial in Nature [in an overcrowded and sweltering commuter train] rather than the relevant literature. Nonetheless, I remain puzzled at why this editorial was ever published. (Speaking of democracy, the issue contains also warning reports about Hungary’s ultra-right government taking over the Hungarian Academy of Sciences.)

LMS Invited Lecture Series / CRISM Summer School 2018

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , on July 12, 2018 by xi'an

Trump is coming…

Posted in pictures, Travel with tags , , , , , , , , on July 12, 2018 by xi'an

 

In conjunction with the official (if not state-) visit of Donald Trump to the UK, a list of demonstrations throughout the kingdom, on top of the national demonstration in London on July 13, Together Against Trump, jointly organised by the Stop Trump Coalition (STC) and Stand Up To Trump (SUTT). (Even stoptrump.uk wrong address returns an appropriate heading!)

free and graphic session at RSS 2018 in Cardiff

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , on July 11, 2018 by xi'an

Reposting an email I received from the Royal Statistical Society, this is to announce a discussion session on three papers on Data visualization in Cardiff City Hall next September 5, as a free part of the RSS annual conference. (But the conference team must be told in advance.)

Paper:             ‘Visualizing spatiotemporal models with virtual reality: from fully immersive environments to applications in stereoscopic view

Authors:         Stefano Castruccio (University of Notre Dame, USA) and Marc G. Genton and Ying Sun (King Abdullah University of Science and Technology, Thuwal)

 Paper:             Visualization in Bayesian workflow’

Authors:            Jonah Gabry (Columbia University, New York), Daniel Simpson (University of Toronto), Aki Vehtari (Aalto University, Espoo), Michael Betancourt (Columbia University, New York, and Symplectomorphic, New York) and Andrew Gelman (Columbia University, New York)

Paper:             ‘Graphics for uncertainty’

Authors:         Adrian W. Bowman (University of Glasgow)

PDFs and supplementary files of these papers from StatsLife and the RSS website. As usual, contributions can be sent in writing, with a deadline of September 19.