Archive for Susie Bayarri

Jeffreys priors for hypothesis testing [Bayesian reads #2]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , on February 9, 2019 by xi'an

A second (re)visit to a reference paper I gave to my OxWaSP students for the last round of this CDT joint program. Indeed, this may be my first complete read of Susie Bayarri and Gonzalo Garcia-Donato 2008 Series B paper, inspired by Jeffreys’, Zellner’s and Siow’s proposals in the Normal case. (Disclaimer: I was not the JRSS B editor for this paper.) Which I saw as a talk at the O’Bayes 2009 meeting in Phillie.

The paper aims at constructing formal rules for objective proper priors in testing embedded hypotheses, in the spirit of Jeffreys’ Theory of Probability “hidden gem” (Chapter 3). The proposal is based on symmetrised versions of the Kullback-Leibler divergence κ between null and alternative used in a transform like an inverse power of 1+κ. With a power large enough to make the prior proper. Eventually multiplied by a reference measure (i.e., the arbitrary choice of a dominating measure.) Can be generalised to any intrinsic loss (not to be confused with an intrinsic prior à la Berger and Pericchi!). Approximately Cauchy or Student’s t by a Taylor expansion. To be compared with Jeffreys’ original prior equal to the derivative of the atan transform of the root divergence (!). A delicate calibration by an effective sample size, lacking a general definition.

At the start the authors rightly insist on having the nuisance parameter v to differ for each model but… as we all often do they relapse back to having the “same ν” in both models for integrability reasons. Nuisance parameters make the definition of the divergence prior somewhat harder. Or somewhat arbitrary. Indeed, as in reference prior settings, the authors work first conditional on the nuisance then use a prior on ν that may be improper by the “same” argument. (Although conditioning is not the proper term if the marginal prior on ν is improper.)

The paper also contains an interesting case of the translated Exponential, where the prior is L¹ Student’s t with 2 degrees of freedom. And another one of mixture models albeit in the simple case of a location parameter on one component only.

ABC variable selection

Posted in Books, Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , on July 18, 2018 by xi'an

Prior to the ISBA 2018 meeting, Yi Liu, Veronika Ročková, and Yuexi Wang arXived a paper on relying ABC for finding relevant variables, which is a very original approach in that ABC is not as much the object as it is a tool. And which Veronika considered during her Susie Bayarri lecture at ISBA 2018. In other words, it is not about selecting summary variables for running ABC but quite the opposite, selecting variables in a non-linear model through an ABC step. I was going to separate the two selections into algorithmic and statistical selections, but it is more like projections in the observation and covariate spaces. With ABC still providing an appealing approach to approximate the marginal likelihood. Now, one may wonder at the relevance of ABC for variable selection, aka model choice, given our warning call of a few years ago. But the current paper does not require low-dimension summary statistics, hence avoids the difficulty with the “other” Bayes factor.

In the paper, the authors consider a spike-and… forest prior!, where the Bayesian CART selection of active covariates proceeds through a regression tree, selected covariates appearing in the tree and others not appearing. With a sparsity prior on the tree partitions and this new ABC approach to select the subset of active covariates. A specific feature is in splitting the data, one part to learn about the regression function, simulating from this function and comparing with the remainder of the data. The paper further establishes that ABC Bayesian Forests are consistent for variable selection.

“…we observe a curious empirical connection between π(θ|x,ε), obtained with ABC Bayesian Forests  and rescaled variable importances obtained with Random Forests.”

The difference with our ABC-RF model choice paper is that we select summary statistics [for classification] rather than covariates. For instance, in the current paper, simulation of pseudo-data will depend on the selected subset of covariates, meaning simulating a model index, and then generating the pseudo-data, acceptance being a function of the L² distance between data and pseudo-data. And then relying on all ABC simulations to find which variables are in more often than not to derive the median probability model of Barbieri and Berger (2004). Which does not work very well if implemented naïvely. Because of the immense size of the model space, it is quite hard to find pseudo-data close to actual data, resulting in either very high tolerance or very low acceptance. The authors get over this difficulty by a neat device that reminds me of fractional or intrinsic (pseudo-)Bayes factors in that the dataset is split into two parts, one that learns about the posterior given the model index and another one that simulates from this posterior to compare with the left-over data. Bringing simulations closer to the data. I do not remember seeing this trick before in ABC settings, but it is very neat, assuming the small data posterior can be simulated (which may be a fundamental reason for the trick to remain unused!). Note that the split varies at each iteration, which means there is no impact of ordering the observations.

Altos de Losada [guest wine post by Susie]

Posted in pictures, Travel, University life, Wines with tags , , , , , on June 20, 2015 by xi'an

[Here is a wine criticism written by Susie Bayarri in 2013 about a 2008 bottle of Altos de Losada, a wine from Leon:]

altosThe cork is fantastic. Very good presentation and labelling of the bottle. The wine  color is like dark cherry, I would almost say of the color of blood. Very bright although unfiltered. The cover is d16efinitely high. The tear is very nice (at least in my glass), slow, wide, through parallel streams… but it does not dye my glass at all.

The bouquet is its best feature… it is simply voluptuous… with ripe plums as well as vanilla, some mineral tone plus a smoky hint. I cannot quite detect which wood is used… I have always loved the bouquet of this wine…

In mouth, it remains a bit closed. Next time, I will make sure I decant it (or I will use that Venturi device) but it is nonetheless excellent… the wine is truly fruity, but complex as well (nothing like grape juice). The tannins are definitely present, but tamed and assimilated (I think they will continue to mellow) and it has just a hint of acidity… Despite its alcohol content, it remains light, neither overly sweet nor heavy. The after-taste offers a pleasant bitterness… It is just delicious, an awesome wine!

O’Bayes 2015 [day #2]

Posted in pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , on June 4, 2015 by xi'an

vale1This morning was the most special time of the conference in that we celebrated Susie Bayarri‘s contributions and life together with members of her family. Jim gave a great introduction that went over Susie’s numerous papers and the impact they had in Statistics and outside Statistics. As well as her recognised (and unsurprising if you knew her) expertise in wine and food! The three talks in that morning were covering some of the domains within Susie’s fundamental contributions and delivered by former students of her: model assessment through various types of predictive p-values by Maria Eugenia Castellanos, Bayesian model selection by Anabel Forte, and computer models by Rui Paulo, all talks that translated quite accurately the extent of Susie’s contributions… In a very nice initiative, the organisers had also set a wine tasting break (at 10am!) around two vintages that Susie had reviewed in the past years [with reviews to show up soon in the Wines section of the ‘Og!]

The talks of the afternoon session were by Jean-Bernard (JB) Salomond about a new proposal to handle embedded hypotheses in a non-parametric framework and by James Scott about false discovery rates for neuroimaging. Despite the severe theoretical framework behind the proposal, JB managed a superb presentation that mostly focussed on the intuition for using the smoothed (or approximative) version of the null hypothesis. (A flavour of ABC, somehow?!) Also kudos to JB for perpetuating my tradition of starting sections with unrelated pictures. James’ topic was more practical Bayes or pragmatic Bayes than objective Bayes in that he analysed a large fMRI experiment on spatial working memory, introducing a spatial pattern that led to a complex penalised Lasso-like optimisation. The data was actually an fMRI of the brain of Russell Poldrack, one of James’ coauthors on that paper.

The (sole) poster session was on the evening with a diverse range of exciting topics—including three where I was a co-author, by Clara Grazian, Kaniav Kamary, and Kerrie Mengersen—but it was alas too short or I was alas too slow to complete the tour before it ended! In retrospect we could have broken it into two sessions since Wednesday evening is a free evening.

O-Bayes15 [day #1]

Posted in Books, pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , on June 3, 2015 by xi'an

vale3So here we are back together to talk about objective Bayes methods, and in the City of Valencià as well.! A move back to a city where the 1998 O’Bayes took place. In contrast with my introductory tutorial, the morning tutorials by Luis Pericchi and Judith Rousseau were investigating fairly technical and advanced, Judith looking at the tools used in the frequentist (Bernstein-von Mises) analysis of priors, with forays in empirical Bayes, giving insights into a wide range of recent papers in the field. And Luis covering works on Bayesian robustness in the sense of resisting to over-influential observations. Following works of him and of Tony O’Hagan and coauthors. Which means characterising tails of prior versus sampling distribution to allow for the posterior reverting to the prior in case of over-influential datapoints. Funny enough, after a great opening by Carmen and Ed remembering Susie, Chris Holmes also covered Bayesian robust analysis. More in the sense of incompletely or mis-  specified models. (On the side, rekindling one comment by Susie and the need to embed robust Bayesian analysis within decision theory.) Which was also much Chris’ point, in line with the recent Watson and Holmes’ paper. Dan Simpson in his usual kick-the-anthill-real-hard-and-set-fire-to-it discussion pointed out the possible discrepancy between objective and robust Bayesian analysis. (With lines like “modern statistics has proven disruptive to objective Bayes”.) Which is not that obvious because the robust approach simply reincorporates the decision theory within the objective framework. (Dan also concluded with the comic strip below, whose message can be interpreted in many ways…! Or not.)

The second talk of the afternoon was given by Veronika Ročková on a novel type of spike-and-slab prior to handle sparse regression, bringing an alternative to the standard Lasso. The prior is a mixture of two Laplace priors whose scales are constrained in connection with the actual number of non-zero coefficients. I had not heard of this approach before (although Veronika and Ed have an earlier paper on a spike-and-slab prior to handle multicolinearity that Veronika presented in Boston last year) and I was quite impressed by the combination of minimax properties and practical determination of the scales. As well as by the performances of this spike-and-slab Lasso. I am looking forward the incoming paper!

The day ended most nicely in the botanical gardens of the University of Valencià, with an outdoor reception surrounded by palm trees and parakeet cries…