Archive for credible set

discussione a Padova

Posted in Statistics, University life with tags , , , , , , , , , , , , on March 25, 2013 by xi'an

Here are the slides of my talk in Padova for the workshop Recent Advances in statistical inference: theory and case studies (very similar to the slides for the Varanasi and Gainesville meetings, obviously!, with Peter Müller commenting [at last!] that I had picked the wrong photos from Khajuraho!)

The worthy Padova addendum is that I had two discussants, Stefano Cabras from Universidad Carlos III in Madrid, whose slides are :

and Francesco Pauli, from Trieste, whose slides are:

These were kind and rich discussions with many interesting openings: Stefano’s idea of estimating the pivotal function h is opening new directions, obviously, as it indicates an additional degree of freedom in calibrating the method. Esp. when considering the high variability of the empirical likelihood fit depending on the the function h. For instance, one could start with a large collection of candidate functions and build a regression or a principal component reparameterisation from this collection… (Actually I did not get point #1 about ignoring f: the empirical likelihood is by essence ignoring anything outside the identifying equation, so as long as the equation is valid..) Point #2: Opposing sample free and simulation free techniques is another interesting venue, although I would not say ABC is “sample free”. As to point #3, I will certainly get a look at Monahan and Boos (1992) to see if this can drive the choice of a specific type of pseudo-likelihoods. I like the idea of checking the “coverage of posterior sets” and even more “the likelihood must be the density of a statistic, not necessarily sufficient” as it obviously relates with our current ABC model comparison work… Esp. when the very same paper is mentioned by Francesco as well. Grazie, Stefano! I also appreciate the survey made by Francesco on the consistency conditions, because I think this is an important issue that should be taken into consideration when designing ABC algorithms. (Just pointing out again that, in the theorem of Fearnhead and Prangle (2012) quoting Bernardo and Smith (1992), some conditions are missing for the mathematical consistency to apply.) I also like the agreement we seem to reach about ABC being evaluated per se rather than an a poor man’s Bayesian method. Francesco’s analysis of Monahan and Boos (1992) as validating or not empirical likelihood points out a possible link with the recent coverage analysis of Prangle et al., discussed on the ‘Og a few weeks ago. And an unsuspected link with Larry Wasserman! Grazie, Francesco!

workshop a Padova (finale)

Posted in pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , on March 24, 2013 by xi'an

The third day of this rich Padova workshop was actually a half-day which, thanks to a talk cancellation, I managed to attend completely before flying back to Paris. The first talk by Matteo Botai was about the appeal of using quantile regression, as opposed to regular (or mean) regression. The talk was highly pedagogical and enthusiastic, hence enjoyable!, but I did not really buy the argument: if one starts modelling more than the conditional mean, the whole conditional distribution should be the target of the inference, rather than an arbitrary collection of quantiles, esp. if those are estimated marginaly and not jointly. There could be realistic exceptions, for instance legit 95% bounds/quantiles in medical trials, but they are certainly most rare (as exceptions should be!). This talk however led me to ponder about a possible connection with the g-and-k quantile distributions (whose dedicated monograph I did not really appreciate!) even though I had no satisfactory answer by the end of the talk. The second talk by Eva Cantoni dealt with a fishery problem—an ecological model close to my interests—that had nice hierarchical features and [of course] a possible Bayesian analysis of the random effects. This was not the path followed though and the likelihood analysis had to rely on bootstrap and other approximations. The motivation was provided by the very recent move of the hammerhead shark (among several species of shark) to the endangered species list and the data came from reported catches by commercial fishermen vessels. I have always wondered about the reliability of such data, unless there is a researcher on-board the vessel. Indeed, while the commercial catches are presumably checked upon arrival to comply with the quotas (at least in European waters), unintentional catches are presumably thrown away on the spot (maybe not since this is high quality flesh) and not at a time when careful statistics can be saved…

Actually, the whole fishing concept eludes me, even though I can see the commercial side of it: this is the only large-scale remainder of the early hunter-gatherer society and there is no ethical reason it should persist (well, other than feeding coastal populations that rely solely on fish catches, and even then…). The last two centuries have provided many instances of species extinction resulting from unlimited commercial fishing, but fishing is still going on… End of the parenthesis.

The last talk was by Aad van der Vaart, on non-parametric credible sets, i.e. credible sets on curves. Most of the talk was dedicated to the explanation of why there was an issue with those credible sets, that is, why they could be incredibly slow in catching the true curve and in shedding away the impact of the prior. This was most interesting, obviously, if ultimately not that surprising: the prior brings an amount of information that is infinitely larger than the one carried by a finite sample. The last part of the talk showed that the resolution of the difficulty was in selecting priors that avoid over-smoothing (although this depends on an unknown smoothness quantity as well). I liked very much this soft entry to the problem as it showed that all is not that rosy with the Bayesian non-parametric approach, whose foci on asymptotics or computation generally occult this finite sample issue.

Overall, I enjoyed very very much those three days in Padova, from the pleasant feeling of the old city and of the local food (best risottos in the past six months!, and a very decent Valpolicella as well) to the great company of old and new friends—making plans for a model choice brainstorming week in Paris in June—and to the new entries on Bayesian modelling and in particular Bayesian model choice I gathered from the talks. I am thus grateful to my friends Laura Ventura and Walter Racugno for their enormous investment in organising this workshop and in making it such a profitable and rich time. Grazie mille!

a paradox in decision-theoretic interval estimation (solved)

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on October 4, 2012 by xi'an

In 1993, we wrote a paper [with George Casella and Gene/Juinn Hwang] on the paradoxical consequences of using the loss function

\text{length}(C) - k \mathbb{I}_C(\theta)

(published in Statistica Sinica, 3, 141-155) since it led to the following property: for the standard normal mean estimation problem, the regular confidence interval is dominated by the modified confidence interval equal to the empty set when is too large… This was first pointed out by Jim Berger and the most natural culprit is the artificial loss function where the first part is unbounded while the second part is bounded by k. Recently, Paul Kabaila—whom I met in both Adelaide, where he quite appropriately commented about the abnormal talk at the conference!,  and Melbourne, where we met with his students after my seminar at the University of Melbourne—published a paper (first on arXiv then in Statistics and Probability Letters) where he demonstrates that the mere modification of the above loss into

\dfrac{\text{length}(C)}{\sigma} - k \mathbb{I}_C(\theta)

solves the paradox:! For Jeffreys’ non-informative prior, the Bayes (optimal) estimate is the regular confidence interval. besides doing the trick, this nice resolution explains the earlier paradox as being linked to a lack of invariance in the (earlier) loss function. This is somehow satisfactory since Jeffreys’ prior also is the invariant prior in this case.