another bad graph

Posted in Books, pictures, Statistics with tags , , , , on October 27, 2021 by xi'an

Handbooks [not a book review]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , on October 26, 2021 by xi'an

three decades back

Posted in Kids, Travel with tags , , , , , , , , , on October 25, 2021 by xi'an

Yesterday, I (we) found myself (ourselves) back in time, precisely 34 years ago, as we drove our daughter to Orly airport for her flight to Cayenne, French Guiana. And the start of her internship. Indeed, this is also the airport from which I left for Purdue University in 1987 and where my parents drove me then, more for sentimental reasons than out of necessity, as I had much less luggage than Rachel! This old airport has not changed that much, apart from the sharp increase in security restrictions. At the time, my parents were able to stay till the plane to Chicago left (two hours late) and watch it take off from the terraces of the airport. This time, we were just unable to enter the airport beyond the parking lot and watched the plane take off (one hour late) from a flight tracker! But overall there was the same bittersweet feeling of seeing one’s kid move (far) away for a major and exciting step in their professional life. (When I called my mom to watch for the plane flying west straight across Normandy, very close to our family roots, she reminded me of that and also of my grand-parents watching my plane flying by… Without a flight tracker! I actually remember spotting Mont Saint-Michel on that trip.) Fare well, Dr. R, and see you soon!
~

off to Luminy!!!

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , on October 24, 2021 by xi'an

conditioning on insufficient statistics in Bayesian regression

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , on October 23, 2021 by xi'an

“…the prior distribution, the loss function, and the likelihood or sampling density (…) a healthy skepticism encourages us to question each of them”

A paper by John Lewis, Steven MacEachern, and Yoonkyung Lee has recently appeared in Bayesian Analysis. Starting with the great motivation of a misspecified model requiring the use of a (thus necessarily) insufficient statistic and moving to their central concern of simulating the posterior based on that statistic.

Model misspecification remains understudied from a B perspective and this paper is thus most welcome in addressing the issue. However, when reading through, one of my criticisms is in defining misspecification as equivalent to outliers in the sample. An outlier model is an easy case of misspecification, in the end, since the original model remains meaningful. (Why should there be “good” versus “bad” data) Furthermore, adding a non-parametric component for the unspecified part of the data would sound like a “more Bayesian” alternative. Unrelated, I also idly wondered at whether or not normalising flows could be used in this instance..

The problem in selecting a T (Darjeeling of course!) is not really discussed there, while each choice of a statistic T leads to a different signification to what misspecified means and suggests a comparison with Bayesian empirical likelihood.

“Acceptance rates of this [ABC] algorithm can be intolerably low”

Erm, this is not really the issue with ABC, is it?! Especially when the tolerance is induced by the simulations themselves.

When I reached the MCMC (Gibbs?) part of the paper, I first wondered at its relevance for the mispecification issues before realising it had become the focus of the paper. Now, simulating the observations conditional on a value of the summary statistic T is a true challenge. I remember for instance George Casella mentioning it in association with a Student’s t sample in the 1990’s and Kerrie and I having an unsuccessful attempt at it in the same period. Persi Diaconis has written several papers on the problem and I am thus surprised at the dearth of references here, like the rather recent Byrne and Girolami (2013), Florens and Simoni (2015), or Bornn et al. (2019). In the present case, the  linear model assumed as the true model has the exceptional feature that it leads to a feasible transform of an unconstrained simulation into a simulation with fixed statistics, with no measure theoretic worries if not free from considerable efforts to establish the operation is truly valid… And, while simulating (θ,y) makes perfect sense in an insufficient setting, the cost is then precisely the same as when running a vanilla ABC. Which brings us to the natural comparison with ABC. While taking ε=0 may sound as optimal for being “exact”, it is not from an ABC perspective since the convergence rate of the (summary) statistic should be roughly the one of the tolerance (Fearnhead and Liu, Frazier et al., 2018).

“[The Borel Paradox] shows that the concept of a conditional probability with regard to an isolated given hypothesis whose probability equals 0 is inadmissible.” A. Колмого́ров (1933)

As a side note for measure-theoretic purists, the derivation of the conditional of y given T(y)=T⁰ is arbitrary since the event has probability zero (ie, the conditioning set is of measure zero). See the Borel-Kolmogorov paradox. The computations in the paper are undoubtedly correct, but this is only one arbitrary choice of a transform (or conditioning σ-algebra).

%d bloggers like this: