Archive for Gainesville

transport Monte Carlo

Posted in Books, pictures, Statistics, Travel with tags , , , , , , , , , , , , , , , on August 31, 2020 by xi'an

Read this recent arXival by Leo Duan (from UF in Gainesville) on transport approaches to approximate Bayesian computation, in connection with normalising flows. The author points out a “lack of flexibility in a large class of normalizing flows”  to bring forward his own proposal.

“…we assume the reference (a multivariate uniform distribution) can be written as a mixture of many one-to-one transforms from the posterior”

The transportation problem is turned into defining a joint distribution on (β,θ) such that θ is marginally distributed from the posterior and β is one of an infinite collection of transforms of θ. Which sounds quite different from normalizing flows, to be sure. Reverting the order, if one manages to simulate β from its marginal the resulting θ is one of the transforms. Chosen to be a location-scale modification of β, s⊗β+m. The weights when going from θ to β are logistic transforms with Dirichlet distributed scales. All with parameters to be optimised by minimising the Kullback-Leibler distance between the reference measure on β and its inverse mixture approximation, and resorting to gradient descent. (This may sound a wee bit overwhelming as an approximation strategy and I actually had to make a large cup of strong macha to get over it, but this may be due to the heat wave occurring at the same time!) Drawing θ from this approximation is custom-made straightforward and an MCMC correction can even be added, resulting in an independent Metropolis-Hastings version since the acceptance ratio remains computable. Although this may defeat the whole purpose of the exercise by stalling the chain if the approximation is poor (hence suggesting this last step being used instead as a control.)

The paper also contains a theoretical section that studies the approximation error, going to zero as the number of terms in the mixture, K, goes to infinity. Including a Monte Carlo error in log(n)/n (and incidentally quoting a result from my former HoD at Paris 6, Paul Deheuvels). Numerical experiments show domination or equivalence with some other solutions, e.g. being much faster than HMC, the remaining $1000 question being of course the on-line evaluation of the quality of the approximation.

stratified ABC [One World ABC webinar]

Posted in Books, Statistics, University life with tags , , , , , , , , on May 15, 2020 by xi'an

The third episode of the One World ABC seminar (Season 1!) was kindly delivered by Umberto Picchini on Stratified sampling and bootstrapping for ABC which I already if briefly discussed after BayesComp 2020. Which sounds like a million years ago… His introduction on the importance of estimating the likelihood using a kernel, while 600% justified wrt his talk, made the One World ABC seminar sounds almost like groundhog day!  The central argument is in the computational gain brought by simulating a single θ dependent [expensive] dataset followed by [cheaper] bootstrap replicates. Which turns de fact into bootstrapping the summary statistics.

If I understand correctly, the post-stratification approach of Art Owen (2013?, I cannot find the reference) corrects a misrepresentation of mine. Indeed, defining a partition with unknown probability weights seemed to me to annihilate the appeal of stratification, because the Bernoulli variance of the estimated probabilities brought back the same variability as the mother estimator. But with bootstrap, this requires only two simulations, one for the weights and one for the target. And further allows for a larger ABC tolerance in fine. Free lunch?!

The speaker in two weeks (21 May or Ascension Thursday!) is my friend and co-author Gael Martin from Monash University, who will speak on Focused Bayesian prediction, at quite a late time down under..!

my first parkrun [19:56,3/87,78.8%]

Posted in Kids, pictures, Running, Travel with tags , , , , , , , , , on January 19, 2020 by xi'an

This morning, I had my first parkrun race in Gainesville, before heading back to Paris. (Thanks to Florence Forbes who pointed out this initiative to me.) Which reminded me of the race I ran in Helsinki a few years ago. Without the “self-transcendance” topping…! While the route was very urban, it was a fun opportunity to run a race with a few other runners. My time of 19.56 is not my best by far but, excuses, excuses, I was not feeling too well and the temperature was quite high (21⁰) and I finished in the first three runners, just seconds behind two young fellows who looked like they were still in high school.  (I am now holding the record of that race for my age group as well!) Anyway, this is a great way to join races when travelling and not worry about registration, certificates, &tc.

Parkrun also provides an age-grade adjusted ranking (78.8%), which is interesting but statistically puzzling as this is the ratio of one’s time over the fastest time (ever?) in the age x gender category. Given that fastest times are extreme, this depends on one individual and hence has a high variability. Especially in higher (meaning older!) veteran categories. A quantile in the empirical distribution would sound better. I came across this somewhat statistical analysis of the grade,

stranded

Posted in pictures, Travel with tags , , , , , , , , , , , on January 12, 2020 by xi'an

off to BayesComp 20, Gainesville

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , on January 7, 2020 by xi'an