Archive for Italia
Chris Drovandi (QUT) sent me his impression on BISP8 that just took place in Milano, Italia (BISP stands for Bayesian inference in stochastic processes):
Here is a review of some of the talks at BISP8. For the other talks I do not have sufficient background to give the talks the justice that they deserve. It was a very enjoyable small workshop with many talks in my areas of interest.
In the first session Vanja Dukic presented bayesian inference of SEIR epidemic DE models and state space models of google flu trends data. In the case of the state space models a particle learning algorithm was developed. The author considered both fixed and random effects for the data in each US state. In the second session, Murali Haran presented a likelihood-free approach for inferring the parameters of a spatio-temporal epidemic model. The speaker used a Gaussian process emulator of the model based on model simulations from a regulator grid of parameter values. The emulator approach is suggested to be less intensive in terms of the number of model simulations compared with abc but is only suitable for low dimensional inference problems (even less so than abc).
In the first session of day 2 Ana Palacios combined the gompertz model with Markov processes to create flexible and realistic stochastic growth models. The resulting model has a difficult likelihood and inference was performed by completing the likelihood creating simple Gibbs moves and by ABC.
There were 3 talks in a row on inference for SDEs. The first, by Simon Särkkä, avoids evaluating an intractable transition density by proposing from another diffusion model and computing importance weights using the girsanov theorem. Next, Samuel Kou used a population MCMC type approach where each chain had a different Euler discretisation. This helps improve mixing for the chain with the finest grid. Moves between chains are complicated by the different dimension for each chain. The author used a filling approach to overcome this. A very interesting aspect of the talk was using information from all chains to extrapolate various posterior quantiles to delta_t is 0 (no discretisation implying the correct posterior). I assume the extrapolation may not work as well for the extreme quantiles. The third talk, by Andrew Golightly, proposed an auxiliary approach to improve PMCMC for these models. This talk was the most technical (for me) so need more time to digest. Following my talk (based on some work here. And some current work.) was an applied talk using smc2 methodology.
On the final day Alexandros Beskos investigated the use of SMC for Bayesian inference for a high dimensional (static) parameter. SMC is advocated here due to the ease of adaptation relative to MCMC when there is no structure in the model. The base of the approach I believe was that of Chopin (2002).
Here are the slides of my talk in Padova for the workshop Recent Advances in statistical inference: theory and case studies (very similar to the slides for the Varanasi and Gainesville meetings, obviously!, with Peter Müller commenting [at last!] that I had picked the wrong photos from Khajuraho!)
The worthy Padova addendum is that I had two discussants, Stefano Cabras from Universidad Carlos III in Madrid, whose slides are :
and Francesco Pauli, from Trieste, whose slides are:
These were kind and rich discussions with many interesting openings: Stefano’s idea of estimating the pivotal function h is opening new directions, obviously, as it indicates an additional degree of freedom in calibrating the method. Esp. when considering the high variability of the empirical likelihood fit depending on the the function h. For instance, one could start with a large collection of candidate functions and build a regression or a principal component reparameterisation from this collection… (Actually I did not get point #1 about ignoring f: the empirical likelihood is by essence ignoring anything outside the identifying equation, so as long as the equation is valid..) Point #2: Opposing sample free and simulation free techniques is another interesting venue, although I would not say ABC is “sample free”. As to point #3, I will certainly get a look at Monahan and Boos (1992) to see if this can drive the choice of a specific type of pseudo-likelihoods. I like the idea of checking the “coverage of posterior sets” and even more “the likelihood must be the density of a statistic, not necessarily sufficient” as it obviously relates with our current ABC model comparison work… Esp. when the very same paper is mentioned by Francesco as well. Grazie, Stefano! I also appreciate the survey made by Francesco on the consistency conditions, because I think this is an important issue that should be taken into consideration when designing ABC algorithms. (Just pointing out again that, in the theorem of Fearnhead and Prangle (2012) quoting Bernardo and Smith (1992), some conditions are missing for the mathematical consistency to apply.) I also like the agreement we seem to reach about ABC being evaluated per se rather than an a poor man’s Bayesian method. Francesco’s analysis of Monahan and Boos (1992) as validating or not empirical likelihood points out a possible link with the recent coverage analysis of Prangle et al., discussed on the ‘Og a few weeks ago. And an unsuspected link with Larry Wasserman! Grazie, Francesco!
The third day of this rich Padova workshop was actually a half-day which, thanks to a talk cancellation, I managed to attend completely before flying back to Paris. The first talk by Matteo Botai was about the appeal of using quantile regression, as opposed to regular (or mean) regression. The talk was highly pedagogical and enthusiastic, hence enjoyable!, but I did not really buy the argument: if one starts modelling more than the conditional mean, the whole conditional distribution should be the target of the inference, rather than an arbitrary collection of quantiles, esp. if those are estimated marginaly and not jointly. There could be realistic exceptions, for instance legit 95% bounds/quantiles in medical trials, but they are certainly most rare (as exceptions should be!). This talk however led me to ponder about a possible connection with the g-and-k quantile distributions (whose dedicated monograph I did not really appreciate!) even though I had no satisfactory answer by the end of the talk. The second talk by Eva Cantoni dealt with a fishery problem—an ecological model close to my interests—that had nice hierarchical features and [of course] a possible Bayesian analysis of the random effects. This was not the path followed though and the likelihood analysis had to rely on bootstrap and other approximations. The motivation was provided by the very recent move of the hammerhead shark (among several species of shark) to the endangered species list and the data came from reported catches by commercial fishermen vessels. I have always wondered about the reliability of such data, unless there is a researcher on-board the vessel. Indeed, while the commercial catches are presumably checked upon arrival to comply with the quotas (at least in European waters), unintentional catches are presumably thrown away on the spot (maybe not since this is high quality flesh) and not at a time when careful statistics can be saved…
Actually, the whole fishing concept eludes me, even though I can see the commercial side of it: this is the only large-scale remainder of the early hunter-gatherer society and there is no ethical reason it should persist (well, other than feeding coastal populations that rely solely on fish catches, and even then…). The last two centuries have provided many instances of species extinction resulting from unlimited commercial fishing, but fishing is still going on… End of the parenthesis.
The last talk was by Aad van der Vaart, on non-parametric credible sets, i.e. credible sets on curves. Most of the talk was dedicated to the explanation of why there was an issue with those credible sets, that is, why they could be incredibly slow in catching the true curve and in shedding away the impact of the prior. This was most interesting, obviously, if ultimately not that surprising: the prior brings an amount of information that is infinitely larger than the one carried by a finite sample. The last part of the talk showed that the resolution of the difficulty was in selecting priors that avoid over-smoothing (although this depends on an unknown smoothness quantity as well). I liked very much this soft entry to the problem as it showed that all is not that rosy with the Bayesian non-parametric approach, whose foci on asymptotics or computation generally occult this finite sample issue.
Overall, I enjoyed very very much those three days in Padova, from the pleasant feeling of the old city and of the local food (best risottos in the past six months!, and a very decent Valpolicella as well) to the great company of old and new friends—making plans for a model choice brainstorming week in Paris in June—and to the new entries on Bayesian modelling and in particular Bayesian model choice I gathered from the talks. I am thus grateful to my friends Laura Ventura and Walter Racugno for their enormous investment in organising this workshop and in making it such a profitable and rich time. Grazie mille!