This “week in Warwick” was not chosen at random as I was aware there is a workshop on non-reversible MCMC going on. (Even though CRiSM sponsored so many workshops in September that almost any week would have worked for the above sentence!) It has always been kind of a mystery to me that non-reversibility could make a massive difference in practice, even though I am quite aware that it does. And I can grasp some of the theoretical arguments why it does. So it was quite rewarding to sit in this Warwick amphitheatre and learn about overdamped Langevin algorithms and other non-reversible diffusions, to see results where convergence times moved from n to √n, and to grasp some of the appeal of lifting albeit in finite state spaces. Plus, the cartoon presentation of Hamiltonian Monte Carlo by Michael Betancourt was a great moment, not only because of the satellite bursting into flames on the screen but also because it gave a very welcome intuition about why reversibility was inefficient and HMC appealing. So I am grateful to my two colleagues, Joris Bierkens and Gareth Roberts, for organising this exciting workshop, with a most profitable scheduling favouring long and few talks. My next visit to Warwick will also coincide with a workshop on intractable likelihood, next November. This time part of the new Alan Turing Institute programme.
Archive for workshop
While I discussed on the ‘Og in the past the difference I saw between estimating an unknown parameter from a distribution and evaluating a normalising constant, evaluating such constants and hence handling [properly] doubly intractable models is obviously of the utmost importance! For this reason, Nial Friel, Helen Ogden and myself have put together a CRiSM workshop on the topic (with the tongue-in-cheek title of Estimating constants!), to be held at the University of Warwick next April 20-22.
The CRiSM workshop will focus on computational methods for approximating challenging normalising constants found in Monte Carlo, likelihood and Bayesian models. Such methods may be used in a wide range of problems: to compute intractable likelihoods, to find the evidence in Bayesian model selection, and to compute the partition function in Physics. The meeting will bring together different communities working on these related problems, some of which have developed original if little advertised solutions. It will also highlight the novel challenges associated with large data and highly complex models. Besides a dozen invited talks, the schedule will highlight two afternoon poster sessions with speed (2-5mn) oral presentations called ‘Elevator’ talks.
While 2016 is going to be quite busy with all kinds of meetings (MCMSkv, ISBA 2016, the CIRM Statistics month, AISTATS 2016, …), this should be an exciting two-day workshop, given the on-going activity in this area, and I thus suggest interested readers to mark the dates in their diary. I will obviously keep you posted about registration and accommodation when those entries are available.
The workshop at the BIPM on measurement uncertainty was certainly most exciting, first by its location in the Parc de Saint Cloud in classical buildings overlooking the Seine river in a most bucolic manner…and second by its mostly Bayesian flavour. The recommendations that the workshop addressed are about revisions in the current GUM, which stands for the Guide to the Expression of Uncertainty in Measurement. The discussion centred on using a more Bayesian approach than in the earlier version, with the organisers of the workshop and leaders of the revision apparently most in favour of that move. “Knowledge-based pdfs” came into the discussion as an attractive notion since it rings a Bayesian bell, especially when associated with probability as a degree of belief and incorporating the notion of an a priori probability distribution. And propagation of errors. Or even more when mentioning the removal of frequentist validations. What I gathered from the talks is the perspective drifting away from central limit approximations to more realistic representations, calling for Monte Carlo computations. There is also a lot I did not get about conventions, codes and standards. Including a short debate about the different meanings on Monte Carlo, from simulation technique to calculation method (as for confidence intervals). And another discussion about replacing the old formula for estimating sd from the Normal to the Student’s t case. A change that remains highly debatable since the Student’s t assumption is as shaky as the Normal one. What became clear [to me] during the meeting is that a rather heated debate is currently taking place about the need for a revision, with some members of the six (?) organisations involved arguing against Bayesian or linearisation tools.
This became even clearer during our frequentist versus Bayesian session with a first talk so outrageously anti-Bayesian it was hilarious! Among other things, the notion that “fixing” the data was against the principles of physics (the speaker was a physicist), that the only randomness in a Bayesian coin tossing was coming from the prior, that the likelihood function was a subjective construct, that the definition of the posterior density was a generalisation of Bayes’ theorem [generalisation found in… Bayes’ 1763 paper then!], that objective Bayes methods were inconsistent [because Jeffreys’ prior produces an inadmissible estimator of μ²!], that the move to Bayesian principles in GUM would cost the New Zealand economy 5 billion dollars [hopefully a frequentist estimate!], &tc., &tc. The second pro-frequentist speaker was by comparison much much more reasonable, although he insisted on showing Bayesian credible intervals do not achieve a nominal frequentist coverage, using a sort of fiducial argument distinguishing x=X+ε from X=x+ε that I missed… A lack of achievement that is fine by my standards. Indeed, a frequentist confidence interval provides a coverage guarantee either for a fixed parameter (in which case the Bayesian approach achieves better coverage by constant updating) or a varying parameter (in which case the frequency of proper inclusion is of no real interest!). The first Bayesian speaker was Tony O’Hagan, who summarily shred the first talk to shreds. And also criticised GUM2 for using reference priors and maxent priors. I am afraid my talk was a bit too exploratory for the audience (since I got absolutely no question!) In retrospect, I should have given an into to reference priors.
An interesting specificity of a workshop on metrology and measurement is that they are hard stickers to schedule, starting and finishing right on time. When a talk finished early, we waited until the intended time to the next talk. Not even allowing for extra discussion. When the only overtime and Belgian speaker ran close to 10 minutes late, I was afraid he would (deservedly) get lynched! He escaped unscathed, but may (and should) not get invited again..!
I attended an highly unusual workshop while in Warwick last week. Unusual for me, obviously. It was about probabilistic numerics, i.e., the use of probabilistic or stochastic arguments in the numerical resolution of (possibly) deterministic problems. The notion in this approach is fairly Bayesian in that it makes use to prior information or belief about the quantity of interest, e.g., a function, to construct an usually Gaussian process prior and derive both an estimator that is identical to a numerical method (e.g., Runge-Kutta or trapezoidal integration) and uncertainty or variability around this estimator. While I did not grasp much more than the classy introduction talk by Philipp Hennig, this concept sounds fairly interesting, if only because of the Bayesian connection, and I wonder if we will soon see a probability numerics section at ISBA! More seriously, placing priors on functions or functionals is a highly formal perspective (as in Bayesian non-parametrics) and it makes me wonder how much of the data (evaluation of a function at a given set of points) and how much of the prior is reflected in the output [variability]. (Obviously, one could also ask a similar question for statistical analyses!) For instance, issues of singularity arise among those stochastic process priors.
Another question that stemmed from this talk is whether or not more efficient numerical methods can derived that way, in addition to recovering the most classical ones. Somewhat, somehow, given the idealised nature of the prior, it feels like priors could be more easily compared or ranked than in classical statistical problems. Since the aim is to figure out the value of an integral or the solution to an ODE. (Or maybe not, since again almost the same could be said about estimating a normal mean.)
Yet another workshop around! Still at Warwick, organised by Simon Barthelmé, Nicolas Chopin and Adam Johansen on the theme of statistical aspects of neuroscience. Being nearby I attended a few lectures today but most talks are more topical than my current interest in the matter, plus workshop fatigue starts to appear!, and hence I will keep a low attendance for the rest of the week to take advantage of my visit here to make some progress in my research and in the preparation of the teaching semester. (Maybe paradoxically I attended a non-neuroscience talk by listening to Richard Wilkinson’s coverage of ABC methods, with an interesting stress on meta-models and the link with computer experiments. Given that we are currently re-revising our paper with Matt Moore and Kerrie Mengersen (and now Chris Drovandi), I find interesting to see a sort of convergence in our community towards a re-re-interpretation of ABC as producing an approximation of the distribution of the summary statistic itself, rather than of the original data, using auxiliary or indirect or pseudo-models like Gaussian processes. (Making the link with Mark Girolami’s talk this morning.)
Great poster session yesterday night and at lunch today. Saw an ABC poster (by Dennis Prangle, following our random forest paper) and several MCMC posters (by Marco Banterle, who actually won one of the speed-meeting mini-project awards!, Michael Betancourt, Anne-Marie Lyne, Murray Pollock), and then a rather different poster on Mondrian forests, that generalise random forests to sequential data (by Balaji Lakshminarayanan). The talks all had interesting aspects or glimpses about big data and some of the unnecessary hype about it (them?!), along with exposing the nefarious views of Amazon to become the Earth only seller!, but I particularly enjoyed the astronomy afternoon and even more particularly Steve Roberts sweep through astronomy machine-learning. Steve characterised variational Bayes as picking your choice of sufficient statistics, which made me wonder why there were no stronger connections between variational Bayes and ABC. He also quoted the book The Fourth Paradigm: Data-Intensive Scientific Discovery by Tony Hey as putting forward interesting notions. (A book review for the next vacations?!) And also mentioned zooniverse, a citizens science website I was not aware of. With a Bayesian analysis of the learning curve of those annotating citizens (in the case of supernovae classification). Big deal, indeed!!!