## a war[like] week

Posted in Books, Kids, pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , , , on April 29, 2015 by xi'an

This week in Warwick was one of the busiest ones ever as I had to juggle between two workshops, including one in Oxford, a departmental meeting, two paper revisions, two pre-vivas, and a seminar in Leeds. Not to mention a broken toe (!), a flat tire (!!), and a diner at the X. Hardly anytime for writing blog entries..! Fortunately, I managed to squeeze time for working with Kerrie Mengersen who was visiting Warwick this fortnight. Finding new directions for the (A)BCel approach we developed a few years ago with Pierre Pudlo. The workshop in Oxford was quite informal with talks from PhD students [I fear I cannot discuss here as the papers are not online yet]. And one talk by François Caron about estimating sparse networks with not exactly exchangeable priors and completely random measures. And one talk by Kerrie Mengersen on a new and in-progress approach to handling Big Data that I found quite convincing (if again cannot discuss here). The probabilistic numerics workshop was discussed in yesterday’s post and I managed to discuss it a wee bit further with the organisers at The X restaurant in Kenilworth. (As a superfluous aside, and after a second sampling this year, I concluded that the Michelin star somewhat undeserved in that the dishes at The X are not particularly imaginative or tasty, the excellent sourdough bread being the best part of the meal!) I was expecting the train ride to Leeds to be highly bucolic as it went through the sunny countryside of South Yorkshire, with newly born lambs running in the bright green fields surrounded by old stone walls…, but instead went through endless villages with their rows of brick houses. Not that I have anything against brick houses, mind! Only, I had not realised how dense this part of England was, this presumably getting back all the way to the Industrial Revolution with the Manchester-Leeds-Birmingham triangle.

My seminar in Leeds was as exciting as in Amsterdam last week and with a large audience, so I got many and only interesting questions, from the issue of turning the output (i.e., the posterior on α) into a decision rule, to making  decision in the event of a non-conclusive posterior, to links with earlier frequentist resolutions, to whether or not we were able to solve the Lindley-Jeffreys paradox (we are not!, which makes a lot of sense), to the possibility of running a subjective or a sequential version. After the seminar I enjoyed a perfect Indian dinner at Aagrah, apparently a Yorkshire institution, with the right balance between too hot and too mild, i.e., enough spices to break a good sweat but not too many to loose any sense of taste!

## Oxford snapshot

Posted in Kids, pictures, Travel, University life with tags , , , , , , on April 28, 2015 by xi'an

## Alan Turing Institute

Posted in Books, pictures, Running, Statistics, University life with tags , , , , , , , , , on February 10, 2015 by xi'an

The University of Warwick is one of the five UK Universities (Cambridge, Edinburgh, Oxford, Warwick and UCL) to be part of the new Alan Turing Institute.To quote from the University press release,  “The Institute will build on the UK’s existing academic strengths and help position the country as a world leader in the analysis and application of big data and algorithm research. Its headquarters will be based at the British Library at the centre of London’s Knowledge Quarter.” The Institute will gather researchers from mathematics, statistics, computer sciences, and connected fields towards collegial and focussed research , which means in particular that it will hire a fairly large number of researchers in stats and machine-learning in the coming months. The Department of Statistics at Warwick was strongly involved in answering the call for the Institute and my friend and colleague Mark Girolami will the University leading figure at the Institute, alas meaning that we will meet even less frequently! Note that the call for the Chair of the Alan Turing Institute is now open, with deadline on March 15. [As a personal aside, I find the recognition that Alan Turing’s genius played a pivotal role in cracking the codes that helped us win the Second World War. It is therefore only right that our country’s top universities are chosen to lead this new institute named in his honour. by the Business Secretary does not absolve the legal system that drove Turing to suicide….]

Posted in Statistics, Travel, University life with tags , , , , , , , , , , , , on January 28, 2015 by xi'an

On Wednesday afternoon, Richard Everitt and Dennis Prangle organised an RSS workshop in Reading on Bayesian Computation. And invited me to give a talk there, along with John Hemmings, Christophe Andrieu, Marcelo Pereyra, and themselves. Given the proximity between Oxford and Reading, this felt like a neighbourly visit, especially when I realised I could take my bike on the train! John Hemmings gave a presentation on synthetic models for climate change and their evaluation, which could have some connection with Tony O’Hagan’s recent talk in Warwick, Dennis told us about “the lazier ABC” version in connection with his “lazy ABC” paper, [from my very personal view] Marcelo expanded on the Moreau-Yoshida expansion he had presented in Bristol about six months ago, with the notion that using a Gaussian tail regularisation of a super-Gaussian target in a Langevin algorithm could produce better convergence guarantees than the competition, including Hamiltonian Monte Carlo, Luke Kelly spoke about an extension of phylogenetic trees using a notion of lateral transfer, and Richard introduced a notion of biased approximation to Metropolis-Hasting acceptance ratios, notion that I found quite attractive if not completely formalised, as there should be a Monte Carlo equivalent to the improvement brought by biased Bayes estimators over unbiased classical counterparts. (Repeating a remark by Persi Diaconis made more than 20 years ago.) Christophe Andrieu also exposed some recent developments of his on exact approximations à la Andrieu and Roberts (2009).

Since those developments are not yet finalised into an archived document, I will not delve into the details, but I found the results quite impressive and worth exploring, so I am looking forward to the incoming publication. One aspect of the talk which I can comment on is related to the exchange algorithm of Murray et al. (2006). Let me recall that this algorithm handles double intractable problems (i.e., likelihoods with intractable normalising constants like the Ising model), by introducing auxiliary variables with the same distribution as the data given the new value of the parameter and computing an augmented acceptance ratio which expectation is the targeted acceptance ratio and which conveniently removes the unknown normalising constants. This auxiliary scheme produces a random acceptance ratio and hence differs from the exact-approximation MCMC approach, which target directly the intractable likelihood. It somewhat replaces the unknown constant with the density taken at a plausible realisation, hence providing a proper scale. At least for the new value. I wonder if a comparison has been conducted between both versions, the naïve intuition being that the ratio of estimates should be more variable than the estimate of the ratio. More generally, it seemed to me [during the introductory part of Christophe’s talk] that those different methods always faced a harmonic mean danger when being phrased as expectations of ratios, since those ratios were not necessarily squared integrable. And not necessarily bounded. Hence my rather gratuitous suggestion of using other tools than the expectation, like maybe a median, thus circling back to the biased estimators of Richard. (And later cycling back, unscathed, to Reading station!)

On top of the six talks in the afternoon, there was a small poster session during the tea break, where I met Garth Holloway, working in agricultural economics, who happened to be a (unsuspected) fan of mine!, to the point of entitling his poster “Robert’s paradox”!!! The problem covered by this undeserved denomination connected to the bias in Chib’s approximation of the evidence in mixture estimation, a phenomenon that I related to the exchangeability of the component parameters in an earlier paper or set of slides. So “my” paradox is essentially label (un)switching and its consequences. For which I cannot claim any fame! Still, I am looking forward the completed version of this poster to discuss Garth’s solution, but we had a beer together after the talks, drinking to the health of our mutual friend John Deely.

## a week in Oxford

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on January 26, 2015 by xi'an

I spent [most of] the past week in Oxford in connection with our joint OxWaSP PhD program, which is supported by the EPSRC, and constitutes a joint Centre of Doctoral Training in  statistical science focussing on data-­intensive environments and large-­scale models. The first cohort of a dozen PhD students had started their training last Fall with the first year spent in Oxford, before splitting between Oxford and Warwick to write their thesis.  Courses are taught over a two week block, with a two day introduction to the theme (Bayesian Statistics in my case), followed by reading, meetings, daily research talks, mini-projects, and a final day in Warwick including presentations of the mini-projects and a concluding seminar.  (involving Jonty Rougier and Robin Ryder, next Friday). This approach by bursts of training periods is quite ambitious in that it requires a lot from the students, both through the lectures and in personal investment, and reminds me somewhat of a similar approach at École Polytechnique where courses are given over fairly short periods. But it is also profitable for highly motivated and selected students in that total immersion into one topic and a large amount of collective work bring them up to speed with a reasonable basis and the option to write their thesis on that topic. Hopefully, I will see some of those students next year in Warwick working on some Bayesian analysis problem!

On a personal basis, I also enjoyed very much my time in Oxford, first for meeting with old friends, albeit too briefly, and second for cycling, as the owner of the great Airbnb place I rented kindly let me use her bike to go around, which allowed me to go around quite freely! Even on a train trip to Reading. As it was a road racing bike, it took me a trip or two to get used to it, especially on the first day when the roads were somewhat icy, but I enjoyed the lightness of it, relative to my lost mountain bike, to the point of considering switching to a road bike for my next bike… I had also some apprehensions with driving at night, which I avoid while in Paris, but got over them until the very last night when I had a very close brush with a car entering from a side road, which either had not seen me or thought I would let it pass. Gave me the opportunity of shouting Oï!

## last Big MC [seminar] before summer [June 19, 3pm]

Posted in pictures, Statistics, University life with tags , , , , , , , , , , , on June 17, 2014 by xi'an

Last session of our Big’MC seminar at Institut Henri Poincaré this year, on Tuesday Thursday, June 19, with

Chris Holmes (Oxford) at 3pm on

Robust statistical decisions via re-weighted Monte Carlo samples

and Pierre Pudlo (iC3M, Université de Montpellier 2) at 4:15pm on [our joint work]

ABC and machine learning

## lazy ABC

Posted in Books, Statistics, University life with tags , , , , , , , on June 9, 2014 by xi'an

“A more automated approach would be useful for lazy versions of ABC SMC algorithms.”

Dennis Prangle just arXived the work on lazy ABC he had presented in Oxford at the i-like workshop a few weeks ago. The idea behind the paper is to cut down massively on the generation of pseudo-samples that are “too far” from the observed sample. This is formalised through a stopping rule that puts the estimated likelihood to zero with a probability 1-α(θ,x) and otherwise divide the original ABC estimate by α(θ,x). Which makes the modification unbiased when compared with basic ABC. The efficiency appears when α(θ,x) can be computed much faster than producing the entire pseudo-sample and its distance to the observed sample. When considering an approximation to the asymptotic variance of this modification, Dennis derives a optimal (in the sense of the effective sample size) if formal version of the acceptance probability α(θ,x), conditional on the choice of a “decision statistic” φ(θ,x).  And of an importance function g(θ). (I do not get his Remark 1 about the case when π(θ)/g(θ) only depends on φ(θ,x), since the later also depends on x. Unless one considers a multivariate φ which contains π(θ)/g(θ) itself as a component.) This approach requires to estimate

$\mathbb{P}(d(S(Y),S(y^o))<\epsilon|\varphi)$

as a function of φ: I would have thought (non-parametric) logistic regression a good candidate towards this estimation, but Dennis is rather critical of this solution.

I added the quote above as I find it somewhat ironical: at this stage, to enjoy laziness, the algorithm has first to go through a massive calibration stage, from the selection of the subsample [to be simulated before computing the acceptance probability α(θ,x)] to the construction of the (somewhat mysterious) decision statistic φ(θ,x) to the estimation of the terms composing the optimal α(θ,x). The most natural choice of φ(θ,x) seems to be involving subsampling, still with a wide range of possibilities and ensuing efficiencies. (The choice found in the application is somehow anticlimactic in this respect.) In most ABC applications, I would suggest using a quick & dirty approximation of the distribution of the summary statistic.

A slight point of perplexity about this “lazy” proposal, namely the static role of ε, which is impractical because not set in stone… As discussed several times here, the tolerance is a function of many factors incl. all the calibration parameters of the lazy ABC, rather than an absolute quantity. The paper is rather terse on this issue (see Section 4.2.2). It seems to me that playing with a large collection of tolerances may be too costly in this setting.