Archive for Bristol

read paper [in Bristol]

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , on January 29, 2016 by xi'an

Clifton & Durdham Downs, Bristol, Sept. 25, 2012I went to give a seminar in Bristol last Friday and I chose to present the testing with mixture paper. As we are busy working on the revision, I was eagerly looking for comments and criticisms that could strengthen this new version. As it happened, the (Bristol) Bayesian Cake (Reading) Club had chosen our paper for discussion, two weeks in a row!, hence the title!, and I got invited to join the group the morning prior to the seminar! This was, of course, most enjoyable and relaxed, including an home-made cake!, but also quite helpful in assessing our arguments in the paper. One point of contention or at least of discussion was the common parametrisation between the components of the mixture. Although all parametrisations are equivalent from a single component point of view, I can [almost] see why using a mixture with the same parameter value on all components may impose some unsuspected constraint on that parameter. Even when the parameter is the same moment for both components. This still sounds like a minor counterpoint in that the weight should converge to either zero or one and hence eventually favour the posterior on the parameter corresponding to the “true” model.

Another point that was raised during the discussion is the behaviour of the method under misspecification or for an M-open framework: when neither model is correct does the weight still converge to the boundary associated with the closest model (as I believe) or does a convexity argument produce a non-zero weight as it limit (as hinted by one example in the paper)? I had thought very little about this and hence had just as little to argue though as this does not sound to me like the primary reason for conducting tests. Especially in a Bayesian framework. If one is uncertain about both models to be compared, one should have an alternative at the ready! Or use a non-parametric version, which is a direction we need to explore deeper before deciding it is coherent and convergent!

A third point of discussion was my argument that mixtures allow us to rely on the same parameter and hence the same prior, whether proper or not, while Bayes factors are less clearly open to this interpretation. This was not uniformly accepted!

Thinking afresh about this approach also led me to broaden my perspective on the use of the posterior distribution of the weight(s) α: while previously I had taken those weights mostly as a proxy to the posterior probabilities, to be calibrated by pseudo-data experiments, as for instance in Figure 9, I now perceive them primarily as the portion of the data in agreement with the corresponding model [or hypothesis] and more importantly as a solution for staying away from a Neyman-Pearson-like decision. Or error evaluation. Usually, when asked about the interpretation of the output, my answer is to compare the behaviour of the posterior on the weight(s) with a posterior associated with a sample from each model. Which does sound somewhat similar to posterior predictives if the samples are simulated from the associated predictives. But the issue was not raised during the visit to Bristol, which possibly reflects on how unfrequentist the audience was [the Statistics group is], as it apparently accepted with no further ado the use of a posterior distribution as a soft assessment of the comparative fits of the different models. If not necessarily agreeing the need of conducting hypothesis testing (especially in the case of the Pima Indian dataset!).

Sherlock [#3]

Posted in Books with tags , , , , , , on March 14, 2015 by xi'an

After watching the first two seasons of the BBC TV Series Sherlock while at the hospital, I found myself looking forward further adventures of Holmes and Watson and eventually “bought” the third season. And watched it over the past weekends. I liked it very much as this new season distanced itself from the sheer depiction of Sherlock’s amazing powers to a quite ironic and self-parodic story, well in tune with a third season where the audience is now utterly familiar with the main characters. They all put on weight (mostly figuratively!), from Sherlock’s acknowledgement of his psychological shortcomings, to Mrs. Hudson’s revealing her drug trafficking past and expressing her dislike of Mycroft, to  John Watson’s engagement and acceptance of Sherlock’s idiosyncrasies, making him the central character of the series in a sort of fatherly figure. Some new characters are also terrific, including Mary Morstan and the new archvillain, C.A. Magnussen. Paradoxically, this makes the detective part of the stories secondary, which is all for the best as, in my opinion, the plots are rather weak and the resolutions hardly relying on high intellectual powers, albeit always surprising. More sleuthing in the new season would be most welcome! As an aside, the wedding place sounded somewhat familiar to me, until I realised it was Goldney Hall, where the recent workshops I attended in Bristol took place.

high-dimensional stochastic simulation and optimisation in image processing [day #3]

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , on August 31, 2014 by xi'an

Last and maybe most exciting day of the “High-dimensional Stochastic Simulation and Optimisation in Image Processing” in Bristol as it was exclusively about simulation (MCMC) methods. Except my own talk on ABC. And Peter Green’s on consistency of Bayesian inference in non-regular models. The talks today were indeed about using convex optimisation devices to speed up MCMC algorithms with tools that were entirely new to me, like the Moreau transform discussed by Marcelo Pereyra. Or using auxiliary variables à la RJMCMC to bypass expensive Choleski decompositions. Or optimisation steps from one dual space to the original space for the same reason. Or using pseudo-gradients on partly differentiable functions in the talk by Sylvain Lecorff on a paper commented earlier in the ‘Og. I particularly liked the notion of Moreau regularisation that leads to more efficient Langevin algorithms when the target is not regular enough. Actually, the discretised diffusion itself may be geometrically ergodic without the corrective step of the Metropolis-Hastings acceptance. This obviously begs the question of an extension to Hamiltonian Monte Carlo. And to multimodal targets, possibly requiring as many normalisation factors as there are modes. So, in fine, a highly informative workshop, with the perfect size and the perfect crowd (which happened to be predominantly French, albeit from a community I did not have the opportunity to practice previously). Massive kudos to Marcello for putting this workshop together, esp. on a week where family major happy events should have kept him at home!

As the workshop ended up in mid-afternoon, I had plenty of time for a long run with Florence Forbes down to the Avon river and back up among the deers of Ashton Court, avoiding most of the rain, all of the mountain bikes on a bike trail that sounded like trail running practice, and building enough of an appetite for the South Indian cooking of the nearby Thali Café. Brilliant!

high-dimensional stochastic simulation and optimisation in image processing [day #2]

Posted in pictures, Statistics, Travel, Uncategorized, University life, Wines with tags , , , , , , on August 30, 2014 by xi'an

After a nice morning run down Leigh Woods and on the muddy banks of the Avon river, I attended a morning session on hyperspectral image non-linear modelling. Topic about which I knew nothing beforehand. Hyperspectral images are 3-D images made of several wavelengths to improve their classification as a mixture of several elements. The non-linearity is due to the multiple reflections from the ground as well as imperfections in the data collection. I found this new setting of clear interest, from using mixtures to exploring Gaussian processes and Hamiltonian Monte Carlo techniques on constrained spaces… Not to mention the “debate” about using Bayesian inference versus optimisation. It was overall a day of discovery as I am unaware of the image processing community (being the outlier in this workshop!) and of their techniques. The problems mostly qualify as partly linear high-dimension inverse problems, with rather standard if sometimes hybrid MCMC solutions. (The day ended even more nicely with another long run in the fields of Ashton Court and a conference diner by the river…)


high-dimensional stochastic simulation and optimisation in image processing [day #1]

Posted in pictures, Statistics, Travel, Uncategorized, University life, Wines with tags , , , , , , , , , , , on August 29, 2014 by xi'an

Even though I flew through Birmingham (and had to endure the fundamental randomness of trains in Britain), I managed to reach the “High-dimensional Stochastic Simulation and Optimisation in Image Processing” conference location (in Goldney Hall Orangery) in due time to attend the (second) talk by Christophe Andrieu. He started with an explanation of the notion of controlled Markov chain, which reminded me of our early and famous-if-unpublished paper on controlled MCMC. (The label “controlled” was inspired by Peter Green who pointed out to us the different meanings of controlled in French [meaning checked or monitored] and in English . We use it here in the English sense, obviously.) The main focus of the talk was on the stability of controlled Markov chains. With of course connections with out controlled MCMC of old, for instance the case of the coerced acceptance probability. Which happened to be not that stable! With the central tool being Lyapounov functions. (Making me wonder whether or not it would make sense to envision the meta-problem of adaptively estimating the adequate Lyapounov function from the MCMC outcome.)

As I had difficulties following the details of the convex optimisation talks in the afternoon, I eloped to work on my own and returned to the posters & wine session, where the small number of posters allowed for the proper amount of interaction with the speakers! Talking about the relevance of variational Bayes approximations and of possible tools to assess it, about the use of new metrics for MALA and of possible extensions to Hamiltonian Monte Carlo, about Bayesian modellings of fMRI and of possible applications of ABC in this framework. (No memorable wine to make the ‘Og!) Then a quick if reasonably hot curry and it was already bed-time after a rather long and well-filled day!z

capture-recapture homeless deaths

Posted in Statistics, Travel, University life with tags , , , , , , , on August 28, 2014 by xi'an

Paris and la Seine, from Pont du Garigliano, Oct. 20, 2011In the newspaper I grabbed in the corridor to my plane today (flying to Bristol to attend the SuSTaIn image processing workshop on “High-dimensional Stochastic Simulation and Optimisation in Image Processing” where I was kindly invited and most readily accepted the invitation), I found a two-page entry on estimating the number of homeless deaths using capture-recapture. Besides the sheer concern about the very high mortality rate among homeless persons (expected lifetime, 48 years; around 7000 deaths in France between 2008 and 2010) and the dreadful realisation that there are an increasing number of kids dying in the streets, I was obviously interested in this use of capture-recapture methods as I had briefly interacted with researchers from INED working on estimating the number of (living) homeless persons about 15 years ago. Glancing at the original paper once I had landed, there was alas no methodological innovation in the approach, which was based on the simplest maximum likelihood estimate. I wonder whether or not more advanced models and [Bayesian] methods of inference could [or should] be used on such data. Like introducing covariates in the process. For instance, when conditioning the probability of (cross-)detection on the cause of death.

running MCMC for too long, and even longer…

Posted in Books, pictures, Running, Statistics, University life with tags , , , , , , , , , , , on October 23, 2013 by xi'an

Clifton observatory, Clifton, Sept. 24, 2012Following my earlier post about the young astronomer who feared he was running his MCMC for too long, here is an update from his visit to my office this morning.  This visit proved quite an instructive visit for both of us. (Disclaimer: the picture of an observatory seen from across Brunel’s suspension bridge in Bristol is as earlier completely unrelated with the young astronomer!)

First, the reason why he thought MCMC was running too long was that the acceptance rate was plummeting down to zero, whatever the random walk scale. The reason for this behaviour is that he was actually running a standard simulated annealing algorithm, hence observing the stabilisation of the Markov chain in one of the (global) modes of the target function. In that sense, he was right that the MCMC was run for “too long”, as there was nothing to expect once the mode had been reached and the temperature turned down to zero. So the algorithm was working correctly.

Second, the astronomy problem he considers had a rather complex likelihood, for which he substituted a distance between the (discretised) observed data and (discretised) simulated data, simulated conditional on the current parameter value. Now…does this ring a bell? If not, here is a three letter clue: ABC… Indeed, the trick he had found to get around this likelihood calculation issue was to re-invent a version of ABC-MCMC! Except that the distance was re-introduced into a regular MCMC scheme as a substitute to the log-likelihood. And compared with the distance at the previous MCMC iteration. This is quite clever, even though this substitution suffers from a normalisation issue (that I already mentioned in the post about Holmes’ and Walker’s idea to turn loss functions into pseudo likelihoods. Regular ABC does not encounter this difficult, obviously. I am still bemused by this reinvention of ABC from scratch!  

So we are now at a stage where my young friend will experiment with (hopefully) correct ABC steps, trying to derive the tolerance value from warmup simulations and use some of the accelerating tricks suggested by Umberto Picchini and Julie Forman to avoid simulating the characteristics of millions of stars for nothing. And we agreed to meet soon for an update. Indeed, a fairly profitable morning for both of us!


Get every new post delivered to your Inbox.

Join 980 other followers