Archive for London

mixtures are slices of an orange

Posted in Kids, R, Statistics with tags , , , , , , , , , , , , , , , , on January 11, 2016 by xi'an

licenceDataTempering_mu_pAfter presenting this work in both London and Lenzerheide, Kaniav Kamary, Kate Lee and I arXived and submitted our paper on a new parametrisation of location-scale mixtures. Although it took a long while to finalise the paper, given that we came with the original and central idea about a year ago, I remain quite excited by this new representation of mixtures, because the use of a global location-scale (hyper-)parameter doubling as the mean-standard deviation for the mixture itself implies that all the other parameters of this mixture model [beside the weights] belong to the intersection of a unit hypersphere with an hyperplane. [Hence the title above I regretted not using for the poster at MCMskv!]fitted_density_galaxy_data_500iters2This realisation that using a (meaningful) hyperparameter (μ,σ) leads to a compact parameter space for the component parameters is important for inference in such mixture models in that the hyperparameter (μ,σ) is easily estimated from the entire sample, while the other parameters can be studied using a non-informative prior like the Uniform prior on the ensuing compact space. This non-informative prior for mixtures is something I have been seeking for many years, hence my on-going excitement! In the mid-1990‘s, we looked at a Russian doll type parametrisation with Kerrie Mengersen that used the “first” component as defining the location-scale reference for the entire mixture. And expressing each new component as a local perturbation of the previous one. While this is a similar idea than the current one, it falls short of leading to a natural non-informative prior, forcing us to devise a proper prior on the variance that was a mixture of a Uniform U(0,1) and of an inverse Uniform 1/U(0,1). Because of the lack of compactness of the parameter space. Here, fixing both mean and variance (or even just the variance) binds the mixture parameter to an ellipse conditional on the weights. A space that can be turned into the unit sphere via a natural reparameterisation. Furthermore, the intersection with the hyperplane leads to a closed form spherical reparameterisation. Yay!

While I do not wish to get into the debate about the [non-]existence of “non-informative” priors at this stage, I think being able to using the invariant reference prior π(μ,σ)=1/σ is quite neat here because the inference on the mixture parameters should be location and scale equivariant. The choice of the prior on the remaining parameters is of lesser importance, the Uniform over the compact being one example, although we did not study in depth this impact, being satisfied with the outputs produced from the default (Uniform) choice.

From a computational perspective, the new parametrisation can be easily turned into the old parametrisation, hence leads to a closed-form likelihood. This implies a Metropolis-within-Gibbs strategy can be easily implemented, as we did in the derived Ultimixt R package. (Which programming I was not involved in, solely suggesting the name Ultimixt from ultimate mixture parametrisation, a former title that we eventually dropped off for the paper.)

Discussing the paper at MCMskv was very helpful in that I got very positive feedback about the approach and superior arguments to justify the approach and its appeal. And to think about several extensions outside location scale families, if not in higher dimensions which remain a practical challenge (in the sense of designing a parametrisation of the covariance matrices in terms of the global covariance matrix).

animal picture of the year

Posted in Kids, pictures with tags , , , , on December 31, 2015 by xi'an

delayed & robbed in London [CFE-CMStatistics 2015]

Posted in Kids, pictures, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , , on December 26, 2015 by xi'an

London by Delta, Dec. 14, 2011Last Sunday, I gave a talk on delayed acceptance at the 9th International Conference on Computational and Financial Econometrics (CFE 2015), joint with CMStatistics 2015, in London. This was a worthwhile session, with other talks by Matias Quiroz, on subsampling strategies for large data, David Frazier, on our joint paper about the consistency of ABC algorithms, and James Ridgway not on Pima Indians! And with a good-sized audience especially when considering the number of parallel sessions (36!). Earlier that day, I also attended an equally interesting session on the calibration of misspecified Bayesian models including talks by Peter Green [with a potential answer to the difficulty of parameters on the boundaries by adding orthogonal priors on those boundaries] and Julien Stoehr. calibrating composite likelihoods on Gaussian random fields. In the evening I went to a pub I had last visited when my late friend Costas Goutis was still at UCL and later enjoyed a fiery hot rogan josh.

While I could have attended two more sessions the next morning, I took advantage of the nice café in the Gower Street Waterstones to work a few hours with co-authors (and drink a few litres of tea from real teapots). Despite this quite nice overall experience, the 36 parallel session and the 1600 plus attendants at the conference still make wonder at the appeal of such a large conference and at the pertinence of giving a talk in parallel with so many other talks. And on about all aspects of statistics and econometrics. One JSM (or one NIPS) is more than enough! And given that many people only came for delivering their talk, there is very little networking between research teams or mentoring of younger colleagues, as far as I can tell. And no connection with a statistical society (it would be so nice if the RSS annual conference could only attract 1600 people!). Only a “CMStatistics working group” of which I discovered I was listed as a member [and asked for removal, so far with no answer]. Whose goals and actions are unclear, except to support Elsevier journals with special issues apparently constructed on the same pattern as this conference was organised, i.e., by asking people to take care [for free!] of gathering authors on a theme of their choice. And behind this “working group” an equally nebulous structure called ERCIM

While the “robbed” in the title could be interpreted as wondering at the reason for paying such high registration fees (£250 for very early birds), I actually got robbed of my bicycle while away at the conference. Second bike stolen within a calendar year, quite an achievement! This was an old 1990 mountain bike I had bought in Cornell and carried back to France, in such a poor state that I could not imagine anyone stealing it. Wrong prior, obviously.

secondhand religion

Posted in Books, pictures, Travel with tags , , , , , , on December 19, 2015 by xi'an

Optimization Monte Carlo: Efficient and embarrassingly parallel likelihood-free inference

Posted in Books, Statistics, Travel with tags , , , , , , , , on December 16, 2015 by xi'an

optiMC1AmstabcTed Meeds and Max Welling have not so recently written about an embarrassingly parallel approach to ABC that they call optimisation Monte Carlo. [Danke Ingmar for pointing out the reference to me.] They start from a rather innocuous rephrasing of the ABC posterior, writing the pseudo-observations as deterministic transforms of the parameter and of a vector of uniforms. Innocuous provided this does not involve an infinite number of uniforms, obviously. Then they suddenly switch to the perspective that, for a given uniform vector u, one should seek the parameter value θ that agrees with the observation y. A sort of Monte Carlo inverse regression: if

y=f(θ,u),

then invert this equation in θ. This is quite clever! Maybe closer to fiducial than true Bayesian statistics, since the prior does not occur directly [only as a weight p(θ)], but if this is manageable [and it all depends on the way f(θ,u) is constructed], this should perform better than ABC! After thinking about it a wee bit more in London, though, I realised this was close to impossible in the realistic examples I could think of. But I still like the idea and want to see if anything at all can be made of this…

“However, it is hard to detect if our optimization succeeded and we may therefore sometimes reject samples that should not have been rejected. Thus, one should be careful not to create a bias against samples u for which the optimization is difficult. This situation is similar to a sampler that will not mix to remote local optima in the posterior distribution.”

Now, the paper does not go that way but keeps the ε-ball approach as in regular ABC, to derive an approximation of the posterior density. For a while I was missing the difference between the centre of the ball and the inverse of the above equation, bottom of page 3. But then I realised the former was an approximation to the latter. When the authors discuss their approximation in terms of the error ε, I remain unconvinced by the transfer of the tolerance to the optimisation error, as those are completely different notions. This also applies to the use of a Jacobian in the weight, which seems out of place since this Jacobian appears in a term associated with (or replacing) the likelihood, f(θ,u), which is then multiplied by the prior p(θ). (Assuming a Jacobian exists, which is unclear when considering most simulation patterns use hard bounds and indicators.) When looking at the toy examples, it however makes sense to have a Jacobian since the selected θ’s are transforms of the u’s. And the p(θ)’s are simply importance weights correcting for the wrong target. Overall, the appeal of the method proposed in the paper remains unclear to me. Most likely because I did not spend enough time over it.

delayed in London [CFE 2015]

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , on December 13, 2015 by xi'an

London by Delta, Dec. 14, 2011Today I am giving a talk at the 9th International Conference on Computational and Financial Econometrics (CFE 2015), in London. The number of parallel sessions there is astounding, which makes me [now] wonder at the appeal of such a large conference and the pertinence of giving a talk in parallel with so many other talks that I end up talking at the same time as Pierre Pudlo, who is presenting our ABC with random forest paper (in the twin CMStatistics 2015!). While I may sound overly pessimistic, or just peeved from missing the second day of workshops at NIPS!, there is no reason to doubt the quality of the talks, given the list of authors (and friends) there. So I am looking forward to see what I can get from this multipurpose econometrics and statistics conference.

Je reviendrai à Montréal [D-2]

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , on December 9, 2015 by xi'an

I have spent the day and more completing and compiling slides for my contrapuntal perspective on probabilistic numerics, back in Montréal, for the NIPS 2015 workshop of December 11 on this theme. As I presume the kind  invitation by the organisers was connected with my somewhat critical posts on the topic, I mostly  The day after, while I am flying back to London for the CFE (Computational and Financial Econometrics) workshop, somewhat reluctantly as there will be another NIPS workshop that day on scalable Monte Carlo.

Je veux revoir le long désert
Des rues qui n’en finissent pas
Qui vont jusqu’au bout de l’hiver
Sans qu’il y ait trace de pas