## NIPS 2014

Posted in Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on December 15, 2014 by xi'an

Second and last day of the NIPS workshops! The collection of topics was quite broad and would have made my choosing an ordeal, except that I was invited to give a talk at the probabilistic programming workshop, solving my dilemma… The first talk by Kathleen Fisher was quite enjoyable in that it gave a conceptual discussion of the motivations for probabilistic languages, drawing an analogy with the early days of computer programming that saw a separation between higher level computer languages and machine programming, with a compiler interface. And calling for a similar separation between the models faced by statistical inference and machine-learning and the corresponding code, if I understood her correctly. This was connected with Frank Wood’s talk of the previous day where he illustrated the concept through a generation of computer codes to approximately generate from standard distributions like Normal or Poisson. Approximately as in ABC, which is why the organisers invited me to talk in this session. However, I was a wee bit lost in the following talks and presumably lost part of my audience during my talk, as I realised later to my dismay when someone told me he had not perceived the distinction between the trees in the random forest procedure and the phylogenetic trees in the population genetic application. Still, while it had for me a sort of Twilight Zone feeling of having stepped in another dimension, attending this workshop was an worthwhile experiment as an eye-opener into a highly different albeit connected field, where code and simulator may take the place of a likelihood function… To the point of defining Hamiltonian Monte Carlo directly on the former, as Vikash Mansinghka showed me at the break.

I completed the day with the final talks in the variational inference workshop, if only to get back on firmer ground! Apart from attending my third talk by Vikash in the conference (but on a completely different topic on variational approximations for discrete particle-ar distributions), a talk by Tim Salimans linked MCMC and variational approximations, using MCMC and HMC to derive variational bounds. (He did not expand on the opposite use of variational approximations to build better proposals.) Overall, I found these two days and my first NIPS conference quite exciting, if somewhat overpowering, with a different atmosphere and a different pace compared with (small or large) statistical meetings. (And a staggering gender imbalance!)

## recycling accept-reject rejections (#2)

Posted in R, Statistics, University life with tags , , , , , , on July 2, 2014 by xi'an

Following yesterday’s post on Rao’s, Liu’s, and Dunson’s paper on a new approach to intractable normalising constants, and taking advantage of being in Warwick, I tested the method on a toy model, namely the posterior associated with n Student’s t observations with unknown location parameter μ and a flat prior,

$x_1,\ldots,x_n \sim p(x|\mu) \propto \left[ 1+(x-\mu)^2/\nu \right]^{-(\nu+1)/2}$

which is “naturally” bounded by a Cauchy density with scale √ν. The constant M is then easily derived and running the new algorithm follows from a normal random walk proposal targeting the augmented likelihood (R code below).

As shown by the above graph, the completion-by-rejection scheme produces a similar outcome (tomato) as the one based on the sole observations (steelblue). With a similar acceptance rate. However, the computing time is much much degraded:

> system.time(g8())
user  system elapsed
53.751   0.056  54.103
> system.time(g9())
user  system elapsed
1.156   0.000   1.161


when compared with the no-completion version. Here is the entire R code that produced both MCMC samples: Continue reading

## R/Rmetrics in Paris [alas!]

Posted in Mountains, pictures, R, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , on June 30, 2014 by xi'an

Today I gave a talk on Bayesian model choice in a fabulous 13th Century former monastery in the Latin Quarter of Paris… It is the Collège des Bernardins, close to Jussieu and Collège de France, unbelievably hidden to the point I was not aware of its existence despite having studied and worked in Jussieu since 1982… I mixed my earlier San Antonio survey on importance sampling approximations to Bayes factors with an entry to our most recent work on ABC with random forests. This was the first talk of the 8th R/Rmetrics workshop taking place in Paris this year. (Rmetrics is aiming at aggregating R packages with econometrics and finance applications.) And I had a full hour and a half to deliver my lecture to the workshop audience. Nice place, nice people, new faces and topics (and even andouille de Vire for lunch!): why should I complain with an alas in the title?!What happened is that the R/Rmetrics meetings have been till this year organised in Meielisalp, Switzerland. Which stands on top of Thuner See and… just next to the most famous peaks of the Bernese Alps! And that I had been invited last year but could not make it… Meaning I lost a genuine opportunity to climb one of my five dream routes, the Mittelegi ridge of the Eiger. As the future R/Rmetrics meetings will not take place there.

A lunch discussion at the workshop led me to experiment the compiler library in R, library that I was unaware of. The impact on the running time is obvious: recycling the fowler function from the last Le Monde puzzle,

> bowler=cmpfun(fowler)
> N=20;n=10;system.time(fowler(pred=N))
user  system elapsed
52.647   0.076  56.332
> N=20;n=10;system.time(bowler(pred=N))
user  system elapsed
51.631   0.004  51.768
> N=20;n=15;system.time(bowler(pred=N))
user  system elapsed
51.924   0.024  52.429
> N=20;n=15;system.time(fowler(pred=N))
user  system elapsed
52.919   0.200  61.960


shows a ten- to twenty-fold gain in system time, if not in elapsed time (re-alas!).