## future of computational statistics

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , , , , , , , on September 29, 2014 by xi'an

I am currently preparing a survey paper on the present state of computational statistics, reflecting on the massive evolution of the field since my early Monte Carlo simulations on an Apple //e, which would take a few days to return a curve of approximate expected squared error losses… It seems to me that MCMC is attracting more attention nowadays than in the past decade, both because of methodological advances linked with better theoretical tools, as for instance in the handling of stochastic processes, and because of new forays in accelerated computing via parallel and cloud computing, The breadth and quality of talks at MCMski IV is testimony to this. A second trend that is not unrelated to the first one is the development of new and the rehabilitation of older techniques to handle complex models by approximations, witness ABC, Expectation-Propagation, variational Bayes, &tc. With a corollary being an healthy questioning of the models themselves. As illustrated for instance in Chris Holmes’ talk last week. While those simplifications are inevitable when faced with hardly imaginable levels of complexity, I still remain confident about the “inevitability” of turning statistics into an “optimize+penalize” tunnel vision…  A third characteristic is the emergence of new languages and meta-languages intended to handle complexity both of problems and of solutions towards a wider audience of users. STAN obviously comes to mind. And JAGS. But it may be that another scale of language is now required…

If you have any suggestion of novel directions in computational statistics or instead of dead ends, I would be most interested in hearing them! So please do comment or send emails to my gmail address bayesianstatistics

## BAYSM’14 recollection

Posted in Books, Kids, pictures, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , on September 23, 2014 by xi'an

When I got invited to BAYSM’14 last December, I was quite excited to be part of the event. (And to have the opportunities to be in Austria, in Wien and on the new WU campus!) And most definitely and a posteriori I have not been disappointed given the high expectations I had for that meeting…! The organisation was seamless, even by Austrian [high] standards, the program diverse and innovative, if somewhat brutal for older Bayesians and the organising committee (Angela Bitto, Gregor Kastner, and Alexandra Posekany) deserves an ISBA recognition award [yet to be created!] for their hard work and dedication. Thanks also to Sylvia Früwirth-Schnatter for hosting the meeting in her university. They set the standard very high for the next BAYSM organising team. (To be hold in Firenze/Florence, on June 19-21, 2016, just prior to the ISBA World meeting not taking place in Banff. A great idea to associate with a major meeting, in order to save on travel costs. Maybe the following BAYSM will take place in Edinburgh! Young, local, and interested Bayesians just have to contact the board of BAYS with proposals.)

So, very exciting and diverse. A lot of talks in applied domains, esp. economics and finance in connection with the themes of the guest institution, WU.  On the talks most related to my areas of interest, I was pleased to see Matthew Simpson working on interweaving MCMC with Vivek Roy and Jarad Niemi, Madhura Killedar constructing her own kind of experimental ABC on galaxy clusters, Kathrin Plankensteiner using Gaussian processes on accelerated test data, Julyan Arbel explaining modelling by completely random measures for hazard mixtures [and showing his filliation with me by (a) adapting my pun title to his talk, (b) adding an unrelated mountain picture to the title page, (c) including a picture of a famous probabilist, Paul Lévy, to his introduction of Lévy processes and (d) using xkcd strips], Ewan Cameron considering future ABC for malaria modelling,  Konstantinos Perrakis working on generic importance functions in data augmentation settings, Markus Hainy presenting his likelihood-free design (that I commented a while ago), Kees Mulder explaining how to work with the circular von Mises distribution. Not to mention the numerous posters I enjoyed over the first evening. And my student Clara Grazian who talked about our joint and current work on Jeffreys priors for mixture of distributions. Whose talk led me to think of several extensions…

Besides my trek through past and current works of mine dealing with mixtures, the plenary sessions for mature Bayesians were given by Mike West and Chris Holmes, who gave very different talks but with the similar message that data was catching up with modelling and with a revenge and that we [or rather young Bayesians] needed to deal with this difficulty. And use approximate or proxy models. Somewhat in connection with my last part on an alternative to Bayes factors, Mike also mentioned a modification of the factor in order to attenuate the absorbing impact of long time series. And Chris re-set Bayesian analysis within decision theory, constructing approximate models by incorporating the loss function as a substitute to the likelihood.

Once again, a terrific meeting in a fantastic place with a highly unusual warm spell. Plus enough time to run around Vienna and its castles and churches. And enjoy local wines (great conference evening at a Heuriger, where we did indeed experience Gemütlichkeit.) And museums. Wunderbar!

## Bangalore workshop [ಬೆಂಗಳೂರು ಕಾರ್ಯಾಗಾರ] and new book

Posted in Books, pictures, R, Statistics, Travel, University life with tags , , , , , , , , , , , , on August 13, 2014 by xi'an

On the last day of the IFCAM workshop in Bangalore, Marc Lavielle from INRIA presented a talk on mixed effects where he illustrated his original computer language Monolix. And mentioned that his CRC Press book on Mixed Effects Models for the Population Approach was out! (Appropriately listed as out on a 14th of July on amazon!) He actually demonstrated the abilities of Monolix live and on diabets data provided by an earlier speaker from Kolkata, which was a perfect way to start initiating a collaboration! Nice cover (which is all I saw from the book at this stage!) that maybe will induce candidates to write a review for CHANCE. Estimation of those mixed effect models relies on stochastic EM algorithms developed by Marc Lavielle and Éric Moulines in the 90’s, as well as MCMC methods.

Posted in Statistics, University life with tags , , on July 14, 2014 by xi'an

Today, I took part in the thesis defence of Amandine Shreck at Telecom-ParisTech. I had commented a while ago on the Langevin algorithm for discontinuous targets she developed with co-authors from that school towards variable selection. The thesis also contains material on the equi-energy sampler that is worth mentioning. The algorithm relates to the Wang-Landau algorithm last discussed here for the seminars of Pierre and Luke in Paris, last month. The algorithm aims at facilitating the moves around the target density by favouring moves from one energy level to the next. As explained to me by Pierre once again after his seminar, the division of the space according to the target values is a way to avoid creating artificial partitions over the sampling space. A sort of Lebesgue version of Monte Carlo integration. The energy bands

$\{\theta;\ \underline{\pi}\le \pi(\theta) \le \overline{\pi}$

require the choice of a sequence of bounds on the density, values that are hardly available prior to the simulation of the target. The paper corresponding to this part of the thesis (and published in our special issue of TOMACS last year) thus considers the extension when the bounds are defined on the go, in a adaptive way. This could be achieved based on earlier simulations, using some quantiles of the observed values of the target but this is  a costly solution which requires to keep an ordered sample of the density values. (Is it that costly?!) Thus the authors prefer to determine the energy levels in a cheaper adaptive manner. Namely, through a Robbins-Monro/stochastic approximation type update of the bounds,

$\xi_{n+1m\alpha}=\xi_{n,\alpha}+\gamma_n(\alpha-\mathbb{I}\{\pi(\theta_n)\le \xi_{n,\alpha}\}\,.$

My questions related with this part of the thesis were about the actual gain if any in computing time versus efficiency, the limitations in terms of curse of dimension and storage, the connections with the Wang-Landau algorithm and pseudo-marginal approximations, and the (degree of) likelihood of an universal and automatised adaptive equi-energy sampler.

## recycling accept-reject rejections (#2)

Posted in R, Statistics, University life with tags , , , , , , on July 2, 2014 by xi'an

Following yesterday’s post on Rao’s, Liu’s, and Dunson’s paper on a new approach to intractable normalising constants, and taking advantage of being in Warwick, I tested the method on a toy model, namely the posterior associated with n Student’s t observations with unknown location parameter μ and a flat prior,

$x_1,\ldots,x_n \sim p(x|\mu) \propto \left[ 1+(x-\mu)^2/\nu \right]^{-(\nu+1)/2}$

which is “naturally” bounded by a Cauchy density with scale √ν. The constant M is then easily derived and running the new algorithm follows from a normal random walk proposal targeting the augmented likelihood (R code below).

As shown by the above graph, the completion-by-rejection scheme produces a similar outcome (tomato) as the one based on the sole observations (steelblue). With a similar acceptance rate. However, the computing time is much much degraded:

> system.time(g8())
user  system elapsed
53.751   0.056  54.103
> system.time(g9())
user  system elapsed
1.156   0.000   1.161


when compared with the no-completion version. Here is the entire R code that produced both MCMC samples: Continue reading

## computational methods for statistical mechanics [day #4]

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , on June 7, 2014 by xi'an

My last day at this ICMS workshop on molecular simulation started [with a double loop of Arthur's Seat thankfully avoiding the heavy rains of the previous night and then] Chris Chipot‘s magistral entry to molecular simulation for proteins with impressive slides and simulation movies, even though I could not follow the details to really understand the simulation challenges therein, just catching a few connections with earlier talks. A typical example of a cross-disciplinary gap, where the other discipline always seems to be stressing the ‘wrong” aspects. Although this is perfectly unrealistic, it would immensely to prepare talks in pairs for such interdisciplinary workshops! Then Gersende Fort presented results about convergence and efficiency for the Wang-Landau algorithm. The idea is to find the optimal rate for updating the weights of the elements of the partition towards reaching the flat histogram in minimal time. Showing massive gains on toy examples. The next talk went back to molecular biology with Jérôme Hénin‘s presentation on improved adaptive biased sampling. With an exciting notion of orthogonality aiming at finding the slowest directions in the target and putting the computational effort. He also discussed the tension between long single simulations and short repeated ones, echoing a long-going debate in the MCMC community. (He also had a slide with a picture of my first 1983 Apple IIe computer!) Then Antonietta Mira gave a broad perspective on delayed rejection and zero variance estimates. With impressive variance reductions (although some physicists then asked for reduction of order 10¹⁰!). Johannes Zimmer gave a beautiful maths talk on the connection between particle and diffusion limits (PDEs) and Wasserstein geometry and large deviations. (I did not get most of the talk, but it was nonetheless beautiful!) Bert Kappen concluded the day (and the workshop for me) by a nice introduction to control theory. Making connection between optimal control and optimal importance sampling. Which made me idly think of the following problem: what if control cannot be completely… controlled and hence involves a stochastic part? Presumably of little interest as the control would then be on the parameters of the distribution of the control.

“The alanine dipeptide is the fruit fly of molecular simulation.”

The example of this alanine dipeptide molecule was so recurrent during the talks that it justified the above quote by Michael Allen. Not that I am more proficient in the point of studying this protein or using it as a benchmark. Or in identifying the specifics of the challenges of molecular dynamics simulation. Not a criticism of the ICMS workshop obviously, but rather of my congenital difficulty with continuous time processes!!! So I do not return from Edinburgh with a new research collaborative project in molecular dynamics (if with more traditional prospects), albeit with the perception that a minimal effort could bring me to breach the vocabulary barrier. And maybe consider ABC ventures in those (new) domains. (Although I fear my talk on ABC did not impact most of the audience!)

## computational methods for statistical mechanics [day #3]

Posted in Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , on June 6, 2014 by xi'an

The third day [morn] at our ICMS workshop was dedicated to path sampling. And rare events. Much more into [my taste] Monte Carlo territory. The first talk by Rosalind Allen looked at reweighting trajectories that are not in an equilibrium or are missing the Boltzmann [normalizing] constant. Although the derivation against a calibration parameter looked like the primary goal rather than the tool for constant estimation. Again papers in J. Chem. Phys.! And a potential link with ABC raised by Antonietta Mira… Then Jonathan Weare discussed stratification. With a nice trick of expressing the normalising constants of the different terms in the partition as solution(s) of a Markov system

$v\mathbf{M}=v$

Because the stochastic matrix M is easier (?) to approximate. Valleau’s and Torrie’s umbrella sampling was a constant reference in this morning of talks. Arnaud Guyader’s talk was in the continuation of Toni Lelièvre’s introduction, which helped a lot in my better understanding of the concepts. Rephrasing things in more statistical terms. Like the distinction between equilibrium and paths. Or bias being importance sampling. Frédéric Cérou actually gave a sort of second part to Arnaud’s talk, using importance splitting algorithms. Presenting an algorithm for simulating rare events that sounded like an opposite nested sampling, where the goal is to get down the target, rather than up. Pushing particles away from a current level of the target function with probability ½. Michela Ottobre completed the series with an entry into diffusion limits in the Roberts-Gelman-Gilks spirit when the Markov chain is not yet stationary. In the transient phase thus.