Archive for MCMC algorithms

Advances in scalable Bayesian computation [day #4]

Posted in Books, Mountains, pictures, R, Statistics, University life with tags , , , , , , , , , , , , , , , , , on March 7, 2014 by xi'an

polyptych painting within the TransCanada Pipeline Pavilion, Banff Centre, Banff, March 21, 2012Final day of our workshop Advances in Scalable Bayesian Computation already, since tomorrow morning is an open research time ½ day! Another “perfect day in paradise”, with the Banff Centre campus covered by a fine snow blanket, still falling…, and making work in an office of BIRS a dream-like moment.

Still looking for a daily theme, parallelisation could be the right candidate, even though other talks this week went into parallelisation issues, incl. Steve’s talk yesterday. Indeed, Anthony Lee gave a talk this morning on interactive sequential Monte Carlo, where he motivated the setting by a formal parallel structure. Then, Darren Wilkinson surveyed the parallelisation issues in Monte Carlo, MCMC, SMC and ABC settings, before arguing in favour of a functional language called Scala. (Neat entries to those topics can be found on Darren’s blog.) And in the afternoon session, Sylvia Frühwirth-Schnatter exposed her approach to the (embarrassingly) parallel problem, in the spirit of Steve’s , David Dunson’s and Scott’s (a paper posted on the day I arrived in Chamonix and hence I missed!). There was plenty to learn from that talk (do not miss the Yin-Yang moment at 25 mn!), but it also helped me to break a difficulty I had with the consensus Bayes representation for two weeks (more on that later!). And, even though Marc Suchard mostly talked about flu and trees in a very pleasant and broad talk, he also had a slide on parallelisation to fit the theme! Although unrelated with parallelism,  Nicolas Chopin’s talk was on sequential quasi-Monte Carlo algorithms: while I had heard previous versions of this talk in Chamonix and BigMC, I found it full of exciting stuff. And it clearly got the room truly puzzled by this possibility, in a positive way! Similarly, Alex Lenkoski spoke about extreme rain events in Norway with no trace of parallelism, but the general idea behind the examples was to question the notion of the calibrated Bayesian (with possible connections with the cut models).

This has been a wonderful week and I am sure the participants got as much as I did from the talks and the informal exchanges. Thanks to BIRS for the sponsorship and the superb organisation of the week (and to the Banff Centre for providing such a paradisical environment). I feel very privileged to have benefited from this support, even though I deadly hope to be back in Banff within a few years.

ABC for bivariate betas

Posted in Statistics, University life with tags , , , , , , , on February 19, 2014 by xi'an

Crakel and Flegal just arXived a short paper running ABC for doing inference on the parameters of two families of bivariate betas. And I could not but read it thru. And wonder why ABC was that necessary to handle the model. The said bivariate betas are defined from




U_i\sim \text{Ga}(\delta_i,1)


X_1=V_1/(1+V_1)\,,\ X_2=V_2/(1+V_2)

This makes each term in the pair Beta and the two components dependent. This construct was proposed by Arnold and Ng (2011). (The five-parameter version cancels the gammas for i=3,4,5.)

Since the pdf of the joint distribution is not available in closed form, Crakel and Flegal zoom on ABC-MCMC as the method of choice and discuss simulation experiments. (The choice of the tolerance ε as an absolute rather than relative value, ε=0.2,0.0.6,0.8, puzzles me, esp. since the distance between the summary statistics is not scaled.) I however wonder why other approaches are impossible. (Or why it is necessary to use this distribution to model correlated betas. Unless I am confused copulas were invented to this effect.) First, this is a latent variable model, so latent variables could be introduced inside an MCMC scheme. A wee bit costly but feasible. Second, several moments of those distributions are known so a empirical likelihood approach could be considered.

MCKSki 5, where willst thou be?!

Posted in Mountains, Statistics, Travel, University life with tags , , , , , , , , , on February 10, 2014 by xi'an

ridge7[Here is a call from the BayesComp Board for proposals for earlier poll on the ‘Og helped shape the proposal, with the year, 2016 vs. 2017, remaining open. I just added town to resort below as it did not sound from the poll people were terribly interested in resorts.]

The Bayesian Computation Section of ISBA is soliciting proposals to host its flagship conference:

Bayesian Computing at MCMSki

The expectation is that the meeting will be held in January 2016, but the committee will consider proposals for other times through January 2017.

This meeting will be the next incarnation of the popular MCMSki series that addresses recent advances in the theory and application of Bayesian computational methods such as MCMC, all in the context of a world-class ski resort/town. While past meetings have taken place in the Alps and the Rocky Mountains, we encourage applications from any venue that could support MCMSki. A three-day meeting is planned, perhaps with an additional day or two of satellite meetings and/or short courses.

One page proposals should address feasibility of hosting the meeting including

1. Proposed dates.
2. Transportation for international participants (both the proximity of international airports and transportation to/from the venue).
3. The conference facilities.
4. The availability and cost of hotels, including low cost options.
5. The proposed local organizing committee and their collective experience organizing international meetings.
6. Expected or promised contributions from the host organization, host country, or industrial partners towards the cost of running the meetings.

Proposals should be submitted to David van Dyk (dvandyk, BayesComp Program Chair) at no later than May 31, 2014.

The Board of Bayesian Computing Section will evaluate the proposals, choose a venue, and appoint the Program Committee for Bayesian Computing at MCMSki.

simulating determinantal processes

Posted in Statistics, Travel with tags , , , , , , , , , , on December 6, 2013 by xi'an

In the plane to Atlanta, I happened to read a paper called Efficient simulation of the Ginibre point process by Laurent Decreusefond, Ian Flint, and Anaïs Vergne (from Telecom Paristech). “Happened to” as it was a conjunction of getting tipped by my new Dauphine colleague (and fellow blogger!) Djalil Chaffaï about the paper, having downloaded it prior to departure, and being stuck in a plane (after watching the only Chinese [somewhat] fantasy movie onboard, Saving General Yang).

This is mostly a mathematics paper. While indeed a large chunk of it is concerned with the rigorous definition of this point process in an abstract space, the last part is about simulating such processes. They are called determinantal (and not detrimental as I was tempted to interpret on my first read!) because the density of an n-set (x1x2,…,xn) is given by a kind of generalised Vandermonde determinant

p(x_1,\ldots,x_n) = \dfrac{1}{n!} \text{det} \left( T(x_i,x_j) \right)

where T is defined in terms of an orthonormal family,

T(x,y) = \sum_{i=1}^n \psi_i(x) \overline{\psi_i(y)}.

(The number n of points can be simulated via an a.s. finite Bernoulli process.) Because of this representation, the sequence of conditional densities for the xi‘s (i.e. x1, x2 given x1, etc.) can be found in closed form. In the special case of the Ginibre process, the ψi‘s are of the form

\psi_i(z) =z^m \exp\{-|z|^2/2\}/\sqrt{\pi m!}

and the process cannot be simulated for it has infinite mass, hence an a.s. infinite number of points. Somehow surprisingly (as I thought this was the point of the paper), the authors then switch to a truncated version of the process that always has a fixed number N of points. And whose density has the closed form

p(x_1,\ldots,x_n) = \dfrac{1}{\pi^N} \prod_i \frac{1}{i!} \exp\{-|z_i|^2/2\}\prod_{i<j} |z_i-z_j|^2

It has an interestingly repulsive quality in that points cannot get close to one another. (It reminded me of the pinball sampler proposed by Kerrie Mengersen and myself at one of the Valencia meetings and not pursued since.) The conclusion (of this section) is anticlimactic, though,  in that it is known that this density also corresponds to the distribution of the eigenvalues of an Hermitian matrix with standardized complex Gaussian entries. The authors mentions that the fact that the support is the whole complex space Cn is a difficulty, although I do not see why.

The following sections of the paper move to the Ginibre process restricted to a compact and then to the truncated Ginibre process restricted to a compact, for which the authors develop corresponding simulation algorithms. There is however a drag in that the sequence of conditionals, while available in closed-form, cannot be simulated efficiently but rely on a uniform accept-reject instead. While I am certainly missing most of the points in the paper, I wonder if a Gibbs sampler would not be an interesting alternative given that the full (last) conditional is a Gaussian density…

Posterior expectation of regularly paved random histograms

Posted in Books, Statistics, University life with tags , , , , , , , , , on October 7, 2013 by xi'an

Today, Raazesh Sainudiin from the University of Canterbury, in Christchurch, New Zealand, gave a seminar at CREST in our BIP (Bayesians in Paris) seminar series. Here is his abstract:

We present a novel method for averaging a sequence of histogram states visited by a Metropolis-Hastings Markov chain whose stationary distribution is the posterior distribution over a dense space of tree-based histograms. The computational efficiency of our posterior mean histogram estimate relies on a statistical data-structure that is sufficient for non-parametric density estimation of massive, multi-dimensional metric data. This data-structure is formalized as statistical regular paving (SRP). A regular paving (RP) is a binary tree obtained by selectively bisecting boxes along their first widest side. SRP augments RP by mutably caching the recursively computable sufficient statistics of the data. The base Markov chain used to propose moves for the Metropolis-Hastings chain is a random walk that data-adaptively prunes and grows the SRP histogram tree. We use a prior distribution based on Catalan numbers and detect convergence heuristically. The L1-consistency of the the initializing strategy over SRP histograms using a data-driven randomized priority queue based on a generalized statistically equivalent blocks principle is proved by bounding the Vapnik-Chervonenkis shatter coefficients of the class of SRP histogram partitions. The performance of our posterior mean SRP histogram is empirically assessed for large sample sizes simulated from several multivariate distributions that belong to the space of SRP histograms.

The paper actually appeared in the special issue of TOMACS Arnaud Doucet and I edited last year. It is coauthored by Dominic Lee, Jennifer Harlow and Gloria Teng. Unfortunately, Raazesh could not connect to our video-projector. Or fortunately as he gave a blackboard talk that turned to be fairly intuitive and interactive.


Get every new post delivered to your Inbox.

Join 549 other followers