**A** revealing question on X validated of a simulation concept students (and others) have trouble gripping with. Namely using auxiliary variates to simulate from a marginal distribution, since these auxiliary variables are later dismissed and hence appear to them (students) of no use at all. Even after being exposed to the accept-reject algorithm. Or to multiple importance sampling. In the sense that a realisation of a random variable can be associated with a whole series of densities in an importance weight, all of them being valid (but some more equal than others!).

## Archive for auxiliary variables

## Why do we draw parameters to draw from a marginal distribution that does not contain the parameters?

Posted in Statistics with tags accept-reject algorithm, Animal Farm, auxiliary variables, cross validated, importance sampling, marginalisation, multiple importance methods, probability basics on November 3, 2019 by xi'an## adaptive exchange

Posted in Books, Statistics, University life with tags adaptive MCMC methods, auxiliary variables, bias, doubly intractable problems, evolutionary Monte Carlo, JASA, Markov chain Monte Carlo algorithm, Monte Carlo Statistical Methods, normalising constant, perfect sampling, simulated annealing on October 27, 2016 by xi'an**I**n the March 2016 issue of JASA that currently sits on my desk, there is a paper by Liang, Jim, Song and Liu on the adaptive exchange algorithm, which aims at handling posteriors for sampling distributions with intractable normalising constants. The concept behind the algorithm is the exchange principle initiated by Jesper Møller and co-authors in 2006, where an auxiliary pseudo-observation is simulated for the missing constants to vanish in a Metropolis-Hastings ratio. (The name *exchangeable* was introduced in a subsequent paper by Iain Murray, Zoubin Ghahramani and David MacKay, also in 2006.)

The crux of the method is to run an iteration as [where y denotes the observation]

- Proposing a new value θ’ of the parameter from a proposal q(θ’|θ);
- Generate a pseudo-observation z~ƒ(z|θ’);
- Accept with probability

which has the appeal to cancel all normalising constants. And the repeal of requiring an *exact* simulation from the very distribution with the missing constant, ƒ(.|θ). Which means that in practice a *finite* number of MCMC steps will be used and will *bias* the outcome. The algorithm is unusual in that it replaces the exact proposal q(θ’|θ) with an unbiased random version q(θ’|θ)ƒ(z|θ’), z being just an augmentation of the proposal. (The current JASA paper by Liang et al. seems to confuse *augment* and *argument*, see p.378.)

To avoid the difficulty in simulating from ƒ(.|θ), the authors draw pseudo-observations from sampling distributions with a *finite* number m of parameter values under the [unrealistic] assumption (A⁰) that this collection of values provides an almost complete cover of the posterior support. One of the tricks stands with an auxiliary [time-heterogeneous] chain of pseudo-observations generated by single Metropolis steps from one of these m fixed targets. These pseudo-observations are then used in the main (or *target*) chain to define the above exchange probability. The auxiliary chain is Markov but time-heterogeneous since the probabilities of accepting a move are evolving with time according to a simulated annealing schedule. Which produces a convergent estimate of the m normalising constants. The main chain is not Markov in that it depends on the whole history of the auxiliary chain [see Step 5, p.380]. Even jointly the collection of both chains is not Markov. The paper prefers to consider the process as an adaptive Markov chain. I did not check the rather intricate in details, so cannot judge of the validity of the overall algorithm; I simply note that one condition (A², p.383) is incredibly strong in that it assumes the Markov transition kernel to be Doeblin uniformly on any compact set of the calibration parameters. However, the major difficulty with this approach seems to be in its delicate calibration. From providing a reference set of m parameter values scanning the posterior support to picking transition kernels on both the parameter and the sample spaces, to properly cooling the annealing schedule [always a fun part!], there seems to be [from my armchair expert’s perspective, of course!] a wide range of opportunities for missing the target or running into zero acceptance problems. Both examples analysed in the paper, the auto-logistic and the auto-normal models, are actually of limited complexity in that they depend on a few parameters, 2 and 4 resp., and enjoy sufficient statistics, of dimensions 2 and 4 as well. Hence simulating (pseudo-)realisations of those sufficient statistics should be less challenging than the original approach replicating an entire vector of thousands of dimensions.

## common derivation for Metropolis–Hastings and other MCMC algorithms

Posted in Books, pictures, Statistics, Travel, University life with tags auxiliary variables, directional sampling, Gibbs sampling, Hamiltonian Monte Carlo, Metropolis-Hastings algorithms, Metropolis-within-Gibbs algorithm, NUTS, pseudo-marginal MCMC, recursive proposals, RJMCMC, slice sampling, Sydney, UNSW on July 25, 2016 by xi'an**K**hoa Tran and Robert Kohn from UNSW just arXived a paper on a comprehensive derivation of a large range of MCMC algorithms, beyond Metropolis-Hastings. The idea is to decompose the MCMC move into

- a random completion of the current value θ into V;
- a deterministic move T from (θ,V) to (ξ,W), where only ξ matters.

If this sounds like a new version of Peter Green’s completion at the core of his 1995 RJMCMC algorithm, it is because it is indeed essentially the same notion. The resort to this completion allows for a standard form of the Metropolis-Hastings algorithm, which leads to the correct stationary distribution if T is self-inverse. This representation covers Metropolis-Hastings algorithms, Gibbs sampling, Metropolis-within-Gibbs and auxiliary variables methods, slice sampling, recursive proposals, directional sampling, Langevin and Hamiltonian Monte Carlo, NUTS sampling, pseudo-marginal Metropolis-Hastings algorithms, and pseudo-marginal Hamiltonian Monte Carlo, as discussed by the authors. Given this representation of the Markov chain through a random transform, I wonder if Peter Glynn’s trick mentioned in the previous post on retrospective Monte Carlo applies in this generic setting (as it could considerably improve convergence…)

## recents advances in Monte Carlo Methods

Posted in R, Statistics, Travel, University life with tags ABC, auxiliary variables, England, Imperial College London, London, MCMC, Monte Carlo Statistical Methods, particle methods, Read paper, simulation, target environment, warwick university on February 8, 2012 by xi'an**N**ext Thursday *(Feb. 16*), at the RSS, there will be a special half-day meeting (*afternoon, starting at 13:30*) on Recent Advances in Monte Carlo Methods organised by the General Application Section. The speakers are

- Richard Everitt, University of Oxford,
*Missing data, and what to do about it* - Anthony Lee, Warwick University,
*Auxiliary variables and many-core computation* - Nicolas Kantas, Imperial College London,
*Particle methods for computing optimal control inputs* - Nick Whitely, Bristol University,
*Stability properties of some particle filters* - Simon Maskell, QinetiQ & Imperial College London,
*Using a Probabilistic Hypothesis Density filter to confirm tracks in a multi-target environment*

*(Note this is not a Read Paper meeting, so there is no paper nor discussion!)*

## ABC and Monte Carlo seminar in CREST

Posted in Statistics, University life with tags ABC, auxiliary variables, CREST, ENSAE, expectation-propagation, MC² algorithm, pMCMC, SMC² on January 13, 2012 by xi'an**O**n Monday (Jan. 16, 3pm, CREST–ENSAE, Room S08), Nicolas Chopin will present a talk on:

Dealing with intractability: recent advances in Bayesian Monte-Carlo methods for intractable likelihoods

(joint works with P. Jacob, O. Papaspiliopoulos and S. Barthelmé)This talk will start with a review of recent advancements in Monte Carlo methodology for intractable problems; that is problems involving intractable quantities, typically intractable likelihoods. I will discuss in turn ABC type methods (a.k.a. likelihood-free), auxiliary variable methods for dealing with intractable normalising constants (e.g. the exchange algorithm), and MC² type of algorithms, a recent extension of which being the PMCMC algorithm (Andrieu et al., 2010). Then, I will present two recent pieces of work in these direction. First, and more briefly briefly, I’ll present the ABC-EP algorithm (Chopin and Barthelmé, 2011). I’ll also discuss some possible future research in ABC theory. Second, I’ll discuss the SMC² algorithm (Chopin, Jacob and Papaspiliopoulos, 2011), a new type of MC² algorithm that makes it possible to perform sequential analysis for virtually any state-space models, including models with an intractable Markov transition.

## advanced Markov chain Monte Carlo methods

Posted in Books, Statistics, University life with tags adaptive MCMC methods, auxiliary variables, book review, controlled MCMC, Markov chain Monte Carlo, Monte Carlo Statistical Methods, particle filters, population Monte Carlo, regeneration, sequential Monte Carlo, simulated annealing, simulation, Université Paris Dauphine, Wang-Landau algorithm on December 5, 2011 by xi'an**T**his book, *Advanced Markov Chain Monte Carlo Methods: Learning from Past Samples*, by Faming Liang, Chuanhai Liu, and Raymond Carroll, appeared last year and has been sitting on my desk all this time, patiently (?) waiting for a review. When I received it, I took a brief look at it (further than the cool cover!) and then decided I needed more than that to write a useful review! Here are my impressions on *Advanced Markov Chain Monte Carlo Methods* after a deeper read. (I have not read any other review in the main statistical journals so far.)

**T**he title, *Advanced Markov Chain Monte Carlo Methods*, is a clear warning on the level of the book: “advanced”, it certainly is!!! By page 85, the general description of MCMC simulation methods is completed, including perfect sampling and reversible jump MCMC, and the authors engage into a detailed description of highly specialised topics of their choice: Auxiliary variables (Chap. 4), Population-based MCMC (Chap. 5), Dynamic weighting (Chap. 6), Stochastic approximation Monte Carlo (Chap. 7), and MCMC with adaptive proposals (Chap. 8). The book is clearly inspired by the numerous papers the authors have written in those area, especially Faming Liang. (The uneven distribution of the number of citations per year with peaks in 2000 and 2009 reflects this strong connection.) While the book attempts at broadening the spectrum by including introductory sections, and discussing other papers, it remains nonetheless that this centred focus of the book reduces its potential readership to graduate students and researchers who could directly work on the original papers. I would thus hesitate in teaching my graduate students from this book, given that they only attend a single course on Monte Carlo methods. Continue reading

## Xiao-Li Meng’s inception [in Paris]

Posted in Statistics, University life with tags auxiliary variables, Data augmentation, Institut Henri Poincaré, Paris, seminar, Xiao-Li Meng on July 27, 2011 by xi'an**X**iao-Li Meng will give a talk in Paris next September 1st, so I advertise it now, before my Parisian readers leave the city for their August retreat. Here is the abstract, explaining the above title:

Statistical Inception for the MCMC Dream: The kick is in the residual (augmentation)!

Xiao-Li Meng

Department of Statistics, Harvard UniversityThe development of MCMC algorithms via data augmentation (DA) or equivalently auxiliary variables has some resemblance to the theme plot of the recent Hollywood hit

Inception. We MCMC designers all share essentially the same “3S” dream, that is, to create algorithms that are simple, stable, and speedy. Within that grand dream, however, we have created a rather complex web of tools, with some of them producing very similar algorithms but for unclear reasons, or others that were thought to be of different origins but actually are layered when viewed from a suitable distance. These includeconditional augmentation, marginal augmentation, PX-DA, partially non-centering parameterization, sandwiched algorithms, interweaving strategies, ASIS, etc. It turns out that there is a simple statistical insight that can unify essentially all these methods conceptually, and it also provides practical guidelines for their DA constructions. It is the simple concept of regression residuals, which are constructed to be orthogonal to the regression functions. All these methods in one form or another effectively build aresidual augmentation. Given a DA distribution f(T, A), where T is our targeted variable (i.e., f(T) is our targeted distribution) and A is the augmented variable, there are two broad classes of residuals depending on whether we regress T on A or A on T. In this talk we will demonstrate how methods like conditional augmentation and partially non-centering parameterization build their residual augmentations by regressing A on T, whereas methods such as marginal augmentation and ASIS effectively use residual augmentations from regressing T on A. For either class, the attempted orthogonality helps to reduce the dependence among MCMC draws, and when the orthogonality leads to true independence as occurring in some special cases, we reach the dream of producing i.i.d. draws. (The talk is based on an upcoming discussion article, especially its rejoinder, Yu and Meng (2011,JCGS) )

**T**he talk will take place at Institut Henri Poincaré, Thursday Sept. 1, at 15:00, as part of the Big’MC seminars.