Archive for Laplace distribution

Hamiltonian MC on discrete spaces

Posted in Statistics, Travel, University life with tags , , , , , , , , on July 3, 2017 by xi'an

Following a lively discussion with Akihiko Nishimura during a BNP11 poster session last Tuesday, I took the opportunity of the flight to Montréal to read through the arXived paper (written jointly with David Dunson and Jianfeng Liu). The issue is thus one of handling discrete valued parameters in Hamiltonian Monte Carlo. The basic “trick” in handling this complexity goes by turning the discrete support via the inclusion of an auxiliary continuous variable whose discretisation is the discrete parameter, hence resembling to some extent the slice sampler. This removes the discreteness blockage but creates another difficulty, namely handling a discontinuous target density. (I idly wonder why the trick cannot be iterated to second or higher order so that to achieve the right amount of smoothness. Of course, the maths behind would be less cool!) The extension of the Hamiltonian to this setting by a  convolution is a trick I had not seen since the derivation of the Central Limit Theorem during Neveu’s course at Polytechnique.  What I find most exciting in the resolution is the move from a Gaussian momentum to a Laplace momentum, for the reason that I always wondered at alternatives [without trying anything myself!]. The Laplace version is indeed most appropriate here in that it avoids a computation of all discontinuity points and associated values along a trajectory. Since the moves are done component-wise, the method has a Metropolis-within-Gibbs flavour, which actually happens to be a special case. What is also striking is that the approach is both rejection-free and exact, provided ergodicity occurs, which is the case when the stepsize is random.

In addition to this resolution of the discrete parameter problem, the paper presents the further appeal of (re-)running an analysis of the Jolly-Seber capture-recapture model. Where the discrete parameter is the latent number of live animals [or whatever] in the system at any observed time. (Which we cover in Bayesian essentials with R as a neat entry to both dynamic and latent variable models.) I would have liked to see a comparison with the completion approach of Jérôme Dupuis (1995, Biometrika), since I figure the Metropolis version implemented here differs from Jérôme’s. The second example is built on Bissiri et al. (2016) surrogate likelihood (discussed earlier here) and Chopin and Ridgway (2017) catalogue of solutions for not analysing the Pima Indian dataset. (Replaced by another dataset here.)

multiplying a Gaussian matrix and a Gaussian vector

Posted in Books with tags , , , , , on March 2, 2017 by xi'an

This arXived note by Pierre-Alexandre Mattei was actually inspired by one of my blog entries, itself written from a resolution of a question on X validated. The original result about the Laplace distribution actually dates at least to 1932 and a paper by Wishart and Bartlett!I am not sure the construct has clear statistical implications, but it is nonetheless a good calculus exercise.

The note produces an extension to the multivariate case. Where the Laplace distribution is harder to define, in that multiple constructions are possible. The current paper opts for a definition based on the characteristic function. Which leads to a rather unsavoury density with Bessel functions. It however satisfies the constructive definition of being a multivariate Normal multiplied by a χ variate plus a constant vector multiplied by the same squared χ variate. It can also be derived as the distribution of


when W is a (p,q) matrix with iid Gaussian columns and y is a Gaussian vector with independent components. And μ is a vector of the proper dimension. When μ=0 the marginals remain Laplace.

Gauss to Laplace transmutation interpreted

Posted in Books, Kids, Statistics, University life with tags , , , , , , on November 9, 2015 by xi'an

Following my earlier post [induced by browsing X validated], on the strange property that the product of a Normal variate by an Exponential variate is a Laplace variate, I got contacted by Peng Ding from UC Berkeley, who showed me how to derive the result by a mere algebraic transform, related with the decomposition

(X+Y)(X-Y)=X²-Y² ~ 2XY

when X,Y are iid Normal N(0,1). Peng Ding and Joseph Blitzstein have now arXived a note detailing this derivation, along with another derivation using the moment generating function. As a coincidence, I also came across another interesting representation on X validated, namely that, when X and Y are Normal N(0,1) variates with correlation ρ,

XY ~ R(cos(πU)+ρ)

with R Exponential and U Uniform (0,1). As shown by the OP of that question, it is a direct consequence of the decomposition of (X+Y)(X-Y) and of the polar or Box-Muller representation. This does not lead to a standard distribution of course, but remains a nice representation of the product of two Normals.

Gauss to Laplace transmutation!

Posted in Books, Kids, Statistics, University life with tags , , , , on October 14, 2015 by xi'an

When browsing X validated the other day [translate by procrastinating!], I came upon the strange property that the marginal distribution of a zero mean normal variate with exponential variance is a Laplace distribution. I first thought there was a mistake since we usually take an inverse Gamma on the variance parameter, not a Gamma. But then the marginal is a t distribution. The result is curious and can be expressed in a variety of ways:

– the product of a χ21 and of a χ2 is a χ22;
– the determinant of a 2×2 normal matrix is a Laplace variate;
– a difference of exponentials is Laplace…

The OP was asking for a direct proof of the result and I eventually sorted it out by a series of changes of variables, although there exists a much more elegant and general proof by Mike West, then at the University of Warwick, based on characteristic functions (or Fourier transforms). It reminded me that continuous, unimodal [at zero] and symmetric densities were necessary scale mixtures [a wee misnomer] of Gaussians. Mike proves in this paper that exponential power densities [including both the Normal and the Laplace cases] correspond to the variances having an inverse positive stable distribution with half the power. And this is a straightforward consequence of the exponential power density being proportional to the Fourier transform of a stable distribution and of a Fubini inversion. (Incidentally, the processing times of Biometrika were not that impressive at the time, with a 2-page paper submitted in Dec. 1984 published in Sept. 1987!)

This is a very nice and general derivation, but I still miss the intuition as to why it happens that way. But then, I know nothing, and even less about products of random variates!

Bayesian computational tools

Posted in Statistics with tags , , , , , , on April 10, 2013 by xi'an

I just arXived a survey entitled Bayesian computational tools in connection with a chapter the editors of the Annual Review of Statistics and Its Application asked me to write. (A puzzling title, I would have used Applications, not Application. Puzzling journal too: endowed with a prestigious editorial board, I wonder at the long-term perspectives of the review, once “all” topics have been addressed. At least, the “non-profit” aspect is respected: $100 for personal subscriptions and $250 for libraries, plus a one-year complimentary online access to volume 1.) Nothing terribly novel in my review, which illustrates some computational tool in some Bayesian settings, missing five or six pages to cover particle filters and sequential Monte Carlo. I however had fun with a double-exponential (or Laplace) example. This distribution indeed allows for a closed-form posterior distribution on the location parameter under a normal prior, which can be expressed as a mixture of truncated normal distributions. A mixture of (n+1) normal distributions for a sample of size n. We actually noticed this fact (which may already be well-known) when looking at our leading example in the consistent ABC choice paper, but it vanished from the appendix in the later versions. As detailed in the previous post, I also fought programming issues induced by this mixture, due to round-up errors in the most extreme components, until all approaches provided similar answers.