an introduction to MCMC sampling

Posted in Books, Kids, Statistics with tags , , , , , , , , , on August 9, 2022 by xi'an

Following a rather clueless question on X validated, I had a quick read of A simple introduction to Markov Chain Monte–Carlo sampling, by Ravenzwaaij, Cassey, and Brown, published in 2018 in Psychonomic Bulletin & Review, which I had never opened to this day. The setting is very basic and the authors at pain to make their explanations as simple as possible, but I find the effort somehow backfires under the excess of details. And the characteristic avoidance of mathematical symbols and formulae. For instance, in the Normal mean example that is used as introductory illustration and that confused the question originator, there is no explanation for the posterior being a N(100,15) distribution, 100 being the sample average, the notation N(μ|x,σ) is used for the posterior density, and then the Metropolis comparison brings an added layer of confusion:

“Since the target distribution is normal with mean 100 (the value of the single observation) and standard deviation 15,  this means comparing N(100|108, 15) against N(100|110, 15).”

as it most unfortunately exchanges the positions of  μ and x (which is equal to 100). There is no fundamental error there, due to the symmetry of the Normal density, but this switch from posterior to likelihood certainly contributes to the confusion of the QO. Similarly for the Metropolis step description:

“If the new proposal has a lower posterior value than the most recent sample, then randomly choose to accept or
reject the new proposal, with a probability equal to the height of both posterior values. “

And the shortcomings of MCMC may prove equally difficult to ingest: like
“The method will “work” (i.e., the sampling distribution will truly be the target distribution) as long as certain conditions are met.
Firstly, the likelihood values calculated (…) to accept or reject the new proposal must accurately reflect the density of the proposal in the target distribution. When MCMC is applied to Bayesian inference, this means that the values calculated must be posterior likelihoods, or at least be proportional to the posterior likelihood (i.e., the ratio of the likelihoods calculated relative to one another must be correct).”

which leaves me uncertain as to what the authors do mean by the alternative situation, i.e., by the proposed value not reflecting the proposal density. Again, the reluctance in using (more) formulae hurts the intended pedagogical explanations.

transformation MCMC

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , on January 3, 2022 by xi'an

For reasons too long to describe here, I recently came across a 2013 paper by Dutta and Bhattacharya (from ISI Kolkata) entitled MCMC based on deterministic transforms, which sounded a bit dubious until I realised the deterministic label apply to the choice of the transformation and not to the Metropolis-Hastings proposal… The core of the proposed method is to make a proposal that simultaneously considers a move and its inverse, namely from x to either x’=T(x,ε) or x”=T⁻¹(x,ε) , where ε is an independent random noise, possibly degenerated to a manifold of lesser dimension. Due to the symmetry the acceptance probability is then a ratio of the target, multiplied by the x-Jacobian of T (as in reversible jump). I tried the method on a mixture of Gamma distributions target (in red) with an Exponential scale change and the resulting sample indeed fitted said target.

The authors even make an argument in favour of a unidimensional noise, although this amounts to running an implicit Gibbs sampler. Argument based on a reduced simulation cost for ε, albeit the full dimensional transform x’=T(x,ε) still requires to be computed. And as noted in the paper this also requires checking for irreducibility. The claim for higher efficiency found therein is thus mostly unsubstantiated…

“The detailed balance requirement also demands that, given x, the regions covered by the forward and the backward transformations are disjoint.”

The above statement is also surprising in that the generic detailed balance condition does not impose such a restriction.

Blackwell-Rosenbluth Awards 2021

Posted in Statistics, University life with tags , , , , , , , , , , , on November 1, 2021 by xi'an

Congratulations to the winners of the newly created award! This j-ISBA award is intended for junior researchers in different areas of Bayesian statistics. And named after David Blackwell and Arianna  Rosenbluth. They will present their work at the newly created JB³ seminars on 10 and 12 November, both at 1pm UTC. (The awards are broken into two time zones, corresponding to the Americas and the rest of the World.)

UTC+0 to UTC+13

Marta Catalano, Warwick University
Samuel Livingstone, University College London
Dootika Vats, Indian Institute of Technology Kanpur

UTC-12 to UTC-1

Trevor Campbell, University of British Columbia
Daniel Kowal, Rice University
Yixin Wang, University of Michigan

scale matters [maths as well]

Posted in pictures, R, Statistics with tags , , , , , , , , on June 2, 2021 by xi'an

A question from X validated on why an independent Metropolis sampler of a three component Normal mixture based on a single Normal proposal was failing to recover the said mixture…

When looking at the OP’s R code, I did not notice anything amiss at first glance (I was about to drive back from Annecy, hence did not look too closely) and reran the attached code with a larger variance in the proposal, which returned the above picture for the MCMC sample, close enough (?) to the target. Later, from home, I checked the code further and noticed that the Metropolis ratio was only using the ratio of the targets. Dividing by the ratio of the proposals made a significant (?) to the representation of the target.

More interestingly, the OP was fundamentally confused between independent and random-walk Rosenbluth algorithms, from using the wrong ratio to aiming at the wrong scale factor and average acceptance ratio, and furthermore challenged by the very notion of Hessian matrix, which is often suggested as a default scale.

Metropolis-Hastings via Classification [One World ABC seminar]

Posted in Statistics, University life with tags , , , , , , , , , , , , , , , on May 27, 2021 by xi'an

Today, Veronika Rockova is giving a webinar on her paper with Tetsuya Kaji Metropolis-Hastings via classification. at the One World ABC seminar, at 11.30am UK time. (Which was also presented at the Oxford Stats seminar last Feb.) Please register if not already a member of the 1W ABC mailing list.