Archive for Gibbs sampling

insufficient Gibbs sampling bridges as well!

Posted in Books, Kids, pictures, R, Statistics, University life with tags , , , , , , , , , , , , on March 3, 2024 by xi'an

Antoine Luciano, Robin Ryder and I posted a revised version of our insufficient Gibbs sampler on arXiv last week (along with three other revisions or new deposits of mine’s!), following comments and suggestions from referees. Thanks to this revision, we realised that the evidence based on an (insufficient) statistic was also available for approximation by a Monte Carlo estimate attached to the completed sample simulated by the insufficient sampler. Better, a bridge sampling estimator can be used in the same conditions as when the full data is available! In this new version, we thus revisited toy examples first explored in some of my ABC papers on testing (with insufficient statistics), as illustrated by both graphs on this post.

mostly MC[bruary]

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , , , , , , , on February 18, 2024 by xi'an

bias reduction in self-normalised importance sampling

Posted in Books, Statistics with tags , , , , , , on December 22, 2023 by xi'an

Gabriel Cardoso and coauthors (among whom Éric Moulines and Achille Thin, with whom I collaborated on the inFINE/NEO algorithm), have arXived a nice entry on a cheap way to reduce bias in the famously biased self-normalised importance sampling estimator. Which is a standard solution when the target density is not normalised. They reconsider a 2004 technical paper by Tjemeland—I remember reading at the time—, which constructs a sampling resampling algorithm by creating a Markov chain and choosing between the current value and a pool of M proposed values (from the importance function), according to the importance weight, which, thanks to Tjemeland’s reformulation with two copies of the current state, constitutes a Gibbs sampler with the correct target. As in Tjemeland (2004), they propose to recycle all proposed values into the integral estimate, which then turned being unbiased under stationarity, rather unexpectedly. The paper then proceeds to analyse convergence towards this expectation, linearly in the size of the pool and exponentially in the number of Markov iterations.

 

capture & maybe recapture

Posted in Books, pictures, Running, Statistics, University life with tags , , , , , , , , , on September 5, 2023 by xi'an

I read population size estimation with capture-recapture in presence of individual misidentification and low recapture arXived by Rémy Fraysse and coauthors on my flight back from Saigon. The setup is one of a capture-recapture experience where potential misidentification (of a recapture individual labelled as new) may occur due to visual identification errors as, e.g., in whale studies. Trying to handle the issue, Yoshizaki et al. (2011) proposed adding one layer to the temporal Darroch model M[t] via a probability α of creating a “ghost” (by failing to recognise a formerly observed individual). When representing the experiment as a partly observed Markov process (as in Dupuis, 1995), this addition brings another completely latent process for misidentification. Completely in the sense that a misidentification is never observed (while a proper identification is). This means there is an issue with identifiability between failing to capture and misidentification, as a new individual may be captured for the first time or misidentified on that round.

Processing this model (i.e., producing a simulation algorithm of the posterior) can be done formally by a Gibbs completion (as in Dupuis, 1995) but this may prove a non-irreducible scheme (Schofer & Bonner, 2015), a problem solved by considering instead Metropolis-Hastings steps. The current paper is an extension of the above to the multiple states Arnason-Schwarz model with no theoretical convergence issue, besides running a large sized completion. It is mostly a simulation experiment with a comparison on different priors on the misspecification rate, some highly informative, others not, with a frequentist assessment of coverage. Given the identifiability issue mentioned above, this is not particularly helpful since it simply exhibits the fact that with the right priors the parameter values are unbiasedly estimated, while a low recapture plus high misidentification setting makes estimation more difficult.

insufficient Gibbs sampling

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , , , , on July 29, 2023 by xi'an

We have just arXived our paper on insufficient Gibbs sampling with Antoine Luciano and Robin Ryder, from Université Paris Dauphine. This is Antoine’s first paper and part of his PhD. (In particular, he wrote the entire code.) The idea stemmed from a discussion on ABC benchmarks, like the one when the pair (median, MAD) is the only available observation. With no available joint density, the setting seems to prohibit calling for an MCMC sampler. However, simulating the complete data set conditional on these statistics proves feasible, with a bit of bookkeeping. With obviously much better results [demonstrated above for a Cauchy example] than when calling ABC and at a very similar cost. (If not accounting for the ability of ABC to be parallelised.)  The idea can be extended to other settings, obviously, as long as completion remains achievable. (And a big thanks to our friend Ed George who suggested the title, while at CIRM. I had suggested “Gibbs for boars” as a poster title, in connection with the historical time-line of

Gibbs for Kids (Casella and George) — Gibbs for Pigs (Gianola) — Gibbs for Robust Pigs = Gibbs for Boars

and the abundance of boars on the Luminy campus, but this did not sound convincing enough for Antoine.)