Archive for importance sampling

important and published [Markov chains]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on February 26, 2024 by xi'an

Our importance Markov chain paper (with Charly Andral (PhD, Paris Dauphine), Randal Douc, and Hugo Marival (PhD, Telecom SudParis) has been published (on-line) by stochastic processes and their applications (SPA). Incidentally, my first publication in this journal. To paraphrase its abstract, it sort of bridges the (unsuspected?) gap between rejection sampling and importance sampling, moving from one to the other through a tuning parameter. Based on a modified sample of an instrumental Markov chain targeting an instrumental distribution (typically via a MCMC kernel), rather than the target of interest, the Importance Markov chain produces an extended Markov chain whose (first) marginal distribution converges to said target distribution. For instance, when targeting a multimodal distribution, the instrumental distribution can be chosen as a tempered version of the target and this frees the algorithm to explore its multiple modal regions more efficiently. We also derive a Law of Large Numbers and a Central Limit Theorem as well as prove geometric ergodicity for this extended kernel under mild assumptions on the instrumental kernel. Computationally, the algorithm is easy to implement and preexisting librairies can be used to sample from the instrumental distribution.

repulsive sampling

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , on January 31, 2024 by xi'an

After a long absence from the monthly Séminaire Parisien de Statistique I attended one today at IHP, including a talk by Diala Hawat on repelled point processes for numerical integration by Hawat et al. The goal is to get (and prove) a universal variance improvement for numerical integration by applying a form of determinantal processes to initial simulations, as eg iid (Poisson process) sampling (without accounting for the O(N²) cost in moving these points). The repelled points are obtain by a single (why single?) move based on a force function (as shown in the slide below), inspired by a Coulomb potential (in the sense that said move appears as one gradient step along the potential). Which reminded me of the pinball sampler, even though the inverse norm was just there to create infinite repulsion near each point. A surprising feature of this repelling step is that it even modifies a (QMC) Sobol process with also an (empirical) improvement in the variance. I wonder if one could construct an MCMC algorithm that would target a joint distribution, maybe via a copula representation, maybe via an equivalent version of HMC.


As an aside, the Bakhvalov results on the existence of a worst case integrand for any deterministic or random sequence (see top slide) made me wonder what the shape of this worst case function is, esp. for a QMC sequence (eg, Sobol). And whether or not they are of any relevance as a counterfactor to the optimal importance functions.

combining normalizing flows and QMC

Posted in Books, Kids, Statistics with tags , , , , , , , , , , , , , on January 23, 2024 by xi'an

My PhD student Charly Andral [presented at the mostly Monte Carlo seminar and] arXived a new preprint yesterday, on training a normalizing flow network as an importance sampler (as in Gabrié et al.) or an independent Metropolis proposal, and exploiting its invertibility to call quasi-Monte Carlo low discrepancy sequences to boost its efficiency. (Training the flow is not covered by the paper.) This extends the recent study of He et al. (which was presented at MCM 2023 in Paris) to the normalising flow setting. In the current experiments, the randomized QMC samples are computed using the SciPy package (Roy et al. 2023), where the Sobol’ sequence is based on Joe and Kuo (2008) and on Matouˇsek (1998) for the scrambling, and where the Halton sequence is based on Owen (2017). (No pure QMC was harmed in the process!) The flows are constructed using the package FlowMC. As expected the QMC version brings a significant improvement in the quality of the Monte Carlo approximations, for equivalent computing times, with however a rapid decrease in the efficiency as the dimension of the targetted distribution increases. On the other hand, the architecture of the flow demonstrates little relevance. And the type of  RQMC sequence makes a difference, the advantage apparently going to a scrambled Sobol’ sequence.

simulating Gumbel’s bivariate exponential distribution

Posted in Books, Kids, R, Statistics with tags , , , , , , , , , , on January 14, 2024 by xi'an

A challenge interesting enough for a sunny New Year morn, found on X validated, namely the simulation of a bivariate exponential distribution proposed by Gumbel in 1960, with density over the positive quadrant in IR²

{}^{ [(\lambda_2+rx_1)(\lambda_1+rx_2)-r]\exp[-(\lambda_1x_1+\lambda_2x_2+rx_1x_2)]}

Although there exists a direct approach based on the fact that the marginals are Exponential distributions and the conditionals signed mixtures of Gamma distributions, an accept-reject algorithm is also available for the pair, with a dominating density representing a genuine mixture of four Gammas, when omitting the X product in the exponential and the negative r in the first term. The efficiency of this accept-reject algorithm is high for r small. However, and in more direct connection with the original question, using this approach to integrate the function equal to the product of the pair, as considered in the original paper of Gumbel, is much less efficient than seeking a quasi-optimal importance function, since this importance function is yet another mixture of four Gammas that produces a much reduced variance at a cheaper cost!

bias reduction in self-normalised importance sampling

Posted in Books, Statistics with tags , , , , , , on December 22, 2023 by xi'an

Gabriel Cardoso and coauthors (among whom Éric Moulines and Achille Thin, with whom I collaborated on the inFINE/NEO algorithm), have arXived a nice entry on a cheap way to reduce bias in the famously biased self-normalised importance sampling estimator. Which is a standard solution when the target density is not normalised. They reconsider a 2004 technical paper by Tjemeland—I remember reading at the time—, which constructs a sampling resampling algorithm by creating a Markov chain and choosing between the current value and a pool of M proposed values (from the importance function), according to the importance weight, which, thanks to Tjemeland’s reformulation with two copies of the current state, constitutes a Gibbs sampler with the correct target. As in Tjemeland (2004), they propose to recycle all proposed values into the integral estimate, which then turned being unbiased under stationarity, rather unexpectedly. The paper then proceeds to analyse convergence towards this expectation, linearly in the size of the pool and exponentially in the number of Markov iterations.