## continuous herded Gibbs sampling

Posted in Books, pictures, Statistics with tags , , , , , , , , on June 28, 2021 by xi'an

Read a short paper by Laura Wolf and Marcus Baum on Gibbs herding, where herding is a technique of “deterministic sampling”, for instance selecting points over the support of the distribution by matching exact and empirical (or “empirical”!) moments. Which reminds me of the principal points devised by my late friend Bernhard Flury. With an unclear argument as to why it could take over random sampling:

“random numbers are often generated by pseudo-random number generators, hence are not truly random”

Especially since the aim is to “draw samples from continuous multivariate probability densities.” The sequential construction of such a sample proceeds sequentially by adding a new (T+1)-th point to the existing sample of y’s by maximising in x the discrepancy

$(T+1)\mathbb E^Y[k(x,Y)]-\sum_{t=1}^T k(x,y_t)$

where k(·,·) is a kernel, e.g. a Gaussian density. Hence a complexity that grows as O(T). The current paper suggests using Gibbs “sampling” to update one component of x at a time. Using the conditional version of the above discrepancy. Making the complexity grow as O(dT) in d dimensions.

I remain puzzled by the whole thing as these samples cannot be used as regular random or quasi-random samples. And in particular do not produce unbiased estimators of anything. Obviously. The production of such samples being furthermore computationally costly it is also unclear to me that they could even be used for quick & dirty approximations of a target sample.

## QMC at CIRM

Posted in Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on October 21, 2020 by xi'an

## dropping a point

Posted in Statistics, University life with tags , , , , , , , , on September 8, 2020 by xi'an

“A discussion about whether to drop the initial point came up in the plenary tutorial of Fred Hickernell at MCQMC 2020 about QMCPy software for QMC. The issue has been discussed by the pytorch community , and the scipy community, which are both incorporating QMC methods.”

Art Owen recently arXived a paper entitled On dropping the first Sobol’ point in which he examines the impact of a common practice consisting in skipping the first point of a Sobol’ sequence when using quasi-Monte Carlo. By analogy with the burn-in practice for MCMC that aims at eliminating the biais due to the choice of the starting value. Art’s paper shows that by skipping just this one point the rate of convergence of some QMC estimates may drop by a factor, bringing the rate back to Monte Carlo values! As this applies to randomised scrambled Sobol sequences, this is quite amazing. The explanation centers on the suppression leaving one region of the hypercube unexplored, with an O(n⁻¹) error ensuing.

The above picture from the paper makes the case in a most obvious way: the mean squared error is not decreasing at the same rate for the no-drop and one-drop versions, since they are -3/2 and -1, respectively. The paper further “recommends against using roundnumber sample sizes and thinning QMC points.” Conclusion: QMC is not MC!

## ABC by QMC

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , on November 5, 2018 by xi'an

A paper by Alexander Buchholz (CREST) and Nicolas Chopin (CREST) on quasi-Monte Carlo methods for ABC is going to appear in the Journal of Computational and Graphical Statistics. I had missed the opportunity when it was posted on arXiv and only became aware of the paper’s contents when I reviewed Alexander’s thesis for the doctoral school. The fact that the parameters are simulated (in ABC) from a prior that is quite generally a standard distribution while the pseudo-observations are simulated from a complex distribution (associated with the intractability of the likelihood function) means that the use of quasi-Monte Carlo sequences is in general only possible for the first part.

The ABC context studied there is close to the original version of ABC rejection scheme [as opposed to SMC and importance versions], the main difference standing with the use of M pseudo-observations instead of one (of the same size as the initial data). This repeated version has been discussed and abandoned in a strict Monte Carlo framework in favor of M=1 as it increases the overall variance, but the paper uses this version to show that the multiplication of pseudo-observations in a quasi-Monte Carlo framework does not increase the variance of the estimator. (Since the variance apparently remains constant when taking into account the generation time of the pseudo-data, we can however dispute the interest of this multiplication, except to produce a constant variance estimator, for some targets, or to be used for convergence assessment.) L The article also covers the bias correction solution of Lee and Latuszyǹski (2014).

Due to the simultaneous presence of pseudo-random and quasi-random sequences in the approximations, the authors use the notion of mixed sequences, for which they extend a one-dimension central limit theorem. The paper focus on the estimation of Z(ε), the normalization constant of the ABC density, ie the predictive probability of accepting a simulation which can be estimated at a speed of O(N⁻¹) where N is the number of QMC simulations, is a wee bit puzzling as I cannot figure the relevance of this constant (function of ε), especially since the result does not seem to generalize directly to other ABC estimators.

A second half of the paper considers a sequential version of ABC, as in ABC-SMC and ABC-PMC, where the proposal distribution is there  based on a Normal mixture with a small number of components, estimated from the (particle) sample of the previous iteration. Even though efficient techniques for estimating this mixture are available, this innovative step requires a calculation time that should be taken into account in the comparisons. The construction of a decreasing sequence of tolerances ε seems also pushed beyond and below what a sequential approach like that of Del Moral, Doucet and Jasra (2012) would produce, it seems with the justification to always prefer the lower tolerances. This is not necessarily the case, as recent articles by Li and Fearnhead (2018a, 2018b) and ours have shown (Frazier et al., 2018). Overall, since ABC methods are large consumers of simulation, it is interesting to see how the contribution of QMC sequences results in the reduction of variance and to hope to see appropriate packages added for standard distributions. However, since the most consuming part of the algorithm is due to the simulation of the pseudo-data, in most cases, it would seem that the most relevant focus should be on QMC add-ons on this part, which may be feasible for models with a huge number of standard auxiliary variables as for instance in population evolution.

## winning entry at MCqMC’16

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags , , , , , , , on August 29, 2016 by xi'an

The nice logo of MCqMC 2016 was a collection of eight series of QMC dots on the unit (?) cube. The organisers set a competition to identify the principles behind those quasi-random sets and as I had no idea for most of them I entered very random sets unconnected with algorithmia, for which I got an honourable mention and a CD prize (if not the conference staff tee-shirt I was coveting!) Art Owen sent me back my entry, posted below and hopefully (or not!) readable.