Archive for Purdue University

the buzz about nuzz

Posted in Books, Mountains, pictures, Statistics with tags , , , , , , , , , , , , , on April 6, 2020 by xi'an

“…expensive in these terms, as for each root, Λ(x(s),v) (at the cost of one epoch) has to be evaluated for each root finding iteration, for each node of the numerical integral

When using the ZigZag sampler, the main (?) difficulty is in producing velocity switch as the switches are produced as interarrival times of an inhomogeneous Poisson process. When the rate of this process cannot be integrated out in an analytical manner, the only generic approach I know is in using Poisson thinning, obtained by finding an integrable upper bound on this rate, generating from this new process and subsampling. Finding the bound is however far from straightforward and may anyway result in an inefficient sampler. This new paper by Simon Cotter, Thomas House and Filippo Pagani makes several proposals to simplify this simulation, Nuzz standing for numerical ZigZag. Even better (!), their approach is based on what they call the Sellke construction, with Tom Sellke being a probabilist and statistician at Purdue University (trivia: whom I met when spending a postdoctoral year there in 1987-1988) who also wrote a fundamental paper on the opposition between Bayes factors and p-values with Jim Berger.

“We chose as a measure of algorithm performance the largest Kolmogorov-Smirnov (KS) distance between the MCMC sample and true distribution amongst all the marginal distributions.”

The practical trick is rather straightforward in that it sums up as the exponentiation of the inverse cdf method, completed with a numerical resolution of the inversion. Based on the QAGS (Quadrature Adaptive Gauss-Kronrod Singularities) integration routine. In order to save time Kingman’s superposition trick only requires one inversion rather than d, the dimension of the variable of interest. This nuzzled version of ZIgZag can furthermore be interpreted as a PDMP per se. Except that it retains a numerical error, whose impact on convergence is analysed in the paper. In terms of Wasserstein distance between the invariant measures. The paper concludes with a numerical comparison between Nuzz and random walk Metropolis-Hastings, HMC, and manifold MALA, using the number of evaluations of the likelihood as a measure of time requirement. Tuning for Nuzz is described, but not for the competition. Rather dramatically the Nuzz algorithm performs worse than this competition when counting one epoch for each likelihood computation and better when counting one epoch for each integral inversion. Which amounts to perfect inversion, unsurprisingly. As a final remark, all models are more or less Normal, with very smooth level sets, maybe not an ideal range

 

Rao-Blackwellisation, a review in the making

Posted in Statistics with tags , , , , , , , , , , on March 17, 2020 by xi'an

Recently, I have been contacted by a mainstream statistics journal to write a review of Rao-Blackwellisation techniques in computational statistics, in connection with an issue celebrating C.R. Rao’s 100th birthday. As many many techniques can be interpreted as weak forms of Rao-Blackwellisation, as e.g. all auxiliary variable approaches, I am clearly facing an abundance of riches and would thus welcome suggestions from Og’s readers on the major advances in Monte Carlo methods that can be connected with the Rao-Blackwell-Kolmogorov theorem. (On the personal and anecdotal side, I only met C.R. Rao once, in 1988, when he came for a seminar at Purdue University where I was spending the year.)

patterned random matrices [not a book review]

Posted in Books, pictures, Statistics, University life with tags , , , , , on October 24, 2018 by xi'an

a jump back in time

Posted in Books, Kids, Statistics, Travel, University life with tags , , , , , , , , , , , on October 1, 2018 by xi'an

As the Department of Statistics in Warwick is slowly emptying its shelves and offices for the big migration to the new building that is almost completed, books and documents are abandoned in the corridors and the work spaces. On this occasion, I thus happened to spot a vintage edition of the Valencia 3 proceedings. I had missed this meeting and hence the volume for, during the last year of my PhD, I was drafted in the French Navy and as a result prohibited to travel abroad. (Although on reflection I could have safely done it with no one in the military the wiser!) Reading through the papers thirty years later is a weird experience, as I do not remember most of the papers, the exception being the mixture modelling paper by José Bernardo and Javier Giròn which I studied a few years later when writing the mixture estimation and simulation paper with Jean Diebolt. And then again in our much more recent non-informative paper with Clara Grazian.  And Prem Goel’s survey of Bayesian software. That is, 1987 state of the art software. Covering an amazing eighteen list. Including versions by Zellner, Tierney, Schervish, Smith [but no MCMC], Jaynes, Goldstein, Geweke, van Dijk, Bauwens, which apparently did not survive the ages till now. Most were in Fortran but S was also mentioned. And another version of Tierney, Kass and Kadane on Laplace approximations. And the reference paper of Dennis Lindley [who was already retired from UCL at that time!] on the Hardy-Weinberg equilibrium. And another paper by Don Rubin on using SIR (Rubin, 1983) for simulating from posterior distributions with missing data. Ten years before the particle filter paper, and apparently missing the possibility of weights with infinite variance.

There already were some illustrations of Bayesian analysis in action, including one by Jay Kadane reproduced in his book. And several papers by Jim Berger, Tony O’Hagan, Luis Pericchi and others on imprecise Bayesian modelling, which was in tune with the era, the imprecise probability book by Peter Walley about to appear. And a paper by Shaw on numerical integration that mentioned quasi-random methods. Applied to a 12 component Normal mixture.Overall, a much less theoretical content than I would have expected. And nothing about shrinkage estimators, although a fraction of the speakers had worked on this topic most recently.

At a less fundamental level, this was a time when LaTeX was becoming a standard, as shown by a few papers in the volume (and as I was to find when visiting Purdue the year after), even though most were still typed on a typewriter, including a manuscript addition by Dennis Lindley. And Warwick appeared as a Bayesian hotpot!, with at least five papers written by people there permanently or on a long term visit. (In case a local is interested in it, I have kept the volume, to be found in my new office!)

divide & reconquer

Posted in Books, Statistics, University life with tags , , , , , , , , , , on February 5, 2018 by xi'an

Qi Liu, Anindya Bhadra, and William Cleveland from Purdue have arXived a paper entitled Divide and Recombine for Large and Complex Data: Model Likelihood Functions using MCMC. Which is a variation on the earlier divide & … papers attempting at handling large datasets. The beginning is quite similar to these earlier papers in that the likelihood is split into sub-likelihoods, approximated from MCMC samples and recombined into an approximate full likelihood. As in for instance Scott et al. one approximation use for the subsample is to replace the likelihood with a Normal approximation, or a skew Normal generalisation, which remains  a limited choice for heavy tailed likelihoods. Producing a Normal and skew-Normal approximation for the whole [data] likelihood, respectively. If I understand correctly, these approximations are missing a normalising constant to bring them to scale with the true likelihood, which I do not completely understand as the likelihood only needs to be defined up to a [constant] constant for most purposes, including Bayesian ones. The  method of estimation of this constant proposed therein is called the contour probability algorithm and it consists in using a highest density region to compare a likelihood and its approximation. (Nothing to do with our adaptation of Gelfand and Dey (1994) based on HPDs, with Darren Wright. Nor with nested sampling.) Returning a form of qq-plot. This is rather exploratory, while hardly addressing the issue of the precision of such approximations and the resolution of conflicting proposals. And the comparison with all these other recent proposals for splitting likelihoods into manageable bits (proposals that are mentioned in the final section, including our recentering scheme with my student Changye Wu).