Archive for jump process

non-reversibility in discrete spaces

Posted in Books, Statistics, University life with tags , , , , , , , , , on January 3, 2020 by xi'an

Following a recent JASA paper by Giacomo Zanella (which I have not yet read but is discussed on this blog), Sam Power and Jacob Goldman have recently arXived a paper on Accelerated sampling on discrete spaces with non-reversible Markov processes, where they use continuous-time, non-reversible algorithms à la PDMP, even though differential equations do not exist on discrete spaces. More specifically, they devise discrete versions of the coordinate sampler and of the Zig-Zag sampler, using Markov jump processes instead of differential equations, with detailed balance on the jump rate rather than the Markov kernel. A use of jump processes originating at least from Peskun (1973) and connected with MCMC algorithms in Matthew Stephens‘ 1999 PhD thesis. A neat thing about discrete settings is that the jump process can be implemented with no discretisation! However, as we noticed when working on birth-and-death processes with Olivier Cappé and Tobias Rydèn, there is a potential for disastrous implementation if an infinite sequence of instantaneous moves (out of zero probability states) is proposed.

The authors make the further assumption(s) that the discrete space is endowed with a graphical structure with a group G acting upon this graph, with an involution keeping the target (or a completion of the original target) invariant. In this framework, reversibility amounts to repeatedly using (group) generators þ with a low order (as in Bayesian variable selection, binary spin systems, where þ.þ=id, and other permutation problems), since they bring the chain back to its starting point. Their first sampler is called a Tabu sampler for avoiding such behaviour, forcing the next step to use other generators þ in the generator set Þ thanks to a binary auxiliary variable that partitions Þ into forward vs backward moves. For high order generators, the discrete coordinate and Zig-Zag samplers are instead repeatedly using the same generator (although it is unclear to me why this is beneficial, given that neither graph nor generator is not necessarily linked with the target). With the coordinate sampler being again much cheaper since it only looks at one direction in the generator group.

The paper contains a range of comparisons with (only) Zanella’s sampler, some presenting heavy gains in terms of ESS. Including one on hundreds of sensors in a football stadium. As I am not particularly familiar with these examples, except for the Bayesian variable selection one, I found it rather hard to determine whether or not the compared samplers were indeed exploring the entirety of the (highly complex and highly dimensional) target. The collection of examples is however quite rich and support the use of such non-reversible schemes. It may also be that the discrete nature of the target could facilitate the theoretical study of their convergence properties.

my [homonym] talk this afternoon at CREST [Paris-Saclay]

Posted in pictures, Statistics, University life with tags , , , , , , , on March 4, 2019 by xi'an

Christian ROBERT (Université Lyon 1) « How large is the jump discontinuity in the diffusion coefficient of an Itô diffusion?”

Time: 3:30 pm – 4:30 pm
Date: 04th of March 2019
Place: Room 3105

Abstract : We consider high frequency observations from a one-dimensional diffusion process Y. We assume that the diffusion coefficient σ is continuously differentiable, but with a jump discontinuity at some levely. Such a diffusion has already been considered as a local volatility model for the underlying price of an asset, but raises several issues for pricing European options or for hedging such derivatives. We introduce kernel sign-constrained estimators of the left and right limits of σ at y, but up to constant factors. We present and discuss the asymptotic properties of these kernel estimators.  We then propose a method to evaluate these constant factors by looking for bandwiths for which the kernel estimators are stable by iteration. We finally provide an estimator of the jump discontinuity size and discuss its convergence rate.

JSM 2015 [day #2]

Posted in Books, R, Statistics, Travel, University life with tags , , , , , , , , , , , , , on August 11, 2015 by xi'an

Today, at JSM 2015, in Seattle, I attended several Bayesian sessions, having sadly missed the Dennis Lindley memorial session yesterday, as it clashed with my own session. In the morning sessions on Bayesian model choice, David Rossell (Warwick) defended non-local priors à la Johnson (& Rossell) as having better frequentist properties. Although I appreciate the concept of eliminating a neighbourhood of the null in the alternative prior, even from a Bayesian viewpoint since it forces us to declare explicitly when the null is no longer acceptable, I find the asymptotic motivation for the prior less commendable and open to arbitrary choices that may lead to huge variations in the numerical value of the Bayes factor. Another talk by Jin Wang merged spike and slab with EM with bootstrap with random forests in variable selection. But I could not fathom what the intended properties of the method were… Besides returning another type of MAP.

The second Bayesian session of the morn was mostly centred on sparsity and penalisation, with Carlos Carvalho and Rob McCulloch discussing a two step method that goes through a standard posterior  construction on the saturated model, before using a utility function to select the pertinent variables. Separation of utility from prior was a novel concept for me, if not for Jay Kadane who objected to Rob a few years ago that he put in the prior what should be in the utility… New for me because I always considered the product prior x utility as the main brick in building the Bayesian edifice… Following Herman Rubin’s motto! Veronika Rocková linked with this post-LASSO perspective by studying spike & slab priors based on Laplace priors. While Veronicka’s goal was to achieve sparsity and consistency, this modelling made me wonder at the potential equivalent in our mixtures for testing approach. I concluded that having a mixture of two priors could be translated in a mixture over the sample with two different parameters, each with a different prior. A different topic, namely multiple testing, was treated by Jim Berger, who showed convincingly in my opinion that a Bayesian approach provides a significant advantage.

In the afternoon finalists of the ISBA Savage Award presented their PhD work, both in the theory and  methods section and in the application section. Besides Veronicka Rocková’s work on a Bayesian approach to factor analysis, with a remarkable resolution via a non-parametric Indian buffet prior and a variable selection interpretation that avoids MCMC difficulties, Vinayak Rao wrote his thesis on MCMC methods for jump processes with a finite number of observations, using a highly convincing completion scheme that created independence between blocks and which reminded me of the Papaspiliopoulos et al. (2005) trick for continuous time processes. I do wonder at the potential impact of this method for processing the coalescent trees in population genetics. Two talks dealt with inference on graphical models, Masanao Yajima and  Christine Peterson, inferring the structure of a sparse graph by Bayesian methods.  With applications in protein networks. And with again a spike & slab prior in Christine’s work. The last talk by Sayantan Banerjee was connected to most others in this Savage session in that it also dealt with sparsity. When estimating a large covariance matrix. (It is always interesting to try to spot tendencies in awards and conferences. Following the Bayesian non-parametric era, are we now entering the Bayesian sparsity era? We will see if this is the case at ISBA 2016!) And the winner is..?! We will know tomorrow night! In the meanwhile, congrats to my friends Sudipto Banerjee, Igor Prünster, Sylvia Richardson, and Judith Rousseau who got nominated IMS Fellows tonight.

A Vanilla Rao-Blackwellisation (comments)

Posted in Statistics with tags , , , , , on August 26, 2009 by xi'an

One of the authors of “On convergence of importance sampling and other properly weighted samples to the target distribution” by S. Malefaki and G. Iliopoulos, sent me their paper (now published in JSPI, 2008, pp. 1210-1225) to point out the connection with our Vanilla Rao-Blackwellisation paper. There is indeed a link in that those authors also exploit the sequence of accepted values in an MCMC sequence to build up geometric weights based on the distribution of those accepted rv’s. The paper also relates more strongly to the series of papers published by Jun Liu and coauthors in JASA in the early 2000’s about random importance weights, and even more to the birth-and-death jump processes introduced by Brian Ripley in his 1987 simulation book, and studied in Geyer and Møller (1994), Grenander and Miller (1994) and Phillips and Smith (1996) that led to the birth-and-death MCMC approach of Mathew Stephens in his thesis and 2000 Annals paper. As later analysed in our 2003 Series B paper, this jump process approach is theoretically valid but may lead to difficulties in the implementation stage. The first one is that each proposed value is accepted, albeit briefly and thus that, with proposals that have a null recurrent or a transient behaviour, it may take “forever” to go to infinity and back. The second one is that the perspective offered by this representation—which in the case of the standard Metropolis algorithm does not involve any modification—gives a vision of Metropolis algorithms as a rough version of an importance sampling algorithm. While this somehow is also the case for our Vanilla paper, the whole point of using a Metropolis or a Gibbs algorithm is exactly to avoid picking an importance sampling distribution in complex settings because they are almost necessarily inefficient and instead exploit some features of the target to build the proposals. (This is obviously a matter of perspective on the presentation of the analysis in the above paper, nothing’s being wrong with its mathematics.)