**W**ith Grégoire Clarté, Robin Ryder and Julien Stoehr, all from Paris-Dauphine, we have just arXived a paper on the specifics of ABC-Gibbs, which is a version of ABC where the generic ABC accept-reject step is replaced by a sequence of n conditional ABC accept-reject steps, each aiming at an ABC version of a conditional distribution extracted from the joint and intractable target. Hence an ABC version of the standard Gibbs sampler. What makes it so special is that each conditional can (and should) be conditioning on a different statistic in order to decrease the dimension of this statistic, ideally down to the dimension of the corresponding component of the parameter. This successfully bypasses the curse of dimensionality but immediately meets with two difficulties. The first one is that the resulting sequence of conditionals is not coherent, since it is not a Gibbs sampler on the ABC target. The conditionals are thus incompatible and therefore convergence of the associated Markov chain becomes an issue. We produce sufficient conditions for the Gibbs sampler to converge to a stationary distribution using incompatible conditionals. The second problem is then that, provided it exists, the limiting and also intractable distribution does not enjoy a Bayesian interpretation, hence may fail to be justified from an inferential viewpoint. We however succeed in producing a version of ABC-Gibbs in a hierarchical model where the limiting distribution can be explicited and even better can be weighted towards recovering the original target. (At least with limiting zero tolerance.)

## Archive for stationarity

## ABC with Gibbs steps

Posted in Statistics with tags ABC, ABC-Gibbs, Approximate Bayesian computation, Bayesian inference, bois de Boulogne, compatible conditional distributions, contraction, convergence, ergodicity, France, Gibbs sampler, hierarchical Bayesian modelling, incompatible conditionals, La Défense, Paris, stationarity, tolerance, Université Paris Dauphine on June 3, 2019 by xi'an## Gibbs for incompatible kids

Posted in Books, Statistics, University life with tags Bayesian GANs, convergence of Gibbs samplers, GANs, Gibbs for Kids, Gibbs sampling, irreducibility, JCGS, Markov chains, MCMC algorithms, Monte Carlo Statistical Methods, stationarity on September 27, 2018 by xi'an**I**n continuation of my earlier post on Bayesian GANs, which resort to strongly incompatible conditionals, I read a 2015 paper of Chen and Ip that I had missed. (Published in the Journal of Statistical Computation and Simulation which I first confused with JCGS and which I do not know at all. Actually, when looking at its editorial board, I recognised only one name.) But the study therein is quite disappointing and not helping as it considers Markov chains on finite state spaces, meaning that the transition distributions are matrices, meaning also that convergence is ensured if these matrices have no null probability term. And while the paper is motivated by realistic situations where incompatible conditionals can reasonably appear, the paper only produces illustrations on two and three states Markov chains. Not that helpful, in the end… The game is still afoot!

## flea circus

Posted in Books, Kids, pictures, R, Statistics with tags coupling, cross validated, fleas, Leonhard Euler, Markov chains, Markov random field, Monte Carlo integration, occupancy, Poisson approximation, R, random walk, stationarity on December 8, 2016 by xi'an**A**n old riddle found on X validated asking for Monte Carlo resolution but originally given on Project Euler:

A 30×30 grid of squares contains 30² fleas, initially one flea per square. When a bell is rung, each flea jumps to an adjacent square at random. What is the expected number of unoccupied squares after 50 bell rings, up to six decimal places?

The debate on X validated is whether or not a Monte Carlo resolution is feasible. Up to six decimals, certainly not. But with some lower precision, certainly. Here is a rather basic R code where the 50 steps are operated on the 900 squares, rather than the 900 fleas. This saves some time by avoiding empty squares.

xprmt=function(n=10,T=50){ mean=0 for (t in 1:n){ board=rep(1,900) for (v in 1:T){ beard=rep(0,900) if (board[1]>0){ poz=c(0,1,0,30) ne=rmultinom(1,board[1],prob=(poz!=0)) beard[1+poz]=beard[1+poz]+ne} # for (i in (2:899)[board[-1][-899]>0]){ u=(i-1)%%30+1;v=(i-1)%/%30+1 poz=c(-(u>1),(u<30),-30*(v>1),30*(v<30)) ne=rmultinom(1,board[i],prob=(poz!=0)) beard[i+poz]=beard[i+poz]+ne} # if (board[900]>0){ poz=c(-1,0,-30,0) ne=rmultinom(1,board[900],prob=(poz!=0)) beard[900+poz]=beard[900+poz]+ne} board=beard} mean=mean+sum(board==0)} return(mean/n)}

The function returns an empirical average over n replications. With a presumably awkward approach to the borderline squares, since it involves adding zeros to keep the structure the same… Nonetheless, it produces an approximation that is rather close to the approximate expected value, in about 3mn on my laptop.

> exprmt(n=1e3) [1] 331.082 > 900/exp(1) [1] 331.0915

Further gains follow from considering only half of the squares, as there are two independent processes acting in parallel. I looked at an alternative and much faster approach using the stationary distribution, with the stationary being the Multinomial (450,(2/1740,3/1740…,4/1740,…,2/1740)) with probabilities proportional to 2 in the corner, 3 on the sides, and 4 in the inside. (The process, strictly speaking, has *no stationary distribution*, since it is periodic. But one can consider instead the subprocess indexed by even times.) This seems to be the case, though, when looking at the occupancy frequencies, after defining the stationary as:

inva=function(B=30){ return(c(2,rep(3,B-2),2,rep(c(3, rep(4,B-2),3),B-2),2,rep(3,B-2),2))}

namely

> mn=0;n=1e8 #14 clock hours! > proz=rep(c(rep(c(0,1),15),rep(c(1,0),15)),15)*inva(30) > for (t in 1:n) + mn=mn+table(rmultinom(1,450,prob=rep(1,450)))[1:4] > mn=mn/n > mn[1]=mn[1]-450 > mn 0 1 2 3 166.11 164.92 82.56 27.71 > exprmt(n=1e6) #55 clock hours!! [1] 165.36 165.69 82.92 27.57

my original confusion being that the Poisson approximation had not yet taken over… (Of course, computing the first frequency for the stationary distribution does not require any simulation, since it is the sum of the complement probabilities to the power 450, i.e., 166.1069.)

## solution manual for Bayesian Essentials with R

Posted in Books, Kids, Statistics, University life with tags Bayesian Core, Bayesian Essentials with R, lag polynomial, Lemer-Schur algorithm, solution manual, Springer-Verlag, stationarity on March 18, 2015 by xi'anThe solution manual to our *Bayesian Essentials with R* has just been arXived. If I link this completion with the publication date of the book itself, it sure took an unreasonable time to come out and sadly with no obvious reason or even less justification for the delay… Given the large overlap with the solution manual of the previous edition, *Bayesian Core*, this version should have been completed much much earlier but, paradoxically if in-line with the lengthy completion of the book istelf, this previous manual is one of the causes for the delay, as we thought the overlap allowed for self-study readers to check some of the exercises. Prodded by Hannah Bracken from Springer-Verlag, and unable to hire an assistant towards this task, I eventually decided to spend the few days required to clean up this solution manual, with the unintentional help from my sorry excuse for an Internet provider who accidentally cutting my home connection for a whole week so far…!

In the course of writing solutions, I stumbled upon one inexplicably worded exercise about the Lemer-Schur algorithm for testing stationarity, exercise that I had to rewrite from scratch. Apologies to any reader of *Bayesian Essentials with R* getting stuck on that exercise!!!

## non-stationary AR(10)

Posted in Books, R, Statistics, University life with tags AR(p) model, Bayesian Core, polynomials, R, stationarity, time series on January 19, 2012 by xi'an**I**n the revision of * Bayesian Core* on which Jean-Michel Marin and I worked together most of last week, having missed our CIRM break last summer (!), we have now included an illustration of what happens to an AR(p) time series when the customary stationarity+causality condition on the roots of the associated polynomial is not satisfied. More specifically, we generated several time-series with the same underlying white noise and random coefficients that have a fair chance of providing non-stationary series and then plotted the 260 next steps of the series by the R code

p=10 T=260 dat=seri=rnorm(T) #white noise par(mfrow=c(2,2),mar=c(2,2,1,1)) for (i in 1:4){ coef=runif(p,min=-.5,max=.5) for (t in ((p+1):T)) seri[t]=sum(coef*seri[(t-p):(t-1)])+dat[t] plot(seri,ty="l",lwd=2,ylab="") }

leading to outputs like the following one

## Time series

Posted in Books, R, Statistics with tags ARMA models, Bayesian statistics, graduate course, stationarity, Statistics Forum, textbook, time series on March 29, 2011 by xi'an*(This post got published on The Statistics Forum yesterday.)*

**T**he short book review section of the International Statistical Review sent me Raquel Prado’s and Mike West’s book, ** Time Series (Modeling, Computation, and Inference)** to review. The current post is not about this specific book, but rather on why I am unsatisfied with the textbooks in this area (and correlatively why I am always reluctant to teach a graduate course on the topic). Again, I stress that

*the following is not specifically about the book by Raquel Prado and Mike West!*

**W**ith the noticeable exception of Brockwell and Davis’ ** Time Series: Theory and Methods**, most time-series books seem to suffer (in my opinion) from the same difficulty, which sums up as being unable to provide the reader with a coherent and logical description of/introduction to the field. (This echoes a complaint made by Håvard Rue a few weeks ago in Zurich.) Instead, time-series books appear to haphazardly pile up notions and techniques, theory and methods, without paying much attention to the coherency of the presentation. That’s how I was introduced to the field (even though it was by a fantastic teacher!) and the feeling has not left me since then. It may be due to the fact that the field stemmed partly from signal processing in engineering and partly from econometrics, but such presentations never achieve a Unitarian front on how to handle time-series. In particular, the opposition between the time domain and the frequency domain always escapes me. This is presumably due to my inability to see the relevance of the spectral approach, as harmonic regression simply appears (to me) as a special type of non-linear regression with sinusoidal regressors and with a well-defined likelihood that does not require Fourier frequencies nor periodogram (nor either spectral density estimation). Even within the time domain, I find the handling of stationarity by time-series book to be mostly cavalier. Why stationarity is important is never addressed, which leads to the reader being left with the hard choice between imposing stationarity and not imposing stationarity. (My original feeling was to let the issue being decided by the data, but this is not possible!) Similarly, causality is often invoked as a reason to set constraints on MA coefficients, even though this resorts to a non-mathematical justification, namely preventing dependence on the future. I thus wonder if being an Unitarian (i.e. following a single logical process for analysing time-series data) is at all possible in the time-series world! E.g., in

*, we processed AR, MA, ARMA models in a single perspective, conditioning on the initial values of the series and imposing all the usual constraints on the roots of the lag polynomials but this choice was far from perfectly justified…*

**Bayesian Core**