Archive for stationarity

flea circus

Posted in Books, Kids, pictures, R, Statistics with tags , , , , , , , , , , , on December 8, 2016 by xi'an

gribAn old riddle found on X validated asking for Monte Carlo resolution  but originally given on Project Euler:

A 30×30 grid of squares contains 30² fleas, initially one flea per square. When a bell is rung, each flea jumps to an adjacent square at random. What is the expected number of unoccupied squares after 50 bell rings, up to six decimal places?

The debate on X validated is whether or not a Monte Carlo resolution is feasible. Up to six decimals, certainly not. But with some lower precision, certainly. Here is a rather basic R code where the 50 steps are operated on the 900 squares, rather than the 900 fleas. This saves some time by avoiding empty squares.


 for (t in 1:n){

   for (v in 1:T){

    if (board[1]>0){
    for (i in (2:899)[board[-1][-899]>0]){
    if (board[900]>0){

The function returns an empirical average over n replications. With a presumably awkward approach to the borderline squares, since it involves adding zeros to keep the structure the same… Nonetheless, it produces an approximation that is rather close to the approximate expected value, in about 3mn on my laptop.

> exprmt(n=1e3)
[1] 331.082
> 900/exp(1)
[1] 331.0915

Further gains follow from considering only half of the squares, as there are two independent processes acting in parallel. I looked at an alternative and much faster approach using the stationary distribution, with the stationary being the Multinomial (450,(2/1740,3/1740…,4/1740,…,2/1740)) with probabilities proportional to 2 in the corner, 3 on the sides, and 4 in the inside. (The process, strictly speaking, has no stationary distribution, since it is periodic. But one can consider instead the subprocess indexed by even times.) This seems to be the case, though, when looking at the occupancy frequencies, after defining the stationary as:



> mn=0;n=1e8 #14 clock hours!
> proz=rep(c(rep(c(0,1),15),rep(c(1,0),15)),15)*inva(30)
> for (t in 1:n)
+ mn=mn+table(rmultinom(1,450,prob=rep(1,450)))[1:4]
> mn=mn/n
> mn[1]=mn[1]-450
> mn
     0      1      2     3
166.11 164.92  82.56 27.71
> exprmt(n=1e6) #55 clock hours!!
[1] 165.36 165.69 82.92 27.57

my original confusion being that the Poisson approximation had not yet taken over… (Of course, computing the first frequency for the stationary distribution does not require any simulation, since it is the sum of the complement probabilities to the power 450, i.e., 166.1069.)

solution manual for Bayesian Essentials with R

Posted in Books, Kids, Statistics, University life with tags , , , , , , on March 18, 2015 by xi'an

The solution manual to our Bayesian Essentials with R has just been arXived. If I link this completion with the publication date of the book itself, it sure took an unreasonable time to come out and sadly with no obvious reason or even less justification for the delay… Given the large overlap with the solution manual of the previous edition, Bayesian Core, this version should have been completed much much earlier but, paradoxically if in-line with the lengthy completion of the book istelf, this previous manual is one of the causes for the delay, as we thought the overlap allowed for self-study readers to check some of the exercises. Prodded by Hannah Bracken from Springer-Verlag, and unable to hire an assistant towards this task, I eventually decided to spend the few days required to clean up this solution manual, with the unintentional help from my sorry excuse for an Internet provider who accidentally cutting my home connection for a whole week so far…!

solmanIn the course of writing solutions, I stumbled upon one inexplicably worded exercise about the Lemer-Schur algorithm for testing stationarity, exercise that I had to rewrite from scratch. Apologies to any reader of Bayesian Essentials with R getting stuck on that exercise!!!

non-stationary AR(10)

Posted in Books, R, Statistics, University life with tags , , , , , on January 19, 2012 by xi'an

In the revision of Bayesian Core on which Jean-Michel Marin and I worked together most of last week, having missed our CIRM break last summer (!), we have now included an illustration of what happens to an AR(p) time series when the customary stationarity+causality condition on the roots of the associated polynomial is not satisfied.  More specifically, we generated several time-series with the same underlying white noise and random coefficients that have a fair chance of providing non-stationary series and then plotted the 260 next steps of the series by the R code

dat=seri=rnorm(T) #white noise

for (i in 1:4){
  for (t in ((p+1):T))

leading to outputs like the following one

Time series

Posted in Books, R, Statistics with tags , , , , , , on March 29, 2011 by xi'an

(This post got published on The Statistics Forum yesterday.)

The short book review section of the International Statistical Review sent me Raquel Prado’s and Mike West’s book, Time Series (Modeling, Computation, and Inference) to review. The current post is not about this specific book, but rather on why I am unsatisfied with the textbooks in this area (and correlatively why I am always reluctant to teach a graduate course on the topic). Again, I stress that the following is not specifically about the book by Raquel Prado and Mike West!

With the noticeable exception of Brockwell and Davis’ Time Series: Theory and Methods, most time-series books seem to suffer (in my opinion) from the same difficulty, which sums up as being unable to provide the reader with a coherent and logical description of/introduction to the field. (This echoes a complaint made by Håvard Rue a few weeks ago in Zurich.) Instead, time-series books appear to haphazardly pile up notions and techniques, theory and methods, without paying much attention to the coherency of the presentation. That’s how I was introduced to the field (even though it was by a fantastic teacher!) and the feeling has not left me since then. It may be due to the fact that the field stemmed partly from signal processing in engineering and partly from econometrics, but such presentations never achieve a Unitarian front on how to handle time-series. In particular, the opposition between the time domain and the frequency domain always escapes me. This is presumably due to my inability to see the relevance of the spectral approach, as harmonic regression simply appears (to me) as a special type of non-linear regression with sinusoidal regressors and with a well-defined likelihood that does not require Fourier frequencies nor periodogram (nor either spectral density estimation). Even within the time domain, I find the handling of stationarity  by time-series book to be mostly cavalier. Why stationarity is important is never addressed, which leads to the reader being left with the hard choice between imposing stationarity and not imposing stationarity. (My original feeling was to let the issue being decided by the data, but this is not possible!) Similarly, causality is often invoked as a reason to set constraints on MA coefficients, even though this resorts to a non-mathematical justification, namely preventing dependence on the future. I thus wonder if being an Unitarian (i.e. following a single logical process for analysing time-series data) is at all possible in the time-series world! E.g., in Bayesian Core, we processed AR, MA, ARMA models in a single perspective, conditioning on the initial values of the series and imposing all the usual constraints on the roots of the lag polynomials but this choice was far from perfectly justified…