## Archive for Markov chain

## [non] Markov chains

Posted in pictures, Running, Travel with tags Amsterdam, canals, chain, Holland, Markov chain, sunrise on April 24, 2015 by xi'an## Hamiltonian ABC

Posted in Books, pictures, Statistics, University life with tags ABC, Amsterdam, Hamiltonian Monte Carlo, Markov chain, Monte Carlo Statistical Methods, pseudo-random generator, random seed, synthetic likelihood on March 13, 2015 by xi'an**O**n Monday, Ed Meeds, Robert Leenders, and Max Welling (from Amsterdam) arXived a paper entitled Hamiltonian ABC. Before looking at the paper in any detail, I got puzzled by this association of antagonistic terms, since ABC is intended for complex and mostly intractable likelihoods, while Hamiltonian Monte Carlo requires a lot from the target, in order to compute gradients and Hessians… *[Warning: some graphs on pages 13-14 may be harmful to your printer!]*

Somewhat obviously (ex-post!), the paper suggests to use Hamiltonian dynamics on ABC approximations of the likelihood. They compare a Gaussian kernel version

with the synthetic Gaussian likelihood version of Wood (2010)

where both mean and variance are estimated from the simulated data. If ε is taken as an external quantity and driven to zero, the second approach is much more stable. But… ε is never driven to zero in ABC, or fixed at ε=0.37: It is instead considered as a kernel bandwidth and hence estimated from the simulated data. Hence ε is commensurable with σ(θ). And this makes me wonder at the relevance of the conclusion that synthetic is better than kernel for Hamiltonian ABC. More globally, I wonder at the relevance of better simulating from a still approximate target when the true goal is to better approximate the genuine posterior.

Some of the paper covers separate issues like handling gradient by finite differences à la Spall *[if you can afford it!]* and incorporating the random generator as part of the Markov chain. And using S *common* random numbers in computing the gradients for all values of θ. (Although I am not certain all random generators can be represented as a deterministic transform of a parameter θ and of a fixed number of random uniforms. But the authors may consider a random number of random uniforms when they represent their random generators as deterministic transform of a parameter θ and of the random seed. I am also uncertain about the distinction between common, sticky, and persistent random numbers!)

## aperiodic Gibbs sampler

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags aperiodicity, convergence, cross validated, Gibbs sampler, Markov chain, MCMC algorithms, Monte Carlo Statistical Methods, skeleton chain on February 11, 2015 by xi'an**A** question on Cross Validated led me to realise I had never truly considered the issue of periodic Gibbs samplers! In MCMC, non-aperiodic chains are a minor nuisance in that the skeleton trick of randomly subsampling the Markov chain leads to a aperiodic Markov chain. (The picture relates to the skeleton!) Intuitively, while the systematic Gibbs sampler has a tendency to non-reversibility, it seems difficult to imagine a sequence of full conditionals that would force the chain away from the current value..!In the discrete case, given that the current state of the Markov chain has positive probability for the target distribution, the conditional probabilities are all positive as well and hence the Markov chain can stay at its current value after one Gibbs cycle, with positive probabilities, which means strong aperiodicity. In the continuous case, a similar argument applies by considering a neighbourhood of the current value. (Incidentally, the same person asked a question about the absolute continuity of the Gibbs kernel. Being confused by our chapter on the topic!!!)

## an ISBA tee-shirt?!

Posted in Kids, pictures, University life with tags BAYSM 2014, ISBA, Markov chain, mug, tee-shirt, werewolf on September 19, 2014 by xi'an **S**onia Petrone announced today at BAYSM’14 that a competition was open for the design of an official ISBA tee-shirt! The deadline is October 15 and the designs are to be sent to Clara Grazian, currently at CEREMADE, Université Dauphine [that should be enough to guess her email!]. I will most certainly submit my mug design. And maybe find enough free time to design a fake eleven Paris with moustache tee-shirt. With Bayes’ [presumed] portrait of course…

## chain event graphs [RSS Midlands seminar]

Posted in pictures, Statistics, University life with tags Bayes nets, chain event graphs, DAG, directed acyclic graphs, graphs, lumpable Markov chain, Markov chain, Midlands, RSS, University of Warwick, variable length Markov chain on October 16, 2013 by xi'an**L**ast evening, I attended the RSS Midlands seminar here in Warwick. The theme was chain event graphs (CEG), As I knew nothing about them, it was worth my time listening to both speakers and discussing with Jim Smith afterwards. CEGs are extensions of Bayes nets with originally many more nodes since they start with the probability tree involving all modalities of all variables. Intensive Bayesian model comparison is then used to reduce the number of nodes by merging modalities having the same children or removing variables with no impact on the variable of interest. So this is not exactly a new Bayes net based on modality dummies as nodes (my original question). This is quite interesting, esp. in the first talk illustration of using missing value indicators as a supplementary variable (to determine whether or not data is missing at random). I also wonder how much of a connection there is with variable length Markov chains (either as a model or as a way to prune the tree). A last vague idea is a potential connection with *lumpable* Markov chains, a concept I learned from Kemeny & Snell (1960): a finite Markov chain is lumpable if by merging two or more of its states it remains a Markov chain. I do not know if this has ever been studied from a statistical point of view, i.e. testing for lumpability, but this sounds related to the idea of merging modalities of some variables in the probability tree…

## A repulsive random walk

Posted in R, Statistics with tags Markov chain, R-bloggers, random walk, recurrence, transience on May 28, 2010 by xi'an**M**att Asher posted an R experiment on R-bloggers yesterday simulating the random walk

which has the property of avoiding zero by quickly switching to a large value as soon as is small. He was then wondering about the “convergence” of the random walk given that it moves very little once is large enough. The values he found for various horizons *t* seemed to indicate a stable regime.

**I** reran the same experiment as Matt in a Monte Carlo perspective, using the R program

resu=matrix(0,ncol=100,nrow=25) sampl=rnorm(100) for (i in 1:25){ for (t in 2^(i-1):2^i) sampl=sampl+rnorm(100)/sampl resu[i,]=sampl } boxplot(as.data.frame(t(abs(resu))),name=as.character(1:25),col="wheat3")

**T**he outcome of this R code plotted above shows that the range and the average of the 100 replications is increasing with *t*. This behaviour indicates a transient behaviour of the Markov chain, which almost surely goes to infinity and never comes back (because at infinity the variance is zero). Another indication for transience is shown by the fact that comes back to the interval *(-1,1)* with probability , a probability which goes to zero with . As suggested to me by Randal Douc, this transience can be established rigorously by considering

which is thus bounded from below by a null recurrent process, which almost surely goes to infinity. Therefore the above Markov chain cannot have a stationary distribution or even a stationary measure: it almost surely goes to (plus or minus) infinity.

## Computational Statistics

Posted in Books, R, Statistics with tags book review, bootstrap, BUGS, C, computational statistics, computer error, Fortran, graphical models, jackknife, James Gentle, Markov chain, MCMC, nonparametric probability density estimation, programming, R, random generator, simulation, Springer-Verlag, statistical computing on May 10, 2010 by xi'an

Do not resort to Monte Carlo methods unnecessarily.

**W**hen I received this 2009 Springer-Verlag book, ** Computational Statistics**, by James Gentle a while ago, I briefly took a look at the table of contents and decided to have a better look later… Now that I have gone through the whole book, I can write a short review on its scope and contents (to be submitted). Despite its title, the book aims at covering

*both*computational statistics and statistical computing. (With 752 pages at his disposal, Gentle can afford to do

*both*indeed!)

**T**he book ** Computational Statistics** is separated into four parts:

- Part I:
**Mathematical and statistical preliminaries**. - Part II:
**Statistical Computing**(Computer storage and arithmetic.- Algorithms and programming.- Approximation of functions and numerical quadrature.- Numerical linear algebra.- Solution of nonlinear equations and optimization.- Generation of random numbers.) - Part III:
**Methods of Computational Statistics**(Graphical methods in computational statistics.- Tools for identification of structure in data.- Estimation of functions.- Monte Carlo methods for statistical inference.- Data randomization, partitioning, and augmentation.- Bootstrap methods.) - Part IV:
**Exploring Data Density and Relationship**(Estimation of probability density functions using parametric models.- Nonparametric estimation of probability density functions.- Statistical learning and data mining.- Statistical models of dependencies.)

Computational inference, together with exact inference and asymptotic inference, is an important component of statistical methods.

**T**he first part of ** Computational Statistics **is indeed a preliminary containing essentials of math and probability and statistics. A reader unfamiliar with too many topics within this chapter should first consider improving his or her background in the corresponding area! This is a rather large chapter, with 82 pages, and it should not be extremely useful to readers, except to signal deficiencies in their background, as noted above. Given this purpose, I am not certain the selected exercises of this chapter are necessary (especially when considering that some involve tools introduced much later in the book).

The form of a mathematical expression and the way the expression should be evaluated in actual practice may be quite different .

**T**he second part of ** Computational Statistics **is truly about computing, meaning the theory of computation, i.e. of using computers for numerical approximation, with discussions about the representation of numbers in computers, approximation errors, and of course random number generators. While I judge George Fishman’s

*to provide a deeper and more complete coverage of those topics, I appreciate the need for reminding our students of those hardware subtleties as they often seem unaware of them, despite their advanced computer skills. This second part is thus a crash course of 250 pages on numerical methods (like function approximations by basis functions and …) and on random generators, i.e. cover the same ground as Gentle’s earlier books,*

**Monte Carlo***and*

**Random Number Generation and Monte Carlo Methods****, while the more recent**

*Numerical Linear Algebra for Applications in Statistics***looks very much like a shorter entry on the same topics as those of Parts III IV of**

*Elements of Computational Statistics***. This part could certainly sustain a whole semester undergraduate course while only advanced graduate students could be expected to gain from a self-study of those topics. It is nonetheless the most coherent and attractive part of the book. It constitutes a must-read for all students and researchers engaging into any kind of serious programming. Obviously, some notions are introduced a bit too superficially, given the scope of this section (as for instance Monte Carlo methods, in particular MCMC techniques that are introduced in less than six pages), but I came to realise this is the point of the book, which provides an entry into “all” necessary topics, along with links to the relevant literature (if missing**

*Computational Statistics**!). I however deplore that the important issue of Monte Carlo experiments, whose construction is often a hardship for students, is postponed till the 100 page long appendix. (I suspect that*

**Monte Carlo Statistical Methods***students do not read appendices*is another of those folk theorems!)

Monte Carlo methods differ from other methods of numerical analysis in yielding anestimaterather than anapproximation.

**T**he third and fourth parts of the book cover methods of computational statistics, including Monte Carlo methods, randomization and cross validation, the bootstrap, probability density estimation, and statistical learning. Unfortunately, I find the level of Part III to be quite uneven, where all chapters are short and rather superficial because they try to be all-encompassing. (For instance, Chapter 8 includes two pages on the RGB colour coding.) Part IV does a better job of presenting machine learning techniques, if not with the thoroughness of Hastie et al.’s ** The Elements of Statistical Learning: Data Mining, Inference, and Prediction**… It seems to me that the relevant sections of Part III would have fitted better where they belong in Part IV. For instance, Chapter 10 on estimation of functions only covers the evaluation of estimators of functions, postponing the construction of those estimators till Chapter 15. Jackknife is introduced on its own in Chapter 12 (not that I find this introduction ultimately necessary) without the bootstrap covered in eight pages in Chapter 13 (bootstrap for non-iid data is dismissed rather quickly, given the current research in the area). The first chapter of Part IV covers some (non-Bayesian) estimation approaches for parametric families, but I find this chapter somehow superfluous as it does not belong to the computational statistic domain (except as an

*approximation*method, as stressed in Section 14.4). While Chapter 16 is a valuable entry on clustering and data-analysis tools like PCA, the final section on high dimensions feels out of context (and mentioning the curse of dimensionality only that close to the end of the book does not seem appropriate). Chapter 17 on dependent data is missing the rich literature on graphical models and their use in the determination of dependence structures.

Programming is the best way to learn programming (read that again) .

**I**n conclusion, ** Computational Statistics** is a very diverse book that can be used at several levels as textbook, as well as a reference for researchers (even if as an entry towards further and deeper references). The book is well-written, in a lively and personal style. (I however object to the reduction of the notion of

*Markov chains*to discrete state-spaces!) There is no requirement for a specific programming language, although R is introduced in a somewhat dismissive way (R

*most serious flaw is usually lack of robustness*since

*some [packages] are not of high-quality*) and some exercises start with

*Design and write either a C or a Fortran subroutine*.

**BUGS**is not mentioned at all. The appendices of

**also contain the solutions to some exercises, even though the level of detail is highly variable, from one word (“1″) to one page (see, e.g., Exercise 11.4). The 20 page list of references is preceded by a few pages on available journals and webpages, which could get obsolete rather quickly. Despite the reservations raised above about some parts of**

*Computational Statistics***that would benefit from a deeper coverage, I think this book is a reference book that should appear in the shortlist of any computational statistics/statistical computing graduate course as well as on the shelves of any researcher supporting his or her statistical practice with a significant dose of computing backup.**

*Computational Statistics*