## rethinking the ESS published!

Posted in Statistics with tags , , , , , , , , on May 3, 2022 by xi'an

Our paper Rethinking the Effective Sample Size, with Victor Elvira (the driving force behind the paper!) and Luca Martino, has now been published in the International Statistical Review! As discussed earlier on this blog, we wanted to re-evaluate the pros and cons of the effective sample size (ESS), as a tool assessing the quality [or lack thereof] of a Monte Carlo approximation. It is particularly exploited in the specific context of importance sampling. Following a 1992 construction by Augustine Kong, his approximation has been widely used in the last 25 years, in part due to its simplicity as a practical rule of thumb. However, we show in this paper that the assumptions made in the derivation of this approximation make it difficult to consider it as a reasonable approximation of the ESS. Note that this reevaluation does not cover the use of ESS for Markov chain Monte Carlo algorithms, although there would also be much to tell about it..!

## null recurrent = zero utility?

Posted in Books, R, Statistics with tags , , , , , , , on April 28, 2022 by xi'an

The stability result that the ratio

$\dfrac{\sum^T_{t=1} f(\theta^{(t)})}{\sum^T_{t=1} g(\theta^{(t)})}\qquad(1)$

converges holds for a Harris π-null-recurrent Markov chain for all functions f,g in L¹(π) [Meyn & Tweedie, 1993, Theorem 17.3.2] is rather fascinating. However, it is unclear it can be useful in simulation environments, as for the integral priors we have been studying over the years with Juan Antonio Cano and Diego Salmeron Martinez. Above, the result of an experiment where I simulated a Markov chain as a Normal random walk in dimension one, hence a Harris π-null-recurrent Markov chain for the Lebesgue measure λ, and monitored the stabilisation of the ratio (1) when using two densities for f and g,  to its expected value (1, shown by a red horizontal line). There is quite a variability in the outcome (repeated 100 times),  but the most intriguing is the quick stabilisation of most cumulated averages to values different from 1. Even longer runs display this feature

which I would blame on the excursions of the random walk far away from the central regions for both f and g, that is on long sequences where zeroes keep being added to numerator and denominators in (1). As far as integral approximation is concerned, this is not very helpful!

## extinction minus one

Posted in Books, Kids, pictures, R, Statistics, University life with tags , , , , , , , , , , , , , , , on March 14, 2022 by xi'an

The riddle from The Riddler of 19 Feb. is about the Bernoulli Galton-Watson process, where each individual in the population has one or zero descendant with equal probabilities: Starting with a large population os size N, what is the probability that the size of the population on the brink of extinction is equal to one? While it is easy to show that the probability the n-th generation is extinct is

$\mathbb{P}(S_n=0) = 1 - \frac{1}{2^{nN}}$

I could not find a way to express the probability to hit one and resorted to brute force simulation, easily coded

for(t in 1:(T<-1e8)){N=Z=1e4
while(Z>1)Z=rbinom(1,Z,.5)
F=F+Z}
F/T


which produces an approximate probability of 0.7213 or 0.714. The impact of N is quickly vanishing, as expected when the probability to reach 1 in one generation is negligible…

However, when returning to Dauphine after a two-week absence, I presented the problem with my probabilist neighbour François Simenhaus, who immediately pointed out that this probability was more simply seen as the probability that the maximum of N independent geometric rv’s was achieved by a single one among the N. Searching later a reference for that probability, I came across the 1990 paper of Bruss and O’Cinneide, which shows that the probability of uniqueness of the maximum does not converge as N goes to infinity, but rather fluctuates around 0.72135 with logarithmic periodicity. It is only when N=2^n that the sequence converges to 0.721521… This probability actually writes down in closed form as

$N\sum_{i=1}^\infty 2^{-i-1}(1-2^{-i})^{N-1}$

(which is obvious in retrospect!, albeit containing a typo in the original paper which is missing a ½ factor in equation (17)) and its asymptotic behaviour is not obvious either, as noted by the authors.

On the historical side, and in accordance with Stiegler’s law, the Galton-Watson process should have been called the Bienaymé process! (Bienaymé was a student of Laplace, who successively lost positions for his political idea, before eventually joining Académie des Sciences, and later founding the Société Mathématique de France.)

## learning base R [book review]

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , , , , , , , on February 26, 2022 by xi'an

This second edition of an introductory R book was sent to me by the author for a potential CHANCE book review.  As there are many (many) books in the same spirit, the main question behind my reading it (in one go) was on the novelty it brings. The topics Learning Base R covers are

• arithmetics with R
• data structures
• built-in and user-written R functions
• R utilities
• more data structures
• comparison and coercion
• lists and data frames
• resident R datasets
• R interface
• probability calculations in R
• R graphics
• R programming
• simulations
• statistical inference in R
• linear algebra
• use of R packages

within as many short chapters. The style is rather standard, that is, short paragraphs with mostly raw reproductions of line commands and their outcome. Sometimes a whole page long of code examples (if with comments). All in all I feel there are rather too few tables when compared with examples, at least for my own taste. The exercises are mostly short and, while they vary in depth, they show that the book is rather intended for students with some mathematical background (e.g., with a chapter on complex numbers and another one on linear algebra that do not seem immediately relevant for most intended readers). Or more than that, when considering one (of several) exercise (19.30) on the Black-Scholes process that mentions Brownian motion. Possibly less appealing for would-be statisticians.

I also wonder at the pedagogical choice of not including and involving more clearly graphical interfaces like R studio as students are usually not big fans of “old-style” [their wording not mine!] line command languages. For instance, the chapter on packages would have benefited from this perspective. Nothing on Rmarkdown either. Apparently nothing on handling big data, more advanced database manipulation, the related realistic dangers of memory freeze and compulsory reboot, the intricacies of managing different directories and earlier sessions, little on the urgency of avoiding loops (p.233) by vectorial programming, a paradoxically if function being introduced after ifelse, and again not that much on statistics (with density only occurring in exercises).The chapter on customising R graphics may possibly scare the intended reader when considering the all-in-one example of p.193! As we advance though the book, the more advanced examples often are fairly standard programming ones (found in other language manuals) like creating Fibonacci numbers, implementing Eratosthenes sieve, playing the Hanoi Tower game… (At least they remind me of examples read in the language manuals I read as a student.) The simulation chapter could have gone into the one (Chap. 19) on probability calculations, rather than superfluously redefining standard distributions. (Except when defining a random number as a uniformly random number (p.162).)  This chapter also spends an unusual amount of space on linear congruencial pseudo-random generators, while missing to point out the trivia that the randu dataset mentioned twice earlier is actually an outcome from the infamous RANDU Fortran generator. The following section in that chapter is written in such a way that it may give the wrong impression that one can find the analytic solution from repeated Monte Carlo experiments and hence the error. Which is rarely the case, even in finite environments with rational expectations, as one usually does not know of which unit fraction the expectation should be a multiple of. (Remember the Squid Games paradox!) And no mention is made of the prescription of always returning an error estimate along with the numerical approximation. The statistics chapter is obviously more developed, with descriptive statistics, ecdf, but no bootrstap, a t.test curiously applied to the Michelson measurements of the speed of light (how could it be zero?!), ANOVA, regression handled via lm and glm, time series analysis by ARIMA models, which I hope will not be the sole exposure of readers to these concepts.

In conclusion, there is nothing critically wrong with this manual introducing R to newcomers and I would not mind having my undergraduate students reading it (rather than our shorter and home-made handout, polished along the years) before my first mathematical statistics lab. However I do not find it massively innovative in its presentation or choice of concept, even though the most advanced examples are not necessarily standard, and may not appeal to all categories of students.

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Book Review section in CHANCE.]

## 1 / duh?!

Posted in Books, R, Statistics, University life with tags , , , , , , , on September 28, 2021 by xi'an

An interesting case on X validated of someone puzzled by the simulation (and variance) of the random variable 1/X when being able to simulate X. And being surprised at the variance of the ratio being way larger than the variances of both numerator and denominator.