Archive for Images des Mathématiques

precision in MCMC

Posted in Books, R, Statistics, University life with tags , , , , , , , , , on January 14, 2016 by xi'an

presisio21 presisio22

While browsing Images des Mathématiques, I came across this article [in French] that studies the impact of round-off errors on number representations in a dynamical system and checked how much this was the case for MCMC algorithms like the slice sampler (recycling some R code from Monte Carlo Statistical Methods). By simply adding a few signif(…,dig=n) in the original R code. And letting the precision n vary.

presisio31 presisio32

“…si on simule des trajectoires pendant des intervalles de temps très longs, trop longs par rapport à la précision numérique choisie, alors bien souvent, les résultats des simulations seront complètement différents de ce qui se passe en réalité…” Pierre-Antoine Guihéneuf

Rather unsurprisingly (!), using a small enough precision (like two digits on the first row) has a visible impact on the simulation of a truncated normal. Moving to three digits seems to be sufficient in this example… One thing this tiny experiment reminds me of is the lumpability property of Kemeny and Snell.  A restriction on Markov chains for aggregated (or discretised) versions to be ergodic or even Markov. Also, in 2000, Laird Breyer, Gareth Roberts and Jeff Rosenthal wrote a Statistics and Probability Letters paper on the impact of round-off errors on geometric ergodicity. However, I presume [maybe foolishly!] that the result stated in the original paper, namely that there exists an infinite number of precision digits for which the dynamical system degenerates into a small region of the space does not hold for MCMC. Maybe foolishly so because the above statement means that running a dynamical system for “too” long given the chosen precision kills the intended stationary properties of the system. Which I interpret as getting non-ergodic behaviour when exceeding the period of the uniform generator. More or less.

presisio91 presisio92

Statistique dans Le Monde

Posted in University life with tags , , , , , , , , , , , on November 5, 2012 by xi'an

Again, some relevant entries in the weekend edition of Le Monde: a paper on Nate Silver and his FivThirtyEight blog, with a short description of his statistical approach, namely to pool all existing polls in a sort of meta-analysis. Not going as far as mentioning LOESS or nearest neighbour regression techniques. [Even less Bayesian!] For this, the FAQ of FivThirtyEight is much more explicit:

Firstly, we assign each poll a weighting based on that pollster’s historical track record, the poll’s sample size, and the recentness of the poll. More reliable polls are weighted more heavily in our averages.

Secondly, we include a regression estimate based on the demographics in each state among our ‘polls’, which helps to account for outlier polls and to keep the polling in its proper context.

Thirdly, we use an inferential process to compute a rolling trendline that allows us to adjust results in states that have not been polled recently and make them ‘current’.

Fourthly, we simulate the election 10,000 times for each site update in order to provide a probabilistic assessment of electoral outcomes based on a historical analysis of polling data since 1952. The simulation further accounts for the fact that similar states are likely to move together, e.g. future polling movement in states like Michigan and Ohio, or North and South Carolina, is likely to be in the same direction

The second paper is a tribune written by Marc Lavielle, senior researcher at INRIA Saclay, on the (French) debate surrounding the recent publication of a study by Séralini et al. on the toxicity of the genetically modified NK603 (Monsanto) corn. Part of the controversy stems form the fact that this paper was distributed to the media prior to its publication with a confidentiality contract that prevented the media to consult other experts (but not from publishing nonsensical definitive headlines). Another part of the controversy comes from the publication by six of the French Académies (namely, Science, Agriculture, Medicine, Pharmacy, Technologies, and Veterinary) of a statement concluding to the lack of reliability of the Food and Chemical Toxicology paper by Séralini et al., followed by another tribune written by Paul Deheuvels, professor of statistics at Université Pierre et Marie Curie and member of the Académie des Sciences, tribune in which he disagrees with the opinion expressed in this statement and legitimately complains not being consulted while being the sole statistician member of the Academy of Sciences. (This debate was also reported in the recent October recap of CNRS Images des Mathématiques.)