**A**s pointed out by Peter Coles on his blog, In the Dark, Hyungsuk Tak, Sujit Ghosh, and Justin Ellis just arXived a review of the unsafe use of improper priors in astronomy papers, 24 out of 75 having failed to establish that the corresponding posteriors are well-defined. And they exhibit such an instance (of impropriety) in a MNRAS paper by Pihajoki (2017), which is a complexification of Gelfand et al. (1990), also used by Jim Hobert in his thesis. (Even though the formal argument used to show the impropriety of the posterior in Pihajoki’s paper does not sound right since it considers divergence at a single value of a parameter β.) Besides repeating this warning about an issue that was rather quickly identified in the infancy of MCMC, if not in the very first publications on the Gibbs sampler, the paper seems to argue against using improper priors due to this potential danger, stating that instead proper priors that include all likely values and beyond are to be preferred. Which reminds me of the BUGS feature of using a N(0,10⁹) prior instead of the flat prior, missing the fact that “very large” variances do impact the resulting inference (if only for the issue of model comparison, remember Lindley-Jeffreys!). And are informative in that sense. However, it is obviously a good idea to advise checking for propriety (!) and using such alternatives may come as a safety button, providing a comparison benchmark to spot possible divergences in the resulting inference.

## Archive for the pictures Category

## improperties on an astronomical scale

Posted in Books, pictures, Statistics with tags astronomy, astrostatistics, Bayesian inference, BUGS, improper posteriors, impropriety, noninformative priors, vague priors on December 15, 2017 by xi'an## U of T sunset [jatp]

Posted in pictures, Running, Travel, University life with tags Austin, jatp, sunset, Texas, The University of Texas at Austin, USA on December 14, 2017 by xi'an## red Capitol [jatp]

Posted in pictures, Running, Travel with tags Austin, jatp, road running, sunrise, Texas, Texas Capitol on December 12, 2017 by xi'an## AlphaGo [100 to] zero

Posted in Books, pictures, Statistics, Travel with tags DeepMind, Go, Nature, University of Warwick on December 12, 2017 by xi'an**W**hile in Warwick last week, I read a few times through Nature article on AlphaGo Zero, the new DeepMind program that learned to play Go by itself, through self-learning, within a few clock days, and achieved massive superiority (100 to 0) over the earlier version of the program, which (who?!) was based on a massive data-base of human games. (A Nature paper I also read while in Warwick!) From my remote perspective, the neural network associated with AlphaGo Zero seems more straightforward that the double network of the earlier version. It is solely based on the board state and returns a probability vector * p* for all possible moves, as well as the probability of winning from the current position. There are still intermediary probabilities

*produced by a Monte Carlo tree search, which drive the computation of a final board, the (reinforced) learning aiming at bringing*

**π***and*

**p***as close as possible, via a loss function like*

**π**(z-v)²-<* π*, log

*>+c|*

**p***|²*

**θ**where z is the game winner and ** θ** is the vector of parameters of the neural network. (Details obviously missing above!) The achievements of this new version are even more impressive than those of the earlier one (which managed to systematically beat top Go players) in that blind exploration of game moves repeated over some five million games produced a much better AI player. With a strategy at times remaining a mystery to Go players.

Incidentally a two-page paper appeared on arXiv today with the title *Demystifying AlphaGo Zero*, by Don, Wu, and Zhou. Which sets AlphaGo Zero as a special generative adversarial network. And invoking Wasserstein distance as solving the convergence of the network. To conclude that “it’s not [sic] surprising that AlphaGo Zero show [sic] a good convergence property”… A most perplexing inclusion in arXiv, I would say.

## sunrise over Colorado [jatp]

Posted in pictures, Running, Travel, University life with tags Austin, Colorado, Colorado river, jatp, road running, sunrise, Texas on December 11, 2017 by xi'an## delayed acceptance ABC-SMC

Posted in pictures, Statistics, Travel with tags ABC-MCMC, ABC-SMC, Biometrika, delayed acceptance, lazy ABC, sequential Monte Carlo, SMC-ABC, stratified sampling on December 11, 2017 by xi'an**L**ast summer, during my vacation on Skye, Richard Everitt and Paulina Rowińska arXived a paper on delayed acceptance associated with ABC. ArXival that I missed, then! In order to decrease the number of simulations from the likelihood. As in our own delayed acceptance paper (without ABC), a cheap alternative generator is used to first reject the least likely parameters values, before possibly continuing to use a full generator. Also as lazy ABC. The first step of this ABC algorithm requires a cheap generator plus a primary tolerance ε¹ to compare the generation with the data or part of it. This may be followed by a second generation with a second tolerance level ε². The paper applies more specifically ABC-SMC as introduced in Sisson, Fan and Tanaka (2007) and reassessed in our subsequent 2009 Biometrika paper with Mark Beaumont, Jean-Marie Cornuet and Jean-Michel Marin. As well as in the ABC-SMC paper by Pierre Del Moral and Arnaud Doucet.

When looking at the version of the algorithm [Algorithm 2] based on two basic acceptance ABC steps, there are two features I find intriguing: (i) the primary step uses a cheap generator to reject early poor values of the parameter, followed by the second step involving a more expensive and exact generator, but I see no impact of the choice of this cheap generator in the acceptance probability; (ii) this is an SMC algorithm with imposed resampling at each iteration but there is no visible step for creating new weights after the resampling step. In the current presentation, it sounds like the weights do not change from the initial step, except for those turning to zero and the renormalisation transforms. Which makes the (unspecified) stratification of little interest if any. I must therefore miss a point in the implementation!

One puzzling sentence in the appendix is that the resampling algorithm used in the SMC step “ensures that every particle that is alive before resampling is represented in the resampled particles”, which reminds me of an argument [possibly a different one] made already in Sisson, Fan and Tanaka (2007) and that we could not validate in our subsequent paper. For resampling to be correct, a form of multinomial sampling must be implemented, even via variance reduction schemes like stratified or systematic sampling.