## an inverse permutation test

Posted in Books, Kids, R, Statistics with tags , , , , on September 23, 2016 by xi'an

A straightforward but probabilistic riddle this week in the Riddler, which is to find the expected order of integer i when the sequence {1,2,…,n} is partitioned at random into two sets, A and B, each of which is then sorted before both sets are merged. For instance, if {1,2,3,4} is divided in A={1,4} and B={2,3}, the order of 2 in {1,4,2,3} is 3. An R rendering of the experiment is

m=rbinom(1,n,.5)
if (m*(n-m)>0){
fist=sort(sample(1:n,m))
return(order(c(fist,sort((1:n)[-fist])))[i])
}else{
return(i)}
[\sourcecode]

It is rather easy to find that the probability that the order of i takes the value j is

${i-1 \choose j-1}(1/2)^i$

if j<i (i is in A) and

${n-i \choose n-j}(1/2)^{n-i+1}$

if \$j>i\$ (i is in B), the case i=j being the addition of both cases, but the mean can be found (almost) immediately by considering that, when i is in A, its average value is (i+1)/2 and when it is in B, its average value is (n+i)/2 [by symmetry]. Hence a global mean of (2i+n+1)/4….

## astroABC: ABC SMC sampler for cosmological parameter estimation

Posted in Books, R, Statistics, University life with tags , , , , , , , , on September 6, 2016 by xi'an

“…the chosen statistic needs to be a so-called sufficient statistic in that any information about the parameter of interest which is contained in the data, is also contained in the summary statistic.”

Elise Jenningsa and Maeve Madigan arXived a paper on a new Python code they developed for implementing ABC-SMC, towards astronomy or rather cosmology applications. They stress the parallelisation abilities of their approach which leads to “crucial speed enhancement” against the available competitors, abcpmc and cosmoabc. The version of ABC implemented there is “our” ABC PMC where particle clouds are shifted according to mixtures of random walks, based on each and every point of the current cloud, with a scale equal to twice the estimated posterior variance. (The paper curiously refers to non-astronomy papers through their arXiv version, even when they have been published. Like our 2008 Biometrika paper.) A large part of the paper is dedicated to computing aspects that escape me, like the constant reference to MPIs. The algorithm is partly automated, except for the choice of the summary statistics and of the distance. The tolerance is chosen as a (large) quantile of the previous set of simulated distances. Getting comments from the designers of abcpmc and cosmoabc would be great.

“It is clear that the simple Gaussian Likelihood assumption in this case, which neglects the effects of systematics yields biased cosmological constraints.”

The last part of the paper compares ABC and MCMC on a supernova simulated dataset. Which is somewhat a dubious comparison since the model used for producing the data and running ABC is not the same as the Gaussian version used with MCMC. Unsurprisingly, MCMC then misses the true value of the cosmological parameters and most likely and more importantly the true posterior HPD region. While ABC SMC (or PMC) proceeds to a concentration around the genuine parameter values. (There is no additional demonstration of how accelerated the approach is.)

## Matlab goes deep [learning]

Posted in Books, pictures, R, Statistics, University life with tags , , on September 5, 2016 by xi'an

A most interesting link I got when reading Le Monde, about MatLab proposing deep learning tools…

## conditional sampling

Posted in R, Statistics with tags , , , , on September 5, 2016 by xi'an

An interesting question about stratified sampling came up on X validated last week, namely how to optimise a Monte Carlo estimate based on two subsequent simulations, one, X, from a marginal and one or several Y from the corresponding conditional given X, when the costs of producing those two simulations significantly differ. When looking at the variance of the Monte Carlo estimate, this variance can be minimised in the number of simulations from the marginal under a computing budget. However, when the costs of producing x, y given or (x,y) are about the same, it does not pay to replicate simulations from y given x or x given y, because this increases the randomness of the estimator and induces positive correlation between some terms in the sum. Assuming that the conditional variances are always computable, we could envision an extension to the question where for each new value of a simulated x (or y), a variable number of conditionally simulated y (or x) could be produced. Or even when those conditional variances are unknown but can be replaced with empirical versions.

The illustration comes from a bivariate normal model with correlation, for a rather arbitrary function , but the pattern remains the same, namely that iid simulations of the pair (X,Y invariably leads to a smaller variance of the estimator compared with simulation with a 1:10 (10 Y’s for one X) or 10:1 ratio between x’s and y’s. Depending on the function and the relative variances, the 1:10 or 10:1 schemes may have a similar variability.

```zigma=c(9,1,-.9*sqrt(1*9))

geney=function(x,n=1){
return(rnorm(n,mean=zigma[3]*x/zigma[1],sd=sqrt(zigma[2]-
zigma[3]^2/zigma[1])))}
genex=function(y,n=1){
return(rnorm(n,mean=zigma[3]*y/zigma[2],sd=sqrt(zigma[1]-
zigma[3]*zigma[3]/zigma[2])))}
targ=function(x,y){ log(x^2*y^4)+x^2*cos(x^2)/y*sin(y^2)}

T=1e4;N=1e3
vales=matrix(0,N,3)
for (i in 1:N){
xx=rnorm(T,sd=sqrt(zigma[1]))
vales[i,1]=mean(targ(xx,geney(xx,n=T)))
xx=rep(rnorm(T/10,sd=sqrt(zigma[1])),10)
vales[i,2]=mean(targ(xx,geney(xx,n=T)))
yy=rep(rnorm(T/10,sd=sqrt(zigma[2])),10)
vales[i,3]=mean(targ(enex(yy,n=T),yy))}
```

## Florid’AISTATS

Posted in pictures, R, Statistics, Travel, University life with tags , , , , , , , , , on August 31, 2016 by xi'an

The next AISTATS conference is taking place in Florida, Fort Lauderdale, on April 20-22. (The website keeps the same address one conference after another, which means all my links to the AISTATS 2016 conference in Cadiz are no longer valid. And that the above sunset from Florida is named… cadiz.jpg!) The deadline for paper submission is October 13 and there are two novel features:

1. Fast-track for Electronic Journal of Statistics: Authors of a small number of accepted papers will be invited to submit an extended version for fast-track publication in a special issue of the Electronic Journal of Statistics (EJS) after the AISTATS decisions are out. Details on how to prepare such extended journal paper submission will be announced after the AISTATS decisions.
2. Review-sharing with NIPS: Papers previously submitted to NIPS 2016 are required to declare their previous NIPS paper ID, and optionally supply a one-page letter of revision (similar to a revision letter to journal editors; anonymized) in supplemental materials. AISTATS reviewers will have access to the previous anonymous NIPS reviews. Other than this, all submissions will be treated equally.

I find both initiatives worth applauding and replicating in other machine-learning conferences. Particularly in regard with the recent debate we had at Annals of Statistics.

## Bayesian Essentials with R [book review]

Posted in Books, R, Statistics, University life with tags , , , , , , , on July 28, 2016 by xi'an

[A review of Bayesian Essentials that appeared in Technometrics two weeks ago, with the first author being rechristened Jean-Michael!]

“Overall this book is a very helpful and useful introduction to Bayesian methods of data analysis. I found the use of R, the code in the book, and the companion R package, bayess, to be helpful to those who want to begin using  Bayesian methods in data analysis. One topic that I would like to see added is the use of Bayesian methods in change point problems, a topic that we found useful in a recent article and which could be added to the time series chapter. Overall this is a solid book and well worth considering by its intended audience.”
David E. BOOTH
Kent State University

## the curious incident of the inverse of the mean

Posted in R, Statistics, University life with tags , , , on July 15, 2016 by xi'an

A s I figured out while working with astronomer colleagues last week, a strange if understandable difficulty proceeds from the simplest and most studied statistical model, namely the Normal model

x~N(θ,1)

Indeed, if one reparametrises this model as x~N(υ⁻¹,1) with υ>0, a single observation x brings very little information about υ! (This is not a toy problem as it corresponds to estimating distances from observations of parallaxes.) If x gets large, υ is very likely to be small, but if x is small or negative, υ is certainly large, with no power to discriminate between highly different values. For instance, Fisher’s information for this model and parametrisation is υ⁻² and thus collapses at zero.

While one can always hope for Bayesian miracles, they do not automatically occur. For instance, working with a Gamma prior Ga(3,10³) on υ [as informed by a large astronomy dataset] leads to a posterior expectation hardly impacted by the value of the observation x:

And using an alternative estimate like the harmonic posterior mean that is associated with the relative squared error loss does not see much more impact from the observation:

There is simply not enough information contained in one datapoint (or even several datapoints for all that matters) to infer about υ.