## on approximations of Φ and Φ⁻¹

Posted in Books, Kids, R, Statistics with tags , , , , , , , , , on June 3, 2021 by xi'an

As I was working on a research project with graduate students, I became interested in fast and not necessarily very accurate approximations to the normal cdf Φ and its inverse. Reading through this 2010 paper of Richards et al., using for instance Polya’s

$F_0(x) =\frac{1}{2}(1+\sqrt{1-\exp(-2x^2/\pi)})$

(with another version replacing 2/π with the squared root of π/8) and

$F_2(x)=1/1+\exp(-1.5976x(1+0.04417x^2))$

not to mention a rational faction. All of which are more efficient (in R), if barely, than the resident pnorm() function.

      test replications elapsed relative user.self
3 logistic       100000   0.410    1.000     0.410
2    polya       100000   0.411    1.002     0.411
1 resident       100000   0.455    1.110     0.455


For the inverse cdf, the approximations there are involving numerical inversion except for

$F_0^{-1}(p) =(-\pi/2 \log[1-(2p-1)^2])^{\frac{1}{2}}$

which proves slightly faster than qnorm()

       test replications elapsed relative user.self
2 inv-polya       100000   0.401    1.000     0.401
1  resident       100000   0.450    1.000     0.450


## warped Cauchys

Posted in Books, Kids, R, Statistics with tags , , , on May 4, 2021 by xi'an

A somewhat surprising request on X validated about the inverse cdf representation of a wrapped Cauchy distribution. I had not come across this distribution, but its density being

$f_{WC}(\theta;\gamma)=\sum_{n=-\infty}^\infty \frac{\gamma}{\pi(\gamma^2+(\theta+2\pi n)^2)}\mathbb I_{ -\pi<\theta<\pi}$

means that it is the superposition of shifted Cauchys on the unit circle (with nice complex representations). As such, it is easily simulated by re-shifting a Cauchy back to (-π,π), i.e. using the inverse transform

$\theta = [\gamma\tan(\pi U-\pi/2)+\pi]\ \text{mod}\,(2\pi) - \pi$

## nested sampling: any prior anytime?!

Posted in Books, pictures, Statistics, Travel with tags , , , , , , , , , , , , on March 26, 2021 by xi'an

A recent arXival by Justin Alsing and Will Handley on “nested sampling with any prior you like” caught my attention. If only because I was under the impression that some priors would not agree with nested sampling. Especially those putting positive weight on some fixed levels of the likelihood function, as well as improper priors.

“…nested sampling has largely only been practical for a somewhat restrictive class of priors, which have a readily available representation as a transform from the unit hyper-cube.”

Reading from the paper, it seems that the whole point is to demonstrate that “any proper prior may be transformed onto the unit hypercube via a bijective transformation.” Which seems rather straightforward if the transform is not otherwise constrained: use a logit transform in every direction. The paper gets instead into the rather fashionable direction of normalising flows as density representations. (Which suddenly reminded me of the PhD dissertation of Rob Cornish at Oxford, which I examined last year. Even though nested was not used there in the same understanding.) The purpose appearing later (in the paper) or in fine to express a random variable simulated from the prior as the (generative) transform of a Uniform variate, f(U). Resuscitating the simulation from an arbitrary distribution from first principles.

“One particularly common scenario where this arises is when one wants to use the (sampled) posterior from one experiment as the prior for another”

But I remained uncertain at the requirement for this representation in implementing nested sampling as I do not see how it helps in bypassing the hurdles of simulating from the prior constrained by increasing levels of the likelihood function. It would be helpful to construct normalising flows adapted to the truncated priors but I did not see anything related to this version in the paper.

The cosmological application therein deals with the incorporation of recent measurements in the study of the ΛCDM cosmological model, that is, more recent that the CMB Planck dataset we played with 15 years ago. (Time flies, even if an expanding Universe!) Namely, the Baryon Oscillation Spectroscopic Survey and the SH0ES collaboration.

## why does rbinom(1,1) differ from sample(0:1,1) with the same seed?

Posted in Statistics with tags , , , , , , , , , on February 17, 2021 by xi'an
> set.seed(1)
> rbinom(10,1,0.5)
[1] 0 0 1 1 0 1 1 1 1 0
> set.seed(1)
> sample(c(0,1), 10, replace = TRUE)
[1] 0 1 0 1 0 0 1 0 1 1

This rather legitimate question was posted on X validated last week, the answer being that the C codes behind both functions do not use pseudo-random generators in the same manner. For instance, rbinom does get involved beyond a mean value of 30 (and otherwise resorts to the inverse cdf approach). And following worries about sample biases, sample was updated in 2019 (and also seems to resort to the inverse cdf when the mean is less than 200). However, when running the above code on my machine, still using the 2018 R version 3.4.4!, I recover the same outcome:

> set.seed(1)
> rbinom(10,1,0.5)
[1] 0 0 1 1 0 1 1 1 1 0

> set.seed(1)
> sample(c(0,1), 10, replace = TRUE)
[1] 0 0 1 1 0 1 1 1 1 0> set.seed(1)
> qbinom(runif(10),1,0.5)
[1] 0 0 1 1 0 1 1 1 1 0
> set.seed(1)
> 1*(runif(10)>.5)
[1] 0 0 1 1 0 1 1 1 1 0


## how can a posterior be uniform?

Posted in Books, Statistics with tags , , , , , , on September 1, 2020 by xi'an

How can we have a posterior distribution that is a uniform distribution?

With the underlying message that a uniform distribution does not depend on the data, since it is uniform! While it is always possible to pick the parameterisation a posteriori so that the posterior is uniform, by simply using the inverse cdf transform, or to pick the prior a posteriori so that the prior cancels the likelihood function, there exist more authentic discrete examples of a data realisation leading to a uniform distribution, as eg in the Multinomial model. I deem the confusion to stem from the impression either that uniform means non-informative (what we could dub Laplace’s daemon!) or that it could remain uniform for all realisations of the sampled rv.