## wrapped Normal distribution

Posted in Books, R, Statistics with tags , , , , , on April 14, 2020 by xi'an

One version of the wrapped Normal distribution on (0,1) is expressed as a sum of Normal distributions with means shifted by all relative integers

$\psi(x;\mu,\sigma)=\sum_{k\in\mathbb Z}\varphi(x;\mu+k,\sigma)\mathbb I_{(0,1)}(x)$

which, while a parameterised density, has imho no particular statistical appeal over the use of other series. It was nonetheless the centre of a series of questions on X validated in the past weeks. Curiously used as the basis of a random walk type move over the unit cube along with a uniform component. Simulating from this distribution is easily done when seeing it as an infinite mixture of truncated Normal distributions, since the weights are easily computed

$\sum_{k\in\mathbb Z}\overbrace{[\Phi_\sigma(1-\mu-k)-\Phi_\sigma(-\mu-k)]}^{p_k(\mu,\sigma)}\times$

$\dfrac{\varphi_\sigma(x-\mu-k)\mathbb I_{(0,1)}(y)}{\Phi_\sigma(1-\mu-k)-\Phi_\sigma(-\mu-k)}$

Hence coding simulations as

wrap<-function(x, mu, sig){
ter = trunc(5*sig + 1)
return(sum(dnorm(x + (-ter):ter, mu, sig)))}
siw = function(N=1e4,beta=.5,mu,sig){
unz = (runif(N)<beta)
ter = trunc(5*sig + 1)
qrbz = diff(prbz<-pnorm(-mu + (-ter):ter, sd=sig))
ndx = sample((-ter+1):ter,N,rep=TRUE,pr=qrbz)+ter
z = sig*qnorm(prbz[ndx]+runif(N)*qrbz[ndx])-ndx+mu+ter+1
return(c(runif(sum(unz)),z[!unz]))}


and checking that the harmonic mean estimator was functioning for this density, predictably since it is lower bounded on (0,1). The prolix originator of the question was also wondering at the mean of the wrapped Normal distribution, which I derived as (predictably)

$\mu+\sum_{k\in\mathbb Z} kp_k(x,\mu,\sigma)$

but could not simplify any further except for x=0,½,1, when it is ½. A simulated evaluation of the mean as a function of μ shows a vaguely sinusoidal pattern, also predictably periodic and unsurprisingly antisymmetric, and apparently independent of the scale parameter σ…

## truncated Normal moments

Posted in Books, Kids, Statistics with tags , , , , , on May 24, 2019 by xi'an

An interesting if presumably hopeless question spotted on X validated: a lower-truncated Normal distribution is parameterised by its location, scale, and truncation values, μ, σ, and α. There exist formulas to derive the mean and variance of the resulting distribution,  that is, when α=0,

$\Bbb{E}_{\mu,\sigma}[X]= \mu + \frac{\varphi(\mu/\sigma)}{1-\Phi(-\mu/\sigma)}\sigma$

and

$\text{var}_{\mu,\sigma}(X)=\sigma^2\left[1-\frac{\mu\varphi(\mu/\sigma)/\sigma}{1-\Phi(-\mu/\sigma)} -\left(\frac{\varphi(\mu/\sigma)}{1-\Phi(-\mu/\sigma)}\right)^2\right]$

but there is no easy way to choose (μ, σ) from these two quantities. Beyond numerical resolution of both equations. One of the issues is that ( μ, σ) is not a location-scale parameter for the truncated Normal distribution when α is fixed.

## selected parameters from observations

Posted in Books, Statistics with tags , , , , , , , on December 7, 2018 by xi'an

I recently read a fairly interesting paper by Daniel Yekutieli on a Bayesian perspective for parameters selected after viewing the data, published in Series B in 2012. (Disclaimer: I was not involved in processing this paper!)

The first example is to differentiate the Normal-Normal mean posterior when θ is N(0,1) and x is N(θ,1) from the restricted posterior when θ is N(0,1) and x is N(θ,1) truncated to (0,∞). By restating the later as the repeated generation from the joint until x>0. This does not sound particularly controversial, except for the notion of selecting the parameter after viewing the data. That the posterior support may depend on the data is not that surprising..!

“The observation that selection affects Bayesian inference carries the important implication that in Bayesian analysis of large data sets, for each potential parameter, it is necessary to explicitly specify a selection rule that determines when inference  is provided for the parameter and provide inference that is based on the selection-adjusted posterior distribution of the parameter.” (p.31)

The more interesting distinction is between “fixed” and “random” parameters (Section 2.1), which separate cases where the data is from a truncated distribution (given the parameter) and cases where the joint distribution is truncated but misses the normalising constant (function of θ) for the truncated sampling distribution. The “mixed” case introduces an hyperparameter λ and the normalising constant integrates out θ and depends on λ. Which amounts to switching to another (marginal) prior on θ. This is quite interesting even though one can debate of the very notions of “random” and “mixed” “parameters”, which are those where the posterior most often changes, as true parameters. Take for instance Stephen Senn’s example (p.6) of the mean associated with the largest observation in a Normal mean sample, with distinct means. When accounting for the distribution of the largest variate, this random variable is no longer a Normal variate with a single unknown mean but it instead depends on all the means of the sample. Speaking of the largest observation mean is therefore misleading in that it is neither the mean of the largest observation, nor a parameter per se since the index [of the largest observation] is a random variable induced by the observed sample.

In conclusion, a very original article, if difficult to assess as it can be argued that selection models other than the “random” case result from an intentional modelling choice of the joint distribution.

## R package truncnorm

Posted in Statistics with tags , , , , , on November 8, 2017 by xi'an

This week in Warwick, thanks to a (rather incomprehensible) X validated question, I came across the CRAN R package truncnorm, which provides the “density, distribution function, quantile function, random generation and expected value function for the truncated normal distribution”. The short description of the sampler states that the method follows the accept-reject solution of John Geweke (1991), which I reproduced [independently!] a few years later. I may have missed the right code, but checking on the Github depository associated with this package, I did not find in the C code a trace of our optimal solution via a translated exponential proposal, since the exponential proosal, when used, relies on a scale equal to the left truncation point, a in the above picture. Obviously, this does not make a major difference in the execution time (and the algorithm is still correct!).

## truncated normal algorithms

Posted in Books, pictures, R, Statistics, University life with tags , , , , on January 4, 2017 by xi'an

Nicolas Chopin (CREST) just posted an entry on Statisfaction about the comparison of truncated Normal algorithms run by Alan Rogers, from the University of Utah. Nicolas wrote a paper in Statistics and Computing about a simulation method, which proposes a Ziggurat type of algorithm for this purpose, and which I do not remember reading, thanks to my diminishing memory buffer!  As shown in the picture below, when truncating to the half-line (a,∞), this method improves upon my accept-reject approach except in the far tails.

On the top graph, made by Alan Rogers, my uniform proposal (r) seems to be doing better for a Normal truncated to (a,b) when b<0, or when a gets large and close to b. Nicolas’ ziggurat (c) works better than the Gaussian accept-reject method (c) on the positive part. (I wonder what the exponential proposal (e) stands for, in terms of scale parameter.)