## the random variable that was always less than its mean…

Although this is far from a paradox when realising why the phenomenon occurs, it took me a few lines to understand why the empirical average of a log-normal sample is apparently a biased estimator of its mean. And why conversely the biased plug-in estimator does not appear to present a bias. To illustrate this “paradox” consider the picture below which compares both estimators of the mean of a log-normal LN(0,σ²) distribution as σ² increases: blue stands for the empirical mean, while gold corresponds to the plug-in estimator exp(σ²/2) when σ² is estimated from the log-sample, as in a normal sample. (The sample is of size 10⁶.) The gold sequence remains around one, while the blue one drifts away towards zero…

The question came on X validated and my first reaction was to doubt an implementation which outcome was so counter-intuitive. But then I thought further about the representation of a log-normal variate as exp(σξ) when ξ is a standard Normal variate. When σ grows large enough, it is near impossible for σξ to be larger than σ². More precisely,

P(X>E[X])=P(σξ>σ²/2)=1-Φ(σ/2)

which can be arbitrarily small.

### 3 Responses to “the random variable that was always less than its mean…”

1. […] Si la distribución de una variable aleatoria es asimétrica la media puede dar una impresión equivocada de los valores que toma la variable. Dada una probabilidad   tan cercana a uno como se quiera, en esta entrada veremos como dar un ejemplo sencillo de variable aleatoria tal que , es decir, si elegimos un valor de cercano a uno, toma valores que son casi siempre menores que su valor esperado . Posteriormente veremos que este ejemplo da lugar a un estimador insesgado que se comporta de una manera inesperada. He encontrado el ejemplo en  este blog. […]

2. […] article was first published on R – Xi’an’s Og , and kindly contributed […]

3. […] article was first published on R – Xi'an's Og, and kindly contributed to […]