Archive for empirical cdf

variance of an exponential order statistics

Posted in Books, Kids, pictures, R, Statistics, University life with tags , , , , , , , , , , on November 10, 2016 by xi'an

This afternoon, one of my Monte Carlo students at ENSAE came to me with an exercise from Monte Carlo Statistical Methods that I did not remember having written. And I thus “charged” George Casella with authorship for that exercise!

Exercise 3.3 starts with the usual question (a) about the (Binomial) precision of a tail probability estimator, which is easy to answer by iterating simulation batches. Expressed via the empirical cdf, it is concerned with the vertical variability of this empirical cdf. The second part (b) is more unusual in that the first part is again an evaluation of a tail probability, but then it switches to find the .995 quantile by simulation and produce a precise enough [to three digits] estimate. Which amounts to assess the horizontal variability of this empirical cdf.

As we discussed about this question, my first suggestion was to aim at a value of N, number of Monte Carlo simulations, such that the .995 x N-th spacing had a length of less than one thousandth of the .995 x N-th order statistic. In the case of the Exponential distribution suggested in the exercise, generating order statistics is straightforward, since, as suggested by Devroye, see Section V.3.3, the i-th spacing is an Exponential variate with rate (N-i+1). This is so fast that Devroye suggests simulating Uniform order statistics by inverting Exponential order statistics (p.220)!

However, while still discussing the problem with my student, I came to a better expression of the question, which was to figure out the variance of the .995 x N-th order statistic in the Exponential case. Working with the density of this order statistic however led nowhere useful. A bit later, after Google-ing the problem, I came upon this Stack Exchange solution that made use of the spacing result mentioned above, namely that the expectation and variance of the k-th order statistic are

\mathbb{E}[X_{(k)}]=\sum\limits_{i=N-k+1}^N\frac1i,\qquad \mbox{Var}(X_{(k)})=\sum\limits_{i=N-k+1}^N\frac1{i^2}

which leads to the proper condition on N when imposing the variability constraint.

the random variable that was always less than its mean…

Posted in Books, Kids, R, Statistics with tags , , , , , on May 30, 2016 by xi'an

Although this is far from a paradox when realising why the phenomenon occurs, it took me a few lines to understand why the empirical average of a log-normal sample is apparently a biased estimator of its mean. And why conversely the biased plug-in estimator does not appear to present a bias. To illustrate this “paradox” consider the picture below which compares both estimators of the mean of a log-normal LN(0,σ²) distribution as σ² increases: blue stands for the empirical mean, while gold corresponds to the plug-in estimator exp(σ²/2) when σ² is estimated from the log-sample, as in a normal sample. (The sample is of size 10⁶.) The gold sequence remains around one, while the blue one drifts away towards zero…

The question came on X validated and my first reaction was to doubt an implementation which outcome was so counter-intuitive. But then I thought further about the representation of a log-normal variate as exp(σξ) when ξ is a standard Normal variate. When σ grows large enough, it is near impossible for σξ to be larger than σ². More precisely,

P(X>E[X])=P(σξ>σ²/2)=1-Φ(σ/2)

which can be arbitrarily small.

measuring honesty, with p=.006…

Posted in Books, Kids, pictures, Statistics with tags , , , , , on April 19, 2016 by xi'an


Simon Gächter and Jonathan Schulz recently published a paper in Nature attempting to link intrinsic (individual) honesty with a measure of corruption in the subject home country. Out of more than 2,500 subjects in 23 countries. [I am now reading Nature on a regular basis, thanks to our lab subscribing a coffee room subscription!] Now I may sound naïvely surprised at the methodological contents of the paper and at a publication in Nature but I never read psychology papers, only Andrew’s rants at’em!!!

“The results are consistent with theories of the cultural co-evolution of institutions and values, and show that weak institutions and cultural legacies that generate rule violations not only have direct adverse economic consequences, but might also impair individual intrinsic honesty that is crucial for the smooth functioning of society.”

The experiment behind this article and its rather deep claims is however quite modest: the authors asked people to throw a dice twice without monitoring and rewarded them according to the reported result of the first throw. Being dishonest here means reporting a false result towards a larger monetary gain. This sounds rather artificial and difficult to relate to dishonest behaviours in realistic situations, as I do not see much appeal in cheating for 50 cents or so. Especially since the experiment accounted for a difference in wealth backgrounds, by adapting to the hourly wage in the country (“from $0.7 dollar in Vietnam to $4.2 in the Netherlands“). Furthermore, the subjects of this experiment were undergraduate students in economics departments: depending on the country, this may create a huge bias in terms of social background, as I do not think access to universities is the same in Germany and in Guatemala, say.

“Our expanded scope of societies therefore provides important support and qualifications for the generalizability of these theories—people benchmark their justifiable dishonesty with the extent of dishonesty they see in their societal environment.”

The statistical analysis behind this “important support” is not earth-shattering either. The main argument is based on the empirical cdfs of the gain repartitions per country (in the above graph), with tests that the overall empirical cdf for low corruption countries is higher than the corresponding one for high corruption countries. The comparison of the cumulated or pooled cdf across countries from each group is disputable, in that there is no reason the different countries have the same “honesty” cdf. The groups themselves are built on a rough measure of “prevalence of rule violations”. It is also rather surprising that for both groups the percentage of zero gain answers is “significantly” larger than the expected value of 2.8% if the assumption of “justified dishonesty” holds. In any case, there is no compelling argument as to why students not reporting the value of the first dice would naturally opt for the maximum of the two dices. Hence a certain bemusement at this paper appearing in Nature and even deserving an introductory coverage in the first part of the journal…

the problem of assessing statistical methods

Posted in Books, pictures, Statistics, University life with tags , , , , , , on November 4, 2015 by xi'an

A new arXival today by Abigail Arnold and Jason Loeppky that discusses how simulations studies are and should be conducted when assessing statistical methods.

“Obviously there is no one model that will universally outperform the rest. Recognizing the “No Free Lunch” theorem, the logical question to ask is whether one model will perform best over a given class of problems. Again, we feel that the answer to this question is of course no. But we do feel that there are certain methods that will have a better chance than other methods.”

I find the assumptions or prerequisites of the paper arguable [in the sense of 2. open to disagreement; not obviously correct]—not even mentioning the switch from models to methods in the above—in that I will not be convinced that a method outperforms another method by simply looking at a series of simulation experiments. (Which is why I find some machine learning papers unconvincing, when they introduce a new methodology and run it through a couple benchmarks.) This also reminds me of Samaniego’s Comparison of the Bayesian and frequentist approaches, which requires a secondary prior to run the comparison. (And hence is inconclusive.)

“The papers above typically show the results as a series of side-by-side boxplots (…) for each method, with one plot for each test function and sample size. Conclusions are then drawn from looking at a handful of boxplots which often look very cluttered and usually do not provide clear evidence as to the best method(s). Alternatively, the results will be summarized in a table of average performance (…) These tables are usually overwhelming to look at and interpretations are incredibly inefficient.”

Agreed boxplots are terrible (my friend Jean-Michel is forever arguing against them!). Tables are worse. But why don’t we question RMSE as well? This is most often a very reductive way of comparing methods. I also agree with the point that the design of the simulation studies is almost always overlooked and induces a false sense of precision, while failing to cover a wide enough range of cases. However, and once more, I question the prerequisites for comparing methods through simulations for the purpose of ranking those methods. (Which is not the perspective adopted by James and Nicolas when criticising the use of the Pima Indian dataset.)

“The ECDF allows for quick assessments of methods over a large array of problems to get an overall view while of course not precluding comparisons on individual functions (…) We hope that readers of this paper agree with our opinions and strongly encourage everyone to rely on the ECDF, at least as a starting point, to display relevant statistical information from simulations.”

Drawing a comparison with the benchmarking of optimisation methods, the authors suggest to rank statistical methods via the empirical cdf of their performances or accuracy across (benchmark) problems. Arguing that “significant benefit is gained by [this] collapsing”. I am quite sceptical [as often] of the argument, first because using a (e)cdf means the comparison is unidimensional, second because I see no reason why two cdfs should be easily comparable, third because the collapsing over several problems only operates when the errors for those different problems do not overlap.

Statistics slides (3)

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , on October 9, 2014 by xi'an

La Défense from Paris-Dauphine, Nov. 15, 2012Here is the third set of slides for my third year statistics course. Nothing out of the ordinary, but the opportunity to link statistics and simulation for students not yet exposed to Monte Carlo methods. (No ABC yet, but who knows?, I may use ABC as an entry to Bayesian statistics, following Don Rubin’s example! Surprising typo on the Project Euclid page for this 1984 paper, by the way…) On Monday, I had the pleasant surprise to see Shravan Vasishth in the audience, as he is visiting Université Denis Diderot (Paris 7) this month.