“Many researchers might have observed that the magnitude of a correlation is pretty unstable in small samples.”

**O**n the statsblog aggregator, I spotted an entry that eventually led me to this post about the best paper award for the evolution of correlation, a paper published in the *Journal of Research in Personality*. A journal not particularly well-known for its statistical methodology input. The main message of the paper is that, while the empirical correlation is highly varying for small n’s, an interval (or *corridor of stability*!) can be constructed so that a Z-transform of the correlation does not vary away from the true value by more than a chosen quantity like 0.1. And the *point of stability* is then defined as the sample size after which the trajectory of the estimate does not leave the corridor… Both corridor and point depending on the true and unknown value of the correlation parameter by the by. Which implies resorting to bootstrap to assess the distribution of this point of stability. And deduce quantiles that can be used for… For what exactly?! Setting the necessary sample size? But this requires a preliminary run to assess the possible value of the true correlation ρ. The paper concludes that “for typical research scenarios reasonable trade-offs between accuracy and confidence start to be achieved when n approaches 250”. This figure was achieved by a bootstrap study on a bivariate Gaussian population with 10⁶ datapoints, yes indeed 10⁶!, and bootstrap samples of maximal size 10³. All in all, while I am at a loss as to why the *Journal of Research in Personality* would promote the estimation of a correlation coefficient with 250 datapoints, there is nothing fundamentally wrong with the paper (!), except for this recommendation of the 250 datapoint, as the figure stems from a specific setting with particular calibrations and cannot be expected to apply in every and all cases.

Actually, the graph in the paper was the thing that first attracted my attention because it looks very much like the bootstrap examples I show my third year students to demonstrate the appeal of bootstrap. Which is not particularly useful in the current case. A quick simulation on 100 samples of size 300 showed [above] that Monte Carlo simulations produce a tighter confidence band than the one created by bootstrap, in the Gaussian case. Continue reading