
Archive for central limit theorem
errno EFBIG
Posted in Books, Kids, Linux, Statistics, University life with tags asymptotic variance, central limit theorem, CLT, common mistakes, cross validated, errno, teaching on October 12, 2020 by xi'ansimulating a sum of Uniforms
Posted in Statistics with tags Bates distribution, central limit theorem, characteristic function, Irwin-Hall distribution, Luc Devroye, overflow, R, Sherlock Holmes, sum of uniforms, underflow, unifed on May 1, 2020 by xi'anWhen considering the distribution of the sum (or average) of N Uniform variates, called either Irwin-Hall for the sum or Bates for the average, simulating the N uniforms then adding them shows a linear cost in N. The density of the resulting variate is well-known,
but similarly is of order N. Furthermore, controlling the terms in the alternating sum may prove delicate, as shown by the R function unifed::dirwin.hall() whose code
for (k in 0:floor(x)) ret1 <- ret1 + (-1)^k * choose(n, k) * (x - k)^(n - 1)
quickly becomes unreliable (although I managed an easy fix by using logs and a reference value of the magnitude of the terms in the summation). There is however a quick solution provided by [of course!] Devroye (NURVG, Section XIV.3, p.708), using the fact that the characteristic function of the Irwin-Hall distribution [for Uniforms over (-1,1)] is quite straightforward
which means the density can be bounded from above and results in an algorithm (NURVG, Section XIV.3, p.714) with complexity at most N to the power 5/8, if not clearly spelled out in the book. Obviously, it can be objected that for N large enough, like N=20, the difference between the true distribution and the CLT approximation is quite negligible (reminding me of my early simulating days where generating a Normal was done by averaging a dozen uniforms and properly rescaling!). But this is not an exact approach and the correction proves too costly. As shown by Section XIV.4 on the simulation of sums in NURVG. So… the game is afoot!
essentials of probability theory for statisticians
Posted in Books, Kids, pictures, Statistics, Travel, University life with tags asymptotics, book review, central limit theorem, CHANCE, Cydonia, face on Mars, Glivenko-Cantelli Theorem, Henri Lebesgue, Lebesque integration, measure theory, pareidolia, probability theory, quincunx on April 25, 2020 by xi'anOn yet another confined sunny lazy Sunday morning, I read through Proschan and Shaw’s Essentials of Probability Theory for Statisticians, a CRC Press book that was sent to me quite a while ago for review. The book was indeed published in 2016. Before moving to serious things, let me evacuate the customary issue with the cover. I have trouble getting the point of the “face on Mars” being adopted as the cover of a book on probability theory (rather than a book on, say, pareidolia). There is a brief paragraph on post-facto probability calculations, stating how meaningless the question of the probability of this shade appearing on a Viking Orbiter picture by “chance”, but this is so marginal I would have preferred any other figure from the book!
The book plans to cover the probability essentials for dealing with graduate level statistics and in particular convergence, conditioning, and paradoxes following from using non-rigorous approaches to probability. A range that completely fits my own prerequisite for statistics students in my classes and that of course involves the recourse to (Lebesgue) measure theory. And a goal that I find both commendable and comforting as my past experience with exchange students led me to the feeling that rigorous probability theory was mostly scrapped from graduate programs. While the book is not extremely formal, it provides a proper motivation for the essential need of measure theory to handle the complexities of statistical analysis and in particular of asymptotics. It thus relies as much as possible on examples that stem from or relate to statistics, even though most examples may appear as standard to senior readers. For instance the consistency of the sample median or a weak version of the Glivenko-Cantelli theorem. The final chapter is dedicated to applications (in the probabilist’ sense!) that emerged from statistical problems. I felt these final chapters were somewhat stretched compared with what they could have been, as for instance with the multiple motivations of the conditional expectation, but this simply makes for more material. If I had to teach this material to students, I would certainly rely on the book! in particular because of the repeated appearances of the quincunx for motivating non-Normal limites. (A typo near Fatou’s lemma missed the dominating measure. And I did not notice the Riemann notation dx being extended to the measure in a formal manner.)
[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Books Review section in CHANCE.]
asymptotics of synthetic likelihood [a reply from the authors]
Posted in Books, Statistics, University life with tags ABC, approximate Bayesian inference, Bayesian inference, Bayesian synthetic likelihood, central limit theorem, effective sample size, frequentist confidence, local regression, misspecification, pseudo-marginal MCMC, response, tolerance, uncertainty quantification on March 19, 2019 by xi'an[Here is a reply from David, Chris, and Robert on my earlier comments, highlighting some points I had missed or misunderstood.]
Dear Christian
Thanks for your interest in our synthetic likelihood paper and the thoughtful comments you wrote about it on your blog. We’d like to respond to the comments to avoid some misconceptions.
Your first claim is that we don’t account for the differing number of simulation draws required for each parameter proposal in ABC and synthetic likelihood. This doesn’t seem correct, see the discussion below Lemma 4 at the bottom of page 12. The comparison between methods is on the basis of effective sample size per model simulation.
As you say, in the comparison of ABC and synthetic likelihood, we consider the ABC tolerance \epsilon and the number of simulations per likelihood estimate M in synthetic likelihood as functions of n. Then for tuning parameter choices that result in the same uncertainty quantification asymptotically (and the same asymptotically as the true posterior given the summary statistic) we can look at the effective sample size per model simulation. Your objection here seems to be that even though uncertainty quantification is similar for large n, for a finite n the uncertainty quantification may differ. This is true, but similar arguments can be directed at almost any asymptotic analysis, so this doesn’t seem a serious objection to us at least. We don’t find it surprising that the strong synthetic likelihood assumptions, when accurate, give you something extra in terms of computational efficiency.
We think mixing up the synthetic likelihood/ABC comparison with the comparison between correctly specified and misspecified covariance in Bayesian synthetic likelihood is a bit unfortunate, since these situations are quite different. The first involves correct uncertainty quantification asymptotically for both methods. Only a very committed reader who looked at our paper in detail would understand what you say here. The question we are asking with the misspecified covariance is the following. If the usual Bayesian synthetic likelihood analysis is too much for our computational budget, can something still be done to quantify uncertainty? We think the answer is yes, and with the misspecified covariance we can reduce the computational requirements by an order of magnitude, but with an appropriate cost statistically speaking. The analyses with misspecified covariance give valid frequentist confidence regions asymptotically, so this may still be useful if it is all that can be done. The examples as you say show something of the nature of the trade-off involved.
We aren’t quite sure what you mean when you are puzzled about why we can avoid having M to be O(√n). Note that because of the way the summary statistics satisfy a central limit theorem, elements of the covariance matrix of S are already O(1/n), and so, for example, in estimating μ(θ) as an average of M simulations for S, the elements of the covariance matrix of the estimator of μ(θ) are O(1/(Mn)). Similar remarks apply to estimation of Σ(θ). I’m not sure whether that gets to the heart of what you are asking here or not.
In our email discussion you mention the fact that if M increases with n, then the computational burden of a single likelihood approximation and hence generating a single parameter sample also increases with n. This is true, but unavoidable if you want exact uncertainty quantification asymptotically, and M can be allowed to increase with n at any rate. With a fixed M there will be some approximation error, which is often small in practice. The situation with vanilla ABC methods will be even worse, in terms of the number of proposals required to generate a single accepted sample, in the case where exact uncertainty quantification is desired asymptotically. As shown in Li and Fearnhead (2018), if regression adjustment is used with ABC and you can find a good proposal in their sense, one can avoid this. For vanilla ABC, if the focus is on point estimation and exact uncertainty quantification is not required, the situation is better. Of course as you show in your nice ABC paper for misspecified models jointly with David Frazier and Juidth Rousseau recently the choice of whether to use regression adjustment can be subtle in the case of misspecification.
In our previous paper Price, Drovandi, Lee and Nott (2018) (which you also reviewed on this blog) we observed that if the summary statistics are exactly normal, then you can sample from the summary statistic posterior exactly with finite M in the synthetic likelihood by using pseudo-marginal ideas together with an unbiased estimate of a normal density due to Ghurye and Olkin (1962). When S satisfies a central limit theorem so that S is increasingly close to normal as n gets large, we conjecture that it is possible to get exact uncertainty quantification asymptotically with fixed M if we use the Ghurye and Olkin estimator, but we have no proof of that yet (if it is true at all).
Thanks again for being interested enough in the paper to comment, much appreciated.
David, Chris, Robert.