A nice question was posted on X validated as to figure out a way to simulate a Bernoulli B(q) variate when using only a Bernoulli B(p) generator. With the additional question of handling the special case q=a/b, a rational probability. This is not exactly a Bernoulli factory problem in that q does not write as f(p), but still a neat challenge. My solution would have been similar to the one posted by William Huber, namely to simulate a sequence of B(p) or B(1-p) towards zooming on q until the simulation of the underlying uniforms U allows us to conclude at the position of U wrt q. For instance, if p>q and X~B(p) is equal to zero, the underlying uniform is more than p, hence more than q, leading to returning zero for the B(q) generation. Else, a second B(p) or B(1-p) generation means breaking the interval (0,p) into two parts, one of which allows for stopping the generation, and so on. The solution posted by William Huber contains an R code that could be easily improved by choosing for each interval between p and (1-p) towards the maximal probability of stopping. I still wonder at the ultimate optimal solution that would minimise the (average or median) number of calls to the Bernoulli(p) generator.
Archive for Monte Carlo
I am currently preparing a survey paper on the present state of computational statistics, reflecting on the massive evolution of the field since my early Monte Carlo simulations on an Apple //e, which would take a few days to return a curve of approximate expected squared error losses… It seems to me that MCMC is attracting more attention nowadays than in the past decade, both because of methodological advances linked with better theoretical tools, as for instance in the handling of stochastic processes, and because of new forays in accelerated computing via parallel and cloud computing, The breadth and quality of talks at MCMski IV is testimony to this. A second trend that is not unrelated to the first one is the development of new and the rehabilitation of older techniques to handle complex models by approximations, witness ABC, Expectation-Propagation, variational Bayes, &tc. With a corollary being an healthy questioning of the models themselves. As illustrated for instance in Chris Holmes’ talk last week. While those simplifications are inevitable when faced with hardly imaginable levels of complexity, I still remain confident about the “inevitability” of turning statistics into an “optimize+penalize” tunnel vision… A third characteristic is the emergence of new languages and meta-languages intended to handle complexity both of problems and of solutions towards a wider audience of users. STAN obviously comes to mind. And JAGS. But it may be that another scale of language is now required…
If you have any suggestion of novel directions in computational statistics or instead of dead ends, I would be most interested in hearing them! So please do comment or send emails to my gmail address bayesianstatistics…
Another exciting day at Adap’skiii!!!
Yves Atchadé presented a very recent work on the fundamental issue of estimating the asymptotic variance estimation for adaptive MCMC algorithms, with an intriguing experimental observation that a non-converging bandwidth with rate 1/n was providing better coverage than the converging rate. (I always found the issue of estimating the asymptotic variance both a tough problem and an important item in convergence assessment.) Galin Jones showed new regeneration results for componentwise MCMC samplers, with applications to quantile estimation. The iid structure produced by the regeneration mechanism allows rather naturally to introduce an adaptive improvement in those algorithms, if regeneration occurs often enough. (From the days of my Stat’Sci’ paper on convergence assessment, I love regeneration techniques for both theoretical and methodological reasons, even though they are often difficult to efficiently implement in practice.) Matti Vihola summarised several of his recent papers on the stability and convergence of adaptive MCMC algorithms, pursuing the Finnish tradition of leadership in adaptive algorithms! One point I found particularly interesting was the possibility of separating ergodicity from the Law of Large Numbers, thus reducing the constraints imposed by the containment condition. In the afternoon, Dawn Woodard discussed the convergence rate of the Gibbs sampler used for genomic motif discovery by Liu, Lawrence and Neuwald (1995). Scott Schmidler concluded the workshop by a far-ranging talk distinguishing between exploration and exploitation in adaptive MCMC algorithms, ie mixing vs burning, with illustrations using the Wang-Landau algorithm.
Thus, as in the previous editions of Adap’ski, we have had a uniformly high quality of talks about the current research in the area of adaptive algorithms (and a wee further). This shows the field is very well active and expanding, aiming at reaching a wider audience by providing verifiable convergence conditions and semi-automated softwares (like Jeff Rosenthal’s amcmc R code we used in Introducing Monte Carlo Methods with R). Looking forward Adap’ski 4 (Adap’skiV?!), hopefully in Europe and why not in Chamonix?! Which could then lead us to call the next meeting Adap’skiX…
I reran the program checking the distribution of the digits over 9 “diagonals” (obtained by acceptable permutations of rows and column) and this test again results in mostly small p-values. Over a million iterations, and the nine (dependent) diagonals, four p-values were below 0.01, three were below 0.1, and two were above (0.21 and 0.42). So I conclude in a discrepancy between my (full) sudoku generator and the hypothesised distribution of the (number of different) digits over the diagonal. Assuming my generator is a faithful reproduction of the one used in the paper by Newton and DeSalvo, this discrepancy suggests that their distribution over the sudoku grids do not agree with this diagonal distribution, either because it is actually different from uniform or, more likely, because the uniform distribution I use over the (groups of three over the) diagonal is not compatible with a uniform distribution over all sudokus…
Pierre Lecuyer is the new editor of the ACM Transactions on Modeling and Computer Simulation (TOMACS) and he has asked me to become an Area Editor for the new area of simulation in Statistics. I am quite excited by this new Æditor’s hat, since this is a cross-disciplinary journal:
The ACM Transactions on Modeling and Computer Simulation (TOMACS) provides a single archival source for the publication of high-quality research and developmental results in computer simulation. The subjects of emphasis are discrete event simulation, combined discrete and continuous simulation, as well as Monte Carlo methods. Papers in continuous simulation will also receive serious consideration if their contributions to modeling and simulation in general are substantial.
The use of simulation techniques is pervasive, extending to virtually all the sciences. TOMACS serves to enhance the understanding, improve the practice, and increase the utilization of computer simulation. Submissions should contribute to the realization of these objectives, and papers treating applications should stress their contributions vis-a-vis these objectives.
As an indication of this cross-disciplinarity, I note that most Area Editors and Associate Editors are unknown to me (except for Luc Devroye, of course!). In addition, I savour the irony of being associated with a journal of the Association for Computer Machinery (ACM), given my complete lack of practical skills! So, if you have relevant papers to submit in the field, please consider the ACM Transactions on Modeling and Computer Simulation (TOMACS) as a possible outlet.
As the discrepancy [from 1] in the sum of the nine probabilities seemed too blatant to be attributed to numerical error given the problem scale, I went and checked my R code for the probabilities and found a choose(9,3) instead of a choose(6,3) in the last line… The fit between the true distribution and the observed frequencies is now much better
Chi-squared test for given probabilities
X-squared = 16.378, df = 6, p-value = 0.01186
since a p-value of 1% is a bit in the far tail of the distribution.
A longer run of the R code of yesterday with a million sudokus produced the following qqplot.
It does look ok but no perfect. Actually, it looks very much like the graph of yesterday, although based on a 100-fold increase in the number of simulations. Now, if I test the adequation with a basic chi-square test (!), the result is highly negative:
> chisq.test(obs,p=pdiag/sum(pdiag)) #numerical error in pdiag
Chi-squared test for given probabilities
X-squared = 6978.503, df = 6, p-value < 2.2e-16
(there are seven entries for both obs and pdiag, hence the six degrees of freedom). So this casts a doubt upon the uniformity of the random generator suggested in the paper by Newton and DeSalvo or rather on my programming abilities, see next post!