**A** question that came out on X validated is asking for help in figuring out the UMVUE (uniformly minimal variance unbiased estimator) of (1-θ)^{½} when observing iid Bernoulli B(θ). As it happens, there is no unbiased estimator of this quantity and hence not UMVUE. But there exists a Bernoulli factory producing a coin with probability (1-θ)^{½} from draws of a coin with probability θ, hence a mean to produce unbiased estimators of this quantity. Although of course UMVUE does not make sense in this sequential framework. While Nacu & Peres (2005) were uncertain there was a Bernoulli factory for θ^{½}, witness their Question #1, Mendo (2018) and Thomas and Blanchet (2018) showed that there does exist a Bernoulli factory solution for θ^{a}, 0≤a≤1, with constructive arguments that only require the series expansion of θ^{½}. In my answer to that question, using a straightforward R code, I tested the proposed algorithm, which indeed produces an unbiased estimate of θ^{½}… (Most surprisingly, the question got closed as a “self-study” question, which sounds absurd since it could not occur as an exercise or an exam question, unless the instructor is particularly clueless.)

## Archive for industrial ruins

## another Bernoulli factory

Posted in Books, Kids, pictures, R, Statistics with tags Bernoulli factory, Cockatoo Island, cross validated, existence of unbiased estimators, industrial ruins, Sydney Harbour, UMVUE on May 18, 2020 by xi'an## CRAN does not validate R packages!

Posted in pictures, R, University life with tags bayess, Cockatoo Island, CRAN, industrial ruins, mcsm, R, R package, S, sample, Sydney Harbour, Ultimixt on July 10, 2019 by xi'an**A** friend called me the other day for advice on how to submit an R package to CRAN along with a proof his method was mathematically sound. I replied with some items of advice taken from my (limited) experience with submitting packages. And with the remark that CRAN would not validate the mathematical contents of the associated package manual. Nor even the validity of the R code towards delivering the right outcome as stated in the manual. This shocked him quite seriously as he thought having a package accepted by CRAN was a stamp of validation of both the method and the R code. It would be nice of course but would require so much manpower that it seems unrealistic. Some middle ground is to aim at a journal or a peer community validation where both code and methods are vetted. Which happens for instance with the Journal of Computational and Graphical Statistics. Or the Journal of Statistical Software (which should revise its instructions to authors that states “The majority of software published in JSS is written in S, MATLAB, SAS/IML, C++, or Java”. S, really?!)

As for the validity of the latest release of R (currently R-3.6.1 which came out on 2019-07-05, named Action of the Toes!), I figure the bazillion R programs currently running should be able to detect any defect pretty fast, although awareness of the incredible failure of sample() reported in an earlier post took a while to appear.

## robust Bayesian synthetic likelihood

Posted in Statistics with tags ABC, Australia, Bayesian synthetic likelihood, Brisbane, industrial ruins, MCMC, Melbourne, Metropolis-Hastings algorithm, misspecified model, Monash University, pseudo-likelihood, QUT, summary statistics, Sydney Harbour on May 16, 2019 by xi'an**D**avid Frazier (Monash University) and Chris Drovandi (QUT) have recently come up with a robustness study of Bayesian synthetic likelihood that somehow mirrors our own work with David. In a sense, Bayesian synthetic likelihood is definitely misspecified from the start in assuming a Normal distribution on the summary statistics. When the data generating process is misspecified, even were the Normal distribution the “true” model or an appropriately converging pseudo-likelihood, the simulation based evaluation of the first two moments of the Normal is biased. Of course, for a choice of a summary statistic with limited information, the model can still be *weakly compatible* with the data in that there exists a pseudo-true value of the parameter θ⁰ for which the synthetic mean μ(θ⁰) is the mean of the statistics. (Sorry if this explanation of mine sounds unclear!) Or rather the Monte Carlo estimate of μ(θ⁰) coincidences with that mean.The same Normal toy example as in our paper leads to very poor performances in the MCMC exploration of the (unsympathetic) synthetic target. The robustification of the approach as proposed in the paper is to bring in an extra parameter to correct for the bias in the mean, using an additional Laplace prior on the bias to aim at sparsity. Or the same for the variance matrix towards inflating it. This over-parameterisation of the model obviously avoids the MCMC to get stuck (when implementing a random walk Metropolis with the target as a scale).