## CRAN does not validate R packages!

Posted in pictures, R, University life with tags , , , , , , , , , , on July 10, 2019 by xi'an

A friend called me the other day for advice on how to submit an R package to CRAN along with a proof his method was mathematically sound. I replied with some items of advice taken from my (limited) experience with submitting packages. And with the remark that CRAN would not validate the mathematical contents of the associated package manual. Nor even the validity of the R code towards delivering the right outcome as stated in the manual. This shocked him quite seriously as he thought having a package accepted by CRAN was a stamp of validation of both the method and the R code. It would be nice of course but would require so much manpower that it seems unrealistic. Some middle ground is to aim at a journal or a peer community validation where both code and methods are vetted. Which happens for instance with the Journal of Computational and Graphical Statistics. Or the Journal of Statistical Software (which should revise its instructions to authors that states “The majority of software published in JSS is written in S, MATLAB, SAS/IML, C++, or Java”. S, really?!)

As for the validity of the latest release of R (currently R-3.6.1 which came out on 2019-07-05, named Action of the Toes!), I figure the bazillion R programs currently running should be able to detect any defect pretty fast, although awareness of the incredible failure of sample() reported in an earlier post took a while to appear.

## truncated Normal moments

Posted in Books, Kids, Statistics with tags , , , , , on May 24, 2019 by xi'an

An interesting if presumably hopeless question spotted on X validated: a lower-truncated Normal distribution is parameterised by its location, scale, and truncation values, μ, σ, and α. There exist formulas to derive the mean and variance of the resulting distribution,  that is, when α=0,

$\Bbb{E}_{\mu,\sigma}[X]= \mu + \frac{\varphi(\mu/\sigma)}{1-\Phi(-\mu/\sigma)}\sigma$

and

$\text{var}_{\mu,\sigma}(X)=\sigma^2\left[1-\frac{\mu\varphi(\mu/\sigma)/\sigma}{1-\Phi(-\mu/\sigma)} -\left(\frac{\varphi(\mu/\sigma)}{1-\Phi(-\mu/\sigma)}\right)^2\right]$

but there is no easy way to choose (μ, σ) from these two quantities. Beyond numerical resolution of both equations. One of the issues is that ( μ, σ) is not a location-scale parameter for the truncated Normal distribution when α is fixed.

## robust Bayesian synthetic likelihood

Posted in Statistics with tags , , , , , , , , , , , , , on May 16, 2019 by xi'an

David Frazier (Monash University) and Chris Drovandi (QUT) have recently come up with a robustness study of Bayesian synthetic likelihood that somehow mirrors our own work with David. In a sense, Bayesian synthetic likelihood is definitely misspecified from the start in assuming a Normal distribution on the summary statistics. When the data generating process is misspecified, even were the Normal distribution the “true” model or an appropriately converging pseudo-likelihood, the simulation based evaluation of the first two moments of the Normal is biased. Of course, for a choice of a summary statistic with limited information, the model can still be weakly compatible with the data in that there exists a pseudo-true value of the parameter θ⁰ for which the synthetic mean μ(θ⁰) is the mean of the statistics. (Sorry if this explanation of mine sounds unclear!) Or rather the Monte Carlo estimate of μ(θ⁰) coincidences with that mean.The same Normal toy example as in our paper leads to very poor performances in the MCMC exploration of the (unsympathetic) synthetic target. The robustification of the approach as proposed in the paper is to bring in an extra parameter to correct for the bias in the mean, using an additional Laplace prior on the bias to aim at sparsity. Or the same for the variance matrix towards inflating it. This over-parameterisation of the model obviously avoids the MCMC to get stuck (when implementing a random walk Metropolis with the target as a scale).

## ABC by subset simulation

Posted in Books, Statistics, Travel with tags , , , , , , , , , on August 25, 2016 by xi'an

Last week, Vakilzadeh, Beck and Abrahamsson arXived a paper entitled “Using Approximate Bayesian Computation by Subset Simulation for Efficient Posterior Assessment of Dynamic State-Space Model Classes”. It follows an earlier paper by Beck and co-authors on ABC by subset simulation, paper that I did not read. The model of interest is a hidden Markov model with continuous components and covariates (input), e.g. a stochastic volatility model. There is however a catch in the definition of the model, namely that the observable part of the HMM includes an extra measurement error term linked with the tolerance level of the ABC algorithm. Error term that is dependent across time, the vector of errors being within a ball of radius ε. This reminds me of noisy ABC, obviously (and as acknowledged by the authors), but also of some ABC developments of Ajay Jasra and co-authors. Indeed, as in those papers, Vakilzadeh et al. use the raw data sequence to compute their tolerance neighbourhoods, which obviously bypasses the selection of a summary statistic [vector] but also may drown signal under noise for long enough series.

“In this study, we show that formulating a dynamical system as a general hierarchical state-space model enables us to independently estimate the model evidence for each model class.”

Subset simulation is a nested technique that produces a sequence of nested balls (and related tolerances) such that the conditional probability to be in the next ball given the previous one remains large enough. Requiring a new round of simulation each time. This is somewhat reminding me of nested sampling, even though the two methods differ. For subset simulation, estimating the level probabilities means that there also exists a converging (and even unbiased!) estimator for the evidence associated with different tolerance levels. Which is not a particularly natural object unless one wants to turn it into a tolerance selection principle, which would be quite a novel perspective. But not one adopted in the paper, seemingly. Given that the application section truly compares models I must have missed something there. (Blame the long flight from San Francisco to Sydney!) Interestingly, the different models as in Table 4 relate to different tolerance levels, which may be an hindrance for the overall validation of the method.

I find the subsequent part on getting rid of uncertain prediction error model parameters of lesser [personal] interest as it essentially replaces the marginal posterior on the parameters of interest by a BIC approximation, with the unsurprising conclusion that “the prior distribution of the nuisance parameter cancels out”.

## exact ABC

Posted in Books, pictures, Statistics, University life with tags , , , , , on January 21, 2016 by xi'an

Minh-Ngoc Tran and Robert Kohn have devised an “exact” ABC algorithm. They claim therein to remove the error due to the non-zero tolerance by using an unbiased estimator of the likelihood. Most interestingly, they start from the debiasing technique of Rhee and Glynn [also at the basis of the Russian roulette]. Which sums up as using a telescoping formula on a sequence of converging biased estimates. And cutting the infinite sum with a stopping rule.

“Our article proposes an ABC algorithm to estimate [the observed likelihood] that completely removes the error due to [the ABC] approximation…”

The sequence of biased but converging approximations is associated with a sequence of decreasing tolerances. The corresponding sequence of weights that determines the truncation in the series is connected to the decrease in the bias in an implicit manner for all realistic settings. Although Theorem 1 produces conditions on the ABC kernel and the sequence of tolerances and pseudo-sample sizes that guarantee unbiasedness and finite variance of the likelihood estimate. For a geometric stopping rule with rejection probability p, both tolerance and pseudo-sample size decrease as a power of p. As a side product the method also returns an unbiased estimate of the evidence. The overall difficulty I have with the approach is the dependence on the stopping rule and its calibration, and the resulting impact on the computing time of the likelihood estimate. When this estimate is used in a pseudo-marginal scheme à la Andrieu and Roberts (2009), I fear this requires new pseudo-samples at each iteration of the Metropolis-Hastings algorithm, which then becomes prohibitively expensive. Later today, Mark Girolami pointed out to me that Anne-Marie Lyne [one of the authors of the Russian roulette paper] also considered this exact approach in her thesis and concluded at an infinite computing time.

## ABC in Sydney, July 3-4, 2014!!!

Posted in pictures, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , , on February 12, 2014 by xi'an

After ABC in Paris in 2009, ABC in London in 2011, and ABC in Roma last year, things are accelerating since there will be—as I just learned—  an ABC in Sydney next July (not June as I originally typed, thanks Robin!). The workshop on the current developments of ABC methodology thus leaves Europe to go down-under and to take advantage of the IMS Meeting in Sydney on July 7-10, 2014. Hopefully, “ABC in…” will continue its tour of European capitals in 2015! To keep up with an unbroken sequence of free workshops, Scott Sisson has managed to find support so that attendance is free of charge (free as in “no registration fee at all”!) but you do need to register as space is limited. While I would love to visit UNSW and Sydney once again and attend the workshop, I will not, getting ready for Cancún and our ABC short course there.

## art brut

Posted in pictures, Travel with tags , , , , , , on September 2, 2012 by xi'an