**O**ur paper, weakly informative reparameterisations of location-scale mixtures, with Kaniav Kamary and Kate Lee, got accepted by JCGS! Great news, which comes in perfect timing for Kaniav as she is currently applying for positions. The paper proposes a unidimensional mixture Bayesian modelling based on the first and second moment constraints, since these turn the remainder of the parameter space into a compact. While we had already developed an associated R package, Ultimixt, the current editorial policy of JCGS imposes the R code used to produce all results to be attached to the submission and it took us a few more weeks than it should have to produce a directly executable code, due to internal library incompatibilities. (For this entry, I was looking for a link to our special JCGS issue with my picture of Edinburgh but realised I did not have this picture.)

## Archive for R package

## weakly informative reparameterisations

Posted in Books, pictures, R, Statistics, University life with tags Bayesian modelling, Edinburgh, Gaussian mixture, JCGS, location-scale parameterisation, moments, non-informative priors, publication, R package, Ultimixt on February 14, 2018 by xi'an## bridgesampling [R package]

Posted in pictures, R, Statistics, University life with tags Amsterdam, bridge, bridge sampling, bridgesampling, JAGS, R, R package, STAN, University of Amsterdam, warped bridge sampling on November 9, 2017 by xi'an**Q**uentin F. Gronau, Henrik Singmann and Eric-Jan Wagenmakers have arXived a detailed documentation about their * bridgesampling* R package. (No wonder that researchers from Amsterdam favour bridge sampling!)

*[The package relates to a [52 pages] tutorial on bridge sampling by Gronau et al. that I will hopefully comment soon.]*The bridge sampling methodology for marginal likelihood approximation requires

*two*Monte Carlo samples for a ratio of

*two*integrals. A nice twist in this approach is to use a dummy integral that is already available, with respect to a probability density that is an approximation to the exact posterior. This means avoiding the difficulties with bridge sampling of bridging two different parameter spaces, in possibly different dimensions, with potentially very little overlap between the posterior distributions. The substitute probability density is chosen as Normal or warped Normal, rather than a t which would provide more stability in my opinion. The

*package also provides an error evaluation for the approximation, although based on spectral estimates derived from the*

**bridgesampling****package. The remainder of the document exhibits how the package can be used in conjunction with either JAGS or Stan. And concludes with the following words of caution:**

*coda*

“It should also be kept in mind that there may be cases in which the bridge sampling procedure may not be the ideal choice for conducting Bayesian model comparisons. For instance, when the models are nested it might be faster and easier to use the Savage-Dickey density ratio (Dickey and Lientz 1970; Wagenmakers et al. 2010). Another example is when the comparison of interest concerns a very large model space, and a separate bridge sampling based computation of marginal likelihoods may take too much time. In this scenario, Reversible Jump MCMC (Green 1995) may be more appropriate.”

## computational methods for numerical analysis with R [book review]

Posted in Books, Kids, pictures, R, Statistics, University life with tags book review, CRC Press, differential equation, Euler discretisation, integrate, integration, introductory textbooks, Monte Carlo integration, numerical analysis, optimisation, partial differential equations, R, R function, R package, Runge-Kutta, simulated annealing on October 31, 2017 by xi'an**T**his is a book by James P. Howard, II, I received from CRC Press for review in CHANCE. (As usual, the customary warning applies: *most of this blog post will appear later in my book review column in CHANCE.*) It consists in a traditional introduction to numerical analysis with backup from R codes and packages. The early chapters are setting the scenery, from basics on R to notions of numerical errors, before moving to linear algebra, interpolation, optimisation, integration, differentiation, and ODEs. The book comes with a package cmna that reproduces algorithms and testing. While I do not find much originality in the book, given its adherence to simple resolutions of the above topics, I could nonetheless use it for an elementary course in our first year classes. With maybe the exception of the linear algebra chapter that I did not find very helpful.

“…you can have a solution fast, cheap, or correct, provided you only pick two.” (p.27)

The (minor) issue I have with the book and that a potential mathematically keen student could face as well is that there is little in the way of justifying a particular approach to a given numerical problem (as opposed to others) and in characterising the limitations and failures of the presented methods (although this happens from time to time as e.g. for gradient descent, p.191). [Seeping in my Gallic “mal-être”, I am prone to over-criticise methods during classing, to the (increased) despair of my students!, but I also feel that avoiding over-rosy presentations is a good way to avoid later disappointments or even disasters.] In the case of this book, finding [more] ways of detecting would-be disasters would have been nice.

An uninteresting and highly idiosyncratic side comment is that the author preferred the French style for long division to the American one, reminding me of my first exposure to the latter, a few months ago! Another comment from a statistician is that mentioning time series inter- or extra-polation without a statistical model sounds close to anathema! And makes extrapolation a weapon without a cause.

“…we know, a priori, exactly how long the [simulated annealing] process will take since it is a function of the temperature and the cooling rate.” (p.199)

Unsurprisingly, the section on Monte Carlo integration is disappointing for a statistician/probabilistic numericist like me, as it fails to give a complete enough picture of the methodology. All simulations seem to proceed there from a large enough hypercube. And recommending the “fantastic” (p.171) R function *integrate* as a default is scary, given the ability of the selected integration bounds to misled its users. Similarly, I feel that the simulated annealing section is not providing enough of a cautionary tale about the highly sensitive impact of cooling rates and absolute temperatures. It is only through the raw output of the algorithm applied to the travelling salesman problem that the novice reader can perceive the impact of some of these factors. (The acceptance bound on the jump (6.9) is incidentally wrongly called a *probability* on p.199, since it can take values larger than one.)

*[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Books Review section in CHANCE.]*

## testing R code [book review]

Posted in R, Statistics, Travel with tags CRC Press, debugging, R, R package, RStudio, testing, Testing R Code on March 1, 2017 by xi'an**W**hen I saw this title among the CRC Press novelties, I immediately ordered it as I though it fairly exciting. Now that I have gone through the book, the excitement has died. Maybe faster than need be as I read it while being stuck in a soulless Schipol airport and missing the only ice-climbing opportunity of the year!

Testing R Code was written by Richard Cotton and is quite short: once you take out the appendices and the answers to the exercises, it is about 130 pages long, with a significant proportion of code and output. And it is about some functions developed by Hadley Wickham from RStudio, for testing the coherence of R code in terms of inputs more than outputs. The functions are assertive and testthat. Intended for run-time versus development-time testing. Meaning that the output versus the input are what the author of the code intends them to be. The other chapters contain advices and heuristics about writing maintainable testable code, and incorporating a testing feature in an R package.

While I am definitely a poorly qualified reader for this type of R books, my disappointment stems from my expectation of a book about debugging R code, which is possibly due to a misunderstanding of the term testing. This is an unrealistic expectation, for sure, as testing for a code to produce what it is supposed to do requires some advanced knowledge of what the output should be, at least in some representative situations. Which means using interface like RStudio is capital in spotting unsavoury behaviours of some variables, if not foolproof in any case.

## Le Monde puzzle [#967]

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags arithmetics, Le Monde, mathematical puzzle, prime number, primes, R, R package, sudoku on September 30, 2016 by xi'an**A** Sudoku-like Le Monde mathematical puzzle for a come-back (now that it competes with The Riddler!):

Does there exist a 3×3 grid with different and positive integer entries such that the sum of rows, columns, and both diagonals is a prime number? If there exist such grids, find the grid with the minimal sum?

**I **first downloaded the R package ** primes**. Then I checked if by any chance a small bound on the entries was sufficient:

cale<-function(seqe){ ros=apply(seqe,1,sum) cole=apply(seqe,2,sum) dyag=sum(diag(seqe)) dayg=sum(diag(seqe[3:1,1:3])) return(min(is_prime(c(ros,cole,dyag,dayg)))>0)}

Running the blind experiment

for (t in 1:1e6){ n=sample(9:1e2,1) if (cale(matrix(sample(n,9),3))) print(n)}

I got 10 as the minimal value of n. Trying with n=9 did not give any positive case. Running another blind experiment checking for the minimal sum led to the result

> A [,1] [,2] [,3] [1,] 8 3 6 [2,] 1 5 7 [3,] 2 11 4

with sum 47.