## Le Monde puzzle [#1087]

Posted in Books, Kids, R, Statistics with tags , , , , , on February 25, 2019 by xi'an

A board-like Le Monde mathematical puzzle in the digit category:

Given a (k,m) binary matrix, what is the maximum number S of entries with only one neighbour equal to one? Solve for k=m=2,…,13, and k=6,m=8.

For instance, for k=m=2, the matrix

$\begin{matrix} 0 &0\\ 1 &1\\ \end{matrix}$

is producing the maximal number 4. I first attempted a brute force random filling of these matrices with only a few steps of explorations and got the numbers 4,8,16,34,44,57,… for the first cases. Since I was convinced that the square k² of a number k previously exhibited to reach its maximum as S=k² was again perfect in this regard, I then tried another approach based on Gibbs sampling and annealing (what else?):

gibzit=function(k,m,A=1e2,N=1e2){
temp=1 #temperature
board=sole=matrix(sample(c(0,1),(k+2)*(m+2),rep=TRUE),k+2,m+2)
board[1,]=board[k+2,]=board[,1]=board[,m+2]=0 #boundaries
maxol=counter(board,k,m) #how many one-neighbours?
for (t in 1:A){#annealing
for (r in 1:N){#basic gibbs steps
for (i in 2:(k+1))
for (j in 2:(m+1)){
prop=board
prop[i,j]=1-board[i,j]
u=runif(1)
if (log(u/(1-u))<temp*(counter(prop,k,m)-val)){
board=prop;val=counter(prop,k,m)
if (val>maxol){
maxol=val;sole=board}}
}}
temp=temp*2}
return(sole[-c(1,k+2),-c(1,m+2)])}


which leads systematically to the optimal solution, namely a perfect square k² when k is even and a perfect but one k²-1 when k is odd. When k=6, m=8, all entries can afford one neighbour exactly,

> gibzbbgiz(6,8)
[1] 48
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]    1    0    0    1    1    0    0    1
[2,]    1    0    0    0    0    0    0    1
[3,]    0    0    1    0    0    1    0    0
[4,]    0    0    1    0    0    1    0    0
[5,]    1    0    0    0    0    0    0    1
[6,]    1    0    0    1    1    0    0    1


but this does not seem feasible when k=6, m=7, which only achieves 40 entries with one single neighbour.

## call for sessions and labs at Bay2sC0mp²⁰

Posted in pictures, R, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , on February 22, 2019 by xi'an

A call to all potential participants to the incoming BayesComp 2020 conference at the University of Florida in Gainesville, Florida, 7-10 January 2020, to submit proposals [to me] for contributed sessions on everything computational or training labs [to David Rossell] on a specific language or software. The deadline is April 1 and the sessions will be selected by the scientific committee, other proposals being offered the possibility to present the associated research during a poster session [which always is a lively component of the conference]. (Conversely, we reserve the possibility of a “last call” session made from particularly exciting posters on new topics.) Plenary speakers for this conference are

and the first invited sessions are already posted on the webpage of the conference. We dearly hope to attract a wide area of research interests into a as diverse as possible program, so please accept this invitation!!!

## Le Monde puzzle [#1085]

Posted in Books, Kids, R with tags , , , , , on February 18, 2019 by xi'an

A new Le Monde mathematical puzzle in the digit category:

Given 13 arbitrary relative integers chosen by Bo, Abigail can select any subset of them to be drifted by plus or minus one by Bo, repeatedly until Abigail reaches the largest possible number N of multiples of 5. What is the minimal possible value of N under the assumption that Bo tries to minimise it?

I got stuck on that one, as building a recursive functiion led me nowhere: the potential for infinite loop (add one, subtract one, add one, …) rather than memory issues forced me into a finite horizon for the R function, which then did not return anything substantial in a manageable time. Over the week and the swimming sessions, I thought of simplifying the steps, like (a) work modulo 5, (b) bias moves towards 1 or 4, away from 2 and 3, by keeping only one entry in 2 and 3, and all but one at 1 and 4, but could only produce five 0’s upon a sequence of attempts… With the intuition that only 3 entries should remain in the end, which was comforted by Le Monde solution the week after.

## Le Monde puzzle [#1083]

Posted in Books, Kids, R, Travel with tags , , , , , , on February 7, 2019 by xi'an

A Le Monde mathematical puzzle that seems hard to solve without the backup of a computer (and just simple enough to code on a flight to Montpellier):

Given the number N=2,019, find a decomposition of N as a sum of non-trivial powers of integers such that (a) the number of integers in the sum is maximal or (b) all powers are equal to 4.  Is it possible to write N as a sum of two powers?

It is straightforward to identify all possible terms in these sums by listing all powers of integers less than N

pool=(1:trunc(sqrt(2019)))^2
for (pow in 3:11)
pool=unique(c(pool,(2:trunc(2019^(1/pow)))^pow))


which leads to 57 distinct powers. Sampling at random from this collection at random produces a sum of 21 perfect powers:

 1+4+8+9+16+25+27+32+36+49+64+81+100+121+125+128+144+169+196+243+441

But looking at the 22 smallest numbers in the pool of powers leads to 2019, which is a sure answer. Restricting the terms to powers of 4 leads to the sequence

1⁴+2⁴+3⁴+5⁴+6⁴ = 2019

And starting from the pools of all possible powers in a decomposition of 2019 as the sum of two powers shows this is impossible.

## Computational Bayesian Statistics [book review]

Posted in Books, Statistics with tags , , , , , , , , , , , , , , , , , , , , , , , , , , , , on February 1, 2019 by xi'an

This Cambridge University Press book by M. Antónia Amaral Turkman, Carlos Daniel Paulino, and Peter Müller is an enlarged translation of a set of lecture notes in Portuguese. (Warning: I have known Peter Müller from his PhD years in Purdue University and cannot pretend to perfect objectivity. For one thing, Peter once brought me frozen-solid beer: revenge can also be served cold!) Which reminds me of my 1994 French edition of Méthodes de Monte Carlo par chaînes de Markov, considerably upgraded into Monte Carlo Statistical Methods (1998) thanks to the input of George Casella. (Re-warning: As an author of books on the same topic(s), I can even less pretend to objectivity.)

“The “great idea” behind the development of computational Bayesian statistics is the recognition that Bayesian inference can be implemented by way of simulation from the posterior distribution.”

The book is written from a strong, almost militant, subjective Bayesian perspective (as, e.g., when half-Bayesians are mentioned!). Subjective (and militant) as in Dennis Lindley‘s writings, eminently quoted therein. As well as in Tony O’Hagan‘s. Arguing that the sole notion of a Bayesian estimator is the entire posterior distribution. Unless one brings in a loss function. The book also discusses the Bayes factor in a critical manner, which is fine from my perspective.  (Although the ban on improper priors makes its appearance in a very indirect way at the end of the last exercise of the first chapter.)

Somewhat at odds with the subjectivist stance of the previous chapter, the chapter on prior construction only considers non-informative and conjugate priors. Which, while understandable in an introductory book, is a wee bit disappointing. (When mentioning Jeffreys’ prior in multidimensional settings, the authors allude to using univariate Jeffreys’ rules for the marginal prior distributions, which is not a well-defined concept or else Bernardo’s and Berger’s reference priors would not have been considered.) The chapter also mentions the likelihood principle at the end of the last exercise, without a mention of the debate about its derivation by Birnbaum. Or Deborah Mayo’s recent reassessment of the strong likelihood principle. The following chapter is a sequence of illustrations in classical exponential family models, classical in that it is found in many Bayesian textbooks. (Except for the Poison model found in Exercise 3.3!)

Nothing to complain (!) about the introduction of Monte Carlo methods in the next chapter, especially about the notion of inference by Monte Carlo methods. And the illustration by Bayesian design. The chapter also introduces Rao-Blackwellisation [prior to introducing Gibbs sampling!]. And the simplest form of bridge sampling. (Resuscitating the weighted bootstrap of Gelfand and Smith (1990) may not be particularly urgent for an introduction to the topic.) There is furthermore a section on sequential Monte Carlo, including the Kalman filter and particle filters, in the spirit of Pitt and Shephard (1999). This chapter is thus rather ambitious in the amount of material covered with a mere 25 pages. Consensus Monte Carlo is even mentioned in the exercise section.

“This and other aspects that could be criticized should not prevent one from using this [Bayes factor] method in some contexts, with due caution.”

Chapter 5 turns back to inference with model assessment. Using Bayesian p-values for model assessment. (With an harmonic mean spotted in Example 5.1!, with no warning about the risks, except later in 5.3.2.) And model comparison. Presenting the whole collection of xIC information criteria. from AIC to WAIC, including a criticism of DIC. The chapter feels somewhat inconclusive but methinks this is the right feeling on the current state of the methodology for running inference about the model itself.

“Hint: There is a very easy answer.”

Chapter 6 is also a mostly standard introduction to Metropolis-Hastings algorithms and the Gibbs sampler. (The argument given later of a Metropolis-Hastings algorithm with acceptance probability one does not work.) The Gibbs section also mentions demarginalization as a [latent or auxiliary variable] way to simulate from complex distributions [as we do], but without defining the notion. It also references the precursor paper of Tanner and Wong (1987). The chapter further covers slice sampling and Hamiltonian Monte Carlo, the later with sufficient details to lead to reproducible implementations. Followed by another standard section on convergence assessment, returning to the 1990’s feud of single versus multiple chain(s). The exercise section gets much larger than in earlier chapters with several pages dedicated to most problems. Including one on ABC, maybe not very helpful in this context!

“…dimension padding (…) is essentially all that is to be said about the reversible jump. The rest are details.”

The next chapter is (somewhat logically) the follow-up for trans-dimensional problems and marginal likelihood approximations. Including Chib’s (1995) method [with no warning about potential biases], the spike & slab approach of George and McCulloch (1993) that I remember reading in a café at the University of Wyoming!, the somewhat antiquated MC³ of Madigan and York (1995). And then the much more recent array of Bayesian lasso techniques. The trans-dimensional issues are covered by the pseudo-priors of Carlin and Chib (1995) and the reversible jump MCMC approach of Green (1995), the later being much more widely employed in the literature, albeit difficult to tune [and even to comprehensively describe, as shown by the algorithmic representation in the book] and only recommended for a large number of models under comparison. Once again the exercise section is most detailed, with recent entries like the EM-like variable selection algorithm of Ročková and George (2014).

The book also includes a chapter on analytical approximations, which is also the case in ours [with George Casella] despite my reluctance to bring them next to exact (simulation) methods. The central object is the INLA methodology of Rue et al. (2009) [absent from our book for obvious calendar reasons, although Laplace and saddlepoint approximations are found there as well]. With a reasonable amount of details, although stopping short of implementable reproducibility. Variational Bayes also makes an appearance, mostly following the very recent Blei et al. (2017).

The gem and originality of the book are primarily to be found in the final and ninth chapter where four software are described, all with interfaces to R: OpenBUGS, JAGS, BayesX, and Stan, plus R-INLA which is processed in the second half of the chapter (because this is not a simulation method). As in the remainder of the book, the illustrations are related to medical applications. Worth mentioning is the reminder that BUGS came in parallel with Gelfand and Smith (1990) Gibbs sampler rather than as a consequence. Even though the formalisation of the Markov chain Monte Carlo principle by the later helped in boosting the power of this software. (I also appreciated the mention made of Sylvia Richardson’s role in this story.) Since every software is illustrated in depth with relevant code and output, and even with the shortest possible description of its principle and modus vivendi, the chapter is 60 pages long [and missing a comparative conclusion]. Given my total ignorance of the very existence of the BayesX software, I am wondering at the relevance of its inclusion in this description rather than, say, other general R packages developed by authors of books such as Peter Rossi. The chapter also includes a description of CODA, with an R version developed by Martin Plummer [now a Warwick colleague].

In conclusion, this is a high-quality and all-inclusive introduction to Bayesian statistics and its computational aspects. By comparison, I find it much more ambitious and informative than Albert’s. If somehow less pedagogical than the thicker book of Richard McElreath. (The repeated references to Paulino et al.  (2018) in the text do not strike me as particularly useful given that this other book is written in Portuguese. Unless an English translation is in preparation.)

Disclaimer: this book was sent to me by CUP for endorsement and here is what I wrote in reply for a back-cover entry:

An introduction to computational Bayesian statistics cooked to perfection, with the right mix of ingredients, from the spirited defense of the Bayesian approach, to the description of the tools of the Bayesian trade, to a definitely broad and very much up-to-date presentation of Monte Carlo and Laplace approximation methods, to an helpful description of the most common software. And spiced up with critical perspectives on some common practices and an healthy focus on model assessment and model selection. Highly recommended on the menu of Bayesian textbooks!

And this review is likely to appear in CHANCE, in my book reviews column.

## missing digit in a 114 digit number [a Riddler’s riddle]

Posted in R, Running, Statistics with tags , , , , , , , on January 31, 2019 by xi'an

A puzzling riddle from The Riddler (as Le Monde had a painful geometry riddle this week): this number with 114 digits

530,131,801,762,787,739,802,889,792,754,109,70?,139,358,547,710,066,257,652,050,346,294,484,433,323,974,747,960,297,803,292,989,236,183,040,000,000,000

is missing one digit and is a product of some of the integers between 2 and 99. By comparison, 76! and 77! have 112 and 114 digits, respectively. While 99! has 156 digits. Using WolframAlpha on-line prime factor decomposition code, I found that only 6 is a possible solution, as any other integer between 0 and 9 included a large prime number in its prime decomposition:

However, I thought anew about it when swimming the next early morning [my current substitute to morning runs] and reasoned that it was not necessary to call a formal calculator as it is reasonably easy to check that this humongous number has to be divisible by 9=3×3 (for else there are not enough terms left to reach 114 digits, checked by lfactorial()… More precisely, 3³³x33! has 53 digits and 99!/3³³x33! 104 digits, less than 114), which means the sum of all digits is divisible by 9, which leads to 6 as the unique solution.

## more concentration, everywhere

Posted in R, Statistics with tags , , , , , , , , , , on January 25, 2019 by xi'an

Although it may sound like an excessive notion of optimality, one can hope at obtaining an estimator δ of a unidimensional parameter θ that is always closer to θ that any other parameter. In distribution if not almost surely, meaning the cdf of (δ-θ) is steeper than for other estimators enjoying the same cdf at zero (for instance ½ to make them all median-unbiased). When I saw this question on X validated, I thought of the Cauchy location example, where there is no uniformly optimal estimator, albeit a large collection of unbiased ones. But a simulation experiment shows that the MLE does better than the competition. At least than three (above) four of them (since I tried the Pitman estimator via Christian Henning’s smoothmest R package). The differences to the MLE empirical cd make it clearer below (with tomato for a score correction, gold for the Pitman estimator, sienna for the 38% trimmed mean, and blue for the median):I wonder at a general theory along these lines. There is a vague similarity with Pitman nearness or closeness but without the paradoxes induced by this criterion. More in the spirit of stochastic dominance, which may be achievable for location invariant and mean unbiased estimators…