## Le Monde puzzle [#1094]

Posted in Books, Kids, R with tags , , , , , , on April 23, 2019 by xi'an

A rather blah number Le Monde mathematical puzzle:

Find all integer multiples of 11111 with exactly one occurrence of each decimal digit..

Which I solved by brute force, by looking at the possible range of multiples (and  borrowing stringr:str_count from Robin!)

> combien=0
> for (i in 90001:900008){
j=i*11111
combien=combien+(min(stringr::str_count(j,paste(0:9)))==1)}
> combien
[1] 3456


And a bonus one:

Find all integers y that can write both as x³ and (10z)³+a with 1≤a≤999.

which does not offer much in terms of solutions since x³-v³=(x-v)(x²+xv+v²)=a shows that x² is less than 2a/3, meaning x is at most 25. Among such numbers only x=11,12 lead to a solution as x³=1331,1728.

## survivalists [a Riddler’s riddle]

Posted in Books, Kids, R, Statistics with tags , , , , , , on April 22, 2019 by xi'an

A neat question from The Riddler on a multi-probability survival rate:

Nine processes are running in a loop with fixed survivals rates .99,….,.91. What is the probability that the first process is the last one to die? Same question with probabilities .91,…,.99 and the probability that the last process is the last one to die.

The first question means that the realisation of a Geometric G(.99) has to be strictly larger than the largest of eight Geometric G(.98),…,G(.91). Given that the cdf of a Geometric G(a) is [when counting the number of attempts till failure, included, i.e. the Geometric with support the positive integers]

$F(x)=\Bbb P(X\le x)=1-a^{x}$

the probability that this happens has the nice (?!) representation

$\sum_{x=2}^\infty a_1^{x-1}(1-a_1)\prod_{j\ge 2}(1-a_j^{x-1})=(1-a_1)G(a_1,\ldots,a_9)$

which leads to an easy resolution by recursion since

$G(a_1,\ldots,a_9)=G(a_1,\ldots,a_8)-G(a_1a_9,\ldots,a_8)$

and $G(a)=a/(1-a)$

and a value of 0.5207 returned by R (Monte Carlo evaluation of 0.5207 based on 10⁷ replications). The second question is quite similar, with solution

$\sum_{x=2}^\infty a_1^{x-1}(1-a_1)\prod_{j\ge 1}(1-a_j^{x})=a^{-1}(1-a_1)G(a_1,\ldots,a_9)$

and value 0.52596 (Monte Carlo evaluation of 0.52581 based on 10⁷ replications).

## haunting of tramcar 105 [book review]

Posted in Statistics with tags , , , , , , , , , , , on April 20, 2019 by xi'an

A mix of steampunk and urban magic in a enlightened 1912 Cairo sounded like a good prolegomena and I bought P. Djèli Clark’s The haunting of tram car 015 on this basis. As it happens, this is actually a novella of 123 pages building on the same universe as a previous work of the author, A dead djinn in Cairo, which however is even shorter and only available as a Kindle book… I really enjoyed the short read and its description of an alternate Cairo that is competing with Paris and London, thanks to the advantage brought by the supernatural powers of djinns. (And apparently also gaining the independence Egypt could not secure under the British protectorate.) The English suffragettes have also their counterparts in Egypt and the country is about to decide about women right to vote. The story itself is nice if not stratospheric, with mostly well-drawn characters and good dialogues. (The core of the plot relies on smuggling sweets from Armenia, though, a rather weak link.) As in an earlier order, the book itself was not properly printed, with a vertical white band of erased characters on most odd pages, presumably another illustration of the shortcomings of the  print-on-demand principle. (Which means that I sent the book back to Amazon rather than leaving it in the common room.)

## holistic framework for ABC

Posted in Books, Statistics, University life with tags , , , , , , , on April 19, 2019 by xi'an

An AISTATS 2019 paper was recently arXived by Kelvin Hsu and Fabio Ramos. Proposing an ABC method

“…consisting of (1) a consistent surrogate likelihood model that modularizes queries from simulation calls, (2) a Bayesian learning objective for hyperparameters that improves inference accuracy, and (3) a posterior surrogate density and a super-sampling inference algorithm using its closed-form posterior mean embedding.”

While this sales line sounds rather obscure to me, the authors further defend their approach against ABC-MCMC or synthetic likelihood by the points

“that (1) only one new simulation is required at each new parameter θ and (2) likelihood queries do not need to be at parameters where simulations are available.”

using a RKHS approach to approximate the likelihood or the distribution of the summary (statistic) given the parameter (value) θ. Based on the choice of a certain positive definite kernel. (As usual, I do not understand why RKHS would do better than another non-parametric approach, especially since the approach approximates the full likelihood, but I am not a non-parametrician…)

“The main advantage of using an approximate surrogate likelihood surrogate model is that it readily provides a marginal surrogate likelihood quantity that lends itself to a hyper-parameter learning algorithm”

The tolerance ε (and other cyberparameters) are estimated by maximising the approximated marginal likelihood, which happens to be available in the convenient case the prior is an anisotropic Gaussian distribution. For the simulated data in the reference table? But then missing the need for localising the simulations near the posterior? Inference is then conducting by simulating from this approximation. With the common (to RKHS) drawback that the approximation is “bounded and normalized but potentially non-positive”.

## Le Monde puzzle [#1092]

Posted in Statistics with tags , , , , , , , on April 18, 2019 by xi'an

A Latin square Le Monde mathematical puzzle that I found rather dreary:

A hidden 3×3 board contains all numbers from 1 to 9. Anselm wants to guess the board and makes two proposals. Berenicke tells him how many entries are in the right rows and colums for each proposal, along with the information that no entry is at the right location. Anselm deduces the right board.

Which I solved by brute force and not even simulated annealing, first defining a target

ordoku1=ordoku2=matrix(1,9,2)
ordoku1[,1]=c(1,1,1,2,2,2,3,3,3)
ordoku1[,2]=rep(1:3,3)
ordoku2[,1]=c(3,2,3,1,2,3,2,1,1)
ordoku2[,2]=c(2,2,3,2,3,1,1,3,1)
fitz=function(ordo){
(sum(ordo[c(1,4,7),2]==1)==1)+(sum(ordo[c(2,5,8),2]==2)==1)+
(sum(ordo[c(3,6,9),2]==3)==0)+(sum(ordo[c(1,2,3),1]==1)==1)+
(sum(ordo[c(4,5,6),1]==2)==1)+(sum(ordo[c(7,8,9),1]==3)==2)+
(sum(ordo[c(6,7,9),2]==1)==2)+(sum(ordo[c(1,2,4),2]==2)==1)+
(sum(ordo[c(3,5,8),2]==3)==2)+(sum(ordo[c(4,8,9),1]==1)==1)+
(sum(ordo[c(7,2,5),1]==2)==1)+(sum(ordo[c(1,3,6),1]==3)==0)+
(!(0%in%apply((ordo-ordoku1)^2,1,sum)))+(!(0%in%apply((ordo-ordoku2)^2,1,sum)))
}


on a 9×9 board entry reproducing all items of information given by Berenicke. If all constraints are met, the function returns 14. And then searched for a solution at random:

temp=1
randw=function(ordo){
for (t in 1:1e6){
chlg=sample(1:9,2)
temp=ordo[chlg[1],]
ordo[chlg[1],]=ordo[chlg[2],]
ordo[chlg[2],]=temp
if (fitz(ordo)==14){
print(ordo);break()}}}


which produces the correct board

4 3 5
6 7 1
9 2 8


## bandits for doubly intractable posteriors

Posted in Statistics with tags , , , , , , , , on April 17, 2019 by xi'an

Last Friday, Guanyang Wang arXived a paper on the use of multi-armed bandits (hence the reference to the three bandits) to handle intractable normalising constants. The bandit compares or mixes Møller et al. (2006) auxiliary variable solution with Murray et al. (2006) exchange algorithm. Which are both special cases of pseudo-marginal MCMC algorithms. In both cases, the auxiliary variables produce an unbiased estimator of the ratio of the constants. Rather than the ratio of two unbiased estimators as in the more standard pseudo-marginal MCMC. The current paper tries to compare the two approaches based on the variance of the ratio estimate, but cannot derive a general ordering. The multi-armed bandit algorithm exploits both estimators of the acceptance ratio to pick the one that is almost the largest, almost because there is a correction for validating the step by detailed balance. The bandit acceptance probability is the maximum [over the methods] of the minimum [over the time directions] of the original acceptance ratio. While this appears to be valid, note that the resulting algorithm implies four times as many auxiliary variates as the original ones, which makes me wonder at the gain when compared with a parallel implementation of these methods, coupled at random times. (The fundamental difficulty of simulating from likelihoods with an unknown normalising constant remains, see p.4.)

## back to Venice

Posted in Statistics with tags , , , , , , , on April 16, 2019 by xi'an