## Advances in Scalable Bayesian Computation [group photo]

Posted in Kids, Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , on March 8, 2014 by xi'an

## Le Monde puzzle [#855]

Posted in Books, Kids, Statistics with tags , , , , , , on March 7, 2014 by xi'an

A Le Monde mathematical puzzle that reminds me of an earlier one:

Given ten tokens with different unknown weights, and a scale that can rank three tokens at a time, starting with ranking three tokens, what is the minimum number of actions necessary to rank the ten of them if (a) one token at a time is added, (b) one or two tokens are added? If no restriction is imposed on the token introduction, is there a more efficient strategy?

It indeed relates to earlier posts on sorting and ternary sorting. Checking further on StackOverflow I found this reply to a question about ternary sorting:

Average number of comparisons:

``````in ternary search = ((1/3)*1+(2/3)*2)*ln(n)/ln(3)~1.517*ln(n)
in binary search  =                 1*ln(n)/ln(2)~1.443*ln(n)
``````

Worst number of comparisons:

``````in ternary search = 2*ln(n)/ln(3)~1.820*ln(n)
in binary search  = 1*ln(n)/ln(2)~1.443*ln(n)``````

albeit with no reference. So this somewhat answers part (c) of the question. (If asymptotically since this does not work for n=10! Even for n=100, it is way too large.) Looking for a solution to part (a), I looked at the performances of a dyadic sorting algorithm, partitioning recursively the already sorted part of the sample into three parts to locate each time the proper position of the new token. This led me to the following code

```rang=function(x,ranj,taken){

mag=max(ranj)-min(ranj)+1
i1=ranj[1]-1+trunc(mag/3)
i2=ranj[1]-1+trunc(2*mag/3)
locrk=rank(c(taken[c(i1,i2)],x))

if (locrk[3]==1) return(ifelse(i1>3,
1+rang(x,ranj=1:(i1-1),taken),
1+(i1>1)))
if (locrk[3]==2) return(ifelse(i2-i1>4,
1+rang(x,ranj=(i1+1):(i2-1),taken),ranj=1:(i1-1),taken),
1+(i1>1)))
if (locrk[3]==3) return(ifelse(mag-i2>2,
1+rang(x,ranj=(i2+1):mag,taken),ranj=1:(i1-1),taken),
1+(i1>1)))
}

ternone=function(toke){

toke[1:3]=sort(toke[1:3])
counter=1
for (ul in (4:length(toke))){

counter=counter+rang(toke[ul],1:(ul-1),toke)
toke[1:ul]=sort(toke[1:ul])
}

return(counter)
}

ternone(sample(1:10))
```

which produces a minimum of eight (of course!) and a maximum of 20 uses of the scale (based on 10⁶ random permutations of (1,…,10)). Unsurprisingly, the solution proposed in Le Monde the week after does better as it obtains 16 as the worst case figure. Repeating the experiment with n=100 values, the maximum was 303 (with a mean of 270 and a minimum of 240 ternary comparisons).

Moving to an insertion two tokens at a time, I tested a scheme where two new tokens were tested against the current median, then the median of one half, and so on until they split on both sides of this median. Here is the corresponding R code

```ring=function(x,y,ranj,taken){

mag=max(ranj)-min(ranj)+1
i1=ranj[1]-1+trunc(mag/2)
locrk=rank(c(taken[i1],x,y))

if (locrk[1]==3) return(ifelse(i1>2,
1+ring(x,y,ranj=1:(i1-1),taken),
1+(i1==2)))
if (locrk[1]==2) return(ifelse(mag>4,
1+rang(min(x,y),ranj=min(ranj):(i1-1),taken)
+rang(max(x,y),(i1+1):max(ranj),taken),1))
if (locrk[1]==1) return(ifelse(mag-i1>2,
1+ring(x,y,ranj=(i1+1):mag,taken),
1+(mag-i1>1)))
}

terntwo=function(toke){

toke[1:3]=sort(toke[1:3])
counter=1+rang(toke[4],1:3,toke)
toke[1:4]=sort(toke[1:4])

for (ul in -1+2*(3:(length(toke)/2))){

counter=counter+ring(toke[ul],toke[ul+1],1:(ul-1),toke)
toke[1:ul]=sort(toke[1:ul])
ul=ul+1
}

return(counter)
}
```

leading to a value of 13.

This Feb. 19 issue of Le Monde Science&Médecine leaflet also contains a two page report on growing psychological work-related issues like stress, burn-out and depression, in research labs, mostly connected with the increase in administrative duties and grant writing, duties most of us have not been trained for. There is also a short column by Etienne Gys telling about the short lived claim by Moukhtarbaï Otelbaev to have solved the Navier-Stokes problem. Until Terry Tao wrote a paper stating why the proof could not work.

## Bayesian programming [book review]

Posted in Books, Kids, pictures, Statistics, University life with tags , , , , , , , , , , on March 3, 2014 by xi'an

“We now think the Bayesian Programming methodology and tools are reaching maturity. The goal of this book is to present them so that anyone is able to use them. We will, of course, continue to improve tools and develop new models. However, pursuing the idea that probability is an alternative to Boolean logic, we now have a new important research objective, which is to design specific hsrdware, inspired from biology, to build a Bayesian computer.”(p.xviii)

On the plane to and from Montpellier, I took an extended look at Bayesian Programming a CRC Press book recently written by Pierre Bessière, Emmanuel Mazer, Juan-Manuel Ahuactzin, and Kamel Mekhnacha. (Very nice picture of a fishing net on the cover, by the way!) Despite the initial excitement at seeing a book which final goal was to achieve a Bayesian computer, as demonstrated by the above quote, I however soon found the book too arid to read due to its highly formalised presentation… The contents are clear indications that the approach is useful as they illustrate the use of Bayesian programming in different decision-making settings, including a collection of Python codes, so it brings an answer to the what but it somehow misses the how in that the construction of the priors and the derivation of the posteriors is not explained in a way one could replicate.

“A modeling methodology is not sufficient to run Bayesian programs. We also require an efficient Bayesian inference engine to automate the probabilistic calculus. This assumes we have a collection of inference algorithms adapted and tuned to more or less specific models and a software architecture to combine them in a coherent and unique tool.” (p.9)

For instance, all models therein are described via the curly brace formalism summarised by

which quickly turns into an unpalatable object, as in this example taken from the online PhD thesis of Gabriel Synnaeve (where he applied Bayesian programming principles to a MMORPG called StarCraft and developed an AI (or bot) able to play BroodwarBotQ)

thesis that I found most interesting!

“Consequently, we have 21 × 16 = 336 bell-shaped distributions and we have 2 × 21 × 16 = 772 free parameters: 336 means and 336 standard deviations.¨(p.51)

Now, getting back to the topic of the book, I can see connections with statistical problems and models, and not only via the application of Bayes’ theorem, when the purpose (or Question) is to take a decision, for instance in a robotic action. I still remain puzzled by the purpose of the book, since it starts with very low expectations on the reader, but hurries past notions like Kalman filters and Metropolis-Hastings algorithms in a few paragraphs. I do not get some of the details, like this notion of a discretised Gaussian distribution (I eventually found the place where the 772 prior parameters are “learned” in a phase called “identification”.)

“Thanks to conditional independence the curse of dimensionality has been broken! What has been shown to be true here for the required memory space is also true for the complexity of inferences. Conditional independence is the principal tool to keep the calculation tractable. Tractability of Bayesian inference computation is of course a major concern as it has been proved NP-hard (Cooper, 1990).”(p.74)

The final chapters (Chap. 14 on “Bayesian inference algorithms revisited”, Chap. 15 on “Bayesian learning revisited” and  Chap. 16 on “Frequently asked questions and frequently argued matters” [!]) are definitely those I found easiest to read and relate to. With mentions made of conjugate priors and of the EM algorithm as a (Bayes) classifier. The final chapter mentions BUGS, Hugin and… Stan! Plus a sequence of 23 PhD theses defended on Bayesian programming for robotics in the past 20 years. And explains the authors’ views on the difference between Bayesian programming and Bayesian networks (“any Bayesian network can be represented in the Bayesian programming formalism, but the opposite is not true”, p.316), between Bayesian programming and probabilistic programming (“we do not search to extend classical languages but rather to replace them by a new programming approach based on probability”, p.319), between Bayesian programming and Bayesian modelling (“Bayesian programming goes one step further”, p.317), with a further (self-)justification of why the book sticks to discrete variables, and further more philosophical sections referring to Jaynes and the principle of maximum entropy.

“The “objectivity” of the subjectivist approach then lies in the fact that two different subjects with same preliminary knowledge and same observations will inevitably reach the same conclusions.”(p.327)

Bayesian Programming thus provides a good snapshot of (or window on) what one can achieve in uncertain environment decision-making with Bayesian techniques. It shows a long-term reflection on those notions by Pierre Bessière, his colleagues and students. The topic is most likely too remote from my own interests for the above review to be complete. Therefore, if anyone is interested in reviewing any further this book for CHANCE, before I send the above to the journal, please contact me. (Usual provisions apply.)

## is atheism irrational?

Posted in Books, Kids with tags , , , , , , , on March 2, 2014 by xi'an

“If a belief is as likely to be false as to be true, we’d have to say the probability that any particular belief is true is about 50 percent. Now suppose we had a total of 100 independent beliefs (of course, we have many more). Remember that the probability that all of a group of beliefs are true is the multiplication of all their individual probabilities. Even if we set a fairly low bar for reliability — say, that at least two-thirds (67 percent) of our beliefs are true — our overall reliability, given materialism and evolution, is exceedingly low: something like .0004. So if you accept both materialism and evolution, you have good reason to believe that your belief-producing faculties are not reliable.”

On the (New York Times) philosophy blog The Stone, I spotted this entry and first wondered if I had misread the title, as atheism sounds (to me) as a most rational position. I then read the piece and found it mostly missing, even though a few points rang true(r). First, theism is never properly defined. (Even though the author Alvin Plantinga seems to stick to monotheist religions.) This is a not-so-subtle trick as it makes atheism appear as the extreme position, since it is rejecting any form of theism! Then, the interviewee is mostly using a sequence of sophisms as arguments that atheists are irrational, see e.g. the even-star-ism and a-moonism and a-teapotism entries. Further, some of his entries very strongly resemble intelligent design arguments, e.g. the “fine-tuning” line that the universe is too perfectly suited to human life to be due to randomness. Even though Plantinga also resorts to evolution when needed, as in the above quote. (The interviewer is not doing a great job either, by referring to evil, or the need (or lack thereof) of God versus science to explain the world. Rather than resorting to rational arguments. And without mentioning the fundamental point in favour of atheism that the existence of a sentient being driving the whole universe while remaining hidden to us humans requires an infinitely stronger step than arguing this is impossibly incompatible with the laws of Physics and the accumulated corpus of experience since the dawn of humanity.) The whole strategy of Plantinga is actually to turn atheism into another kind of belief “that materialism and evolution are true” and then to rank it equal with the theisms. A very poor philosophical performance. As also (and better) pointed out in this other post. (And as my daughter remarked, fresh from writing a philosophy essay, Plantinga is missing the best argument of all, namely Pascal’s wager, an early instance of decision theory applied to religion.)

## Carlin and Chib (1995) for fixed dimension problems

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , on February 25, 2014 by xi'an

Yesterday, I was part of a (public) thesis committee at the Université Pierre et Marie Curie, in down-town Paris. After a bit of a search for the defence room (as the campus is still undergoing a massive asbestos clean-up, 20 years after it started…!), I listened to Florian Maire delivering his talk on an array of work in computational statistics ranging from the theoretical (Peskun ordering) to the methodological (Monte Carlo online EM) to the applied (unsupervised learning of classes shapes via deformable templates). The implementation of the online EM algorithm involved the use of pseudo-priors à la Carlin and Chib (1995), even though the setting was a fixed-dimension one, in order to fight the difficulty of exploring the space of templates by a regular Gibbs sampler. (As usual, the design of the pseudo-priors was crucial to the success of the method.) The thesis also included a recent work with Randal Douc and Jimmy Olsson on ranking inhomogeneous Markov kernels of the type

$P \circ Q \circ P \circ Q \circ ...$

against alternatives with components (P’,Q’). The authors were able to characterise minimal conditions for a Peskun-ordering domination on the components to transfer to the combination. Quite an interesting piece of work for a PhD thesis!