## vertical likelihood Monte Carlo integration

Posted in Books, pictures, Running, Statistics, Travel, University life with tags , , , , , , , on April 17, 2015 by xi'an

A few months ago, Nick Polson and James Scott arXived a paper on one of my favourite problems, namely the approximation of normalising constants (and it went way under my radar, as I only became aware of it quite recently!, then it remained in my travel bag for an extra few weeks…). The method for approximating the constant Z draws from an analogy with the energy level sampling methods found in physics, like the Wang-Landau algorithm. The authors rely on a one-dimensional slice sampling representation of the posterior distribution and [main innovation in the paper] add a weight function on the auxiliary uniform. The choice of the weight function links the approach with the dreaded harmonic estimator (!), but also with power-posterior and bridge sampling. The paper recommends a specific weighting function, based on a “score-function heuristic” I do not get. Further, the optimal weight depends on intractable cumulative functions as in nested sampling. It would be fantastic if one could draw directly from the prior distribution of the likelihood function—rather than draw an x [from the prior or from something better, as suggested in our 2009 Biometrika paper] and transform it into L(x)—but as in all existing alternatives this alas is not the case. (Which is why I find the recommendations in the paper for practical implementation rather impractical, since, were the prior cdf of L(X) available, direct simulation of L(X) would be feasible. Maybe not the optimal choice though.)

“What is the distribution of the likelihood ordinates calculated via nested sampling? The answer is surprising: it is essentially the same as the distribution of likelihood ordinates by recommended weight function from Section 4.”

The approach is thus very much related to nested sampling, at least in spirit. As the authors later demonstrate, nested sampling is another case of weighting, Both versions require simulations under truncated likelihood values. Albeit with a possibility of going down [in likelihood values] with the current version. Actually, more weighting could prove [more] efficient as both the original nested and vertical sampling simulate from the prior under the likelihood constraint. Getting away from the prior should help. (I am quite curious to see how the method is received and applied.)

## ah ces enseignants..!

Posted in Kids, pictures, Travel on April 16, 2015 by xi'an

## abc [with brains]

Posted in Statistics on April 16, 2015 by xi'an

## reis naar Amsterdam

Posted in Books, Kids, pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , on April 16, 2015 by xi'an

On Monday, I went to Amsterdam to give a seminar at the University of Amsterdam, in the department of psychology. And to visit Eric-Jan Wagenmakers and his group there. And I had a fantastic time! I talked about our mixture proposal for Bayesian testing and model choice without getting hostile or adverse reactions from the audience, quite the opposite as we later discussed this new notion for several hours in the café across the street. I also had the opportunity to meet with Peter Grünwald [who authored a book on the minimum description length principle] pointed out a minor inconsistency of the common parameter approach, namely that the Jeffreys prior on the first model did not have to coincide with the Jeffreys prior on the second model. (The Jeffreys prior for the mixture being unavailable.) He also wondered about a more conservative property of the approach, compared with the Bayes factor, in the sense that the non-null parameter could get closer to the null-parameter while still being identifiable.

Among the many persons I met in the department, Maarten Marsman talked to me about his thesis research, Plausible values in statistical inference, which involved handling the Ising model [a non-sparse Ising model with O(p²) parameters] by an auxiliary representation due to Marc Kac and getting rid of the normalising (partition) constant by the way. (Warning, some approximations involved!) And who showed me a simple probit example of the Gibbs sampler getting stuck as the sample size n grows. Simply because the uniform conditional distribution on the parameter concentrates faster (in 1/n) than the posterior (in 1/√n). This does not come as a complete surprise as data augmentation operates in an n-dimensional space. Hence it requires more time to get around. As a side remark [still worth printing!], Maarten dedicated his thesis as “To my favourite random variables , Siem en Fem, and to my normalizing constant, Esther”, from which I hope you can spot the influence of at least two of my book dedications! As I left Amsterdam on Tuesday, I had time for a enjoyable dinner with E-J’s group, an equally enjoyable early morning run [with perfect skies for sunrise pictures!], and more discussions in the department. Including a presentation of the new (delicious?!) Bayesian software developed there, JASP, which aims at non-specialists [i.e., researchers unable to code in R, BUGS, or, God forbid!, STAN] And about the consequences of mixture testing in some psychological experiments. Once again, a fantastic time discussing Bayesian statistics and their applications, with a group of dedicated and enthusiastic Bayesians!

## run in the parc [#3]

Posted in pictures, Running, Travel with tags , , , , , on April 15, 2015 by xi'an

## Bernoulli, Montmort and Waldegrave

Posted in Books, Kids, R, Statistics on April 15, 2015 by xi'an

In the last issue of Statistical Science, David Belhouse [author of De Moivre’s biography]  and Nicolas Fillion published an accounting of a discussion between Pierre Rémond de Montmort, Nicolaus Bernoulli—”the” Bernoulli associated with the St. Petersburg paradox—, and Francis Waldegrave, about the card game of Le Her (or Hère, for wretch). Here is the abridged description from the paper:

“Le Her is a game (…) played with a standard deck of fifty-two playing cards. The simplest situation is when two players [Pierre and Paul] play the game, and the solution is not simply determined  even in that situation (…) Pierre deals a card from the deck to Paul and then one to himself. Paul has the option of switching his card for Pierre’s card. Pierre can only refuse the switch if he holds a king (the highest valued card). After Paul makes his decision to hold or switch, Pierre now has the option to hold whatever card he now has or to switch it with a card drawn from the deck. However, if he draws a king, he must retain his original card. The player with the highest card wins the pot, with ties going to the dealer Pierre (…) What are the chances of each player (…) ?” (p.2)

As the paper focus on the various and conflicting resolutions by those 18th Century probabilists, reaching the solution [for Paul to win]

$\dfrac{2828ac+2834bc+2838ad+2828bd}{13\cdot 17\cdot 25 \cdot(a+b+c+d)}$

“where a is Paul’s probability of switching with seven, b is Paul’s probability of holding the seven, c is Pierre’s probability of switching with an eight, and d is Pierre’s probability of holding on to an eight”

[which sounds amazing for the time, circa 1713!], where I do not see how a+b or c+d are different from 1,  I ran a small R code to check the probability that Paul wins if he switches when there are more larger than smaller values in the remaining cards and Pierre adopts the same strategy if Paul did not switch:

cards=rep(1:13,4)
win=0
T=10^6
for (t in 1:T){
deal=sample(cards,2)
#Alice has deal[1]
switch=0
rest=cards[-deal[1]]
if ((deal[2]<13)&amp;(sum(rest<=deal[1])<sum(rest>=deal[1]))){
switch=deal[2];deal[2]=deal[1];deal[1]=switch}
#Bob's turn
if (switch>0){
rest=cards[-deal]
if (deal[2]<deal[1]){ #sure loss worse than random one
draw=sample(rest,1)
if (draw<13) deal[2]=draw}
}else{
rest=cards[-deal[2]]
if (sum(rest<=deal[2])<sum(rest>=deal[2])){
draw=sample(rest,1)
if (draw<13) deal[2]=draw}}
win=win+(deal[2]>=deal[1])
}
1-win/T


Returning a winning probability of 0.5128 [at the first try] for Paul. However, this is not the optimal strategy for either Paul or Pierre, since randomisation for card values of 7 and 8 push Paul’s odds slightly higher!

## thumbleweed [no] news

Posted in Kids, Mountains, Running, Travel with tags , , , , , , , on April 14, 2015 by xi'an

Just realised today is the second year since my climbing accident and the loss of my right thumb. Even less to say than last anniversary: while it seems almost impossible not to think about it, the handicap is quite minimal. (Actually, the only time I truly forgot about it was when I was ice-climbing in Scotland this January, the difficulty of the [first] climb meaning I had to concentrate on more immediate issues!) Teaching on the blackboard is fine when I use a chalk holder, I just bought a new bike with the easiest change of gears, and except for lacing my running shoes every morning, most chores do not take longer and, as Andrew pointed out in his March+April madness tornament, I can now get away with some missing-body-part jokes!