## Le Monde puzzle [#1029]

Posted in Books, Kids, R with tags , , on November 22, 2017 by xi'an

A convoluted counting Le Monde mathematical puzzle:

A film theatre has a waiting room and several projection rooms. With four films on display. A first set of 600 spectators enters the waiting room and vote for their favourite film. The most popular film is projected to the spectators who voted for it and the remaining spectators stay in the waiting room. They are joined by a new set of 600 spectators, who then also vote for their favourite film. The selected film (which may be the same as the first one) is then shown to those who vote for it and the remaining spectators stay in the waiting room. This pattern is repeated for a total number of 10 votes, after which the remaining spectators leave. What are the maximal possible numbers of waiting spectators and of spectators in a projection room?

A first attempt by random sampling does not produce extreme enough events to reach those maxima:

wm=rm=600 #waiting and watching
for (v in 1:V){
film=rep(0,4) #votes on each fiLm
for (t in 1:9){
film=film+rmultinom(1,600,rep(1,4))
rm=max(rm,max(film))
film[order(film)[4]]=0
wm=max(wm,sum(film)+600)}
rm=max(rm,max(film)+600)}

where the last line adds the last batch of arriving spectators to the largest group of waiting ones. This code only returns 1605 for the maximal number of waiting spectators. And 1155 for the maximal number in a projection room.  Compared with the even separation of the first 600 into four groups of 150… I thus looked for an alternative deterministic allocation:

wm=rm=0
film=rep(0,4)
for (t in 1:9){
size=sum(film)+600
film=c(rep(ceiling(size/4),3),size-3*ceiling(size/4))
film[order(film)[4]]=0
rm=max(rm,max(film)+600)
wm=max(wm,sum(film)+600)}


which tries to preserve as many waiting spectators as possible for the last round (and always considers the scenario of all newcomers backing the largest waiting group for the next film). The outcome of this sequence moves up to 1155 for the largest projection audience and 2264 for the largest waiting group. I however wonder if splitting into two groups in the final round(s) could even increase the size of the last projection. And indeed halving the last batch into two groups leads to 1709 spectators in the final projection. With uncertainties about the validity of the split towards ancient spectators keeping their vote fixed! (I did not think long enough about this puzzle to turn it into a more mathematical problem…)

While in Warwick, I reconsidered the problem from a dynamic programming perspective, always keeping the notion that it was optimal to allocate the votes evenly between some of the films (from 1 to 4). Using the recursive R code

optiz=function(votz,t){
if (t==9){ return(sort(votz)[3]+600)
}else{
goal=optiz(sort(votz)+c(0,0,600,-max(votz)),t+1)
goal=rep(goal,4)
for (i in 2:4){
film=sort(votz);film[4]=0;film=sort(film)
size=sum(film[(4-i+1):4])+600
film[(4-i+1):4]=ceiling(size/i)
while (sum(film[(4-i+1):4])>size) film[4]=film[4]-1
goal[i]=optiz(sort(film),t+1)}
return(max(goal))}}


led to a maximal audience size of 1619. [Which is also the answer provided by Le Monde]

## four positions at Warwick Statistics, apply!

Posted in Statistics with tags , , , , , , , on November 21, 2017 by xi'an

Enthusiastic and excellent academics are sought to be part of our Department of Statistics at Warwick, one of the world’s most prominent and most research active departments of Statistics. We are advertising four posts in total, which reflects the strong commitment of the University of Warwick to invest in Statistics. We intend to fill the following positions:

• Assistant or Associate Professor of Statistics (two positions)

• Reader of Statistics

• Full Professor of Statistics.

All posts are permanent, with posts at the Assistant level subject to probation.

You will have expertise in statistics (to be interpreted in the widest sense and to include both applied and methodological statistics, probability, probabilistic operational research and mathematical finance together with interdisciplinary topics involving one or more of these areas) and you will help shape research and teaching leadership in this fast-developing discipline. Applicants for senior positions should have an excellent publication record and proven ability to secure research funding. Applicants for more junior positions should show exceptional promise to become leading academics.

While the posts are open to applicants with expertise in any field of statistics (widely interpreted as above), the Department is particularly interested in strengthening its existing group in Data Science. The Department is heavily involved in the Warwick Data Science Institute and the Alan Turing Institute, the national institute for data science, headquartered in London. If interested, a successful candidate can apply to spend part of their time at the Alan Turing Institute as a Turing Fellow.

Closing date: 3 January 2018 for the Assistant/Associate level posts and 10 January 2018 for the Full Professor position.

Informal enquires can be addressed to Professors Mark Steel, Gareth Roberts, and David Firth or to any other senior member of the Warwick Statistics Department. Applicants at Assistant/Associate levels should ask their referees to send letters of recommendation by the closing date to the Departmental Administrator, Mrs Paula Matthews.

## the Hyvärinen score is back

Posted in pictures, Statistics, Travel with tags , , , , , , , , , , , , , on November 21, 2017 by xi'an

Stéphane Shao, Pierre Jacob and co-authors from Harvard have just posted on arXiv a new paper on Bayesian model comparison using the Hyvärinen score

$\mathcal{H}(y, p) = 2\Delta_y \log p(y) + ||\nabla_y \log p(y)||^2$

which thus uses the Laplacian as a natural and normalisation-free penalisation for the score test. (Score that I first met in Padova, a few weeks before moving from X to IX.) Which brings a decision-theoretic alternative to the Bayes factor and which delivers a coherent answer when using improper priors. Thus a very appealing proposal in my (biased) opinion! The paper is mostly computational in that it proposes SMC and SMC² solutions to handle the estimation of the Hyvärinen score for models with tractable likelihoods and tractable completed likelihoods, respectively. (Reminding me that Pierre worked on SMC² algorithms quite early during his Ph.D. thesis.)

A most interesting remark in the paper is to recall that the Hyvärinen score associated with a generic model on a series must be the prequential (predictive) version

$\mathcal{H}_T (M) = \sum_{t=1}^T \mathcal{H}(y_t; p_M(dy_t|y_{1:(t-1)}))$

rather than the version on the joint marginal density of the whole series. (Followed by a remark within the remark that the logarithm scoring rule does not make for this distinction. And I had to write down the cascading representation

$\log p(y_{1:T})=\sum_{t=1}^T \log p(y_t|y_{1:t-1})$

to convince myself that this unnatural decomposition, where the posterior on θ varies on each terms, is true!) For consistency reasons.

This prequential decomposition is however a plus in terms of computation when resorting to sequential Monte Carlo. Since each time step produces an evaluation of the associated marginal. In the case of state space models, another decomposition of the authors, based on measurement densities and partial conditional expectations of the latent states allows for another (SMC²) approximation. The paper also establishes that for non-nested models, the Hyvärinen score as a model selection tool asymptotically selects the closest model to the data generating process. For the divergence induced by the score. Even for state-space models, under some technical assumptions.  From this asymptotic perspective, the paper exhibits an example where the Bayes factor and the Hyvärinen factor disagree, even asymptotically in the number of observations, about which mis-specified model to select. And last but not least the authors propose and assess a discrete alternative relying on finite differences instead of derivatives. Which remains a proper scoring rule.

I am quite excited by this work (call me biased!) and I hope it can induce following works as a viable alternative to Bayes factors, if only for being more robust to the [unspecified] impact of the prior tails. As in the above picture where some realisations of the SMC² output and of the sequential decision process see the wrong model being almost acceptable for quite a long while…

## Domaine de Montcalmès

Posted in Statistics with tags , , , , , , , , on November 20, 2017 by xi'an

## ackward citation style

Posted in Statistics with tags , , , , , , on November 18, 2017 by xi'an

When submitting a paper to WIREs, I was asked to use the APA style for citations. This is rather unpleasant as it requires all kinds of fixes and even then returns an unseemly outcome, quoting sometimes authors with their first name and at a point ignoring the parentheses for \citep citations… Maybe all those annoying bugs are on purpose, as APA stands for the American Psychological Association, presumably eager to experiment on new subjects!

## long journey to reproducible results [or not]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , on November 17, 2017 by xi'an

A rather fascinating article in Nature of last August [hidden under a pile of newspapers at home!]. By Gordon J. Lithgow, Monica Driscoll and Patrick Phillips. About their endeavours to explain for divergent outcomes in the replications [or lack thereof] of an earlier experiment on anti-aging drugs tested on roundworms. Rather than dismissing the failures or blaming the other teams, the above researchers engaged for four years (!) into the titanic and grubby task of understanding the reason(s) for such discrepancies.

Finding that once most causes for discrepancies (like gentle versus rough lab technicians!) were eliminated, there were still two “types” of worms, those short-lived and those long-lived, for reasons yet unclear. “We need to repeat more experiments than we realized” is a welcome conclusion to this dedicated endeavour, worth repeating in different circles. And apparently missing in the NYT coverage by Susan Dominus of the story of Amy Cuddy, a psychologist at the origin of the “power pose” theory that got later disputed for lack of reproducibility. Article which main ideological theme is that Cuddy got singled-out in the replication crisis because she is a woman and because her “power pose” theory is towards empowering women and minorities. Rather than because she keeps delivering the same message, mostly outside academia, despite the lack of evidence and statistical backup. (Dominus’ criticisms of psychologists with “an unusual interest in statistics” and of Andrew’s repeated comments on the methodological flaws of the 2010 paper that started all are thus particularly unfair. A Slate article published after the NYT coverage presents an alternative analysis of this affair. Andrew also posted on Dominus paper, with a subsequent humongous trail of comments!)

## Darmois, Koopman, and Pitman

Posted in Books, Statistics with tags , , , , , , , , on November 15, 2017 by xi'an

When [X’ed] seeking a simple proof of the Pitman-Koopman-Darmois lemma [that exponential families are the only types of distributions with constant support allowing for a fixed dimension sufficient statistic], I came across a 1962 Stanford technical report by Don Fraser containing a short proof of the result. Proof that I do not fully understand as it relies on the notion that the likelihood function itself is a minimal sufficient statistic.