## another viral math puzzle

Posted in Books, Kids, R, University life with tags , , , , , , , on May 25, 2015 by xi'an

After the Singapore Maths Olympiad birthday problem that went viral, here is a Vietnamese primary school puzzle that made the frontline in The Guardian. The question is: Fill the empty slots with all integers from 1 to 9 for the equality to hold. In other words, find a,b,c,d,e,f,g,h,i such that

a+13xb:c+d+12xef-11+gxh:i-10=66.

With presumably the operation ordering corresponding to

a+(13xb:c)+d+(12xe)f-11+(gxh:i)-10=66

although this is not specified in the question. Which amounts to

a+(13xb:c)+d+(12xe)f+(gxh:i)=87

and implies that c divides b and i divides gxh. Rather than pursing this analytical quest further, I resorted to R coding, checking by brute force whether or not a given sequence was working.

baoloc=function(ord=sample(1:9)){
if (ord[1]+(13*ord[2]/ord[3])+ord[4]+
12*ord[5]-ord[6]-11+(ord[7]*ord[8]/
ord[9])-10==66) return(ord)}


I then applied this function to all permutations of {1,…,9} [with the help of the perm(combinat) R function] and found the 128 distinct solutions. Including some for which b:c is not an integer. (Not of this obviously gives a hint as to how a 8-year old could solve the puzzle.)

As pointed out in a comment below, using the test == on scalars is a bad idea—once realising some fractions may be other than integers—and I should thus replace the equality with an alternative that bypasses divisions,

baoloc=function(ord=sample(1:9)){
return(((ord[1]+ord[4]+12*ord[5]-ord[6]-87)*
ord[3]*ord[9]+13*ord[2]*ord[9]+
ord[3]*ord[7]*ord[8]==0)*ord)}


leading to the overall R code

sol=NULL
perms=as.matrix(data.frame(permutations(9)),ncol=9,byrow=TRUE)
for (t in 1:factorial(9)){
a=baoloc(perms[t,])
if (a[1]>0) sol=rbind(sol,a)}
sol=sol[do.call(order, as.data.frame(sol)),]


and returning the 136 different solutions…

## 39% anglo-irish!

Posted in Kids, Statistics, Travel with tags , , , , , , , , on May 24, 2015 by xi'an

As I have always been curious about my ancestry, I made a DNA test on 23andMe. While the company no longer provides statistics about potential medical conditions because of a lawsuit, it does return an ancestry analysis of sorts. In my case, my major ancestry composition is Anglo-Irish!  (with 39% of my DNA) and northern European (with 32%), while only 19% is Franco-German… In retrospect, not so much of a surprise—not because of my well-known Anglophilia but—given that my (known, i.e., at least for the direct ancestral branches) family roots are in Normandy—whose duke invaded Britain in 1056—and Brittany—which was invaded by British Celts fleeing Anglo-Saxons in the 400’s.  What’s maybe more surprising to me is that the database contained 23 people identified as 4th degree cousins and a total of 652 relatives… While the potential number of my potential 4th degree cousins stands in the 10,000’s, and hence there may indeed be a few ending up as 23andMe—mostly American—customers, I am indeed surprised that a .37% coincidence in our genes qualifies for being 4th degree cousins! But given that I only share 3.1% with my great⁴-grandfather, it actually make sense that I share about .1% to .4% with such remote cousins. However I wonder at the precision of such an allocation: could those cousins be even more remotely related? Not related at all? [Warning: All the links to 23andMe in this post are part of their referral program.]

## postdoc in the Alps

Posted in Kids, Mountains, Statistics, Travel, University life with tags , , , , , , , , , on May 22, 2015 by xi'an

Post-doctoral Position in Spatial/Computational Statistics (Grenoble, France)

A post-doctoral position is available in Grenoble, France, to work on computational methods for spatial point process models. The candidate will work with Simon Barthelmé (GIPSA-lab, CNRS) and Jean-François Coeurjolly (Univ. Grenoble Alpes, Laboratory Jean Kuntzmann) on extending point process methodology to deal with large datasets involving multiple sources of variation. We will focus on eye movement data, a new and exciting application area for spatial statistics. The work will take place in the context of an interdisciplinary project on eye movement modelling involving psychologists, statisticians and applied mathematicians from three different institutes in Grenoble.

The ideal candidate has a background in spatial or computational statistics or machine learning. Knowledge of R (and in particular the package spatstat) and previous experience with point process models is a definite plus.

The duration of the contract is 12+6 months, starting 01.10.2015 at the earliest. Salary is according to standard CNRS scale (roughly EUR 2k/month).

Grenoble is the largest city in the French Alps, with a very strong science and technology cluster. It is a pleasant place to live, in an exceptional mountain environment.

## Bruce Lindsay (March 7, 1947 — May 5, 2015)

Posted in Books, Running, Statistics, Travel, University life with tags , , , , , , , , , , , on May 22, 2015 by xi'an

When early registering for Seattle (JSM 2015) today, I discovered on the ASA webpage the very sad news that Bruce Lindsay had passed away on May 5.  While Bruce was not a very close friend, we had met and interacted enough times for me to feel quite strongly about his most untimely death. Bruce was indeed “Mister mixtures” in many ways and I have always admired the unusual and innovative ways he had found for analysing mixtures. Including algebraic ones through the rank of associated matrices. Which is why I first met him—besides a few words at the 1989 Gertrude Cox (first) scholarship race in Washington DC—at the workshop I organised with Gilles Celeux and Mike West in Aussois, French Alps, in 1995. After this meeting, we met twice in Edinburgh at ICMS workshops on mixtures, organised with Mike Titterington. I remember sitting next to Bruce at one workshop dinner (at Blonde) and him talking about his childhood in Oregon and his father being a journalist and how this induced him to become an academic. He also contributed a chapter on estimating the number of components [of a mixture] to the Wiley book we edited out of this workshop. Obviously, his work extended beyond mixtures to a general neo-Fisherian theory of likelihood inference. (Bruce was certainly not a Bayesian!) Last time, I met him, it was in Italia, at a likelihood workshop in Venezia, October 2012, mixing Bayesian nonparametrics, intractable likelihoods, and pseudo-likelihoods. He gave a survey talk about composite likelihood, telling me about his extended stay in Italy (Padua?) around that time… So, Bruce, I hope you are now running great marathons in a place so full of mixtures that you can always keep ahead of the pack! Fare well!

## non-reversible MCMC

Posted in Books, Statistics, University life with tags , , , , , , on May 21, 2015 by xi'an

While visiting Dauphine, Natesh Pillai and Aaron Smith pointed out this interesting paper of Joris Bierkens (Warwick) that had escaped my arXiv watch/monitoring. The paper is about turning Metropolis-Hastings algorithms into non-reversible versions, towards improving mixing.

In a discrete setting, a way to produce a non-reversible move is to mix the proposal kernel Q with its time-reversed version Q’ and use an acceptance probability of the form

$\epsilon\pi(y)Q(y,x)+(1-\epsilon)\pi(x)Q(x,y) \big/ \pi(x)Q(x,y)$

where ε is any weight. This construction is generalised in the paper to any vorticity (skew-symmetric with zero sum rows) matrix Γ, with the acceptance probability

$\epsilon\Gamma(x,y)+\pi(y)Q(y,x)\big/\pi(x)Q(x,y)$

where ε is small enough to ensure all numerator values are non-negative. This is a rather annoying assumption in that, except for the special case derived from the time-reversed kernel, it has to be checked over all pairs (x,y). (I first thought it also implied the normalising constant of π but everything can be set in terms of the unormalised version of π, Γ or ε included.) The paper establishes that the new acceptance probability preserves π as its stationary distribution. An alternative construction is to make the proposal change from Q in H such that H(x,y)=Q(x,y)+εΓ(x,y)/π(x). Which seems more pertinent as not changing the proposal cannot improve that much the mixing behaviour of the chain. Still, the move to the non-reversible versions has the noticeable plus of decreasing the asymptotic variance of the Monte Carlo estimate for any integrable function. Any. (Those results are found in the physics literature of the 2000’s.)

The extension to the continuous case is a wee bit more delicate. One needs to find an anti-symmetric vortex function g with zero integral [equivalent to the row sums being zero] such that g(x,y)+π(y)q(y,x)>0 and with same support as π(x)q(x,y) so that the acceptance probability of g(x,y)+π(y)q(y,x)/π(x)q(x,y) leads to π being the stationary distribution. Once again g(x,y)=ε(π(y)q(y,x)-π(x)q(x,y)) is a natural candidate but it is unclear to me why it should work. As the paper only contains one illustration for the discretised Ornstein-Uhlenbeck model, with the above choice of g for a small enough ε (a point I fail to understand since any ε<1 should provide a positive g(x,y)+π(y)q(y,x)), it is also unclear to me that this modification (i) is widely applicable and (ii) is relevant for genuine MCMC settings.

## speed seminar-ing

Posted in Books, pictures, Statistics, Travel, University life, Wines with tags , , , , , , , , , , on May 20, 2015 by xi'an

Yesterday, I  made a quick afternoon trip to Montpellier as replacement of a seminar speaker who had cancelled at the last minute. Most obviously, I gave a talk about our “testing as mixture” proposal. And as previously, the talk generated a fair amount of discussion and feedback from the audience. Providing me with additional aspects to include in a revision of the paper. Whether or not the current submission is rejected, new points made and received during those seminars will have to get in a revised version as they definitely add to the appeal to the perspective. In that seminar, most of the discussion concentrated on the connection with decisions based on such a tool as the posterior distribution of the mixture weight(s). My argument for sticking with the posterior rather than providing a hard decision rule was that the message is indeed in arguing hard rules that end up mimicking the p- or b-values. And the catastrophic consequences of fishing for significance and the like. Producing instead a validation by simulating under each model pseudo-samples shows what to expect for each model under comparison. The argument did not really convince Jean-Michel Marin, I am afraid! Another point he raised was that we could instead use a distribution on α with support {0,1}, to avoid the encompassing model he felt was too far from the original models. However, this leads back to the Bayes factor as the weights in 0 and 1 are the marginal likelihoods, nothing more. However, this perspective on the classical approach has at least the appeal of completely validating the use of improper priors on common (nuisance or not) parameters. Pierre Pudlo also wondered why we could not conduct an analysis on the mixture of the likelihoods. Instead of the likelihood of the mixture. My first answer was that there was not enough information in the data for estimating the weight(s). A few more seconds of reflection led me to the further argument that the posterior on α with support (0,1) would then be a mixture of Be(2,1) and Be(1,2) with weights the marginal likelihoods, again (under a uniform prior on α). So indeed not much to gain. A last point we discussed was the case of the evolution trees we analyse with population geneticists from the neighbourhood (and with ABC). Jean-Michel’s argument was that the scenari under comparison were not compatible with a mixture, the models being exclusive. My reply involved an admixture model that contained all scenarios as special cases. After a longer pondering, I think his objection was more about the non iid nature of the data. But the admixture construction remains valid. And makes a very strong case in favour of our approach, I believe.

After the seminar, Christian Lavergne and Jean-Michel had organised a doubly exceptional wine-and-cheese party: first because it is not usually the case there is such a post-seminar party and second because they had chosen a terrific series of wines from the Mas Bruguière (Pic Saint-Loup) vineyards. Ending up with a great 2007 L’Arbouse. Perfect ending for an exciting day. (I am not even mentioning a special Livarot from close to my home-town!)

## Cauchy Distribution: Evil or Angel?

Posted in Books, pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , on May 19, 2015 by xi'an

Natesh Pillai and Xiao-Li Meng just arXived a short paper that solves the Cauchy conjecture of Drton and Xiao [I mentioned last year at JSM], namely that, when considering two normal vectors with generic variance matrix S, a weighted average of the ratios X/Y remains Cauchy(0,1), just as in the iid S=I case. Even when the weights are random. The fascinating side of this now resolved (!) conjecture is that the correlation between the terms does not seem to matter. Pushing the correlation to one [assuming it is meaningful, which is a suspension of belief!, since there is no standard correlation for Cauchy variates] leads to a paradox: all terms are equal and yet… it works: we recover a single term, which again is Cauchy(0,1). All that remains thus to prove is that it stays Cauchy(0,1) between those two extremes, a weird kind of intermediary values theorem!

Actually, Natesh and XL further prove an inverse χ² theorem: the inverse of the normal vector, renormalised into a quadratic form is an inverse χ² no matter what its covariance matrix. The proof of this amazing theorem relies on a spherical representation of the bivariate Gaussian (also underlying the Box-Müller algorithm). The angles are then jointly distributed as

$\exp\{-\sum_{i,j}\alpha_{ij}\cos(\theta_i-\theta_j)\}$

and from there follows the argument that conditional on the differences between the θ’s, all ratios are Cauchy distributed. Hence the conclusion!

A question that stems from reading this version of the paper is whether this property extends to other formats of non-independent Cauchy variates. Somewhat connected to my recent post about generating correlated variates from arbitrary distributions: using the inverse cdf transform of a Gaussian copula shows this is possibly the case: the following code is meaningless in that the empirical correlation has no connection with a “true” correlation, but nonetheless the experiment seems of interest…

> ro=.999999;x=matrix(rnorm(2e4),ncol=2);y=ro*x+sqrt(1-ro^2)*matrix(rnorm(2e4),ncol=2)
> cor(x[,1]/x[,2],y[,1]/y[,2])
[1] -0.1351967
> ro=.99999999;x=matrix(rnorm(2e4),ncol=2);y=ro*x+sqrt(1-ro^2)*matrix(rnorm(2e4),ncol=2)
> cor(x[,1]/x[,2],y[,1]/y[,2])
[1] 0.8622714
> ro=1-1e-5;x=matrix(rnorm(2e4),ncol=2);y=ro*x+sqrt(1-ro^2)*matrix(rnorm(2e4),ncol=2)
> z=qcauchy(pnorm(as.vector(x)));w=qcauchy(pnorm(as.vector(y)))
> cor(x=z,y=w)
[1] 0.9999732
> ks.test((z+w)/2,"pcauchy")

One-sample Kolmogorov-Smirnov test

data:  (z + w)/2
D = 0.0068, p-value = 0.3203
alternative hypothesis: two-sided
> ro=1-1e-3;x=matrix(rnorm(2e4),ncol=2);y=ro*x+sqrt(1-ro^2)*matrix(rnorm(2e4),ncol=2)
> z=qcauchy(pnorm(as.vector(x)));w=qcauchy(pnorm(as.vector(y)))
> cor(x=z,y=w)
[1] 0.9920858
> ks.test((z+w)/2,"pcauchy")

One-sample Kolmogorov-Smirnov test

data:  (z + w)/2
D = 0.0036, p-value = 0.9574
alternative hypothesis: two-sided