Archive for the University life Category

the Flatland paradox [#2]

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , , , , , , on May 27, 2015 by xi'an

flatlandAnother trip in the métro today (to work with Pierre Jacob and Lawrence Murray in a Paris Anticafé!, as the University was closed) led me to infer—warning!, this is not the exact distribution!—the distribution of x, namely

f(x|N) = \frac{4^p}{4^{\ell+2p}} {\ell+p \choose p}\,\mathbb{I}_{N=\ell+2p}

since a path x of length l(x) will corresponds to N draws if N-l(x) is an even integer 2p and p undistinguishable annihilations in 4 possible directions have to be distributed over l(x)+1 possible locations, with Feller’s number of distinguishable distributions as a result. With a prior π(N)=1/N on N, hence on p, the posterior on p is given by

\pi(p|x) \propto 4^{-p} {\ell+p \choose p} \frac{1}{\ell+2p}

Now, given N and  x, the probability of no annihilation on the last round is 1 when l(x)=N and in general

\frac{4^p}{4^{\ell+2p}}{\ell-1+p \choose p}\big/\frac{4^p}{4^{\ell+2p}}{\ell+p \choose p}=\frac{\ell}{\ell+p}=\frac{2\ell}{N+\ell}

which can be integrated against the posterior. The numerical expectation is represented for a range of values of l(x) in the above graph. Interestingly, the posterior probability is constant for l(x) large  and equal to 0.8125 under a flat prior over N.

flatelGetting back to Pierre Druilhet’s approach, he sets a flat prior on the length of the path θ and from there derives that the probability of annihilation is about 3/4. However, “the uniform prior on the paths of lengths lower or equal to M” used for this derivation which gives a probability of length l proportional to 3l is quite different from the distribution of l(θ) given a number of draws N. Which as shown above looks much more like a Binomial B(N,1/2).

flatpostHowever, being not quite certain about the reasoning involving Fieller’s trick, I ran an ABC experiment under a flat prior restricted to (l(x),4l(x)) and got the above, where the histogram is for a posterior sample associated with l(x)=195 and the gold curve is the potential posterior. Since ABC is exact in this case (i.e., I only picked N’s for which l(x)=195), ABC is not to blame for the discrepancy! I asked about the distribution on Stack Exchange maths forum (and a few colleagues here as well) but got no reply so far… Here is the R code that goes with the ABC implementation:

#observation:
elo=195
#ABC version
T=1e6
el=rep(NA,T)
N=sample(elo:(4*elo),T,rep=TRUE)
for (t in 1:T){
#generate a path
  paz=sample(c(-(1:2),1:2),N[t],rep=TRUE)
#eliminate U-turns
  uturn=paz[-N[t]]==-paz[-1]
  while (sum(uturn>0)){
    uturn[-1]=uturn[-1]*(1-
              uturn[-(length(paz)-1)])
    uturn=c((1:(length(paz)-1))[uturn==1],
            (2:length(paz))[uturn==1])
    paz=paz[-uturn]
    uturn=paz[-length(paz)]==-paz[-1]
    }
  el[t]=length(paz)}
#subsample to get exact posterior
poster=N[abs(el-elo)==0]

non-reversible MCMC [comments]

Posted in Books, Mountains, pictures, Statistics, University life with tags , , , , , on May 26, 2015 by xi'an

[Here are comments made by Matt Graham that I thought would be more readable in a post format. The beautiful picture of the Alps above is his as well. I do not truly understand what Matt’s point is, as I did not cover continuous time processes in my discussion…]

In terms of interpretation of the diffusion with non-reversible drift component, I think this can be generalised from the Gaussian invariant density case by

dx = [ – (∂E/∂x) dt + √2 dw ] + S’ (∂E/∂x) dt

where ∂E/∂x is the usual gradient of the negative log (unnormalised) density / energy and S=-S’ is a skew symmetric matrix. In this form it seems the dynamic can be interpreted as the composition of an energy and volume conserving (non-canonical) Hamiltonian dynamic

dx/dt = S’ ∂E/∂x

and a (non-preconditioned) Langevin diffusion

dx = – (∂E/∂x) dt + √2 dw

As an alternative to discretising the combined dynamic, it might be interesting to compare to sequential alternation between ‘Hamiltonian’ steps either using a simple Euler discretisation

x’ = x + h S’ ∂E/∂x

or a symplectic method like implicit midpoint to maintain reversibility / volume preservation and then a standard MALA step

x’ = x – h (∂E/∂x) + √2 h w, w ~ N(0, I)

plus MH accept. If only one final MH accept step is done this overall dynamic will be reversible, however if a an intermediary MH accept was done after each Hamiltonian step (flipping the sign / transposing S on a rejection to maintain reversibility), the composed dynamic would in general be non-longer reversible and it would be interesting to compare performance in this case to that using a non-reversible MH acceptance on the combined dynamic (this alternative sidestepping the issues with finding an appropriate scale ε to maintain the non-negativity condition on the sum of the vorticity density and joint density on a proposed and current state).

With regards to your point on the positivity of g(x,y)+π(y)q(y,x), I’m not sure if I have fully understood your notation correctly or not, but I think you may have misread the definition of g(x,y) for the discretised Ornstein-Uhlenbeck case (apologies if instead the misinterpretation is mine!). The vorticity density is defined as the skew symmetric component of the density f of F(dx, dy) = µ(dx) Q(x, dy) with respect to the Lebesgue measure, where µ(dx) is the true invariant distribution of the Euler-Maruyama discretised diffusion based proposal density Q(x, dy) rather than g(x, y) being defined in terms of the skew-symmetric component of π(dx) Q(x, dy) which in general would lead to a vorticity density which does not meet the zero integral requirement as the target density π is not invariant in general with respect to Q.

another viral math puzzle

Posted in Books, Kids, R, University life with tags , , , , , , , on May 25, 2015 by xi'an

After the Singapore Maths Olympiad birthday problem that went viral, here is a Vietnamese primary school puzzle that made the frontline in The Guardian. The question is: Fill the empty slots with all integers from 1 to 9 for the equality to hold. In other words, find a,b,c,d,e,f,g,h,i such that

a+13xb:c+d+12xef-11+gxh:i-10=66.

With presumably the operation ordering corresponding to

a+(13xb:c)+d+(12xe)f-11+(gxh:i)-10=66

although this is not specified in the question. Which amounts to

a+(13xb:c)+d+(12xe)f+(gxh:i)=87

and implies that c divides b and i divides gxh. Rather than pursing this analytical quest further, I resorted to R coding, checking by brute force whether or not a given sequence was working.

baoloc=function(ord=sample(1:9)){
if (ord[1]+(13*ord[2]/ord[3])+ord[4]+
12*ord[5]-ord[6]-11+(ord[7]*ord[8]/
ord[9])-10==66) return(ord)}

I then applied this function to all permutations of {1,…,9} [with the help of the perm(combinat) R function] and found the 128 distinct solutions. Including some for which b:c is not an integer. (Not of this obviously gives a hint as to how a 8-year old could solve the puzzle.)

As pointed out in a comment below, using the test == on scalars is a bad idea—once realising some fractions may be other than integers—and I should thus replace the equality with an alternative that bypasses divisions,

baoloc=function(ord=sample(1:9)){
return(((ord[1]+ord[4]+12*ord[5]-ord[6]-87)*
ord[3]*ord[9]+13*ord[2]*ord[9]+
ord[3]*ord[7]*ord[8]==0)*ord)}

leading to the overall R code

sol=NULL
perms=as.matrix(data.frame(permutations(9)),ncol=9,byrow=TRUE)
for (t in 1:factorial(9)){
  a=baoloc(perms[t,])
  if (a[1]>0) sol=rbind(sol,a)}
sol=sol[do.call(order, as.data.frame(sol)),]

and returning the 136 different solutions…

an even more senseless taxi-ride

Posted in Books, Kids, pictures, Travel, University life with tags , , , , , , , , , on May 24, 2015 by xi'an

I was (exceptionally) working in (and for) my garden when my daughter shouted down from her window that John Nash had just died. I thus completed my tree trimming and went to check about this sad item of news. What I read made the news even sadder as he and his wife had died in a taxi crash in New Jersey, apparently for not wearing seat-belts, a strategy you would think far from minimax… Since Nash was in Norway a few days earlier to receive the 2015 Abel Prize, it may even be that the couple was on its way home back from the airport.  A senseless death for a Beautiful Mind.

postdoc in the Alps

Posted in Kids, Mountains, Statistics, Travel, University life with tags , , , , , , , , , on May 22, 2015 by xi'an

Post-doctoral Position in Spatial/Computational Statistics (Grenoble, France)

A post-doctoral position is available in Grenoble, France, to work on computational methods for spatial point process models. The candidate will work with Simon Barthelmé (GIPSA-lab, CNRS) and Jean-François Coeurjolly (Univ. Grenoble Alpes, Laboratory Jean Kuntzmann) on extending point process methodology to deal with large datasets involving multiple sources of variation. We will focus on eye movement data, a new and exciting application area for spatial statistics. The work will take place in the context of an interdisciplinary project on eye movement modelling involving psychologists, statisticians and applied mathematicians from three different institutes in Grenoble.

The ideal candidate has a background in spatial or computational statistics or machine learning. Knowledge of R (and in particular the package spatstat) and previous experience with point process models is a definite plus.

The duration of the contract is 12+6 months, starting 01.10.2015 at the earliest. Salary is according to standard CNRS scale (roughly EUR 2k/month).

Grenoble is the largest city in the French Alps, with a very strong science and technology cluster. It is a pleasant place to live, in an exceptional mountain environment.

Bruce Lindsay (March 7, 1947 — May 5, 2015)

Posted in Books, Running, Statistics, Travel, University life with tags , , , , , , , , , , , on May 22, 2015 by xi'an

When early registering for Seattle (JSM 2015) today, I discovered on the ASA webpage the very sad news that Bruce Lindsay had passed away on May 5.  While Bruce was not a very close friend, we had met and interacted enough times for me to feel quite strongly about his most untimely death. Bruce was indeed “Mister mixtures” in many ways and I have always admired the unusual and innovative ways he had found for analysing mixtures. Including algebraic ones through the rank of associated matrices. Which is why I first met him—besides a few words at the 1989 Gertrude Cox (first) scholarship race in Washington DC—at the workshop I organised with Gilles Celeux and Mike West in Aussois, French Alps, in 1995. After this meeting, we met twice in Edinburgh at ICMS workshops on mixtures, organised with Mike Titterington. I remember sitting next to Bruce at one workshop dinner (at Blonde) and him talking about his childhood in Oregon and his father being a journalist and how this induced him to become an academic. He also contributed a chapter on estimating the number of components [of a mixture] to the Wiley book we edited out of this workshop. Obviously, his work extended beyond mixtures to a general neo-Fisherian theory of likelihood inference. (Bruce was certainly not a Bayesian!) Last time, I met him, it was in Italia, at a likelihood workshop in Venezia, October 2012, mixing Bayesian nonparametrics, intractable likelihoods, and pseudo-likelihoods. He gave a survey talk about composite likelihood, telling me about his extended stay in Italy (Padua?) around that time… So, Bruce, I hope you are now running great marathons in a place so full of mixtures that you can always keep ahead of the pack! Fare well!

 

non-reversible MCMC

Posted in Books, Statistics, University life with tags , , , , , , on May 21, 2015 by xi'an

While visiting Dauphine, Natesh Pillai and Aaron Smith pointed out this interesting paper of Joris Bierkens (Warwick) that had escaped my arXiv watch/monitoring. The paper is about turning Metropolis-Hastings algorithms into non-reversible versions, towards improving mixing.

In a discrete setting, a way to produce a non-reversible move is to mix the proposal kernel Q with its time-reversed version Q’ and use an acceptance probability of the form

\epsilon\pi(y)Q(y,x)+(1-\epsilon)\pi(x)Q(x,y) \big/ \pi(x)Q(x,y)

where ε is any weight. This construction is generalised in the paper to any vorticity (skew-symmetric with zero sum rows) matrix Γ, with the acceptance probability

\epsilon\Gamma(x,y)+\pi(y)Q(y,x)\big/\pi(x)Q(x,y)

where ε is small enough to ensure all numerator values are non-negative. This is a rather annoying assumption in that, except for the special case derived from the time-reversed kernel, it has to be checked over all pairs (x,y). (I first thought it also implied the normalising constant of π but everything can be set in terms of the unormalised version of π, Γ or ε included.) The paper establishes that the new acceptance probability preserves π as its stationary distribution. An alternative construction is to make the proposal change from Q in H such that H(x,y)=Q(x,y)+εΓ(x,y)/π(x). Which seems more pertinent as not changing the proposal cannot improve that much the mixing behaviour of the chain. Still, the move to the non-reversible versions has the noticeable plus of decreasing the asymptotic variance of the Monte Carlo estimate for any integrable function. Any. (Those results are found in the physics literature of the 2000’s.)

The extension to the continuous case is a wee bit more delicate. One needs to find an anti-symmetric vortex function g with zero integral [equivalent to the row sums being zero] such that g(x,y)+π(y)q(y,x)>0 and with same support as π(x)q(x,y) so that the acceptance probability of g(x,y)+π(y)q(y,x)/π(x)q(x,y) leads to π being the stationary distribution. Once again g(x,y)=ε(π(y)q(y,x)-π(x)q(x,y)) is a natural candidate but it is unclear to me why it should work. As the paper only contains one illustration for the discretised Ornstein-Uhlenbeck model, with the above choice of g for a small enough ε (a point I fail to understand since any ε<1 should provide a positive g(x,y)+π(y)q(y,x)), it is also unclear to me that this modification (i) is widely applicable and (ii) is relevant for genuine MCMC settings.

Follow

Get every new post delivered to your Inbox.

Join 852 other followers