Archive for R

Le Monde puzzle [#947]

Posted in Books, Kids, pictures, R, Statistics with tags , , , , , , , on February 2, 2016 by xi'an

Another boardgame in Le Monde mathematical puzzle :

Given an 8×8 chequerboard,  consider placing 2×2 tiles over this chequerboard until (a) the entire surface is covered and (b) removing a single 2×2 tile exposes some of the original chequerboard. What is the maximal number of 2×2 tiles one can set according to this scheme? And for a 10×10 chequerboard?

This puzzle reminded me of Wilfrid Kendall’s introduction to perfect sampling  with leaves seen through a ceiling window falling from above, until sky was no longer visible. The representation was used by Wilfrid to explain that coupling from the past did not need to go all the way back to infinity:

Defining a random coverage of the chequerboard by those 2×2 tiles amounts to selecting a random permutation þ of 1:(n-1)² and finding the subvector of þ producing a full coverage

 grid=matrix(0,n,n)
 path=sample(1:(n-1)^2) #random permutation
 path=path+((path-1)%/%(n-1)) #account for border shift
 i=1
 neigh=c(0,1,n,n+1)
 while (min(grid)==0){ #all entries covered
   grid[path[i]+neigh]=grid[path[i]+neigh]+1
   i=i+1
 }
 i=i-1

Then removing superfluous tiles:

for (k in sample(1:i)){
 krid=grid
 krid[path[k]+neigh]=krid[path[k]+neigh]-1
 if (min(krid)>0){ #off used below
   off[k]=FALSE; grid=krid} #useless tile
 }

And checking the remaining ones are truly essential:

mingrid=0
for (k in (1:i)[off]){
 krid=grid
 krid[path[k]+neigh]=krid[path[k]+neigh]-1
 mingrid=max(mingrid,min(krid))
 }
sky=(mingrid>0) #rejection of the grid

leads to the maximum number of tiles to be [at least] M=16,25,37,54 for n=6,8,10,12…

R typos

Posted in Books, Kids, R, Statistics, Travel, University life with tags , , , , , , , , on January 27, 2016 by xi'an

Amster14At MCMskv, Alexander Ly (from Amsterdam) pointed out to me some R programming mistakes I made in the introduction to Metropolis-Hastings algorithms I wrote a few months ago for the Wiley on-line encyclopedia! While the outcome (Monte Carlo posterior) of the corrected version is moderately changed this is nonetheless embarrassing! The example (if not the R code) was a mixture of a Poisson and a Geometric distributions borrowed from our testing as mixture paper. Among other things, I used a flat prior on the mixture weights instead of a Beta(1/2,1/2) prior and a simple log-normal random walk on the mean parameter instead of a more elaborate second order expansion discussed in the text. And I also inverted the probabilities of success and failure for the Geometric density. The new version is now available on arXiv, and hopefully soon on the Wiley site, but one (the?) fact worth mentioning here is that the (right) corrections in the R code first led to overflows, because I was using the Beta random walk Be(εp,ε(1-p)) which major drawback I discussed here a few months ago. With the drag that nearly zero or one values of the weight parameter produced infinite values of the density… Adding 1 (or 1/2) to each parameter of the Beta proposal solved the problem. And led to a posterior on the weight still concentrating on the correct corner of the unit interval. In any case, a big thank you to Alexander for testing the R code and spotting out the several mistakes…

Le Monde puzzle [#945]

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags , , , , on January 25, 2016 by xi'an

A rather different Le Monde mathematical puzzle:

A two-person game is played on an nxn grid filled with zero. Each player pick an entry, which is increased by one as well as all adjacent entries. The game stops when all entries are equal. For n=3,4,5, what are the possible grids with identical values all over?

If I define an R neighbourhood function

 nighbo=function(i,n){
   neigh=i
   if (i%%n!=1) neigh=c(i-1,neigh)
   if (i%%n>0) neigh=c(i+1,neigh)
   if (i+n<=n*n) neigh=c(i+n,neigh)
   if (i-n>0) neigh=c(i-n,neigh) 
   return(neigh)}

and try a brute force filling of the grid

while ((min(grid)==0)||(length(unique(grid))>1)){
  ent=sample(1:(n*n),1,prob=1/as.vector(grid+1)^10)
  grid[nighbo(ent,n)]=grid[nighbo(ent,n)]+1}

the loop never stops. When thinking of the case n=3 [while running in the early morning], I wondered whether or not reaching an equal value on all entries was at all possible. Indeed, it is impossible to update one of the four corners without updating at least one of the neighbours, while the converse is false. Experimenting further with simulated annealing to optimise the probabilities of picking different entries in the table when n=4,5 seems to indicate this is also the case for larger values of n, in that all attempts lead to larger values of neighbours to the four corners :

outer=c(1,n,n*n,n*n-n+1)
border=sort(unique(c(2:(n-1),(n*n-n+2):(n*n-1),1+n*(1:(n-2)),n+n*(1:(n-2)))))
inner=(1:(n*n))[-c(outer,border)]
#
target=function(weit){
 grid=matrix(0,n,n)
 for (t in 1:1e4){
   cas=sample(1:3,1,prob=weit)
   if (cas==1) ent=sample(outer,1,prob=max(grid[outer])-grid[outer]+1)
   if (cas==2) ent=sample(border,1,prob=max(grid[border])-grid[border]+1)
   if (cas==3) ent=sample(inner,1,prob=max(grid[inner])-grid[inner]+1)
   ent=nighbo(ent,n)
   grid[ent]=grid[ent]+1}
 ave=c(mean(grid[outer]),mean(grid[border]),mean(grid[inner]))
 return(list(dive=max(diff(sort(ave))),ave=ave,ava=mean(grid)))
 }
#
weit=rep(1,3)
cur=target(weit)
T=100
while (cur$dive>0){
 ind=sample(1:3,1,prob=1/cur$ave)
 peit=weit
 peit[ind]=weit[ind]-(cur$ave[ind]-cur$ava)*runif(1)/(cur$ava)
 while(peit[ind]<0)
 peit[ind]=weit[ind]-(cur$ave[ind]-cur$ava)*runif(1)/(cur$ava)
 prop=target(peit)
 if (log(runif(1))*1e4/T<prop$dive-cur$dive){
 weit=peit;cur=prop}
 T=T*1.00001
 print(cur$ave)}

done! [#2]

Posted in Kids, Statistics, University life with tags , , , , , , , , , on January 21, 2016 by xi'an

exosPhew! I just finished my enormous pile of homeworks for the computational statistics course… This massive pile is due to an unexpected number of students registering for the Data Science Master at ENSAE and Paris-Dauphine. As I was not aware of this surge, I kept to my practice of asking students to hand back solved exercises from Monte Carlo Statistical Methods at the beginning of each class. And could not change the rules of the game once the course had started! Next year, I’ll make sure to get some backup for grading those exercises. Or go for group projects instead…

Le Monde puzzle [#944]

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags , , , , on January 20, 2016 by xi'an

A completely dull Le Monde mathematical puzzle:

Find all integer pairs (a,b) less than 10⁶ such that a-b=2015 and ab is a perfect square.

If  I write the only condition as the function

perfect=function(c){
  return(c==trunc(sqrt(c))^2)}

and brute-force checked for all possible solutions

for (A in 2015:1e6)
 if (perfect(A*(A-2015))) print(A)
trunc(y/10)*11+(y-10*trunc(y/10))-(y==10*trunc(y/10))}

which produced

[1] 2015
[1] 2304
[1] 2420
[1] 2511
[1] 3627
[1] 4212
[1] 7056
[1] 7595
[1] 16640
[1] 33759
[1] 41616
[1] 79092
[1] 204020

as the 13 possible answers. If one checks between 10⁶ and 5 10⁶, the only remaining solution is 1,016,064. Which is the highest possible one according to Le Monde.

mixtures are slices of an orange

Posted in Kids, R, Statistics with tags , , , , , , , , , , , , , , , , on January 11, 2016 by xi'an

licenceDataTempering_mu_pAfter presenting this work in both London and Lenzerheide, Kaniav Kamary, Kate Lee and I arXived and submitted our paper on a new parametrisation of location-scale mixtures. Although it took a long while to finalise the paper, given that we came with the original and central idea about a year ago, I remain quite excited by this new representation of mixtures, because the use of a global location-scale (hyper-)parameter doubling as the mean-standard deviation for the mixture itself implies that all the other parameters of this mixture model [beside the weights] belong to the intersection of a unit hypersphere with an hyperplane. [Hence the title above I regretted not using for the poster at MCMskv!]fitted_density_galaxy_data_500iters2This realisation that using a (meaningful) hyperparameter (μ,σ) leads to a compact parameter space for the component parameters is important for inference in such mixture models in that the hyperparameter (μ,σ) is easily estimated from the entire sample, while the other parameters can be studied using a non-informative prior like the Uniform prior on the ensuing compact space. This non-informative prior for mixtures is something I have been seeking for many years, hence my on-going excitement! In the mid-1990‘s, we looked at a Russian doll type parametrisation with Kerrie Mengersen that used the “first” component as defining the location-scale reference for the entire mixture. And expressing each new component as a local perturbation of the previous one. While this is a similar idea than the current one, it falls short of leading to a natural non-informative prior, forcing us to devise a proper prior on the variance that was a mixture of a Uniform U(0,1) and of an inverse Uniform 1/U(0,1). Because of the lack of compactness of the parameter space. Here, fixing both mean and variance (or even just the variance) binds the mixture parameter to an ellipse conditional on the weights. A space that can be turned into the unit sphere via a natural reparameterisation. Furthermore, the intersection with the hyperplane leads to a closed form spherical reparameterisation. Yay!

While I do not wish to get into the debate about the [non-]existence of “non-informative” priors at this stage, I think being able to using the invariant reference prior π(μ,σ)=1/σ is quite neat here because the inference on the mixture parameters should be location and scale equivariant. The choice of the prior on the remaining parameters is of lesser importance, the Uniform over the compact being one example, although we did not study in depth this impact, being satisfied with the outputs produced from the default (Uniform) choice.

From a computational perspective, the new parametrisation can be easily turned into the old parametrisation, hence leads to a closed-form likelihood. This implies a Metropolis-within-Gibbs strategy can be easily implemented, as we did in the derived Ultimixt R package. (Which programming I was not involved in, solely suggesting the name Ultimixt from ultimate mixture parametrisation, a former title that we eventually dropped off for the paper.)

Discussing the paper at MCMskv was very helpful in that I got very positive feedback about the approach and superior arguments to justify the approach and its appeal. And to think about several extensions outside location scale families, if not in higher dimensions which remain a practical challenge (in the sense of designing a parametrisation of the covariance matrices in terms of the global covariance matrix).

MCMskv #2 [ridge with a view]

Posted in Mountains, pictures, R, Statistics, Travel, University life with tags , , , , , , , , , , , , , on January 7, 2016 by xi'an

Tuesday at MCMSkv was a rather tense day for me, from having to plan the whole day “away from home” [8km away] to the mundane worry of renting ski equipment and getting to the ski runs over the noon break, to giving a poster over our new mixture paper with Kaniav Kamary and Kate Lee, as Kaniav could not get a visa in time. It actually worked out quite nicely, with almost Swiss efficiency. After Michael Jordan’s talk, I attended a Bayesian molecular biology session with an impressive talk by Jukka Corander on evolutionary genomics with novel ABC aspects. And then a Hamiltonian Monte Carlo session with two deep talks by Sam Livingstone and Elena Akhmatskaya on the convergence of HMC, followed by an amazing entry into Bayesian cosmology by Jens Jasche (with a slight drawback that MCMC simulations took about a calendar year, handling over 10⁷ parameters). Finishing the day with more “classical” MCMC convergence results and techniques, with talks about forgetting time, stopping time (an undervalued alternative to convergence controls), and CLTs. Including a multivariate ESS by James Flegal. (This choice of sessions was uniformly frustrating as I was also equally interested in “the other” session. The drawback of running parallel sessions, obviously.)

The poster session was busy and animated, but I alas could not get an idea of the other posters as I was presenting mine. This was quite exciting as I discussed a new parametrisation for location-scale mixture models that allows for a rather straightforward “non-informative” or reference prior. (The paper with Kaniav Kamary and Kate Lee should be arXived overnight!) The recently deposited CRAN package Ultimixt by Kaniav and Kate contains Metropolis-Hastings functions related to this new approach. The result is quite exciting, especially because I have been looking for it for decades and I will discuss it pretty soon in another post, and I had great exchanges with the conference participants, which led me to consider the reparametrisation in a larger scale and to simplify the presentation of the approach, turning the global mean and variance as hyperparameters.

The day was also most auspicious for a ski break as it was very mild and sunny, while the snow conditions were (somewhat) better than the ones we had in the French Alps two weeks ago. (Too bad that the Tweedie ski race had to be cancelled for lack of snow on the reserved run! The Blossom ski reward will have again to be randomly allocated!) Just not exciting enough to consider another afternoon out, given the tension in getting there and back. (And especially when considering that it took me the entire break time to arXive our mixture paper…)

Follow

Get every new post delivered to your Inbox.

Join 980 other followers