Archive for the Statistics Category

accelerating MCMC

Posted in Statistics with tags , , , , , , , , , , , , on May 29, 2017 by xi'an

I have recently [well, not so recently!] been asked to write a review paper on ways of accelerating MCMC algorithms for the [review] journal WIREs Computational Statistics and would welcome all suggestions towards the goal of accelerating MCMC algorithms. Besides [and including more on]

  • coupling strategies using different kernels and switching between them;
  • tempering strategies using flatter or lower dimensional targets as intermediary steps, e.g., à la Neal;
  • sequential Monte Carlo with particle systems targeting again flatter or lower dimensional targets and adapting proposals to this effect;
  • Hamiltonian MCMC, again with connections to Radford (and more generally ways of avoiding rejections);
  • adaptive MCMC, obviously;
  • Rao-Blackwellisation, just as obviously (in the sense that increasing the precision in the resulting estimates means less simulations).

plenary talks at JSM 2017 in Baltimore

Posted in Statistics with tags , , , , , , , , , , on May 25, 2017 by xi'an

ARS: when to update?

Posted in Books, Kids, Statistics, University life with tags , , , , , on May 25, 2017 by xi'an

An email I got today from Heng Zhou wondered about the validity of the above form of the ARS algorithm. As printed in our book Monte Carlo Statistical Methods. The worry is that in the original version of the algorithm the envelope of the log-concave target f(.) is only updated for rejected values. My reply to the question is that there is no difference in the versions towards returning a value simulated from f, since changing the envelope between simulations does not modify the accept-reject nature of the algorithm. There is no issue of dependence between the simulations of this adaptive accept-reject method, all simulations remain independent. The question is rather one about efficiency, namely does it pay to update the envelope(s) when accepting a new value and I think it does because the costly part is the computation of f(x), rather than the call to the piecewise-exponential envelope. Correct me if I am wrong!

the long way to a small angry planet [book review]

Posted in Statistics with tags , , , , , , , , , , on May 21, 2017 by xi'an

When leaving London last week, I went through the (very nice) bookstore in St Pancras International and saw this book by Becky Chambers. And bought it as I had read nice criticisms and liked both the title and the cover. I have been reading it at every free minute since then and eventually finished it last night. It is a very enjoyable novel, very homey despite it taking place mostly in interstellar space, as it goes through the personal stories of the members of a tunneller crew (tunnels meaning shortcuts between distant points in space, the astrophysics being a bit vague on how those are possible!). It is far from a masterpiece but the succession of scenes and characters is enjoyable enough to be enjoyable, with a final twist of a larger magnitude. Nothing profoundly innovative like Ancillary Justice [except for the openness about interspecies sex, this could have been written in the 50’s] or era-defining like Ender’s Game, or The Road, but a pleasant read by all means!

continental divide

Posted in Books, Kids, pictures, R with tags , , , , on May 19, 2017 by xi'an

While the Riddler puzzle this week was anticlimactic,  as it meant filling all digits in the above division towards a null remainder, it came as an interesting illustration of how different division is taught in the US versus France: when I saw the picture above, I had to go and check an American primary school on-line introduction to division, since the way I was taught in France is something like that

with the solution being that 12128316 = 124 x 97809… Solved by a dumb R exploration of all constraints:

for (y in 111:143)
for (z4 in 8:9)
for (oz in 0:999){
  z=oz+7e3+z4*1e4
  x=y*z
  digx=digits(x)
  digz=digits(z)
  if ((digz[2]==0)&(x>=1e7)&(x<1e8)){ 
   r1=trunc(x/1e4)-digz[5]*y 
   if ((digz[5]*y>=1e3)&(digz[4]*y<1e4) &(r1>9)&(r1<100)){ 
    r2=10*r1+digx[4]-7*y 
    if ((7*y>=1e2)&(7*y<1e3)&(r2>=1e2)&(r2<1e3)){     
     r3=10*r2+digx[3]-digz[3]*y 
     if ((digz[3]*y>=1e2)&(digz[3]*y<1e3)&(r3>9)&(r3<1e2)){
       r4=10*r3+digx[2]
       if (r4<y) solz=rbind(solz,c(y,z,x))
  }}}}

Looking for a computer-free resolution, the constraints on z exhibited by the picture are that (a) the second digit is 0 and the fourth digit is 7.  Moreover, the first and fifth digits are larger than 7 since y times these digits is a four-digit number. Better, since the second subtraction from a three-digit number by 7y returns a three-digit number and the third subtraction from a four-digit number by ny returns a two-digit number, n is larger than 7 but less than the first and fifth digits. Ergo, z is necessarily 97809! Furthermore, 8y<10³ and 9y≥10³, which means 111<y<125. Plus the constraint that 1000-8y≤99 implies y≥112. Nothing gained there! This leaves 12 values of y to study, unless there is another restriction I missed…

peer community in evolutionary biology

Posted in Statistics with tags , , , , , , , on May 18, 2017 by xi'an

My friends (and co-authors) from Montpellier pointed out the existence of PCI Evolutionary Biology, which is a preprint and postprint validation forum [so far only] in the field of Evolutionary Biology. Authors of a preprint or of a published paper request a recommendation from the forum. If someone from the board finds the paper of interest, this person initiates a quick refereeing process with one or two referees and returns a review to the authors, with possible requests for modification, and if the lead reviewer is happy with the new version, the link to the paper and the reviews are published on PCI Evol Biol, which thus gives a stamp of validation to the contents in the paper. The paper can then be submitted for publication in any journal, as can be seen from the papers in the list.

This sounds like a great initiative and since PCI is calling for little brothers and sisters to PCI Evol Biol, I think we should try to build its equivalent in Statistics or maybe just Computational Statistics.

ABCπ

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , on May 17, 2017 by xi'an

Ritabrata Dutta, Marcel Schöengens, Jukka-Pekka Onnela, and Antonietta Mira recently put a new ABC software on-line, called ABCpy for ABC with Python. The software aims at  an automated parallelisation of ABC runs, requiring only code to generate from the (generative) model and the choice of summary statistics and of associated distance. Alternatively an approximate likelihood (as in synthetic likelihood) can be used. The tolerance ε is chosen as a percentile of the prior predictive distribution on the distance. The versions of ABC found in ABCpy are

  1. Population Monte Carlo for ABC (PMCABC);
  2. sequential Monte Carlo ABC (ABC-SMC);
  3. replenishment Sequential Monte Carlo ABC (RSMC-ABC);
  4. adaptive Population Monte Carlo ABC (APMCABC);
  5. ABC with subset simulation (ABCsubsim); and
  6. simulated annealing ABC (SABC)

Anto mentioned ABCpy to me while in Harvard last week and I have not tested the program (my only brush with Python being the occasional call to latex2wp for SeriesB’log). And obviously, writing a blog about Monte (Carlo and) Python makes a link to the Monty Pythons irresistible: