an extension of nested sampling

Posted in Books, Statistics, University life with tags , , , , , , , on December 16, 2014 by xi'an

I was reading [in the Paris métro] Hastings-Metropolis algorithm on Markov chains for small-probability estimation, arXived a few weeks ago by François Bachoc, Lionel Lenôtre, and Achref Bachouch, when I came upon their first algorithm that reminded me much of nested sampling: the following was proposed by Guyader et al. in 2011,

To approximate a tail probability P(H(X)>h),

  • start from an iid sample of size N from the reference distribution;
  • at each iteration m, select the point x with the smallest H(x)=ξ and replace it with a new point y simulated under the constraint H(y)≥ξ;
  • stop when all points in the sample are such that H(X)>h;
  • take

\left(1-\dfrac{1}{N}\right)^{m-1}

as the unbiased estimator of P(H(X)>h).

Hence, except for the stopping rule, this is the same implementation as nested sampling. Furthermore, Guyader et al. (2011) also take advantage of the bested sampling fact that, if direct simulation under the constraint H(y)≥ξ is infeasible, simulating via one single step of a Metropolis-Hastings algorithm is as valid as direct simulation. (I could not access the paper, but the reference list of Guyader et al. (2011) includes both original papers by John Skilling, so the connection must be made in the paper.) What I find most interesting in this algorithm is that it even achieves unbiasedness (even in the MCMC case!).

NIPS 2014

Posted in Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on December 15, 2014 by xi'an

mugSecond and last day of the NIPS workshops! The collection of topics was quite broad and would have made my choosing an ordeal, except that I was invited to give a talk at the probabilistic programming workshop, solving my dilemma… The first talk by Kathleen Fisher was quite enjoyable in that it gave a conceptual discussion of the motivations for probabilistic languages, drawing an analogy with the early days of computer programming that saw a separation between higher level computer languages and machine programming, with a compiler interface. And calling for a similar separation between the models faced by statistical inference and machine-learning and the corresponding code, if I understood her correctly. This was connected with Frank Wood’s talk of the previous day where he illustrated the concept through a generation of computer codes to approximately generate from standard distributions like Normal or Poisson. Approximately as in ABC, which is why the organisers invited me to talk in this session. However, I was a wee bit lost in the following talks and presumably lost part of my audience during my talk, as I realised later to my dismay when someone told me he had not perceived the distinction between the trees in the random forest procedure and the phylogenetic trees in the population genetic application. Still, while it had for me a sort of Twilight Zone feeling of having stepped in another dimension, attending this workshop was an worthwhile experiment as an eye-opener into a highly different albeit connected field, where code and simulator may take the place of a likelihood function… To the point of defining Hamiltonian Monte Carlo directly on the former, as Vikash Mansinghka showed me at the break.

I completed the day with the final talks in the variational inference workshop, if only to get back on firmer ground! Apart from attending my third talk by Vikash in the conference (but on a completely different topic on variational approximations for discrete particle-ar distributions), a talk by Tim Salimans linked MCMC and variational approximations, using MCMC and HMC to derive variational bounds. (He did not expand on the opposite use of variational approximations to build better proposals.) Overall, I found these two days and my first NIPS conference quite exciting, if somewhat overpowering, with a different atmosphere and a different pace compared with (small or large) statistical meetings. (And a staggering gender imbalance!)

further up North

Posted in pictures, Travel, University life with tags , , , , , , , on December 14, 2014 by xi'an

upnorth

up North

Posted in pictures, Travel, University life with tags , , , , , on December 14, 2014 by xi'an

Scoastland

ABC à Montréal

Posted in Kids, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , on December 13, 2014 by xi'an

Montreal1So today was the NIPS 2014 workshop, “ABC in Montréal“, which started with a fantastic talk by Juliane Liepe on some exciting applications of ABC to the migration of immune cells, with the analysis of movies involving those cells acting to heal a damaged fly wing and a cut fish tail. Quite amazing videos, really. (With the great entry line of ‘We have all cut  a finger at some point in our lives’!) The statistical model behind those movies was a random walk on a grid, with different drift and bias features that served as model characteristics. Frank Wood managed to deliver his talk despite a severe case of food poisoning, with a great illustration of probabilistic programming that made me understand (at last!) the very idea of probabilistic programming. And  Vikash Mansinghka presented some applications in image analysis. Those two talks led me to realise why probabilistic programming was so close to ABC, with a programming touch! Hence why I was invited to talk today! Then Dennis Prangle exposed his latest version of lazy ABC, that I have already commented on the ‘Og, somewhat connected with our delayed acceptance algorithm, to the point that maybe something common can stem out of the two notions. Michael Blum ended the day with provocative answers to the provocative question of Ted Meeds as to whether or not machine learning needed ABC (Ans. No!) and whether or not machine learning could help ABC (Ans. ???). With an happily mix-up between mechanistic and phenomenological models that helped generating discussion from the floor.

The posters were also of much interest, with calibration as a distance measure by Michael Guttman, in continuation of the poster he gave at MCMski, Aaron Smith presenting his work with Luke Bornn, Natesh Pillai and Dawn Woodard, on why a single pseudo-sample is enough for ABC efficiency. This gave me the opportunity to discuss with him the apparent contradiction with the result of Kryz Łatunsziński and Anthony Lee about the geometric convergence of ABC-MCMC only attained with a random number of pseudo-samples… And to wonder if there is a geometric versus binomial dilemma in this setting, Namely, whether or not simulating pseudo-samples until one is accepted would be more efficient than just running one and discarding it in case it is too far. So, although the audience was not that large (when compared with the other “ABC in…” and when considering the 2500+ attendees at NIPS over the week!), it was a great day where I learned a lot, did not have a doze during talks (!), [and even had an epiphany of sorts at the treadmill when I realised I just had to take longer steps to reach 16km/h without hyperventilating!] So thanks to my fellow organisers, Neil D Lawrence, Ted Meeds, Max Welling, and Richard Wilkinson for setting the program of that day! And, by the way, where’s the next “ABC in…”?! (Finland, maybe?)

broken homes [book review]

Posted in Books, pictures, Travel with tags , , , , , , , on December 13, 2014 by xi'an

London by Delta, Dec. 14, 2011Even though this is the fourth volume in the Peter Grant series, I did read it first [due to my leaving volume one in my office in Coventry and coming across this one in an airport bookstore in Düsseldorf], an experiment I do not advise anyone to repeat as it kills some of the magic in Rivers of London [renamed Midnight Riots on the US market, for an incomprehensible reason!, with the series being recalled Rivers of London, but at least they left the genuine and perfect covers…, not like some of the other foreign editions!] and makes reading Broken homes an exercise in guessing. [Note for ‘Og’s readers suffering from Peter Grant fatigue: the next instalment, taking the seemingly compulsory trip Outside!—witness the Bartholomew series—, is waiting for me in Warwick, so I will not read it before the end of January!]

“I nodded sagely. `You’re right,’ I said. `We need a control.’
`Seriously?’she asked.
`Otherwise, how do you know the variable you’ve changed is the one having the effect?’ I said.”

Now, despite this inauspicious entry, I did enjoy Broken homes as much [almost!] as the other volumes in the series. It mostly takes place in a less familiar [for a French tourist like me] part of London, but remains nonetheless true to its spirit of depicting London as a living organism! There are mostly characters from the earlier novels, but the core of the story is an infamous housing estate built by a mad architect in Elephant and Castle, not that far from Waterloo [Station], but sounding almost like a suburb from Aaronovitch’s depiction! Actually, the author has added a google map for the novel locations on his blog, wish I had it at the time [kind of difficult to get in a plane!].

“Search as I might, nobody else was offering free [wifi] connections to the good people of Elephant and Castle.”

The plot itself is centred on this estate [not really a spoiler, is it?] and the end is outstanding in that it is nothing like one would expect. With or without reading the other volumes. I still had trouble understanding the grand scheme of the main villain, while I have now entirely forgotten about the reasons for the crime scene at the very beginning of Broken homes. Rereading the pages where the driver, Robert Weil, appears did not help. What was his part in the story?! Despite this [maybe entirely personal] gap, the story holds well together, somewhat cemented by the characters populating the estate, who are endowed with enough depth to make them truly part of the story, even when they last only a few pages [spoiler!]. And as usual style and grammar and humour are at their best!

optimal mixture weights in multiple importance sampling

Posted in Statistics, University life with tags , , , , , , on December 12, 2014 by xi'an

Multiple importance sampling is back!!! I am always interested in this improvement upon regular importance sampling, even or especially after publishing a recent paper about our AMIS (for adaptive multiple importance sampling) algorithm, so I was quite eager to see what was in Hera He’s and Art Owen’s newly arXived paper. The paper is definitely exciting and set me on a new set of importance sampling improvements and experiments…

Some of the most interesting developments in the paper are that, (i) when using a collection of importance functions qi with the same target p, every ratio qi/p is a control variate function with expectation 1 [assuming each of the qi‘s has a support smaller than the support of p]; (ii) the weights of a mixture of the qi‘s can be chosen in an optimal way towards minimising the variance for a certain integrand; (iii) multiple importance sampling incorporates quite naturally stratified sampling, i.e. the qi‘s may have disjoint supports; )iv) control variates contribute little, esp. when compared with the optimisation over the weights [which does not surprise me that much, given that the control variates have little correlation with the integrands]; (v) Veach’s (1997) seminal PhD thesis remains a driving force behind those results [and in getting Eric Veach an Academy Oscar in 2014!].

One extension that I would find of the uttermost interest deals with unscaled densities, both for p and the qi‘s. In that case, the weights do not even sum up to a know value and I wonder at how much more difficult it is to analyse this realistic case. And unscaled densities led me to imagine using geometric mixtures instead. Or even harmonic mixtures! (Maybe not.)

Another one is more in tune with our adaptive multiple mixture paper. The paper works with regret, but one could also work with remorse! Besides the pun, this means that one could adapt the weights along iterations and even possible design new importance functions from the past outcome, i.e., be adaptive once again. He and Owen suggest mixing their approach with our adaptive sequential Monte Carlo model.

Follow

Get every new post delivered to your Inbox.

Join 717 other followers