Archive for poster
So today was the NIPS 2014 workshop, “ABC in Montréal“, which started with a fantastic talk by Juliane Liepe on some exciting applications of ABC to the migration of immune cells, with the analysis of movies involving those cells acting to heal a damaged fly wing and a cut fish tail. Quite amazing videos, really. (With the great entry line of ‘We have all cut a finger at some point in our lives’!) The statistical model behind those movies was a random walk on a grid, with different drift and bias features that served as model characteristics. Frank Wood managed to deliver his talk despite a severe case of food poisoning, with a great illustration of probabilistic programming that made me understand (at last!) the very idea of probabilistic programming. And Vikash Mansinghka presented some applications in image analysis. Those two talks led me to realise why probabilistic programming was so close to ABC, with a programming touch! Hence why I was invited to talk today! Then Dennis Prangle exposed his latest version of lazy ABC, that I have already commented on the ‘Og, somewhat connected with our delayed acceptance algorithm, to the point that maybe something common can stem out of the two notions. Michael Blum ended the day with provocative answers to the provocative question of Ted Meeds as to whether or not machine learning needed ABC (Ans. No!) and whether or not machine learning could help ABC (Ans. ???). With an happily mix-up between mechanistic and phenomenological models that helped generating discussion from the floor.
The posters were also of much interest, with calibration as a distance measure by Michael Guttman, in continuation of the poster he gave at MCMski, Aaron Smith presenting his work with Luke Bornn, Natesh Pillai and Dawn Woodard, on why a single pseudo-sample is enough for ABC efficiency. This gave me the opportunity to discuss with him the apparent contradiction with the result of Kryz Łatunsziński and Anthony Lee about the geometric convergence of ABC-MCMC only attained with a random number of pseudo-samples… And to wonder if there is a geometric versus binomial dilemma in this setting, Namely, whether or not simulating pseudo-samples until one is accepted would be more efficient than just running one and discarding it in case it is too far. So, although the audience was not that large (when compared with the other “ABC in…” and when considering the 2500+ attendees at NIPS over the week!), it was a great day where I learned a lot, did not have a doze during talks (!), [and even had an epiphany of sorts at the treadmill when I realised I just had to take longer steps to reach 16km/h without hyperventilating!] So thanks to my fellow organisers, Neil D Lawrence, Ted Meeds, Max Welling, and Richard Wilkinson for setting the program of that day! And, by the way, where’s the next “ABC in…”?! (Finland, maybe?)
The first full day of talks at ISBA 2014, Cancún, was full of goodies, from the three early talks on specifically developed software, including one by Daniel Lee on STAN that completed the one given by Bob Carpenter a few weeks ago in Paris (which gives me the opportunity to advertise STAN tee-shirts!). To the poster session (which just started a wee bit late for my conference sleep pattern!). Sylvia Richardson gave an impressive lecture full of information on Bayesian genomics. I also enjoyed very much two sessions with young Bayesian statisticians, one on Bayesian econometrics and the other one more diverse and sponsored by ISBA. Overall, and this also applies to the programme of the following days, I found that the proportion of non-parametric talks was quite high this year, possibly signalling a switch in the community and the interest of Bayesians. And conversely very few talks on computing related issues. (With most scheduled after my early departure…)
In the first of those sessions, Brendan Kline talked about partially identified parameters, a topic quite close to my interests, although I did not buy the overall modelling adopted in the analysis. For instance, Brendan Kline presented the example of a parameter θ that is the expectation of a random variable Y which is indirectly observed through x <Y< x̅ . While he maintained that inference should be restricted to an interval around θ and that using a prior on θ was doomed to fail (and against econometrics culture), I would have prefered to see this example as a missing data one, with both x and x̅ containing information about θ. And somewhat object to the argument against the prior as it would equally apply to any prior modelling. Although unrelated in the themes, Angela Bitto presented a work on the impact of different prior modellings on the estimation of time-varying parameters in time-series models. À la Harrison and West 1994 Discriminating between good and poor shrinkage in a way I could not spot. Unless it was based on the data fit (horror!). And a third talk of interest by Andriy Norets that (very loosely) related to Angela’s talk by presenting a framework to modify credible sets towards frequentist properties: one example was the credible interval on a positive normal mean that led to a frequency-valid confidence interval with a modified prior. This reminded me very much of the shrinkage confidence intervals of the James-Stein era.
I am now back home after five exciting and exhausting days in Park City, Utah! As reported earlier, the Adap’skiii meeting went on quite well, with high quality talks relating to edge research. I am thus completely committed to organise the next meeting in three years or so, whether or not MCMSki 4 ever takes place. I also found the ski resort where the meeting took place quite interesting, with plenty of mostly empty ski runs and top quality lodging [with the luxury of a fireplace]. (The downside was a type of runs I was not used to in the Alps, but this showed how far I had to improve in my skiing. And another major downside were the grossly overpriced commodities, because down-town Park City was too far to accommodate my jet-lagged schedule. Despite this lack of complete information, I am slightly bemused at Park City making into the top ten places to go in 2011 according to the New York TImes…) While the picture below, taken from my hotel room/flat, was selected as Shot of the Day by The Canyons, the above panorama picture was provided to me by Luke Bornn, who also gave a fairly interesting talk during the Young Investigators session.
Since Bayes factor approximation is one of my areas of interest, I was intrigued by Xiao-Li Meng’s comments during my poster in Benidorm that I was using the “wrong” bridge sampling estimator when trying to bridge two models of different dimensions, based on the completion (for and missing from the first model)
When revising the normal chapter of Bayesian Core, here in CiRM, I thus went back to Xiao-Li’s papers on the topic to try to fathom what the “true” bridge sampling was in that case. In Meng and Schilling (2002, JASA), I found the following indication, “when estimating the ratio of normalizing constants with different dimensions, a good strategy is to bridge each density with a good approximation of itself and then apply bridge sampling to estimate each normalizing constant separately. This is typically more effective than to artificially bridge the two original densities by augmenting the dimension of the lower one”. I was unsure of the technique this (somehow vague) indication pointed at until I understood that it meant introducing one artificial posterior distribution for each of the parameter spaces and processing each marginal likelihood as an integral ratio in itself. For instance, if is an arbitrary normalised density on , and is an arbitrary function, we have the bridge sampling identity on :
Therefore, the optimal choice of leads to the approximation
when and . More exactly, this approximation is replaced with an iterative version since it depends on the unknown . The choice of the density is obviously fundamental and it should be close to the true posterior to guarantee good convergence approximation. Using a normal approximation to the posterior distribution of or a non-parametric approximation based on a sample from , or yet again an average of MCMC proposals are reasonable choices.
The boxplot above compares this solution of Meng and Schilling (2002, JASA), called double (because two pseudo-posteriors and have to be introduced), with Chen, Shao and Ibragim (2001) solution based on a single completion (using a normal centred at the estimate of the missing parameter, and with variance the estimate from the simulation), when testing whether or not the mean of a normal model with unknown variance is zero. The variabilities are quite comparable in this admittedly overly simple case. Overall, the performances of both extensions are obviously highly dependent on the choice of the completion factors, and on the one hand and on the other hand, . The performances of the first solution, which bridges both models via , are bound to deteriorate as the dimension gap between those models increases. The impact of the dimension of the models is less keenly felt for the other solution, as the approximation remains local.
Just to signal readers that the program of the meeting(s) is now available. It is fairly impressive in its coverage of the ongoing research in Bayesian statistics and related fields, plus it has the very nice feature of completely avoiding parallel sessions, a reason why few contributed talks were accepted. And the less appealing feature of having poster sessions, a highlight of the Valencia meetings, starting at 10pm. Right after the [early Spanish] dinner at 9pm. (As in the earlier meeting in Teneriffe, I will have to find climbing partners for the 1pm-5pm break, even though this is not the best time for climbing…) José Bernardo also indicated that the early registration hotel prices were still in order.