## Archive for missing data

## more typos in Monte Carlo statistical methods

Posted in Books, Statistics, University life with tags capture-recapture, EM algorithm, frequentist inference, integer set, Jensen's inequality, missing data, Monte Carlo Statistical Methods, optimisation, typos, UNC on October 28, 2011 by xi'an## Typo in Example 5.18

Posted in Books, R, Statistics, University life with tags EM algorithm, missing data, Monte Carlo Statistical Methods, typos on October 3, 2010 by xi'an**E**dward Kao is engaged in a detailed parallel reading of ** Monte Carlo Statistical Methods **and of

**He has pointed out several typos in Example 5.18 of**

*Introducing Monte Carlo Methods with R.***which studies a missing data phone plan model and its EM resolution. First, the customers in area i should be double-indexed, i.e.**

*Monte Carlo Statistical Methods*which implies in turn that

.

Then the summary **T** should be defined as

and as

given that the first m customers have the fifth plan missing.

## JSM 2010 [day 1]

Posted in R, Statistics, University life with tags ABC, auxiliary variable, Bayesian non-parametrics, cloud computing, GPU, JSM 2010, missing data, mixtures, multithreading, parallelisation, Vancouver on August 2, 2010 by xi'an**T**he first day at **JSM** is always a bit sluggish, as people slowly drip in and get their bearings. Similar to last year in Washington D.C., the meeting takes place in a huge conference centre and thus there is no feeling of overcrowded [so far]. It may also be that the peripheric and foreign location of the meeting put some regular attendees off (not to mention the expensive living costs!).

**N**onetheless, the Sunday afternoon sessions started with a highly interesting * How Fast Can We Compute? How Fast Will We Compute?* session organised by Mike West and featuring Steve Scott, Mark Suchard and Qanli Wang. The topic was on parallel processing, either via multiple processors or via GPUS, the later relating to the exciting talk Chris Holmes gave at the Valencia meeting. Steve showed us some code in order to explain how feasible the jump to parallel programming—a point demonstrated by Julien Cornebise and Pierre Jacob after they returned from Valencia—was, while stressing the fact that a lot of the processing in MCMC runs was opened to parallelisation. For instance, data augmentation schemes can allocate the missing data in a parallel way in most problems and the same for independent data likelihood computation. Marc Suchard focussed on GPUs and phylogenetic trees, both of high interest to me!, and he stressed the huge gains—of the order of hundreds in the decrease in computing time—made possible by the exploitation of laptop [Macbook] GPUs. (If I got his example correctly, he seemed to be doing an exact computation of the phylogeny likelihood, not an ABC approximation… Which is quite interesting, if potentially killing one of my main areas of research!) Qanli Wang linked both previous with the example of mixtures with a huge number of components. Plenty of food for thought.

**I** completed the afternoon session with the ** Student Paper Competition: Bayesian Nonparametric and Semiparametric Methods** which was discouragingly empty of participants, with two of the five speakers missing and less than twenty people in the room. (I did not get the point about the competition as to who was ranking those papers. Not the participants apparently!)