Archive for Monte Carlo simulation and resampling methods for social science

parallel adaptive importance sampling

Posted in Statistics with tags , , , , , on August 30, 2016 by xi'an

Following Paul Russell’s talk at MCqMC 2016, I took a look at his recently arXived paper. In the plane to Sydney. The pseudo-code representation of the method is identical to our population Monte Carlo algorithm as is the suggestion to approximate the posterior by a mixture, but one novel aspect is to use Reich’s ensemble transportation at the resampling stage, in order to maximise the correlation between the original and the resampled versions of the particle systems. (As in our later versions of PMC, the authors also use as importance denominator the entire mixture rather than conditioning on the selected last-step particle.)

“The output of the resampling algorithm gives us a set of evenly weighted samples that we believe represents the target distribution well”

I disagree with this statement: Reweighting does not improve the quality of the posterior approximation, since it introduces more variability. If the original sample is found missing in its adequation to the target, so is the resampled one. Worse, by producing a sample with equal weights, this step may give a false impression of adequate representation…

Another unclear point in the pape relates to tuning the parameters of the mixture importance sampler. The paper discusses tuning these parameters during a burn-in stage, referring to “due to the constraints on adaptive MCMC algorithms”, which indeed is only pertinent for MCMC algorithms, since importance sampling can be constantly modified while remaining valid. This was a major point for advocating PMC. I am thus unsure what the authors mean by a burn-in period in such a context. Actually, I am also unsure on how they use effective sample size to select the new value of the importance parameter, e.g., the variance β in a random walk mixture: the effective sample size involves this variance implicitly through the realised sample hence changing β means changing the realised sample… This seems too costly to contemplate so I wonder at the way Figure 4.2 is produced.

“A popular approach for adaptive MCMC algorithms is to view the scaling parameter as a random variable which we can sample during the course of the MCMC iterations.”

While this is indeed an attractive notion [that I played with in the early days of adaptive MCMC, with the short-lived notion of cyber-parameters], I do not think it is of much help in optimising an MCMC algorithm, since the scaling parameter need be optimised, resulting into a time-inhomogeneous target. A more appropriate tool is thus stochastic optimisation à la Robbins-Monro, as exemplified in Andrieu and Moulines (2006). The paper however remains unclear as to how the scales are updated (see e.g. Section 4.2).

“Ideally, we would like to use a resampling algorithm which is not prohibitively costly for moderately or large sized ensembles, which preserves the mean of the samples, and which makes it much harder for the new samples to forget a significant region in the density.”

The paper also misses on the developments of the early 2000’s about more sophisticated resampling steps, especially Paul Fearnhead’s contributions (see also Nicolas Chopin’s thesis). There exist valid resampling methods that require a single uniform (0,1) to be drawn, rather than m. The proposed method has a flavour similar to systematic resampling, but I wonder at the validity of returning values that are averages of earlier simulations, since this modifies their distribution into ones with slimmer tails. (And it is parameterisation dependent.) Producing xi with probability pi is not the same as returning the average of the pixi‘s.

Monte Carlo simulation and resampling methods for social science [book review]

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , on October 6, 2014 by xi'an

Monte Carlo simulation and resampling methods for social science is a short paperback written by Thomas Carsey and Jeffrey Harden on the use of Monte Carlo simulation to evaluate the adequacy of a model and the impact of assumptions behind this model. I picked it in the library the other day and browsed through the chapters during one of my métro rides. Definitely not an in-depth reading, so be warned before reading the [telegraphic] review!

Overall, I think the book is doing a good job of advocating the use of simulation to evaluate the pros and cons of a given model (rephrased as data generating process) when faced with data. And doing it in R. After some rudiments in probability theory and in R programming, it briefly explains the use of resident random generators if not of how to handle new distributions and then spend a large part of the book on simulation around generalised and regular linear models. For instance, in the linear model, the authors test the impact of heterocedasticity, multicollinearity, measurement error, omitted variable(s), serial correlation, clustered data, and heavy-tailed errors. While this is a perfect way of exploring those semi-hidden hypotheses behind the linear model, I wonder at the impact on students of this exploration. On the one hand, they will perceive the importance of those assumptions and hopefully remember them. On the other hand, and this is a very recurrent criticism of mine, this implies a lot of maturity from the students, i.e., they have to distinguish the data, the model [maybe] behind the data, the finite if large number of hypotheses one can test, and the interpretation of the outcome of a simulation test… Given that they were introduced to basic probability just a few chapters before, this expectation [from the students] may prove unrealistic. (And a similar criticism applies to the following chapters, from GLM to jackknife and bootstrap.)

At the end of the book, the authors ask the question as to how could a reader use the information in this book towards one’s work. Drafting a generic protocol for this reader, who is supposed to consider “alterations to the data generating process” (p.272) and to “identify a possible problem or assumption violation” (p.271). Thus requiring a readership “who has some training in quantitative methods” (p.1). And then some more. But I definitely sympathise with the goal of confronting models and theory with the harsh reality of simulation output!