## Monte Carlo workshop (Tage 1 & 2)

Posted in Statistics, Travel, University life with tags , , , , , , , , , , on February 21, 2013 by xi'an

Gathering with simulators from other fields (mostly [quantum] physicists) offers both the appeal of seeing different perspectives on simulation and the diffiulty of having to filter alien vocabulary and presentation styles (generally assuming too much background from the audience). For instance; while the first talk on Tuesday by Gergely Barnaföldi about using GPUs for simulation was quite accessible, showing poor performances of the (CPU based) Mersenne twister., when using Dieharder as the evaluator. (This was in comparison with GPU-based solutions.) This provided an interesting contrapoint to the (later) seminar by Frederik James on random generators. (Of course, I did have some preliminary background on the topic.)

On the opposite, the second talk by Stefan Schäfer involved hybrid Monte Carlo methods but it took a lot of efforts (for me) to translate back to my understanding of the notion, gathered from this earlier Read Paper of Girolami and Calderhead, with the heat-bath and leapfrog algorithms. One extreme talk in this regard was William Lester’s talk on Wednesday morning on quantum Monte Carlo and its applications in computational chemistry where I could not get past the formulas! Too bad because it sounded quite innovative with notions like variational Monte Carlo and diffusion Monte Carlo… Nice movies, though. On the other hand, the final talk of the morning by Gabor Molnar-Saska on option pricing was highly pedagogical, defining everything and using simple examples as illustrations. (It certainly did not cure my misgivings about modelling the evolution of stock prices via pre-defined diffusions like Black-and-Scholes’, but the introduction was welcome, given the heterogeneity of the audience.) Both talks on transportation problems were also more accessible (maybe because they involved no pysics!)

The speakers in the afternoon sessions of Wednesday also made a huge effort to bring the whole audience up-to-date about their topic, like protein folding and high-energy particle physics (although everyone knows about the Higgs boson nowadays!). And ensemble Kalman filters (x2). In particular, Andrew Stuart did a great job with his simulation movies. Even the final talk about path-sampling for quantum simulation was mostly understandable, at least the problematic of it.  Sadly, at this stage, I still cannot put a meaning on “quantum Monte Carlo”… (Incidentally, I do not think my own talk reached much of the audience, missing convincing examples I did not have time to present:)

## nach Hamburg

Posted in Statistics, Travel, University life with tags , , , , , on February 19, 2013 by xi'an

Today, I will visit Germany again, hopefully with less snow in the airport, in the City of Hamburg, as I am attending an intriguing workshop at the interface between statistics, physics and other sciences (the full title is “Monte Carlo Methods in Natural Sciences, in Engineering and in Economics”) at  DESY (Deutsches Elektronen-Synchrotron). The program of the workshop is a bit tight, but definitely interesting, with mostly speakers unknown to me. (Due to an even tighter schedule, I will miss the guided tour of the structure, unfortunately!)

## Reuven Rubinstein (1938-2012)

Posted in Statistics with tags , , , , , , , , on December 10, 2012 by xi'an

I just learned last night that Professor Reuven Rubinstein passed away. While I was not a close collaborator of him, I met Reuven Rubinstein a few times at conferences and during a short visit to Paris, and each time learned from the encounter. I also appreciated his contributions to the field of simulation, esp. his cross-entropy method that culminated in the book The Cross-Entropy Method with Dirk Kroese. Reuven was involved in many aspects of simulation along his prolific career, he will be especially remembered for his 1981 book Simulation and the Monte Carlo Method that is arguably the very first book on simulation as a Monte Carlo method. This book had a recent second edition, co-authored with Dirk Kroese as well. It is thus quite a sad day to witness this immense contributor to the field leave us. (Here is a link to his webpage at Technion, including pictures of a trip to the Gulag camp where he spent most of his childhood.) I presume there will be testimonies about his influence at the WSC 2012 conference here in Berlin.

## estimating a constant (not really)

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on October 12, 2012 by xi'an

Larry Wasserman wrote a blog entry on the normalizing constant paradox, where he repeats that he does not understand my earlier point…Let me try to recap here this point and the various comments I made on StackExchange (while keeping in mind all this is for intellectual fun!)

The entry is somehow paradoxical in that Larry acknowledges (in that post) that the analysis in his book, All of Statistics, is wrong. The fact that “g(x)/c is a valid density only for one value of c” (and hence cannot lead to a notion of likelihood on c) is the very reason why I stated that there can be no statistical inference nor prior distribution about c: a sample from f does not bring statistical information about c and there can be no statistical estimate of c based on this sample. (In case you did not notice, I insist upon statistical!)

To me this problem is completely different from a statistical problem, at least in the modern sense: if I need to approximate the constant c—as I do in fact when computing Bayes factors—, I can produce an arbitrarily long sample from a certain importance distribution and derive a converging (and sometimes unbiased) approximation of c. Once again, this is Monte Carlo integration, a numerical technique based on the Law of Large Numbers and the stabilisation of frequencies. (Call it a frequentist method if you wish. I completely agree that MCMC methods are inherently frequentist in that sense, And see no problem with this because they are not statistical methods. Of course, this may be the core of the disagreement with Larry and others, that they call statistics the Law of Large Numbers, and I do not. This lack of separation between both notions also shows up in a recent general public talk on Poincaré’s mistakes by Cédric Villani! All this may just mean I am irremediably Bayesian, seeing anything motivated by frequencies as non-statistical!) But that process does not mean that c can take a range of values that would index a family of densities compatible with a given sample. In this Monte Carlo integration approach, the distribution of the sample is completely under control (modulo the errors induced by pseudo-random generation). This approach is therefore outside the realm of Bayesian analysis “that puts distributions on fixed but unknown constants”, because those unknown constants parameterise the distribution of an observed sample. Ergo, c is not a parameter of the sample and the sample Larry argues about (“we have data sampled from a distribution”) contains no information whatsoever about c that is not already in the function g. (It is not “data” in this respect, but a stochastic sequence that can be used for approximation purposes.) Which gets me back to my first argument, namely that c is known (and at the same time difficult or impossible to compute)!

Let me also answer here the comments on “why is this any different from estimating the speed of light c?” “why can’t you do this with the 100th digit of π?” on the earlier post or on StackExchange. Estimating the speed of light means for me (who repeatedly flunked Physics exams after leaving high school!) that we have a physical experiment that measures the speed of light (as the original one by Rœmer at the Observatoire de Paris I visited earlier last week) and that the statistical analysis infers about c by using those measurements and the impact of the imprecision of the measuring instruments (as we do when analysing astronomical data). If, now, there exists a physical formula of the kind

$c=\int_\Xi \psi(\xi) \varphi(\xi) \text{d}\xi$

where φ is a probability density, I can imagine stochastic approximations of c based on this formula, but I do not consider it a statistical problem any longer. The case is thus clearer for the 100th digit of π: it is also a fixed number, that I can approximate by a stochastic experiment but on which I cannot attach a statistical tag. (It is 9, by the way.) Throwing darts at random as I did during my Oz tour is not a statistical procedure, but simple Monte Carlo à la Buffon…

Overall, I still do not see this as a paradox for our field (and certainly not as a critique of Bayesian analysis), because there is no reason a statistical technique should be able to address any and every numerical problem. (Once again, Persi Diaconis would almost certainly differ, as he defended a Bayesian perspective on numerical analysis in the early days of MCMC…) There may be a “Bayesian” solution to this particular problem (and that would nice) and there may be none (and that would be OK too!), but I am not even convinced I would call this solution “Bayesian”! (Again, let us remember this is mostly for intellectual fun!)

## Monte Carlo workshop im Hamburg

Posted in Statistics, Travel, University life with tags , , , , on August 17, 2012 by xi'an

I have just received an invitation to take part in this workshop in Hamburg, next year. I will most certainly go, esp. when considering it takes place on the site of the German synchrotron!