## making a random walk geometrically ergodic

Posted in R, Statistics with tags , , , , , , , , , on March 2, 2013 by xi'an

While a random walk Metropolis-Hastings algorithm cannot be uniformly ergodic in a general setting (Mengersen and Tweedie, AoS, 1996), because it needs more energy to leave far away starting points, it can be geometrically ergodic depending on the target (and the proposal). In a recent Annals of Statistics paper, Leif Johnson and Charlie Geyer designed a trick to turn a random walk Metropolis-Hastings algorithm into a geometrically ergodic random walk Metropolis-Hastings algorithm by virtue of an isotropic transform (under the provision that the original target density has a moment generating function). This theoretical result is complemented by an R package called mcmc. (I have not tested it so far, having read the paper in the métro.) The examples included in the paper are however fairly academic and I wonder how the method performs in practice, on truly complex models, in particular because the change of variables relies on (a) an origin and (b) changing the curvature of space uniformly in all dimensions. Nonetheless, the idea is attractive and reminds me of a project of ours with Randal Douc,  started thanks to the ‘Og and still under completion.

## packed off!!!

Posted in Books, pictures, R, Statistics with tags , , , , , , , , , , on February 9, 2013 by xi'an

Deliverance!!! We have at last completed our book! Bayesian Essentials with R is off my desk! In a final nitty-gritty day of compiling and recompiling the R package bayess and the LaTeX file, we have reached versions that were in par with our expectations. The package has been submitted to CRAN (it has gone back and forth a few times, with requests to lower the computing time in the examples: each example should take less than 10s, then 5s…), then accepted by CRAN, incl. a Windows version, and the book has be sent to Springer-Verlag. This truly is a deliverance for me as this book project has been on my work horizon almost constantly for more than the past two years, led to exciting times in Luminy, Carnon and Berlin, has taken an heavy toll on my collaborations and research activities, and was slowly turning into a unsavoury chore! I am thus delighted Jean-Michel and I managed to close the door before any disastrous consequence on either the book or our friendship could develop. Bayesian Essentials with R is certainly an improvement compared with Bayesian Core, primarily by providing a direct access to the R code. We dearly hope it will attract a wider readership by reducing the mathematical requirements (even though some parts are still too involved for most undergraduates) and we will keep testing it with our own students in Montpellier and Paris over the coming months. In the meanwhile, I just enjoy this feeling of renewed freedom!!!

## structure and uncertainty, Bristol, Sept. 26

Posted in Books, pictures, R, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , , on September 27, 2012 by xi'an

Another day full of interesting and challenging—in the sense they generated new questions for me—talks at the SuSTain workshop. After another (dry and fast) run around the Downs; Leo Held started the talks with one of my favourite topics, namely the theory of g-priors in generalized linear models. He did bring a new perspective on the subject, introducing the notion of a testing Bayes factor based on the residual statistic produced by a classical (maximum likelihood) analysis, connected with earlier works of Vale Johnson. While I did not truly get the motivation for switching from the original data to this less informative quantity, I find this perspective opening new questions for dealing with settings where the true data is replaced with one or several classical statistics. With possible strong connections to ABC, of course. Incidentally, Leo managed to produce a napkin with Peter Green’s intro to MCMC dating back from their first meeting in 1994: a feat I certainly could not reproduce (as I also met both Peter and Leo for the first time in 1994, at CIRM)… Then Richard Everit presented his recent JCGS paper on Bayesian inference on latent Markov random fields, centred on the issue that simulating the latent MRF involves an MCMC step that is not exact (as in our earlier ABC paper for Ising models with Aude Grelaud). I already discussed this paper in an earlier blog and the only additional question that comes to my mind is whether or not a comparison with the auxiliary variable approach of Møller et al. (2006) would make sense.

In the intermission, I had a great conversation with Oliver Ratman on his talk of yesterday on the surprising feature that some models produce as “data” some sample from a pseudo-posterior.. Opening once again new vistas! The following talks were more on the mathematical side, with James Cussens focussing on the use of integer programming for Bayesian variable selections, then Éric Moulines presenting a recent work with a PhD student of his on PAC-Bayesian bounds and the superiority of combining experts. Including a CRAN package. Éric concluded his talk with the funny occurence of Peter’s photograph on Éric’s Microsoft Research Profile own page, due to Éric posting our joint photograph at the top of Pic du Midi d’Ossau in 2005… (He concluded with a picture of the mountain that was the exact symmetry of mine yesterday!)

The afternoon was equally superb with Gareth Roberts covering fifteen years of scaling MCMC algorithms, from the mythical 0.234 figure to the optimal temperature decrease in simulated annealing, John Kent playing the outlier with an EM algorithm—however including a formal prior distribution and raising the challenge as to why Bayesians never had to constrain the posterior expectation, which prompted me to infer that (a) the prior distribution should include all constraints and (b) the posterior expectation was not the “right” tool in non-convex parameters spaces—. Natalia Bochkina presented a recent work, joint with Peter Green, on connecting image analysis with Bayesian asymptotics, reminding me of my early attempts at reading Ibragimov and Has’minskii in the 1990’s. Then a second work with Vladimir Spoikoini on Bayesian asymptotics with misspecified models, introducing a new notion of effective dimension. The last talk of the day was by Nils Hjort about his coming book on “Credibility, confidence and likelihood“—not yet advertised by CUP—which sounds like an attempt at resuscitating Fisher by deriving distributions in the parameter space from frequentist confidence intervals. I already discussed this notion in an earlier blog, so I am fairly skeptical about it, but the talk was representative of Nils’ highly entertaining and though-provoking style! Esp. as he sprinkled the talk with examples where MLE (and some default Bayes estimators) did not work. And reanalysed one of Chris Sims‘ example presented during his Nobel Prize talk…

## the Wang-Landau algorithm reaches the flat histogram in finite time

Posted in R, Statistics, University life with tags , , , , on October 20, 2011 by xi'an

Pierre Jacob and Robin Ryder (from Paris-Dauphine, CREST, and Statisfaction) have just arXived (and submitted to the Annals of Applied Probability) a neat result on the Wang-Landau algorithm. (This algorithm, which modifies the target in a sort of reweighted partioned sampling to achieve faster convergence, has always been perplexing to me.)  They show that some variations of the Wang-Landau algorithm meet the flat histogram criterion in finite time, and, just as importantly that other variations do not reach this criterion. The proof uses elegant Markov chain arguments and I hope the paper makes it through, as there are very few theoretical results on this algorithm. (Pierre also wrote recently a paper with Luke Bornn, Arnaud Doucet, and Pierre Del Moral, on An Adaptive Interacting Wang-Landau Algorithm for Automatic Density Exploration last week, with an associated R package. Not yet on CRAN.)

## “simply start over and build something better”

Posted in R, Statistics, University life with tags , , , on September 13, 2010 by xi'an

The post on the shortcomings of R has attracted a huge number of readers and Ross Ihaka has now posted a detailed comment that is fairly pessimistic… Given the radical directions drafted in this comment from the father of R (along with Robert Gentleman), I once again re-post it as a main entry to advertise more broadly its contents. (Obviously, the whole debate is now far beyond my reach! Please comment on the most current post, i.e. this one.)

Since (something like) my name has been taken in vain here, let me chip in.

I’ve been worried for some time that R isn’t going to provide the base that we’re going to need for statistical computation in the future. (It may well be that the future is already upon us.) There are certainly efficiency problems (speed and memory use), but there are more fundamental issues too. Some of these were inherited from Sand some are peculiar to R.

One of the worst problems is scoping. Consider the following little gem.

f =function() {
if (runif(1) > .5)
x = 10
x
}

The x being returned by this function is randomly local or global. There are other examples where variables alternate between local and non-local throughout the body of a function. No sensible language would allow this. It’s ugly and it makes optimisation really difficult. This isn’t the only problem, even weirder things happen  because of interactions between scoping and lazy evaluation.

In light of this, I’ve come to the conclusion that rather than “fixing” R, it would be much more productive to simply start over and build something better. I think the best you could hope for by fixing the efficiency problems in R would be to boost performance by a small multiple, or perhaps as much as an order of magnitude. This probably isn’t enough to justify the effort (Luke Tierney has been working on R compilation for over a decade now).

To try to get an idea of how much speedup is possible, a number of us have been carrying out some experiments to see how much better we could do with something new. Based on prototyping we’ve been doing at Auckland, it looks like it should be straightforward to get two orders of magnitude speedup over R, at least for those computations which are currently bottle-necked. There are a couple of ways to make this happen.

First, scalar computations in R are very slow. This in part because the R interpreter is very slow, but also because there are a no scalar types. By introducing scalars and using compilation it looks like its possible to get a speedup by a factor of several hundred for scalar computations. This is important because it means that many ghastly uses of array operations and the apply functions could be replaced by simple loops. The cost of these improvements is that scope declarations become mandatory and (optional) type declarations are necessary to help the compiler.

As a side-effect of compilation and the use of type-hinting it should be possible to eliminate dispatch overhead for certain (sealed) classes (scalars and arrays in particular). This won’t bring huge benefits across the board, but it will mean that you won’t have to do foreign language calls to get efficiency.

A second big problem is that computations on aggregates (data frames in particular) run at glacial rates. This is entirely down to unnecessary copying because of the call-by-value semantics. Preserving call-by-value semantics while eliminating the extra copying is hard. The best we can probably do is to take a conservative approach. R already tries to avoid copying where it can, but fails in an epic fashion. The alternative is to abandon call-by-value and move to reference semantics. Again, prototyping indicates that several hundredfold speedup is possible (for data frames in particular).

The changes in semantics mentioned above mean that the new language will not be R. However, it won’t be all that far from R and it should be easy to port R code to the new system, perhaps using some form of automatic translation.

If we’re smart about building the new system, it should be possible to make use of multi-cores and parallelism. Adding this to the mix might just make it possible to get a three order-of-magnitude performance boost with just a fraction of the memory that R uses. I think it’s something really worth putting some effort into.

I also think one other change is necessary. The license will need to a better job of protecting work donated to the commons than GPL2 seems to have done. I’m not willing to have any more of my work purloined by the likes of Revolution Analytics, so I’ll be looking for better protection from the license (and being a lot more careful about who I work with).

## How to use mcsm

Posted in Books, R, Statistics with tags , , , , , on February 28, 2010 by xi'an

Within the past two days, I received this email

Dear Prof.Robert
I have just bought your recent book on Introducing Monte Carlo Methods with R.  Although I have checked your web page for the R programs (bits of the code in the book, codes for generating the figures and tec – not the package available on cran)  used in the book, I have not found them.
I wonder whether you could make them available.
Thank you very much for your time and patience.
Yours Sincerely

and that one

Dear Prof. Robert,
I bought “Introducing Monte Carlo Methods with R” from Amazon booksore. I am a teacher at […] University, and I choose this book as a textbook in my class.
I can not find the R package “mcsm” according to your book (page 5). Where can I download the R package “mcsm”?
I highly appreciate your help.
Best regards,

so I fear that readers may miss the piece of information provided in the book. As indicated on pages 36-37 of Introducing Monte Carlo Methods with R, mcsm is a registred R package, readers can therefore download it manually from CRAN,  but they should first try using install.packages in R as this is both easier and safer. (They should check on the main R project webpage for more help in installing packages.)

Another useful information for readers is that the code used on the examples of Introducing Monte Carlo Methods with R is available from mcsm through the demo command/code. Typing demo(Chapter.3) starts the production of the examples of Chapter 3:

> demo(Chapter.3)

demo(Chapter.3)
————————
Type  <Return>   to start :
> # Section 3.1, Introduction
>
> ch=function(la){ integrate(function(x){x^(la-1)*exp(-x)},0,Inf)\$val}
> plot(lgamma(seq(.01,10,le=100)),log(apply(as.matrix(
+  seq(.01,10,le=100)),1,ch)),xlab=”log(integrate(f))”,
+  ylab=expression(log(Gamma(lambda))),pch=19,cex=.6)
> S=readline(prompt=”Type  <Return>   to continue : “)
Type  <Return>   to continue :

and obviously the same for all other chapters. This also means the code is available in the corresponding file, something like

/usr/lib/R/site-library/mcsm/demo/Chapter.3.R

depending on your system.

## The R Companion to MCSM (9)

Posted in Books, Linux, Statistics with tags , , , , on April 29, 2009 by xi'an

The mcsm package has been validated by CRAN since I got the following emails

`the right version on CRAN now ...`

and even (recovered from my spam box because of the W word!)

`this notification has been generated automatically.`
`Your package mcsm_1.0.tar.gz has been built for Windows and`
`will be published within 24 hours in the corresponding CRAN directory`
`(CRAN/bin/windows/contrib/2.9/).`
`R version 2.9.0 Patched (2009-04-27 r48414)`

(something I obviously could not test!). So, with the package validated and made publicly available, we sent the Enter Monte Carlo Statistical Methods yesterday to John Kimmel from Springer Verlag New York for assessment and (eventually) publication. This is certainly the fastest book writing I have done so far since the previous books took between one year (The Bayesian Choice, Monte Carlo Statistical Methods) and two (Bayesian Core). Of course, using the background provided by Monte Carlo Statistical Methods helped a lot, including using some of the R programs we add already written for this book. The final sequence of chapters is

1. Introduction to R programming
2. Random variable generation
3. Monte Carlo methods
4. Controlling and accelerating convergence
5. Monte Carlo optimization
6. Metropolis-Hastings algorithms
7. Gibbs samplers
8. Convergence monitoring for MCMC algorithms

since we eventually decided against including a solution chapter for space (the book is long enough with 270 pages without adding the 50 pages of condensed solutions) and marketing reasons (we can upgrade the solutions in continuous time as well as make [some of] them only available to instructors), even though this may attract criticisms. (The mcsm package can [and need] also be upgraded at will, given the rapidity with which the CRAN maintainers include new packages.) Now, my forecast for the publication of the book is March 2010 at best, given that we need to go through a review process at first and that the production usually takes about six months (or more as in the case of the paperback The Bayesian Choice whose very first printing was lost by the delivery carrier except for a very few collectors!).