scale acceleration

Posted in pictures, R, Statistics, Travel, University life with tags , , , , , , , , on April 24, 2015 by xi'an

Kate Lee pointed me to a rather surprising inefficiency in matlab, exploited in Sylvia Früwirth-Schnatter’s bayesf package: running a gamma simulation by rgamma(n,a,b) takes longer and sometimes much longer than rgamma(n,a,1)/b, the latter taking advantage of the scale nature of b. I wanted to check on my own whether or not R faced the same difficulty, so I ran an experiment [while stuck in a Thalys train at Brussels, between Amsterdam and Paris…] Using different values for a [click on the graph] and a range of values of b. To no visible difference between both implementations, at least when using system.time for checking.

a=seq(.1,4,le=25)
for (t in 1:25) a[t]=system.time(
rgamma(10^7,.3,a[t]))[3]
a=a/system.time(rgamma(10^7,.3,1))[3]


Once arrived home, I wondered about the relevance of the above comparison, since rgamma(10^7,.3,1) forces R to use 1 as a scale, which may differ from using rgamma(10^7,.3), where 1 is known to be the scale [does this sentence make sense?!]. So I rerun an even bigger experiment as

a=seq(.1,4,le=25)
for (t in 1:25) a[t]=system.time(
rgamma(10^8,.3,a[t]))[3]
a=a/system.time(rgamma(10^7,.3))[3]


and got the graph below. Which is much more interesting because it shows that some values of a are leading to a loss of efficiency of 50%. Indeed. (The most extreme cases correspond to a=0.3, 1.1., 5.8. No clear pattern emerging.)Update

As pointed out by Martyn Plummer in his comment, the C function behind the R rgamma function and Gamma generator does take into account the scale nature of the second parameter, so the above time differences are not due to this function but rather to whatever my computer was running at the same time…! Apologies to anyone I scared with this void warning!

capacity exceeded…

Posted in Books, University life with tags , , , on April 23, 2015 by xi'an

A silly LaTeX error took me a few minutes too many to solve: I defined

\renewcommand\theta{\boldsymbol{\theta}}


which got me the error message

TeX capacity exceeded ,
sorry [ grouping levels =255].


that I understood as a recursive definition. So I instead pre-defined the new θ as

\newcommand\btheta{\boldsymbol{\theta}}
\renewcommand\theta\btheta


which did not work either… After google-ing the issue, I found this on-line LaTeX Wikibook that provided me with the solution:

\let\btheta{\boldsymbol{\theta}}
\renewcommand\theta\btheta


which worked. Of course, a global change of \theta into \btheta would have been much much faster to execute….

ISBA 2016 [logo]

Posted in pictures, Statistics, Travel, University life, Wines with tags , , , , , , , , , , on April 22, 2015 by xi'an

Things are starting to get in place for the next ISBA 2016 World meeting, in Forte Village Resort Convention Center, Sardinia, Italy. June 13-17, 2016. And not only the logo inspired from the nuraghe below. I am sure the program will be terrific and make this new occurrence of a “Valencia meeting” worth attending. Just like the previous occurrences, e.g. Cancún last summer and Kyoto in 2012.

However, and not for the first time, I wonder at the sustainability of such meetings when faced with always increasing—or more accurately sky-rocketing!—registration fees… We have now reached €500 per participant for the sole (early reg.) fees, excluding lodging, food or transportation. If we bet on 500 participants, this means simply renting the convention centre would cost €250,000 for the four or five days of the meeting. This sounds enormous, even accounting for the processing costs of the congress organiser. (By comparison, renting the convention centre MCMSki in Chamonix for three days was less than €20,000.) Given the likely high costs of staying at the resort, it is very unlikely I will be able to support my PhD students  As I know very well of the difficulty to find dedicated volunteers willing to offer a large fraction of their time towards the success of behemoth meetings, this comment is by no means aimed at my friends from Cagliari who kindly accepted to organise this meeting. But rather at the general state of academic meetings which costs makes them out of reach for a large part of the scientific community.

Thus, this makes me wonder anew whether we should move to a novel conference model given that the fantastic growth of the Bayesian community makes the ideal of gathering together in a single beach hotel for a week of discussions, talks, posters, and more discussions unattainable. If truly physical meetings are to perdure—and this notion is as debatable as the one about the survival of paper versions of the journals—, a new approach would be to find a few universities or sponsors able to provide one or several amphitheatres around the World and to connect all those places by teleconference. Reducing the audience size at each location would greatly the pressure to find a few huge and pricey convention centres, while dispersing the units all around would diminish travel costs as well. There could be more parallel sessions and ways could be found to share virtual poster sessions, e.g. by having avatars presenting some else’s poster. Time could be reserved for local discussions of presented papers, to be summarised later to the other locations. And so on… Obviously, something would be lost of the old camaraderie, sharing research questions and side stories, as well as gossips and wine, with friends from all over the World. And discovering new parts of the World. But the cost of meetings is already preventing some of those friends to show up. I thus think it is time we reinvent the Valencia meetings into the next generation. And move to the Valenci-e-meetings.

simulating correlated Binomials [another Bernoulli factory]

Posted in Books, Kids, pictures, R, Running, Statistics, University life with tags , , , , , , , on April 21, 2015 by xi'an

This early morning, just before going out for my daily run around The Parc, I checked X validated for new questions and came upon that one. Namely, how to simulate X a Bin(8,2/3) variate and Y a Bin(18,2/3) such that corr(X,Y)=0.5. (No reason or motivation provided for this constraint.) And I thought the following (presumably well-known) resolution, namely to break the two binomials as sums of 8 and 18 Bernoulli variates, respectively, and to use some of those Bernoulli variates as being common to both sums. For this specific set of values (8,18,0.5), since 8×18=12², the solution is 0.5×12=6 common variates. (The probability of success does not matter.) While running, I first thought this was a very artificial problem because of this occurrence of 8×18 being a perfect square, 12², and cor(X,Y)x12 an integer. A wee bit later I realised that all positive values of cor(X,Y) could be achieved by randomisation, i.e., by deciding the identity of a Bernoulli variate in X with a Bernoulli variate in Y with a certain probability ϖ. For negative correlations, one can use the (U,1-U) trick, namely to write both Bernoulli variates as

$X_1=\mathbb{I}(U\le p)\quad Y_1=\mathbb{I}(U\ge 1-p)$

in order to minimise the probability they coincide.

I also checked this result with an R simulation

> z=rbinom(10^8,6,.66)
> y=z+rbinom(10^8,12,.66)
> x=z+rbinom(10^8,2,.66)
cor(x,y)
> cor(x,y)
[1] 0.5000539


Searching on Google gave me immediately a link to Stack Overflow with an earlier solution with the same idea. And a smarter R code.

Bayesian propaganda?

Posted in Books, Kids, pictures, Statistics, University life with tags , , , , , , , , , on April 20, 2015 by xi'an

“The question is about frequentist approach. Bayesian is admissable [sic] only by wrong definition as it starts with the assumption that the prior is the correct pre-information. James-Stein beats OLS without assumptions. If there is an admissable [sic] frequentist estimator then it will correspond to a true objective prior.”

Amsterdamse huizen

Posted in pictures, Running, Travel, University life with tags , , , on April 19, 2015 by xi'an

vertical likelihood Monte Carlo integration

Posted in Books, pictures, Running, Statistics, Travel, University life with tags , , , , , , , on April 17, 2015 by xi'an

A few months ago, Nick Polson and James Scott arXived a paper on one of my favourite problems, namely the approximation of normalising constants (and it went way under my radar, as I only became aware of it quite recently!, then it remained in my travel bag for an extra few weeks…). The method for approximating the constant Z draws from an analogy with the energy level sampling methods found in physics, like the Wang-Landau algorithm. The authors rely on a one-dimensional slice sampling representation of the posterior distribution and [main innovation in the paper] add a weight function on the auxiliary uniform. The choice of the weight function links the approach with the dreaded harmonic estimator (!), but also with power-posterior and bridge sampling. The paper recommends a specific weighting function, based on a “score-function heuristic” I do not get. Further, the optimal weight depends on intractable cumulative functions as in nested sampling. It would be fantastic if one could draw directly from the prior distribution of the likelihood function—rather than draw an x [from the prior or from something better, as suggested in our 2009 Biometrika paper] and transform it into L(x)—but as in all existing alternatives this alas is not the case. (Which is why I find the recommendations in the paper for practical implementation rather impractical, since, were the prior cdf of L(X) available, direct simulation of L(X) would be feasible. Maybe not the optimal choice though.)

“What is the distribution of the likelihood ordinates calculated via nested sampling? The answer is surprising: it is essentially the same as the distribution of likelihood ordinates by recommended weight function from Section 4.”

The approach is thus very much related to nested sampling, at least in spirit. As the authors later demonstrate, nested sampling is another case of weighting, Both versions require simulations under truncated likelihood values. Albeit with a possibility of going down [in likelihood values] with the current version. Actually, more weighting could prove [more] efficient as both the original nested and vertical sampling simulate from the prior under the likelihood constraint. Getting away from the prior should help. (I am quite curious to see how the method is received and applied.)