#isbatoo in The Guardian [without the B word]

December 24, 2017

A week after Kristian Lum posted her testimony of harassment at the Benidorm ISBA conference, The Guardian ran a cover story about this with interviews of both Kristian and Katherine Heller, mentioning further incidents with Steve Scott.

“The allegations against Scott, who declined to comment, has shone a harsh light on harassment in the male-dominated field of statistics, data science and machine learning. Some said misconduct was common – especially at conferences that blend professional work with socializing – and that serial harassers rarely face consequences.”

While the article expands on the existing potential for harassment at conferences (with the above quote reminding me of the call for “More Bayes, less booze” mentioned in an earlier post), its tenor is more about AI and technological companies than statistics conference, even less Bayesian conferences. (Just as a reminder, ISBA is taking the situation very seriously and has established a Task Force for a safe ISBA, which can be contacted at

partly virtual meetings

February 2012

flight to Montpelliers, Feb. 2012A few weeks ago, I read in the NYT an article about the American Academy of Religion cancelling its 2021 annual meeting as a sabbatical year, for environmental reasons.

“We could choose to not meet at a huge annual meeting in which we take over a city. Every year, each participant going to the meeting uses a quantum of carbon that is more than considerable. Air travel, staying in hotels, all of this creates a way of living on the Earth that is carbon intensive. It could be otherwise.”

While I am not in the least interested in the conference or in the topics covered by this society or yet in the benevolent religious activities suggested as a substitute, the notion of cancelling the behemoths that are our national and international academic meetings holds some appeal. I have posted several times on the topic, especially about JSM, and I have no clear and definitive answer to the question. Still, there lies a lack of efficiency on top of the environmental impact that we could and should try to address. Benidorm, June 5, 2010As I was thinking of those issues in the past week, I made another of my numerous “carbon footprints” by attending NIPS across the Atlantic for two workshops than ran in parallel with about twenty others. And hence could have taken place in twenty different places. Albeit without the same exciting feeling of constant intellectual simmering. And without the same mix of highly interactive scholars from all over the planet. (Although the ABC in Montréal workshop seemed predominantly European!) Since workshops are in my opinion the most profitable type of meeting, I would like to experiment with a large meeting made of those (focussed and intense) workshops in such a way that academics would benefit without travelling long distances across the World. One idea would be to have local nodes where a large enough group of researchers could gather to attend video-conferences given from any of the other nodes and to interact locally in terms of discussions and poster presentations. This should even increase the feedback on selected papers as small groups would more readily engage into discussing and criticising papers than a huge conference room. If we could build a World-wide web (!) of such nodes, we could then dream of a non-stop conference, with no central node, no gigantic conference centre, no terrifying beach-ressort…

Multidimension bridge sampling (CoRe in CiRM [5])

July 14, 2010

Since Bayes factor approximation is one of my areas of interest, I was intrigued by Xiao-Li Meng’s comments during my poster in Benidorm that I was using the “wrong” bridge sampling estimator when trying to bridge two models of different dimensions, based on the completion (for \theta_2=(\mu,\sigma^2) and \mu=\theta_1 missing from the first model)

B^\pi_{12}(x)= \dfrac{\displaystyle{\int\pi_1^*(\mu|\sigma^2){\tilde\pi}_1(\sigma^2|x) \alpha(\theta_2) {\pi}_2(\theta_2|x)\hbox{d}\theta_2}}{ \displaystyle{\int{\tilde\pi}_2(\theta_2|x)\alpha(\theta_2) \pi_1(\sigma^2|x)\hbox{d}\sigma^2 } \pi_1^*(\mu|\sigma^2) \hbox{d}\mu }\,.

When revising the normal chapter of Bayesian Core,  here in CiRM, I thus went back to Xiao-Li’s papers on the topic to try to fathom what the “true” bridge sampling was in that case. In Meng and Schilling (2002, JASA), I found the following indication, “when estimating the ratio of normalizing constants with different dimensions, a good strategy is to bridge each density with a good approximation of itself and then apply bridge sampling to estimate each normalizing constant separately. This is typically more effective than to artificially bridge the two original densities by augmenting the dimension of the lower one”. I was unsure of the technique this (somehow vague) indication pointed at until I understood that it meant  introducing one artificial posterior distribution for each of the parameter spaces and processing each marginal likelihood as an integral ratio in itself. For instance, if \eta_1(\theta_1) is an arbitrary normalised density on \theta_1, and \alpha is an arbitrary function, we have the bridge sampling identity on m_1(x):

\int\tilde{\pi}_1(\theta_1|x) \,\text{d}\theta_1 = \dfrac{\displaystyle{\int \tilde{\pi}_1(\theta_1|x) \alpha(\theta_1) {\eta}_1(\theta_1)\,\text{d}\theta_1}}{\displaystyle{\int\eta_1(\theta_1) \alpha(\theta_1) \pi_1(\theta_1|x) \,\text{d}\theta_1}}

Therefore, the optimal choice of \alpha leads to the approximation

\widehat m_1(x) = \dfrac{\displaystyle{\sum_{i=1}^N {\tilde\pi}_1(\theta^\eta_{1i}|x)\big/\left\{{m_1(x) \tilde\pi}_1(\theta^\eta_{1i}|x) + \eta(\theta^\eta_{1i})\right\}}}{\displaystyle{ \sum_{i=1}^{N} \eta(\theta_{1i}) \big/ \left\{{m_1(x) \tilde\pi}_1(\theta_{1i}|x) + \eta(\theta_{1i})\right\}}}

when \theta_{1i}\sim\pi_1(\theta_1|x) and \theta^\eta_{1i}\sim\eta(\theta_1). More exactly, this approximation is replaced with an iterative version since it depends on the unknown m_1(x). The choice of the density \eta is obviously fundamental and it should be close to the true posterior \pi_1(\theta_1|x) to guarantee good convergence approximation. Using a normal approximation to the posterior distribution of \theta or a non-parametric approximation based on a sample from \pi_1(\theta_1|\mathbf{x}), or yet again an average of MCMC proposals are reasonable choices.

The boxplot above compares this solution of Meng and Schilling (2002, JASA), called double (because two pseudo-posteriors \eta_1(\theta_1) and \eta_2(\theta_2) have to be introduced), with Chen, Shao and Ibragim (2001) solution based on a single completion \pi_1^* (using a normal centred at the estimate of the missing parameter, and with variance the estimate from the simulation), when testing whether or not the mean of a normal model with unknown variance is zero. The variabilities are quite comparable in this admittedly overly simple case. Overall, the performances of both extensions are obviously highly dependent on the choice of the completion factors, \eta_1 and \eta_2 on the one hand and \pi_1^* on the other hand, . The performances of the first solution, which bridges both models via \pi_1^*, are bound to deteriorate as the dimension gap between those models increases. The impact of the dimension of the models is less keenly felt for the other solution, as the approximation remains local.

Comments for València 9

June 23, 2010

Following discussions at CREST, we have contributed comments on the following papers

Bernardo, José M. (Universitat de València, Spain)
Integrated objective Bayesian estimation and hypothesis testing. [discussion]

Consonni, Guido (Università di Pavia, Italy)
On moment priors for Bayesian model choice with applications to directed acyclic graphs. [discussion]

Frühwirth-Schnatter, Sylvia (Johannes Kepler Universität Linz, Austria)
Bayesian variable selection for random intercept modeling of Gaussian and non-Gaussian data. [discussion]

Huber, Mark (Claremont McKenna College, USA)
Using TPA for Bayesian inference. [discussion]

Lopes, Hedibert (University of Chicago, USA)
Particle learning for sequential Bayesian computation. [discussion]

Polson, Nicholas (University of Chicago, USA)
Shrink globally, act locally: Sparse Bayesian regularization and prediction. [discussion]

Wilkinson, Darren (University of Newcastle, UK)
Parameter inference for stochastic kinetic models of bacterial gene regulation: a Bayesian approach to systems biology. [discussion]

(with a possible incoming update on Mark Huber’s comments if we manage to get the simulations running in due time).

The Millenium Trilogy (tome 2)

June 20, 2010

Salander was at a loss. She actually was not interested in the answer. It was the process of solution that was the point. So she took a piece of paper and began scribbling figures when she read Fermat’s theorem. But she failed to find a proof for it.

Enforcing a prediction made on the earlier post, I have read through the second Millenium Trilogy volume, Stieg Larson‘s The Girl who played with fire , due to a chance encounter in the convenience shop of the hotel in Benidorm. My overall impression is better than after reading The girl with the dragon tattoo, maybe because there are less raw cruelty scenes, maybe because the hunt-within-the-hunt plot is more compelling, maybe because the action mostly takes place in the present.

By the time Andrew Wiles solved the puzzle in the 1990s, he had been at it for ten years using the world’s most advanced computer programme.

The book feels much more fast-paced than the previous one, it only covers a few calendar days where the police is searching for the “asocial” Lisbeth Salander, who is searching for a Russian sex-trafficker, who is himself searching for Salander! The very first bit taking place in the West Indies is completely unnecessary and does not even play a role in the rest of the novel (except to let us know that Salander was away, can face a tropical storm, seduce a teenager, and kill an abusive husband!). This volume tells us a lot about Salander’s childhood and the reasons why she and her mother ended up in psychiatric institutions. I also like how the book depicts the way the gutter press presents the worst possible picture of Salander from the very few tidbits leaked by the chief investigator (“lesbian Satanist psychopath”).

And all of a sudden she understood. The answer was so disarmingly simple. A game with numbers that lined up and then fell into place in a simple formula that was most similar to a rebus. She gazed straight ahead as she checked the equation.

Now, the inconsistencies and implausibilities I deplored in the first volume are there to be found  as well. First and foremost, Salander is again acting as a super-woman in this novel, mastering parallel financial networks and computer hacking, fashionable clothing and German and Norwegian accents, home modelling (in case you cannot access an Ikea catalogue, the book provides the whole series of references, maybe a Swedish habit of replacing e.g. bookcase by Billy, etc…) and chess playing, fighting techniques (against two Hell’s Angels, no less!) and, best of all!, number theory. I do not understand the motivations of the author for including this mathematical connection (unless maybe he thinks autists all make good mathematicians [when the opposite is closer to the truth!]) but he presumably read some piece on Andrew Wiles’ resolution of Fermat’s Theorem and decided that Salander could as well get a go at it! Hence a sequence of (rather dumb) mathematical quotes about equations and a few idiotic sentences like the ones above. It sounds like the author (or at least Salander) believes that Fermat had a complete proof of his theorem…and of course that Salander, unlike the four-century-some of mathematicians who vainly tried before her, can recover this proof! I have no competence in hacking but the tricks used by Salander to penetrate the whole police force computer network sound rather primitive and unlikely to work, even when obtaining the password from a police officer. Similarly, the fact that private detectives get incorporated within the police team with no suspicion nor limitations and that the first leak ends up with one officer being incriminated instead of a private detective does not sound plausible. The greater picture, namely that all characters are connected, is a weakness of many detective stories, but the book seems to be recycling about every useful character from the previous volume! At last, the relation between Blomkvist and Salander is not well-done, as it is very predictable in Salander being over-reacting vis-à-vis Blomkvist’s long-term relation with Erika Berger and in Blomkvist being completely unaware of this…

New arXiv papers

June 16, 2010

Some recent arXiv papers I will not have time to comment:

València 9 papers on line

June 11, 2010

Just received this email from José Bernardo:

The pdf files of the Valencia 9 invited papers are now available online at the conference webpage, as a link placed by the author name in the V9 invited program list. These are the  last version sent to me by the author, and will be substituted by more current ones as they become available.

I remind you that  you are encouraged to submit written contributions to the discussion of any of these 24 papers even if you could not attend the meeting. Your discussions should be directly emailed by June 28th  to the author(s) of the invited papers, with a copy to me. I will also need the LaTeX source and the eps files of any figures used. Contributions should not exceed six typeset pages (including figures) for invited discussions, and three pages for contributed discussions.

This means anyone can send discussions on the papers presented at the meeting, to be published soon in the Valencia 9 proceedings by Oxford University Press. We are just out of a post-conference meeting with our students and colleagues here at CREST, where we discussed the invited papers by Ickstadt, Nicholls (actually, sadly not open to written discussions!), Meek, and Wilkinson . (On Monday, we plan to cover Dunson, Früwirth-Schnatter, Lopes, Polson, and Vanucci.)