Archive for Cardiff

visual effects

Posted in Books, pictures, Statistics with tags , , , , , , , , , , , on November 2, 2018 by xi'an

As advertised and re-discussed by Dan Simpson on the Statistical Modeling, &tc. blog he shares with Andrew and a few others, the paper Visualization in Bayesian workflow he wrote with Jonah Gabry, Aki Vehtari, Michael Betancourt and Andrew Gelman was one of three discussed at the RSS conference in Cardiff, last week month, as a Read Paper for Series A. I had stored the paper when it came out towards reading and discussing it, but as often this good intention led to no concrete ending. [Except concrete as in concrete shoes…] Hence a few notes rather than a discussion in Series B A.

Exploratory data analysis goes beyond just plotting the data, which should sound reasonable to all modeling readers.

Fake data [not fake news!] can be almost [more!] as valuable as real data for building your model, oh yes!, this is the message I am always trying to convey to my first year students, when arguing about the connection between models and simulation, as well as a defense of ABC methods. And more globally of the very idea of statistical modelling. While indeed “Bayesian models with proper priors are generative models”, I am not particularly fan of using the prior predictive [or the evidence] to assess the prior as it may end up in a classification of more or less all but terrible priors, meaning that all give very little weight to neighbourhoods of high likelihood values. Still, in a discussion of a TAS paper by Seaman et al. on the role of prior, Kaniav Kamary and I produced prior assessments that were similar to the comparison illustrated in Figure 4. (And this makes me wondering which point we missed in this discussion, according to Dan.)  Unhappy am I with the weakly informative prior illustration (and concept) as the amount of fudging and calibrating to move from the immensely vague choice of N(0,100) to the fairly tight choice of N(0,1) or N(1,1) is not provided. The paper reads like these priors were the obvious and first choice of the authors. I completely agree with the warning that “the utility of the the prior predictive distribution to evaluate the model does not extend to utility in selecting between models”.

MCMC diagnostics, beyond trace plots, yes again, but this recommendation sounds a wee bit outdated. (As our 1998 reviewww!) Figure 5(b) links different parameters of the model with lines, which does not clearly relate to a better understanding of convergence. Figure 5(a) does not tell much either since the green (divergent) dots stand within the black dots, at least in the projected 2D plot (and how can one reach beyond 2D?) Feels like I need to rtfm..!

“Posterior predictive checks are vital for model evaluation”, to wit that I find Figure 6 much more to my liking and closer to my practice. There could have been a reference to Ratmann et al. for ABC where graphical measures of discrepancy were used in conjunction with ABC output as direct tools for model assessment and comparison. Essentially predicting a zero error with the ABC posterior predictive. And of course “posterior predictive checking makes use of the data twice, once for the fitting and once for the checking.” Which means one should either resort to loo solutions (as mentioned in the paper) or call for calibration of the double-use by re-simulating pseudo-datasets from the posterior predictive. I find the suggestion that “it is a good idea to choose statistics that are orthogonal to the model parameters” somewhat antiquated, in that this sounds like rephrasing the primeval call to ancillary statistics for model assessment (Kiefer, 1975), while pretty hard to implement in modern complex models.

free and graphic session at RSS 2018 in Cardiff

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , on July 11, 2018 by xi'an

Reposting an email I received from the Royal Statistical Society, this is to announce a discussion session on three papers on Data visualization in Cardiff City Hall next September 5, as a free part of the RSS annual conference. (But the conference team must be told in advance.)

Paper:             ‘Visualizing spatiotemporal models with virtual reality: from fully immersive environments to applications in stereoscopic view

Authors:         Stefano Castruccio (University of Notre Dame, USA) and Marc G. Genton and Ying Sun (King Abdullah University of Science and Technology, Thuwal)

 Paper:             Visualization in Bayesian workflow’

Authors:            Jonah Gabry (Columbia University, New York), Daniel Simpson (University of Toronto), Aki Vehtari (Aalto University, Espoo), Michael Betancourt (Columbia University, New York, and Symplectomorphic, New York) and Andrew Gelman (Columbia University, New York)

Paper:             ‘Graphics for uncertainty’

Authors:         Adrian W. Bowman (University of Glasgow)

PDFs and supplementary files of these papers from StatsLife and the RSS website. As usual, contributions can be sent in writing, with a deadline of September 19.

March(es) for Science

Posted in Statistics with tags , , , , , , , on April 22, 2017 by xi'an

Today there are around 500 marches for Science organised around the World (incl. on in Kangerlussuaq, Qeqqata, Greenland!). Primarily to protest the unprecedented attacks of trumpism on science, scientific values, and scientists, and not only through budget cuts, agency closures, public data erasures, but also denegation of scientific expertise and data to advance financial and partisan interests against climate, water preservation, minorities rights, women equality, and international relations. Being now at a remote retreat in Northern Wales, I will walk virtually at the Cardiff March for Science.

L’Aquila: earthquake, verdict, and statistics

Posted in Statistics, University life with tags , , , , , , , , , on October 25, 2012 by xi'an

Yesterday I read this blog entry by Peter Coles, a Professor of Theoretical Astrophysics at Cardiff and soon in Brighton, about L’Aquila earthquake verdict, condemning six Italian scientists to severe jail sentences. While most of the blogs around reacted against this verdict as an anti-scientific decision and as a 21st Century remake of Giordano Bruno‘s murder by the Roman Inquisition, Peter Coles argues in the opposite that the scientists were not scientific enough in that instance. And should have used statistics and probabilistic reasoning. While I did not look into the details of the L’Aquila earthquake judgement and thus have no idea whether or not the scientists were guilty in not signalling the potential for disaster, were an earthquake to occur, I cannot but repost one of Coles’ most relevant paragraphs:

I thought I’d take this opportunity to repeat the reasons I think statistics and statistical reasoning are so important. Of course they are important in science. In fact, I think they lie at the very core of the scientific method, although I am still surprised how few practising scientists are comfortable even with statistical language. A more important problem is the popular impression that science is about facts and absolute truths. It isn’t. It’s a process. In order to advance, it has to question itself.

Statistical reasoning also applies outside science to many facets of everyday life, including business, commerce, transport, the media, and politics. It is a feature of everyday life that science and technology are deeply embedded in every aspect of what we do each day. Science has given us greater levels of comfort, better health care, and a plethora of labour-saving devices. It has also given us unprecedented ability to destroy the environment and each other, whether through accident or design. Probability even plays a role in personal relationships, though mostly at a subconscious level.

A bit further down, Peter Coles also bemoans the shortcuts and oversimplification of scientific journalism, which reminded me of the time Jean-Michel Marin had to deal with radio journalists about an “impossible” lottery coincidence:

Years ago I used to listen to radio interviews with scientists on the Today programme on BBC Radio 4. I even did such an interview once. It is a deeply frustrating experience. The scientist usually starts by explaining what the discovery is about in the way a scientist should, with careful statements of what is assumed, how the data is interpreted, and what other possible interpretations might be and the likely sources of error. The interviewer then loses patience and asks for a yes or no answer. The scientist tries to continue, but is badgered. Either the interview ends as a row, or the scientist ends up stating a grossly oversimplified version of the story.