Archive for Edinburgh

limited shelf validity

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , on December 11, 2019 by xi'an

A great article from Steve Stigler in the new, multi-scaled, and so exciting Harvard Data Science Review magisterially operated by Xiao-Li Meng, on the limitations of old datasets. Illustrated by three famous datasets used by three equally famous statisticians, Quetelet, Bortkiewicz, and Gosset. None of whom were fundamentally interested in the data for their own sake. First, Quetelet’s data was (wrongly) reconstructed and missed the opportunity to beat Galton at discovering correlation. Second, Bortkiewicz went looking (or even cherry-picking!) for these rare events in yearly tables of mortality minutely divided between causes such as military horse kicks. The third dataset is not Guinness‘, but a test between two sleeping pills, operated rather crudely over inmates from a psychiatric institution in Kalamazoo, with further mishandling by Gosset himself. Manipulations that turn the data into dead data, as Steve put it. (And illustrates with the above skull collection picture. As well as warning against attempts at resuscitating dead data into what could be called “zombie data”.)

“Successful resurrection is only slightly more common than in Christian theology.”

His global perspective on dead data is that they should stop being used before extending their (shelf) life, rather than turning into benchmarks recycled over and over as a proof of concept. If only (my two cents) because it leads to calibrate (and choose) methods doing well over these benchmarks. Another example that could have been added to the skulls above is the Galaxy Velocity Dataset that makes frequent appearances in works estimating Gaussian mixtures. Which Radford Neal signaled at the 2001 ICMS workshop on mixture estimation as an inappropriate use of the dataset since astrophysical arguments weighted against a mixture modelling.

“…the role of context in shaping data selection and form—context in temporal, political, and social as well as scientific terms—has been shown to be a powerful and interesting phenomenon.”

The potential for “dead-er” data (my neologism!) increases with the epoch in that the careful sleuth work Steve (and others) conducted about these historical datasets is absolutely impossible with the current massive data sets. Massive and proprietary. And presumably discarded once the associated neural net is designed and sold. Letting the burden of unmasking the potential (or highly probable?) biases to others. Most interestingly, this recoups a “comment” in Nature of 17 October by Sabina Leonelli on the transformation of data from a national treasure to a commodity which “ownership can confer and signal power”. But her call for openness and governance of research data seems as illusory as other attempts to sever the GAFAs from their extra-territorial privileges…

Bayes plaque

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , on November 22, 2019 by xi'an

at the centre of Bayes

Posted in Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , on October 14, 2019 by xi'an

HW AMS & EPSRC MAG-MIGS CDT seminar

Posted in Statistics with tags , , , , , , , , , on October 10, 2019 by xi'an

Some explanation for all these acronyms! I am giving a Actuarial Mathematics & Statistics (AMS) seminar at Heriot-Watt (HW) University, in Edinburgh, tomorow. But in the (new) Bayes Centre, at the University of Edinburgh, rather than on the campus of Heriot-Watt, as this is also the launching day of the Centre for Doctoral Training (CDT) on Mathematical Modelling, Analysis, & Computation (MAG) shared between Heriot-Watt, and the University of Edinburgh, funded by the EPSRC and located in the Maxwell Institute Graduate School (MIGS) in its Bayes Centre. My talk will be on ABC convergence and misspecification.

in a house of lies [book review]

Posted in Books, Travel with tags , , , , , , , , on August 7, 2019 by xi'an

While I found the latest Rankin’s Rebus novels a wee bit disappointing, this latest installment in the stories of the Edinburghian ex-detective is a true pleasure! Maybe because it takes the pretext of a “cold case” suddenly resurfacing to bring back to life characters met in earlier novels of the series. And the borderline practice of DI Rebus himself. Which should matter less at a stage when Rebus has been retired for 10 years (I could not believe it had been that long!, but I feel like I followed Rebus for most of his carreer…) The plot is quite strong with none of the last minute revelations found in some earlier volumes, with a secondary plot that is much more modern and poignant. I also suspect some of the new characters will reappear in the next books, as well as the consequences of a looming Brexit [pushed by a loony PM] on the Scottish underworld… (No,. I do not mean TorysTories!)