Archive for CHANCE

chance call for book reviewers

Posted in Statistics with tags , , , , , , , , on May 14, 2019 by xi'an

Since I have been unable to find local reviewers for my CHANCE review column of the above recent CRC Press books, namely

I am calling for volunteers among ‘Og’s readers. Please contact me if interested.

Statistics and Health Care Fraud & Measuring Crime [ASA book reviews]

Posted in Books, Statistics with tags , , , , , , , , , , , , , , , , on May 7, 2019 by xi'an

From the recently started ASA books series on statistical reasoning in science and society (of which I already reviewed a sequel to The Lady tasting Tea), a short book, Statistics and Health Care Fraud, I read at the doctor while waiting for my appointment, with no chances of cheating! While making me realise that there is a significant amount of health care fraud in the US, of which I had never though of before (!), with possibly specific statistical features to the problem, besides the use of extreme value theory, I did not find me insight there on the techniques used to detect these frauds, besides the accumulation of Florida and Texas examples. As  such this is a very light introduction to the topic, whose intended audience of choice remains unclear to me. It is stopping short of making a case for statistics and modelling against more machine-learning options. And does not seem to mention false positives… That is, the inevitable occurrence of some doctors or hospitals being above the median costs! (A point I remember David Spiegelhalter making a long while ago, during a memorable French statistical meeting in Pau.) The book also illustrates the use of a free auditing software called Rat-stats for multistage sampling, which apparently does not go beyond selecting claims at random according to their amount. Without learning from past data. (I also wonder if the criminals can reduce the chances of being caught by using this software.)

A second book on the “same” topic!, Measuring Crime, I read, not waiting at the police station, but while flying to Venezia. As indicated by the title, this is about measuring crime, with a lot of emphasis on surveys and census and the potential measurement errors at different levels of surveying or censusing… Again very little on statistical methodology, apart from questioning the data, the mode of surveying, crossing different sources, and establishing the impact of the way questions are stated, but also little on bias and the impact of policing and preventing AIs, as discussed in Weapons of Math Destruction and in some of Kristin Lum’s papers.Except for the almost obligatory reference to Minority Report. The book also concludes on an history chapter centred at Edith Abbott setting the bases for serious crime data collection in the 1920’s.

[And the usual disclaimer applies, namely that this bicephalic review is likely to appear later in CHANCE, in my book reviews column.]

Bayes for good

Posted in Books, Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , on November 27, 2018 by xi'an

A very special weekend workshop on Bayesian techniques used for social good in many different sense (and talks) that we organised with Kerrie Mengersen and Pierre Pudlo at CiRM, Luminy, Marseilles. It started with Rebecca (Beka) Steorts (Duke) explaining [by video from Duke] how the Syrian war deaths were processed to eliminate duplicates, to be continued on Monday at the “Big” conference, Alex Volfonsky (Duke) on a Twitter experiment on the impact of being exposed to adverse opinions as depolarising (not!) or further polarising (yes), turning into network causal analysis. And then Kerrie Mengersen (QUT) on the use of Bayesian networks in ecology, through observational studies she conducted. And the role of neutral statisticians in case of adversarial experts!

Next day, the first talk of David Corlis (Peace-Work), who writes the Stats for Good column in CHANCE and here gave a recruiting spiel for volunteering in good initiatives. Quoting Florence Nightingale as the “first” volunteer. And presenting a broad collection of projects as supports to his recommendations for “doing good”. We then heard [by video] Julien Cornebise from Element AI in London telling of his move out of DeepMind towards investing in social impacting projects through this new startup. Including working with Amnesty International on Darfour village destructions, building evidence from satellite imaging. And crowdsourcing. With an incoming report on the year activities (still under embargo). A most exciting and enthusiastic talk!

Continue reading

Is that a big number? [book review]

Posted in Books, Kids, pictures, Statistics with tags , , , , , , , , , on July 31, 2018 by xi'an

A book I received prior to its publication a few days ago from OXford University Press (OUP), as a book editor for CHANCE (usual provisions apply: the contents of this post will be more or less reproduced in my column in CHANCE when it appears). Copy that I found in my mailbox in Warwick last week and read over the (very hot) weekend.

The overall aim of this book by Andrew Elliott is to encourage numeracy (or fight innumeracy) by making sense of absolute quantities by putting them in perspective, teaching about log scales, visualisation, and divide-and-conquer techniques. And providing a massive list of examples and comparisons, sometimes for page after page… The book is associated with a fairly rich website, itself linked with the many blogs of the author and a myriad of other links and items of information (among which I learned of the recent and absurd launch of Elon Musk’s Tesla car in space! A première in garbage dumping…). From what I can gather from these sites, some (most?) of the material in the book seems to have emerged from the various blog entries.

“Length of River Thames (386 km) is 2 x length of the Suez Canal (193.3 km)”

Maybe I was too exhausted by heat and a very busy week in Warwick for our computational statistics week, the football  2018 World Cup having nothing to do with this, but I could not keep reading the chapters of the book in a continuous manner, suffering from massive information overdump! Being given thousands of entries kills [for me] the appeal of outing weight or sense to large and very large and humongous quantities. And the final vignette in each chapter of pairing of numbers like the one above or the one below

“Time since earliest writing (5200 y) is 25 x time since birth of Darwin (208 y)”

only evokes the remote memory of some kid journal I read from time to time as a kid with this type of entries (I cannot remember the name of the journal!). Or maybe it was a journal I would browse while waiting at the hairdresser’s (which brings back memories of endless waits, maybe because I did not like going to the hairdresser…) Some of the background about measurement and other curios carry a sense of Wikipediesque absolute in their minute details.

A last point of disappointment about the book is the poor graphical design or support. While the author insists on the importance of visualisation on grasping the scales of large quantities, and the webpage is full of such entries, there is very little backup with great graphs to be found in “Is that a big number?” Some of the pictures seem taken from an anonymous databank (where are the towers of San Geminiano?!) and there are not enough graphics. For instance, the fantastic graphics of xkcd conveying the xkcd money chart poster. Or about future. Or many many others

While the style is sometimes light and funny, an overall impression of dryness remains and in comparison I much more preferred Kaiser Fung’s Numbers rule your world and even more both Guesstimation books!

chance meeting

Posted in Statistics with tags , , , , , , , , , on July 10, 2018 by xi'an

As I was travelling to Coventry yesterday, I spotted this fellow passenger on the train from Birmingham with a Valencia 9 bag, and a chat with him. It was a pure chance encounter as he was not attending our summer school, but continued down the line. (These bags are quite sturdy and I kept mine until a zipper broke.)

CHANCE on modern slavery

Posted in Books, Kids, Statistics with tags , , , , on December 23, 2017 by xi'an

Just to mention the latest issue of CHANCE dedicated to the statistical issues related with slavery, edited in collaboration with the Walk Free Foundation. (I remember discussing the possibility of such an issue at the CHANCE editors meeting at JSM, Boston. I also remember Bernard Silverman discussing the case as Senior Scientist to the UK Government.) Difficulties range from defining slavery, to estimating the number of slaves, for instance by capture-mark-recapture methods. to designing ways to protect against slavery. (A stunning figure is the estimated 180,000 slaves in Poland and 20,000 in The Netherlands…)

Practicals of Uncertainty [book review]

Posted in Books, Statistics, University life with tags , , , , , , , on December 22, 2017 by xi'an

On my way to the O’Bayes 2017 conference in Austin, I [paradoxically!] went through Jay Kadane’s Pragmatics of Uncertainty, which had been published earlier this year by CRC Press. The book is to be seen as a practical illustration of the Principles of Uncertainty Jay wrote in 2011 (and I reviewed for CHANCE). The avowed purpose is to allow the reader to check through Jay’s applied work whether or not he had “made good” on setting out clearly the motivations for his subjective Bayesian modelling. (While I presume the use of the same P of U in both books is mostly a coincidence, I started wondering how a third P of U volume could be called. Perils of Uncertainty? Peddlers of Uncertainty? The game is afoot!)

The structure of the book is a collection of fifteen case studies undertaken by Jay over the past 30 years, covering paleontology, survey sampling, legal expertises, physics, climate, and even medieval Norwegian history. Each chapter starts with a short introduction that often explains how he came by the problem (most often as an interesting PhD student consulting project at CMU), what were the difficulties in the analysis, and what became of his co-authors. As noted by the author, the main bulk of each chapter is the reprint (in a unified style) of the paper and most of these papers are actually and freely available on-line. The chapter always concludes with an epilogue (or post-mortem) that re-considers (very briefly) what had been done and what could have been done and whether or not the Bayesian perspective was useful for the problem (unsurprisingly so for the majority of the chapters!). There are also reading suggestions in the other P of U and a few exercises.

“The purpose of the book is philosophical, to address, with specific examples, the question of whether Bayesian statistics is ready for prime time. Can it be used in a variety of applied settings to address real applied problems?”

The book thus comes as a logical complement of the Principles, to demonstrate how Jay himself did apply his Bayesian principles to specific cases and how one can set the construction of a prior, of a loss function or of a statistical model in identifiable parts that can then be criticised or reanalysed. I find browsing through this series of fourteen different problems fascinating and exhilarating, while I admire the dedication of Jay to every case he presents in the book. I also feel that this comes as a perfect complement to the earlier P of U, in that it makes refering to a complete application of a given principle most straightforward, the problem being entirely described, analysed, and in most cases solved within a given chapter. A few chapters have discussions, being published in the Valencia meeting proceedings or another journal with discussions.

While all papers have been reset in the book style, I wish the graphs had been edited as well as they do not always look pretty. Although this would have implied a massive effort, it would have also been great had each chapter and problem been re-analysed or at least discussed by another fellow (?!) Bayesian in order to illustrate the impact of individual modelling sensibilities. This may however be a future project for a graduate class. Assuming all datasets are available, which is unclear from the text.

“We think however that Bayes factors are overemphasized. In the very special case in which there are only two possible “states of the world”, Bayes factors are sufficient. However in the typical case in which there are many possible states of the world, Bayes factors are sufficient only when the decision-maker’s loss has only two values.” (p. 278)

The above is in Jay’s reply to a comment from John Skilling regretting the absence of marginal likelihoods in the chapter. Reply to which I completely subscribe.

[Usual warning: this review should find its way into CHANCE book reviews at some point, with a fairly similar content.]