Archive for springer

Bayes Factors for Forensic Decision Analyses with R [book review]

Posted in Books, R, Statistics with tags , , , , , , , , , , , , , on November 28, 2022 by xi'an

My friend EJ Wagenmaker pointed me towards an entire book on the BF by Bozza (from Ca’Foscari, Venezia), Taroni and Biederman. It is providing a sort of blueprint for using Bayes factors in forensics for both investigative and evaluative purposes. With R code and free access. I am of course unable to judge of the relevance of the approach for forensic science (I was under the impression that Bayesian arguments were usually not well-received in the courtroom) but find that overall the approach is rather one of repositioning the standard Bayesian tools within a forensic framework.

“The [evaluative] purpose is to assign a value to the result of a comparison between an item of unknown source and an item from a known source.”

And thus I found nothing shocking or striking from this standard presentation of Bayes factors, including the call to loss functions, if a bit overly expansive in its exposition. The style is also classical, with a choice of grey background vignettes for R coding parts that we also picked in our R books! If anything, I would have expected more realistic discussions and illustrations of prior specification across the hypotheses (see e.g. page 34), while the authors are mostly centering on conjugate priors and the (de Finetti) trick of the equivalent prior sample size. Bayes factors are mostly assessed using a conservative version of Jeffreys’ “scale of evidence”. The computational section of the book introduces MCMC (briefly) and mentions importance sampling, harmonic mean (with a minimalist warning), and Chib’s formula (with no warning whatsoever).

“The [investigative] purpose is to provide information in investigative proceedings (…) The scientist (…) uses the findings to generate hypotheses and suggestions for explanations of observations, in order to give guidance to investigators or litigants.”

Chapter 2 is about standard models: inferring about a proportion, with some Monte Carlo illustration,  and the complication of background elements, normal mean, with an improper prior making an appearance [on p.69] with no mention being made of the general prohibition of such generalised priors when using Bayes factors or even of the Lindley-Jeffreys paradox. Again, the main difference with Bayesian textbooks stands with the chosen examples.

Chapter 3 focus on evidence evaluation [not in the computational sense] but, again, the coverage is about standard models: processing the Binomial, multinomial, Poisson models, again though conjugates. (With the side remark that Fig 3.2 is rather unhelpful: when moving the prior probability of the null from zero to one, its posterior probability also moves from zero to one!) We are back to the Normal mean case with the model variance being known then unknown. (An unintentionally funny remark (p.96) about the dependence between mean and variance being seen as too restrictive and replaced with… independence!). At last (for me!), the book is pointing [p.99] out that the BF is highly sensitive to the choice of the prior variance (Lindley-Jeffreys, where art thou?!), but with a return of the improper prior (on said variance, p.102) with no debate on the ensuing validity of the BF. Multivariate Normals are also presented, with Wishart priors on the precision matrix, and more details about Chib’s estimate of the evidence. This chapter also contains illustrations of the so-called score-based BF which is simply (?) a Bayes factor using a distribution on a distance summary (between an hypothetical population and the data) and an approximation of the distributions of these summaries, provided enough data is available… I also spotted a potentially interesting foray into BF variability (Section 3.4.2), although not reaching all the way to a notion of BF posterior distributions.

Chapter 4 stands for Bayes factors for investigation, where alternative(s) is(are) less specified, as testing eg Basmati rice vs non-Basmati rice. But there is no non-parametric alternative considered in the book. Otherwise, it looks to me rather similar to Chapter 3, i.e. being back to binomial, multinomial models, with more discussions onm prior specification, more normal, or non-normal model, where the prior distribution is puzzingly estimated by a kernel density estimator, a portmanteau alternative (p.157), more multivariate Normals with Wishart priors and an entry on classification & discrimination.

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Books Review section in CHANCE. As appropriate for a book about Chance!]

le logiciel R

Posted in Books, R, Statistics, University life with tags , , , on August 25, 2011 by xi'an

For once, here is a book review I wrote in French about the book Le logiciel R, written by Pierre Lafaye de Micheaux (Université de Montréal), Rémy Drouilhet (Université de Grenoble 2) and Benoît Liquet (Université de Bordeaux 2):

Ce livre édité par Springer (dans la même collection que Le Choix Bayesien) propose une couverture exhaustive des principales fonctions du langage R et une illustration de son utilisation dans la résolution des problèmes statistiques les plus classiques. Il est concu pour un public relativement large et peut donc à la fois servir de manuel de cours en second ou troisième cycle des universités et de livre de référence pour des chercheurs et ingénieurs utilisant R. Le livre montre en moins de 500 pages comment développer une plateforme de travail entièrement fondée sur ce logiciel libre qui forme la base de nombreuses expériences statistiques, grâce à la formidable collection de procédures construites par la communauté R.

Le livre est très bien conçu d’un point de vue pédagogique, avec de nombreuses trouvailles typographiques pour distinguer les divers niveaux de discours. Le livre est accompagné d’un logiciel (package) spécifique disponible en ligne et de jeux de données appropriées. La première partie, consacrée au langage proprement dit, est sans doute la plus réussie, avec une couverture vraiment complète des possibilités de R, au point d’inclure les instructions pour créer son propre package R, et une belle section sur la programmation objet. Le système d’exploitation de référence est Windows, qui n’est peut-être pas le plus adéquat  à la fois en terme de maniabilité et d’utilisateurs potentiels, mais le livre contient aussi des détails pour les utilisateurs des systèmes  Mac OS et Linux. Je serais sans doute un peu plus critique sur la partie statistique, au sens où les auteurs ont décidé d’inclure de nombreux rappels de statistique mathématique et même de probabilité, ce qui ne me semble pas l’objet premier du livre. Mais cela ne constitue pas un problème fondamental en ce que les lecteurs ne peuvent que se sentir confortés dans leur apprentissage de R. Le livre comprend pour chaque chapitre des exercices élémentaires mais fort utiles pour s’auto-tester, et des TPs beaucoup plus conséquents. D’une lecture facile grâce à ses choix typographiques, le livre contient même quelques graphiques en couleur. Il s’agit à mon avis du meilleur ouvrage de formation au langage R disponible en langue francaise et je pense l’utiliser dès la rentrée pour mon cours de L3.

The book is currently under translation by Robin Ryder (who also translated part of my book Introducing Monte Carlo Methods with R) and Robin told me today he was very nearly done, so the translation should soon appear in one of Springer’s collections, most likely Use R!.

StatProb [wiki]

Posted in R, Statistics with tags , , , , , , , , on August 1, 2010 by xi'an

Via the [financial and technical] support of Springer, probability and statistics societies are launching a specialised wiki called StatProb. It operates as a wiki in that authors can submit short articles on any topic, with further co-authors joining in later to improve those articles, but with the contents guaranteed via the filter of an editorial board. The members of the board and subsequent associate editors are nominated by the statistical societies involved in the project. (For instance, I was nominated by the Royal Statistical Society., Susie Bayarri by ISBA, George Casella by the ASA, etc.) As a starting basis, StatProb will reproduce a few hundred entries from the incoming International Encyclopedia of Statistical Sciences edited by Miodrag Lovric (to which I contributed). Obviously, the wiki will only work if enough contributors submit their piece and make StatProb a reference for statistics. I joined the project because, as opposed to costly encyclopedias, wikis are living things that evolve with the field (if enough activity is maintained by its members) and that can be accessed freely by all. Another good thing about StatProb is that entries are submitted in LaTeX, making the output looking fairly reasonnable. (To start the ball rolling, we submitted this short piece on random number generation with George Casella, exctacted from an older piece that had been sitting around for a while. It does not mean to be the only piece on random number generation, nor on MCMC or Monte Carlo methods. And it can be updated and augmented as in other wikis.) Unless I am confused, I think the site will be officially launched at JSM 2010 in Vancouver this weekend.

%d bloggers like this: