“Enfin, l’envahissement des méthodologistes amène à avoir des reflexions purement mathématiques.”[At last, the invasion of methodologists leads to purely mathematical reasonings.]

“Ce modèle, qui a nourri une quantité de méthodologistes, est devenu une dictature morale.”[This model, which has fed quantity of methodologists, has become a moral dictatorship.]

“…il faut nous débarasser des mathématiciens, des metéorologistes[sic]dans ce domaine.”[we must get rid of mathematicians and meteorologists in this domain]

“…conseil scientifique dans lequel on trouvait deux modélisateurs de l’avenir (qui pour moi représentent l’équivalent de l’astrologie), des maniaques de la méthodologie. Les médecins confrontés au problème du soin représentaient une minorité qui n’avait pas nécessairement l’habitude de s’exprimer et qui se trouvait noyée par cet habillage pseudo-scientifique.”[…scientific committee including two modelisators of the future (equivalent in my opinion to astrologers), manics of methodology. Physicians facing treatment problems were a minority not necessarily used to intervene, overwhelmed by this pseudo-scientific babbling.]

Obviously I have no expertise in drug development or even in epidemiology, but the name-calling tone of this tribune, as illustrated by the above quotes, is appalling and populist, more in the spirit of Trump than of a rational search for scientific evidence. On the opposite, the arguments therein are a-scientific and reject the use of mathematical and statistical methodology for being… mathematical. And resort to name-calling, while not considering the more than philosophical aspect that opting for this drug rather than another one may be reducing survival chances for some groups of patients. *(While the title chosen for this post reflects the title of Raoult’s tribune, with its philosophical pretenses, let me stress that Feyerabend’s book is not mentioned therein and that the article contains no indication that the author relates to Feyerabend’s views.)*

**A** weekly Monde current mathematical puzzle that reminded me of an earlier one (but was too lazy to check):

If ADULE-ELUDE=POINT, was is the largest possible value of POINT? With the convention that all letters correspond to different digits and no digit can start with 0. Same question when ADULE+ELUDE=POINT.

The run of a brute force R search return 65934 as the solution (codegolf welcomed!)

dify<-function(aluda,point) (sum(aluda*10^(4:0))-sum(rev(aluda)*10^(4:0))) num2dig<-function(dif) (dif%/%10^(0:4))%%10 sl=NULL for (t in 1:1e6){ adule=sample(0:9,5) while((dify(aluda)<=0)||(!prod(adule[c(1,5)]))) adule=sample(0:9,5) point=rev(num2dig(dify(adule))) if ((!sum(duplicated(point)))&(prod(point%in%(0:9)[-adule-1]))) sl=rbind(sl,c(adule,point))} sl=as.matrix(distinct(as.data.frame(sl),.keep_all = TRUE))

where *distinct* is a dplyr R function.

> 94581-18549 [1] 76032

The code can be easily turned into solving the second question

> 31782+28713 [1] 60495]]>

“The [UK] Department of Health says reported changes to the abortion law, that would allow women to take both pills at home during the coronavirus outbreak, are not going ahead.” *Independent*, 23 March

“Texas and Ohio have included abortions among the nonessential surgeries and medical procedures that they are requiring to be delayed, saying they are trying to preserve precious protective equipment for health care workers and to make space for a potential flood of coronavirus patients.” *The New York Times*, 23 March

“Le ministre de la Santé, Olivier Véran, et la secrétaire d’Etat chargée de l’Égalité femmes-hommes, Marlène Schiappa, ont tenté lundi de rassurer : les IVG « sont considérées comme des interventions urgentes », et leur « continuité doit être assurée ». Le gouvernement veillera à ce que « le droit des femmes à disposer de leur corps ne soit pas remis en cause », ont-ils assuré.” *Le Parisien*, 23 March

“Lawmakers voted on Wednesday to liberalize New Zealand’s abortion law and allow unrestricted access during the first half of pregnancy, ending the country’s status as one of the few wealthy nations to limit the grounds for abortion during that period.” *The New York Times*, 18 March

]]>

Data de-duplication is the process of detecting records in one or more datasets which refer to the same entity. In this paper we tackle the de-duplication process via a latent entity model, where the observed data are perturbed versions of a set of key variables drawn from a finite population of N different entities. The main novelty of our approach is to consider the population size N as an unknown model parameter. As a result, a salient feature of the proposed method is the capability of the model to account for the de-duplication uncertainty in the population size estimation. As by-products of our approach we illustrate the relationships between de-duplication problems and capture-recapture models and we obtain a more adequate prior distribution on the linkage structure. Moreover we propose a novel simulation algorithm for the posterior distribution of the matching configuration based on the marginalization of the key variables at population level. We apply our method to two synthetic data sets comprising German names. In addition we illustrate a real data application, where we match records from two lists which report information about people killed in the recent Syrian conflict.

As a preliminary warning about my biases, I am always reserved at mixing introductions to probability theory and to (Bayesian) statistics in the same book, as I feel they should be separated to avoid confusion. As for instance between histograms and densities, or between (theoretical) expectation and (empirical) mean. I therefore fail to relate to the pace and tone adopted in the book which, in my opinion, seems to dally on overly simple examples [far too often concerned with food or baseball] while skipping over the concepts and background theory. For instance, introducing the concept of subjective probability as early as page 6 is laudable but I doubt it will engage fresh readers when describing it as a measurement of one’s “belief about the truth of an event”, then stressing that “make any kind of measurement, one needs a tool like a scale or ruler”. Overall, I have no particularly focused criticisms on the probability part except for the discrete vs continuous imbalance. (With the Poisson distribution not covered in the Discrete Distributions chapter. And the “bell curve” making a weird and unrigorous appearance there.) Galton’s board (no mention found of quincunx) could have been better exploited towards the physical definition of a prior, following Steve Stiegler’s analysis, by adding a second level. Or turned into an R coding exercise. In the continuous distributions chapter, I would have seen the cdf coming first to the pdf, rather than the opposite. And disliked the notion that a Normal distribution was supported by an histogram of (marathon) running times, i.e. values lower bounded by 122 (at the moment). Or later (in Chapter 8) for Roger Federer’s serving times. Incidentally, a fun typo on p.191, at least fun for LaTeX users, as

with an extra space between `\’ and `mid’! (I also noticed several occurrences of the unvoidable “the the” typo in the last chapters.) The simulation from a bivariate Normal distribution hidden behind a customised R function sim_binom() when it could have been easily described as a two-stage hierarchy. And no comment on the fact that a sample from Y-1.5X could be directly derived from the joint sample. (Too unconscious a statistician?)

When moving to Bayesian inference, a large section is spent on very simple models like estimating a proportion or a mean, covering both discrete and continuous priors. And strongly focusing on conjugate priors despite giving warnings that they do not necessarily reflect prior information or prior belief. With some debatable recommendation for “large” prior variances as weakly informative or (worse) for Exp(1) as a reference prior for sample precision in the linear model (p.415). But also covering Bayesian model checking either via prior predictive (hence Bayes factors) or posterior predictive (with no mention of using the data twice). A very marginalia in introducing a sufficient statistic for the Normal model. In the Normal model checking section, an estimate of the posterior density of the mean is used without (apparent) explanation.

“It is interesting to note the strong negative correlation in these parameters. If one assigned informative independent priors on and , these prior beliefs would be counter to the correlation between the two parameters observed in the data.”

For the same reasons of having to cut on mathematical validation and rigour, Chapter 9 on MCMC is not explaining why MCMC algorithms are converging outside of the finite state space case. The proposal in the algorithmic representation is chosen as a Uniform one, since larger dimension problems are handled by either Gibbs or JAGS. The recommendations about running MCMC do not include how many iterations one “should” run (or other common queries on Stack eXchange), albeit they do include the sensible running multiple chains and comparing simulated predictive samples with the actual data as a model check. However, the MCMC chapter very quickly and inevitably turns into commented JAGS code. Which I presume would require more from the students than just reading the available code. Like JAGS manual. Chapter 10 is mostly a series of examples of Bayesian hierarchical modeling, with illustrations of the shrinkage effect like the one on the book cover. Chapter 11 covers simple linear regression with some mentions of weakly informative priors, although in a BUGS spirit of using large [enough?!] variances: “If one has little information about the location of a regression parameter, then the choice of the prior guess is not that important and one chooses a large value for the prior standard deviation . So the regression intercept and slope are each assigned a Normal prior with a mean of 0 and standard deviation equal to the large value of 100.” (p.415). Regardless of the scale of y? Standardisation is covered later in the chapter (with the use of the R function scale()) as part of constructing more informative priors, although this sounds more like data-dependent priors to me in the sense that the scale and location are summarily estimated by empirical means from the data. The above quote also strikes me as potentially confusing to the students, as it does not spell at all how to design a joint distribution on the linear regression coefficients that translate the concentration of these coefficients along y̅=β⁰+β¹x̄. Chapter 12 expands the setting to multiple regression and generalised linear models, mostly consisting of examples. It however suggests using cross-validation for model checking and then advocates DIC (deviance information criterion) as “to approximate a model’s out-of-sample predictive performance” (p.463). If only because it is covered in JAGS, the definition of the criterion being relegated to the last page of the book. Chapter 13 concludes with two case studies, the (often used) Federalist Papers analysis and a baseball career hierarchical model. Which may sound far-reaching considering the modest prerequisites the book started with.

In conclusion of this rambling [lazy Sunday] review, this is not a textbook I would have the opportunity to use in Paris-Dauphine but I can easily conceive its adoption for students with limited maths exposure. As such it offers a decent entry to the use of Bayesian modelling, supported by a specific software (JAGS), and rightly stresses the call to model checking and comparison with pseudo-observations. Provided the course is reinforced with a fair amount of computer labs and projects, the book can indeed achieve to properly introduce students to Bayesian thinking. Hopefully leading them to seek more advanced courses on the topic.

**Update:** Jim Albert sent me the following precisions after this review got on-line:

Thanks for your review of our recent book. We had a particular audience in mind, specifically undergraduate American students with some calculus background who are taking their first course in probability and statistics. The traditional approach (which I took many years ago) teaches some probability one semester and then traditional inference (focusing on unbiasedness, sampling distributions, tests and confidence intervals) in the second semester. There didn’t appear to be any Bayesian books at that calculus-based undergraduate level and that motivated the writing of this book. Anyway, I think your comments were certainly fair and we’ve already made some additions to our errata list based on your comments.