## day one at ISBA 22

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , , on June 29, 2022 by xi'an

Started the day with a much appreciated swimming practice in the [alas warm⁺⁺⁺] outdoor 50m pool on the Island with no one but me in the slooow lane. And had my first ride with the biXi system, surprised at having to queue behind other bikes at red lights! More significantly, it was a great feeling to reunite at last with so many friends I had not met for more than two years!!!

My friend Adrian Raftery gave the very first plenary lecture on his work on the Bayesian approach to long-term population projections, which was recently  a work censored by some US States, then counter-censored by the Supreme Court [too busy to kill Roe v. Wade!]. Great to see the use of Bayesian methods validated by the UN Population Division [with at least one branch of the UN

Stephen Lauritzen returning to de Finetti notion of a model as something not real or true at all, back to exchangeability. Making me wonder when exchangeability is more than a convenient assumption leading to the Hewitt-Savage theorem. And sufficiency. I mean, without falling into a Keynesian fallacy, each point of the sample has unique specificities that cannot be taken into account in an exchangeable model. Nice to hear some measure theory, though!!! Plus a comment on the median never being sufficient, recouping an older (and presumably not original) point of mine. Stephen’s (or Fisher’s?) argument being that the median cannot be recursively computed!

Antonietta Mira and I had our ABC session this afternoon with Cecilia Viscardi, Sirio Legramanti, and Massimiliano Tamborino (Warwick) as speakers. Cecilia linked ABC with normalising flows, in collaboration with Dennis Prangle (whose earlier paper on this connection was presented as the first One World ABC seminar). Thus using past simulations to approximate the posterior by a neural network, possibly with a significant increase in computing time when compared with more rudimentary SMC-ABC methods in larger dimensions. Sirio considered summary-free ABC based on discrepancies like Rademacher complexity. Which more or less contains MMD, Kullback-Leibler, Wasserstein and more, although it seems to be dependent on the parameterisation of the observations. An interesting opening at the end was that this approach could apply to non iid settings. Massi presented a paper coauthored with Umberto that had just been arXived. On sequential ABC with a dependence on the summary statistic (hence guided). Further bringing copulas into the game, although this forces another choice [for the marginals] in the method.

Tamara Broderick talked about a puzzling leverage effect of some observations in economic studies where a tiny portion of individuals may modify the significance or the sign of a coefficient, for which I cannot tell whether the data or the reliance on statistical significance are to blame. Robert Kohn presented mixture-of-Gaussian copulas [not to be confused with mixture of Gaussian-copulas!] and Nancy Reid concluded my first [and somewhat exhausting!] day at ISBA with a BFF talk on the different statistical paradigms take on confidence (for which the notion of calibration seems to remain frequentist).

Side comments: First, most people in the conference are wearing masks, which is great! Also, I find it hard to read slides from the screen, which I presume is an age issue (?!) Even more aside, I had Korean lunch in a place that refused to serve me a glass of water, which I find amazing.

## out-standing scientist

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , , , on November 12, 2021 by xi'an

I noticed quite recently that the [Nature] journal Heredity [managed by the Genetics Society] had published an historical / opinion piece on Ronald Fisher and his views on eugenics and race. The authors are all trustees of the Fisher Memorial Trust. The core of the paper contents was also contained in [one of the authors] Stephen Senn’s talk at the JSM round table (I also took part in) and later at the RSS. This is mostly an attempt at resetting Fisher’s position within the era when he lived, in terms of prevalent racism, nationalism, and imperialism. At the core of these woes was a generalised belief in the superiority of some nations, creeds, human groups, even social classes, over others, that was used as a justification in the tragedies of large scale colonialism, the first World War, systemic racism, Nazism, and widespread forced sterilisations….

More attention to the History of Science is needed, as much by scientists as by historians, and especially by biologists, and this should mean a deliberate attempt to understand the thoughts of the great masters of the past, to see in what circumstances or intellectual milieu their ideas were formed, where they took the wrong turning  track or stopped short of the right.”  R.A. Fisher (1959)

While I am thinking the authors are somewhat stretching the arguments isolating Ronald from the worst manifestations of eugenism and racism, as the concept of “voluntary sterilisation” is more than debatable when applied to patients with limited intellectual abilities, as Fisher considered (in 1943) that the Nazi racial laws “have been successful with the best type of German” (which stands as a fairly stupid statement on so many levels, starting with the one that this racial selection had only started a few years before!) and “that the Party sincerely wished to benefit the German racial stock” (in 1948), my already made point is rather that the general tendency of turning genii into saints is bound to meet with disappointment. (Hence, if we have to stick with them, named lectures, prizes, memorials, &tc., should come with an expiration date!)

## baseless!

Posted in Books, Statistics with tags , , , , , , , , , , on July 13, 2021 by xi'an

## Fisher, Bayes, and predictive Bayesian inference [seminar]

Posted in Statistics with tags , , , , , , , , , on April 4, 2021 by xi'an

An interesting Foundations of Probability seminar at Rutgers University this Monday, at 4:30ET, 8:30GMT, by Sandy Zabell (the password is Angelina’s birthdate):

R. A. Fisher is usually perceived to have been a staunch critic of the Bayesian approach to statistics, yet his last book (Statistical Methods and Scientific Inference, 1956) is much closer in spirit to the Bayesian approach than the frequentist theories of Neyman and Pearson.  This mismatch between perception and reality is best understood as an evolution in Fisher’s views over the course of his life.  In my talk I will discuss Fisher’s initial and harsh criticism of “inverse probability”, his subsequent advocacy of fiducial inference starting in 1930, and his admiration for Bayes expressed in his 1956 book.  Several of the examples Fisher discusses there are best understood when viewed against the backdrop of earlier controversies and antagonisms.

## hard birthday problem

Posted in Books, Kids, R, Statistics with tags , , , , , , , , , on February 4, 2021 by xi'an

Click to access birthday.pdf

From an X validated question, found that WordPress now allows for direct link to pdf documents, like the above paper by my old friend Anirban Das Gupta! The question is about estimating a number M of individuals with N distinct birth dates over a year of T days. After looking around I could not find a simpler representation of the probability for N=r other than (1) in my answer,

$\frac{T!}{(\bar N-r)!}\frac{m!}{T^m} \sum_{(r_1,\ldots,r_m);\\\sum_1^m r_i=r\ \&\\\sum_1^m ir_i=m}1\Big/\prod_{j=1}^m r_j! (j!)^{r_j}$

borrowed from a paper by Fisher et al. (Another Fisher!) Checking Feller leads to the probability (p.102)

${T \choose r}\sum_{\nu=0}^r (-1)^{\nu}{r\choose\nu}\left(1-\frac{T-r+\nu}T \right)^m$

which fits rather nicely simulation frequencies, as shown using

apply(!apply(matrix(sample(1:Nb,T*M,rep=TRUE),T,M),1,duplicated),2,sum)


Further, Feller (1970, pp.103-104) justifies an asymptotic Poisson approximation with parameter$$\lambda(M)=\bar{N}\exp\{-M/\bar N\}$ from which an estimate of$M\$ can be derived. With the birthday problem as illustration (pp.105-106)!

It may be that a completion from N to (R¹,R²,…) where the components are the number of days with one birthdate, two birthdates, &tc. could help design an EM algorithm that would remove the summation in (1) but I did not spend more time on the problem (than finding a SAS approximation to the probability!).