## Francis Bach à l’Académie des Sciences

Posted in Statistics with tags , , , , , on April 8, 2020 by xi'an Congrats to Francis Bach, freshly nominated to the French Academy of Sciences, joining Stéphane Mallat²⁰¹⁴ and Éric Moulines²⁰¹⁷ as data science academicians!

## efficiency and the Fréchet-Darmois-Cramèr-Rao bound

Posted in Books, Kids, Statistics with tags , , , , , , , , , , , on February 4, 2019 by xi'an  Following some entries on X validated, and after grading a mathematical statistics exam involving Cramèr-Rao, or Fréchet-Darmois-Cramèr-Rao to include both French contributors pictured above, I wonder as usual at the relevance of a concept of efficiency outside [and even inside] the restricted case of unbiased estimators. The general (frequentist) version is that the variance of an estimator δ of [any transform of] θ with bias b(θ) is

I(θ)⁻¹ (1+b'(θ))²

while a Bayesian version is the van Trees inequality on the integrated squared error loss

(E(I(θ))+I(π))⁻¹

where I(θ) and I(π) are the Fisher information and the prior entropy, respectively. But this opens a whole can of worms, in my opinion since

• establishing that a given estimator is efficient requires computing both the bias and the variance of that estimator, not an easy task when considering a Bayes estimator or even the James-Stein estimator. I actually do not know if any of the estimators dominating the standard Normal mean estimator has been shown to be efficient (although there exist results for closed form expressions of the James-Stein estimator quadratic risk, including one of mine the Canadian Journal of Statistics published verbatim in 1988). Or is there a result that a Bayes estimator associated with the quadratic loss is by default efficient in either the first or second sense?
• while the initial Fréchet-Darmois-Cramèr-Rao bound is restricted to unbiased estimators (i.e., b(θ)≡0) and unable to produce efficient estimators in all settings but for the natural parameter in the setting of exponential families, moving to the general case means there exists one efficiency notion for every bias function b(θ), which makes the notion quite weak, while not necessarily producing efficient estimators anyway, the major impediment to taking this notion seriously;
• moving from the variance to the squared error loss is not more “natural” than using any [other] convex combination of variance and squared bias, creating a whole new class of optimalities (a grocery of cans of worms!);
• I never got into the van Trees inequality so cannot say much, except that the comparison between various priors is delicate since the integrated risks are against different parameter measures.

## machine learning à l’Académie, au Collège, et dans Le Monde

Posted in Books, Statistics, University life with tags , , , , , , , , on January 5, 2018 by xi'an A back-cover story in Le Monde “Sciences & Médecine” of Stéphane Mallat, professor at École Normale and recently elected at the (French) Academy of Sciences and at the Collège de France, on a newly created Chair of Data Sciences.  With works on wavelets, image compression, and neural networks, Stéphane Mallat will give his first lesson on Data Sciences at Collège de France, downtown Paris, on January 11. Entrance is free and open to everyone. (Collège de France is a unique institution, created by Guillaume Budé and supported by François Ier in 1530 to teach topics not taught (then) at the Sorbonne, as indicated by its motto Docet Omnia, including mathematics! Professors are nominated by the current faculty and the closest to statistics, prior to Stéphane Mallat, was Edmond Malinvaud.)

## Le Monde puzzle [#845]

Posted in Books, Kids, Statistics with tags , , , , , , , , , , on December 21, 2013 by xi'an Yet another one of those Le Monde mathematical puzzles which wording is confusing to me:

Take the set of integers between 1 and 1000. endow all of them randomly with red or blue tags. group them by subsets of three or more (grapes). and also group them by pairs so that a switch can change the colour of both integers.  Is it always possible to activate the switches so that one ends up with all grapes being multicoloured?  Unicoloured?

I find it (again!) ultimately puzzling since there are configurations where it cannot work. In the first case, take a grape made of four integers of the same colour, reunited two by two by a switch: activating the switch simply invert the colours but the grape remains uni-coloured. Conversely, take two integers with opposite colours within the same grape. No mater how long one operates the switch, they will remain of an opposite colour, won’t they?! This issue of Le Monde Science&Médecine leaflet actually had several interesting entries, from one on “the thirst of the sociologist for statistical irregularities“—meaning that regression should account for confounding factors like social class versus school performances—to the above picture about weighting the mass of a neutrino—mostly because it strongly reminds of Escher, as I cannot understand the 3D structure of the picture—, to  another tribune of Marco Zito informing me that “quark” is a word invented by James Joyce—and not by Carroll as I believed—, to an interview of Stanislas Dehaene, a neuroscientist professor at Collège de France and a (fairly young) member of the Académie des Sciences—where he mentions statistical learning patterns that reminded me of the Bayesian constructs Pierre Bessière discussed on France Culture—.

## Le Monde puzzle [#838]

Posted in Books, Kids, R with tags , , , , , , , , , , on November 2, 2013 by xi'an Another one of those Le Monde mathematical puzzles which wording is confusing to me:

The 40 members of the Academy vote for two prizes. [Like the one recently attributed to my friend and coauthor Olivier Cappé!] Once the votes are counted for both prizes, it appears that the total votes for each of the candidates take all values between 0 and 12. Is it possible that two academicians never pick the same pair of candidates?

I find it puzzling… First because the total number of votes is then equal to 78, rather than 80=2 x 40. What happened to the vote of the “last” academician? Did she or he abstain? Or did two academicians abstain on candidates for only one prize each?  Second, because of the incertitude in the original wording: can we assume with certainty that each integer between 0 and 12 is only taken once? If so, it would mean that the total number of candidates to the prizes is equal to 13. Third, the question seems unrelated with the “data”: since sums only are known, switching the votes of academicians Dupond and Dupont for candidates Durand and Martin in prize A (or in prize B) does not change the number of votes for Durand and Martin.

If we assume that each integer between 0 and 12 only appears once in the collection of the sums of the votes and that one academician abstained on both prizes, the number of candidates for one of the prizes can vary between 4 and 9, with compatible solutions provided by this R line of code:

```N=5
ok=TRUE
while (ok){
prop=sample(0:12,N)
los=(1:13)[-(prop+1)]-1
ok=((sum(prop)!=39)||(sum(los)!=39))}
```

which returns solutions like

```> N=5
> prop
  9 11  7 12
> los
  0  1  2  3  4  5  6  8 10
```

but does not help in answering the question!

Now, with Robin‘s help, (whose Corcoran memorial prize I should have mentioned in due time!), I reformulate the question as

The 40 members of the Academy vote for two prizes. Once the votes are counted for both prizes, it appears that all values between 0 and 12 are found among the total votes for each of the candidates. Is it possible that two academicians never pick the same pair of candidates?

which has a nicer solution: since all academicians have voted there are two extra votes (40-38), meaning either twice 2 or thrice 1. So there are either 14 or 15 candidates ex toto.  With at least 4 for a given prize. I then checked whether or not the above event could occur, using the following (pedestrian) R code:

```for (t in 1:10^3){
#pick number of replicae
R=sample(1:2,1); cand=13+R
#pick number of literary candidates
N=sample(4:(cand-4),1)
if (R==2){
}else{
ok=TRUE
while (ok){
drop=sample(1:cand,N)
ok=((sum(prop)!=40)||(sum(los)!=40))
}
pool=NULL
for (j in 1:N)
pool=c(pool,rep(j,prop[j]))
cool=NULL
for (j in 1:(cand-N))
cool=c(cool,rep(100+j,los[j]))
cool=sample(cool) #random permutation