Archive for philosophy of sciences

severe testing or severe sabotage? [not a book review]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , on October 16, 2018 by xi'an

Last week, I received this new book of Deborah Mayo, which I was looking forward reading and annotating!, but thrice alas, the book had been sabotaged: except for the preface and acknowledgements, the entire book is printed upside down [a minor issue since the entire book is concerned] and with some part of the text cut on each side [a few letters each time but enough to make reading a chore!]. I am thus waiting for a tested copy of the book to start reading it in earnest!


La déraisonnable efficacité des mathématiques

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , on May 11, 2017 by xi'an

Although it went completely out of my mind, thanks to a rather heavy travel schedule, I gave last week a short interview about the notion of mathematical models, which got broadcast this week on France Culture, one of the French public radio channels. Within the daily La Méthode Scientifique show, which is a one-hour emission on scientific issues, always a [rare] pleasure to listen to. (Including the day they invited Claire Voisin.) The theme of the show that day was about the unreasonable effectiveness of mathematics, with the [classical] questioning of whether it is an efficient tool towards solving scientific (and inference?) problems because the mathematical objects pre-existed their use or we are (pre-)conditioned to use mathematics to solve problems. I somewhat sounded like a dog in a game of skittles, but it was interesting to listen to the philosopher discussing my relativistic perspective [provided you understand French!]. And I appreciated very much the way Céline Loozen the journalist who interviewed me sorted the chaff from the wheat in the original interview to make me sound mostly coherent! (A coincidence: Jean-Michel Marin got interviewed this morning on France Inter, the major public radio, about the Grothendieck papers.)

machine learning and the future of realism

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , on May 4, 2017 by xi'an

Giles and Cliff Hooker arXived a paper last week with this intriguing title. (Giles Hooker is an associate professor of statistics and biology at Cornell U, with an interesting blog on the notion of models, while Cliff Hooker is a professor of philosophy at Newcastle U, Australia.)

“Our conclusion is that simplicity is too complex”

The debate in this short paper is whether or not machine learning relates to a model. Or is it concerned with sheer (“naked”) prediction? And then does it pertain to science any longer?! While it sounds obvious at first, defining why science is more than prediction of effects given causes is much less obvious, although prediction sounds more pragmatic and engineer-like than scientific. (Furthermore, prediction has a somewhat negative flavour in French, being used as a synonym to divination and opposed to prévision.) In more philosophical terms, prediction offers no ontological feature. As for a machine learning structure like a neural network being scientific or a-scientific, its black box nature makes it much more the later than the former, in that it brings no explanation for the connection between input and output, between regressed and regressors. It further lacks the potential for universality of scientific models. For instance, as mentioned in the paper, Newton’s law of gravitation applies to any pair of weighted bodies, while a neural network built on a series of observations could not be assessed or guaranteed outside the domain where those observations are taken. Plus, would miss the simple square law established by Newton. Most fascinating questions, undoubtedly! Putting the stress on models from a totally different perspective from last week at the RSS.

As for machine learning being a challenge to realism, I am none the wiser after reading the paper. Utilising machine learning tools to produce predictions of causes given effects does not seem to modify the structure of the World and very little our understanding of it, since they do not bring explanation per se. What would lead to anti-realism is the adoption of those tools as substitutes for scientific theories and models.

principles or unprincipled?!

Posted in Books, Kids, pictures, Statistics, Travel with tags , , , , , , , on May 2, 2017 by xi'an

A lively and wide-ranging discussion during the Bayes, Fiducial, Frequentist conference was about whether or not we should look for principles. Someone mentioned Terry Speed (2016) claim that it does not help statistics in being principled. Against being efficient. Which gets quite close in my opinion to arguing in favour of a no-U-turn move to machine learning—which requires a significant amount of data to reach this efficiency, as Xiao-Li Meng mentioned—. The debate brought me back to my current running or droning argument on the need to accommodate [more] the difference between models and reality. Not throwing away statistics and models altogether, but developing assessments that are not fully chained to those models. While keeping probabilistic models to handle uncertainty. One pessimistic conclusion I drew from the discussion is that while we [as academic statisticians] may set principles and even teach our students how to run principled and ethical statistical analyses, there is not much we can do about the daily practice of users of statistics…


Posted in Books, Statistics, University life with tags , , , , , , , , , , on March 7, 2016 by xi'an

Oxford University Press sent me this book by Phyllis Illari and Frederica Russo, Causality (Philosophical theory meets scientific practice) a little while ago. (The book appeared in 2014.) Unless I asked for it, I cannot remember…

“The problem is whether and how to use information of general causation established in science to ascertain individual responsibility.” (p.38)

As the subtitle indicates, this is a philosophy book, not a statistics book. And not particularly intended for statisticians. Hence, I am not exactly qualified to analyse its contents, and even less to criticise its lack of connection with statistics. But this being a blog post…  I read rather slowly through the book, which exposes a wide range (“a map”, p.8) of approaches and perspectives on the notions of causality, some ways to infer about causality, and the point of doing all this, concluding with a relativistic (and thus eminently philosophical) viewpoint defending a “pluralistic mosaic” or a “causal mosaic” that relates to all existing accounts of causality as they “each do something valuable” (p.258). From a naïve bystander perspective, this sounds like a new avatar of deconstructionism applied to causality.

“Simulations can be very illuminating about various phenomena that are complex and have unexpected effects (…) can be run repeatedly to study a system in different situations to those seen for the real system…” (p.15)

This is not to state that the book is uninteresting, as it provides a wide entry into philosophical attempts at categorising and defining causality, if not into the statistical aspects of the issue. (For instance, the problem whether or not causality can be proven uniquely from a statistical perspective is not mentioned.) Among those interesting points in the early chapters, a section (2.5) about simulation. Which however misses the depth of this earlier book on climate simulations I reviewed while in Monash. Or of the discussions at the interdisciplinary seminar last year in Hanover. I.J. Good’s probabilistic causality is mentioned but hardly detailed. (With the warning remark that one “should not confuse predictability with determinism [and] determinism with causality”, p.82.) Continue reading

the philosophical importance of Stein’s paradox [a reply from the authors]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , on January 15, 2016 by xi'an

[In the wake of my comment on this paper written by three philosophers of Science, I received this reply from Olav Vassend.]

Thank you for reading our paper and discussing it on your blog! Our purpose with the paper was to give an introduction to Stein’s phenomenon for a philosophical audience; it was not meant to — and probably will not — offer a new and interesting perspective for a statistician who is already very familiar with Stein’s phenomenon and its extensive literature.

I have a few more specific comments:

1. We don’t rechristen Stein’s phenomenon as “holistic pragmatism.” Rather, holistic pragmatism is the attitude to frequentist estimation that we think is underwritten by Stein’s phenomenon. Since MLE is sometimes admissible and sometimes not, depending on the number of parameters estimated, the researcher has to take into account his or her goals (whether total accuracy or individual-parameter accuracy is more important) when picking an estimator. To a statistician, this might sound obvious, but to philosophers it’s a pretty radical idea.

2. “The part connecting Stein with Bayes again starts on the wrong foot, since it is untrue that any shrinkage estimator can be expressed as a Bayes posterior mean. This is not even true for the original James-Stein estimator, i.e., it is not a Bayes estimator and cannot be a Bayes posterior mean.”

That seems to depend on what you mean by a “Bayes estimator.” It is possible to have an empirical Bayes prior (constructed from the sample) whose posterior mean is identical to the original James-Stein estimator. But if you don’t count empirical Bayes priors as Bayesian, then you are right.

3. “And to state that improper priors “integrate to a number larger than 1” and that “it’s not possible to be more than 100% confident in anything”… And to confuse the Likelihood Principle with the prohibition of data dependent priors. And to consider that the MLE and any shrinkage estimator have the same expected utility under a flat prior (since, if they had, there would be no Bayes estimator!).”

I’m not sure I completely understand your criticisms here. First, as for the relation between the LP and data-dependent priors — it does seem to me that the LP precludes the use of data-dependent priors.  If you use data from an experiment to construct your prior, then — contrary to the LP — it will not be true that all the information provided by the experiment regarding which parameter is true is contained in the likelihood function, since some of the information provided by the experiment will also be in your prior.

Second, as to our claim that the ML estimator has the same expected utility (under the flat prior) as a shrinkage prior that it is dominated by—we incorporated this claim into our paper because it was an objection made by a statistician who read and commented on our paper. Are you saying the claim is false? If so, we would certainly like to know so that we can revise the paper to make it more accurate.

4. I was aware of Rubin’s idea that priors and utility functions (supposedly) are non-separable, but I didn’t (and don’t) quite see the relevance of that idea to Stein estimation.

5. “Similarly, very little of substance can be found about empirical Bayes estimation and its philosophical foundations.”

What we say about empirical Bayes priors is that they cannot be interpreted as degrees of belief; they are just tools. It will be surprising to many philosophers that priors are sometimes used in such an instrumentalist fashion in statistics.

6. The reason why we made a comparison between Stein estimation and AIC was two-fold: (a) for sociological reasons, philosophers are much more familiar with model selection than they are with, say, the LASSO or other regularized regression methods. (b) To us, it’s precisely because model selection and estimation are such different enterprises that it’s interesting that they have such a deep connection: despite being very different, AIC and shrinkage both rely on a bias-variance trade-off.

7. “I also object to the envisioned possibility of a shrinkage estimator that would improve every component of the MLE (in a uniform sense) as it contradicts the admissibility of the single component MLE!”

I don’t think our suggestion here contradicts the admissibility of single component MLE. The idea is just that if we have data D and D’ about parameters φ and φ’, then the estimates of both φ and φ’ can sometimes be improved if the estimation problems are lumped together and a shrinkage estimator is used. This doesn’t contradict the admissibility of MLE, because MLE is still admissible on each of the data sets for each of the parameters.

Again, thanks for reading the paper and for the feedback—we really do want to make sure our paper is accurate, so your feedback is much appreciated. Lastly, I apologize for the length of this comment.

Olav Vassend

the philosophical importance of Stein’s paradox

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , on November 30, 2015 by xi'an

I recently came across this paper written by three philosophers of Science, attempting to set the Stein paradox in a philosophical light. Given my past involvement, I was obviously interested about which new perspective could be proposed, close to sixty years after Stein (1956). Paper that we should actually celebrate next year! However, when reading the document, I did not find a significantly innovative approach to the phenomenon…

The paper does not start in the best possible light since it seems to justify the use of a sample mean through maximum likelihood estimation, which only is the case for a limited number of probability distributions (including the Normal distribution, which may be an implicit assumption). For instance, when the data is Student’s t, the MLE is not the sample mean, no matter how shocking that might sounds! (And while this is a minor issue, results about the Stein effect taking place in non-normal settings appear much earlier than 1998. And earlier than in my dissertation. See, e.g., Berger and Bock (1975). Or in Brandwein and Strawderman (1978).)

While the linear regression explanation for the Stein effect is already exposed in Steve Stigler’s Neyman Lecture, I still have difficulties with the argument in that for instance we do not know the value of the parameter, which makes the regression and the inverse regression of parameter means over Gaussian observations mere concepts and nothing practical. (Except for the interesting result that two observations make both regressions coincide.) And it does not seem at all intuitive (to me) that imposing a constraint should improve the efficiency of a maximisation program… Continue reading