Archive for David Hume

the first Bayesian

Posted in Statistics with tags , , , , , , , on February 20, 2018 by xi'an

In the first issue of Statistical Science for this year (2018), Stephen Stiegler pursues the origins of Bayesianism as attributable to Richard Price, main author of Bayes’ Essay. (This incidentally relates to an earlier ‘Og piece on that notion!) Steve points out the considerable inputs of Price on this Essay, even though the mathematical advance is very likely to be entirely Bayes’. It may however well be Price who initiated Bayes’ reflections on the matter, towards producing a counter-argument to Hume’s “On Miracles”.

“Price’s caution in addressing the probabilities of hypotheses suggested by data is rare in early literature.”

A section of the paper is about Price’s approach data-determined hypotheses and to the fact that considering such hypotheses cannot easily fit within a Bayesian framework. As stated by Price, “it would be improbable as infinite to one”. Which is a nice way to address the infinite mass prior.


10 great ideas about chance [book preview]

Posted in Books, pictures, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , on November 13, 2017 by xi'an

[As I happened to be a reviewer of this book by Persi Diaconis and Brian Skyrms, I had the opportunity (and privilege!) to go through its earlier version. Here are the [edited] comments I sent back to PUP and the authors about this earlier version. All in  all, a terrific book!!!]

The historical introduction (“measurement”) of this book is most interesting, especially its analogy of chance with length. I would have appreciated a connection earlier than Cardano, like some of the Greek philosophers even though I gladly discovered there that Cardano was not only responsible for the closed form solutions to the third degree equation. I would also have liked to see more comments on the vexing issue of equiprobability: we all spend (if not waste) hours in the classroom explaining to (or arguing with) students why their solution is not correct. And they sometimes never get it! [And we sometimes get it wrong as well..!] Why is such a simple concept so hard to explicit? In short, but this is nothing but a personal choice, I would have made the chapter more conceptual and less chronologically historical.

“Coherence is again a question of consistent evaluations of a betting arrangement that can be implemented in alternative ways.” (p.46)

The second chapter, about Frank Ramsey, is interesting, if only because it puts this “man of genius” back under the spotlight when he has all but been forgotten. (At least in my circles.) And for joining probability and utility together. And for postulating that probability can be derived from expectations rather than the opposite. Even though betting or gambling has a (negative) stigma in many cultures. At least gambling for money, since most of our actions involve some degree of betting. But not in a rational or reasoned manner. (Of course, this is not a mathematical but rather a psychological objection.) Further, the justification through betting is somewhat tautological in that it assumes probabilities are true probabilities from the start. For instance, the Dutch book example on p.39 produces a gain of .2 only if the probabilities are correct.

> gain=rep(0,1e4)
> for (t in 1:1e4){
+ p=rexp(3);p=p/sum(p)
+ gain[t]=(p[1]*(1-.6)+p[2]*(1-.2)+p[3]*(.9-1))/sum(p)}
> hist(gain)

As I made it clear at the BFF4 conference last Spring, I now realise I have never really adhered to the Dutch book argument. This may be why I find the chapter somewhat unbalanced with not enough written on utilities and too much on Dutch books.

“The force of accumulating evidence made it less and less plausible to hold that subjective probability is, in general, approximate psychology.” (p.55)

A chapter on “psychology” may come as a surprise, but I feel a posteriori that it is appropriate. Most of it is about the Allais paradox. Plus entries on Ellesberg’s distinction between risk and uncertainty, with only the former being quantifiable by “objective” probabilities. And on Tversky’s and Kahneman’s distinction between heuristics, and the framing effect, i.e., how the way propositions are expressed impacts the choice of decision makers. However, it is leaving me unclear about the conclusion that the fact that people behave irrationally should not prevent a reliance on utility theory. Unclear because when taking actions involving other actors their potentially irrational choices should also be taken into account. (This is mostly nitpicking.)

“This is Bernoulli’s swindle. Try to make it precise and it falls apart. The conditional probabilities go in different directions, the desired intervals are of different quantities, and the desired probabilities are different probabilities.” (p.66)

The next chapter (“frequency”) is about Bernoulli’s Law of Large numbers and the stabilisation of frequencies, with von Mises making it the basis of his approach to probability. And Birkhoff’s extension which is capital for the development of stochastic processes. And later for MCMC. I like the notions of “disreputable twin” (p.63) and “Bernoulli’s swindle” about the idea that “chance is frequency”. The authors call the identification of probabilities as limits of frequencies Bernoulli‘s swindle, because it cannot handle zero probability events. With a nice link with the testing fallacy of equating rejection of the null with acceptance of the alternative. And an interesting description as to how Venn perceived the fallacy but could not overcome it: “If Venn’s theory appears to be full of holes, it is to his credit that he saw them himself.” The description of von Mises’ Kollectiven [and the welcome intervention of Abraham Wald] clarifies my previous and partial understanding of the notion, although I am unsure it is that clear for all potential readers. I also appreciate the connection with the very notion of randomness which has not yet found I fear a satisfactory definition. This chapter asks more (interesting) questions than it brings answers (to those or others). But enough, this is a brilliant chapter!

“…a random variable, the notion that Kac found mysterious in early expositions of probability theory.” (p.87)

Chapter 5 (“mathematics”) is very important [from my perspective] in that it justifies the necessity to associate measure theory with probability if one wishes to evolve further than urns and dices. To entitle Kolmogorov to posit his axioms of probability. And to define properly conditional probabilities as random variables (as my third students fail to realise). I enjoyed very much reading this chapter, but it may prove difficult to read for readers with no or little background in measure (although some advanced mathematical details have vanished from the published version). Still, this chapter constitutes a strong argument for preserving measure theory courses in graduate programs. As an aside, I find it amazing that mathematicians (even Kac!) had not at first realised the connection between measure theory and probability (p.84), but maybe not so amazing given the difficulty many still have with the notion of conditional probability. (Now, I would have liked to see some description of Borel’s paradox when it is mentioned (p.89).

“Nothing hangs on a flat prior (…) Nothing hangs on a unique quantification of ignorance.” (p.115)

The following chapter (“inverse inference”) is about Thomas Bayes and his posthumous theorem, with an introduction setting the theorem at the centre of the Hume-Price-Bayes triangle. (It is nice that the authors include a picture of the original version of the essay, as the initial title is much more explicit than the published version!) A short coverage, in tune with the fact that Bayes only contributed a twenty-plus paper to the field. And to be logically followed by a second part [formerly another chapter] on Pierre-Simon Laplace, both parts focussing on the selection of prior distributions on the probability of a Binomial (coin tossing) distribution. Emerging into a discussion of the position of statistics within or even outside mathematics. (And the assertion that Fisher was the Einstein of Statistics on p.120 may be disputed by many readers!)

“So it is perfectly legitimate to use Bayes’ mathematics even if we believe that chance does not exist.” (p.124)

The seventh chapter is about Bruno de Finetti with his astounding representation of exchangeable sequences as being mixtures of iid sequences. Defining an implicit prior on the side. While the description sticks to binary events, it gets quickly more advanced with the notion of partial and Markov exchangeability. With the most interesting connection between those exchangeabilities and sufficiency. (I would however disagree with the statement that “Bayes was the father of parametric Bayesian analysis” [p.133] as this is extrapolating too much from the Essay.) My next remark may be non-sensical, but I would have welcomed an entry at the end of the chapter on cases where the exchangeability representation fails, for instance those cases when there is no sufficiency structure to exploit in the model. A bonus to the chapter is a description of Birkhoff’s ergodic theorem “as a generalisation of de Finetti” (p..134-136), plus half a dozen pages of appendices on more technical aspects of de Finetti’s theorem.

“We want random sequences to pass all tests of randomness, with tests being computationally implemented”. (p.151)

The eighth chapter (“algorithmic randomness”) comes (again!) as a surprise as it centres on the character of Per Martin-Löf who is little known in statistics circles. (The chapter starts with a picture of him with the iconic Oberwolfach sculpture in the background.) Martin-Löf’s work concentrates on the notion of randomness, in a mathematical rather than probabilistic sense, and on the algorithmic consequences. I like very much the section on random generators. Including a mention of our old friend RANDU, the 16 planes random generator! This chapter connects with Chapter 4 since von Mises also attempted to define a random sequence. To the point it feels slightly repetitive (for instance Jean Ville is mentioned in rather similar terms in both chapters). Martin-Löf’s central notion is computability, which forces us to visit Turing’s machine. And its role in the undecidability of some logical statements. And Church’s recursive functions. (With a link not exploited here to the notion of probabilistic programming, where one language is actually named Church, after Alonzo Church.) Back to Martin-Löf, (I do not see how his test for randomness can be implemented on a real machine as the whole test requires going through the entire sequence: since this notion connects with von Mises’ Kollektivs, I am missing the point!) And then Kolmororov is brought back with his own notion of complexity (which is also Chaitin’s and Solomonov’s). Overall this is a pretty hard chapter both because of the notions it introduces and because I do not feel it is completely conclusive about the notion(s) of randomness. A side remark about casino hustlers and their “exploitation” of weak random generators: I believe Jeff Rosenthal has a similar if maybe simpler story in his book about Canadian lotteries.

“Does quantum mechanics need a different notion of probability? We think not.” (p.180)

The penultimate chapter is about Boltzmann and the notion of “physical chance”. Or statistical physics. A story that involves Zermelo and Poincaré, And Gibbs, Maxwell and the Ehrenfests. The discussion focus on the definition of probability in a thermodynamic setting, opposing time frequencies to space frequencies. Which requires ergodicity and hence Birkhoff [no surprise, this is about ergodicity!] as well as von Neumann. This reaches a point where conjectures in the theory are yet open. What I always (if presumably naïvely) find fascinating in this topic is the fact that ergodicity operates without requiring randomness. Dynamical systems can enjoy ergodic theorem, while being completely deterministic.) This chapter also discusses quantum mechanics, which main tenet requires probability. Which needs to be defined, from a frequency or a subjective perspective. And the Bernoulli shift that brings us back to random generators. The authors briefly mention the Einstein-Podolsky-Rosen paradox, which sounds more metaphysical than mathematical in my opinion, although they get to great details to explain Bell’s conclusion that quantum theory leads to a mathematical impossibility (but they lost me along the way). Except that we “are left with quantum probabilities” (p.183). And the chapter leaves me still uncertain as to why statistical mechanics carries the label statistical. As it does not seem to involve inference at all.

“If you don’t like calling these ignorance priors on the ground that they may be sharply peaked, call them nondogmatic priors or skeptical priors, because these priors are quite in the spirit of ancient skepticism.” (p.199)

And then the last chapter (“induction”) brings us back to Hume and the 18th Century, where somehow “everything” [including statistics] started! Except that Hume’s strong scepticism (or skepticism) makes induction seemingly impossible. (A perspective with which I agree to some extent, if not to Keynes’ extreme version, when considering for instance financial time series as stationary. And a reason why I do not see the criticisms contained in the Black Swan as pertinent because they savage normality while accepting stationarity.) The chapter rediscusses Bayes’ and Laplace’s contributions to inference as well, challenging Hume’s conclusion of the impossibility to finer. Even though the representation of ignorance is not unique (p.199). And the authors call again for de Finetti’s representation theorem as bypassing the issue of whether or not there is such a thing as chance. And escaping inductive scepticism. (The section about Goodman’s grue hypothesis is somewhat distracting, maybe because I have always found it quite artificial and based on a linguistic pun rather than a logical contradiction.) The part about (Richard) Jeffrey is quite new to me but ends up quite abruptly! Similarly about Popper and his exclusion of induction. From this chapter, I appreciated very much the section on skeptical priors and its analysis from a meta-probabilist perspective.

There is no conclusion to the book, but to end up with a chapter on induction seems quite appropriate. (But there is an appendix as a probability tutorial, mentioning Monte Carlo resolutions. Plus notes on all chapters. And a commented bibliography.) Definitely recommended!

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Books Review section in CHANCE. As appropriate for a book about Chance!]

atheism: a very [very] short introduction [book review]

Posted in Books with tags , , , , , , , , , , , , , , , on November 3, 2017 by xi'an

After the rather disappointing Edge of Reason, I gave a try at Baggini’s very brief introduction to atheism, which is very short. And equally very disappointing. Rather than approaching the topic from a (academic) philosophical perspective, ex nihilo,  and while defending himself from doing so, the author indeed adopts a rather militant tone in trying to justify the arguments and ethics of atheism, setting the approach solely in a defensive opposition to religions. That is, in reverse, as an answer to faiths and creeds. Even when his arguments make complete sense, e.g., in the lack of support for agnosticism against atheism, the link with inductive reasoning (and Hume), and the logical [and obvious] disconnection between morality and religious attitudes.

“…once we accept the inductive method, we should, to be consistent, also accept that it points toward a naturalism that supports atheism…” (p.27)

While he mentions “militant atheism” as a fundamentalist position to be as avoided as the numerous religious versions, I find the whole exercise in this book missing the point of both an intellectual criticism of atheism [in the sense of Kant’s best seller!] and of the VSI series. Again, to define atheism as an answer to religions and to their irrationality is reducing the scope of this philosophical branch to a contrarian posture, rather than independently advancing a rationalist and scientific position on the entropic nature of life and the universe, one that does not require for a purpose or a higher cause. And to try to show it provides better answers to the same questions as those addressed by religions stoops down to their level.

“So it is not the case that atheism follows merely from some shallow commitment to the primacy of scientific inquiry.” (p.77)

The link therein with a philosophical analysis seems so weak that I deem the essay rather belongs to journalosophy. The very short history of atheism and its embarrassed debate on the attributed connections between atheism and some modern era totalitarianisms [found in the last chapter] are an illustration of this divergence from scholarly work. That the author felt the need to include pictures to illustrate his points says it all!


Posted in Books, Statistics, University life with tags , , , , , , , , , , on March 7, 2016 by xi'an

Oxford University Press sent me this book by Phyllis Illari and Frederica Russo, Causality (Philosophical theory meets scientific practice) a little while ago. (The book appeared in 2014.) Unless I asked for it, I cannot remember…

“The problem is whether and how to use information of general causation established in science to ascertain individual responsibility.” (p.38)

As the subtitle indicates, this is a philosophy book, not a statistics book. And not particularly intended for statisticians. Hence, I am not exactly qualified to analyse its contents, and even less to criticise its lack of connection with statistics. But this being a blog post…  I read rather slowly through the book, which exposes a wide range (“a map”, p.8) of approaches and perspectives on the notions of causality, some ways to infer about causality, and the point of doing all this, concluding with a relativistic (and thus eminently philosophical) viewpoint defending a “pluralistic mosaic” or a “causal mosaic” that relates to all existing accounts of causality as they “each do something valuable” (p.258). From a naïve bystander perspective, this sounds like a new avatar of deconstructionism applied to causality.

“Simulations can be very illuminating about various phenomena that are complex and have unexpected effects (…) can be run repeatedly to study a system in different situations to those seen for the real system…” (p.15)

This is not to state that the book is uninteresting, as it provides a wide entry into philosophical attempts at categorising and defining causality, if not into the statistical aspects of the issue. (For instance, the problem whether or not causality can be proven uniquely from a statistical perspective is not mentioned.) Among those interesting points in the early chapters, a section (2.5) about simulation. Which however misses the depth of this earlier book on climate simulations I reviewed while in Monash. Or of the discussions at the interdisciplinary seminar last year in Hanover. I.J. Good’s probabilistic causality is mentioned but hardly detailed. (With the warning remark that one “should not confuse predictability with determinism [and] determinism with causality”, p.82.) Continue reading

philosophy at the 2015 Baccalauréat

Posted in Books, Kids with tags , , , , , , , , , , on June 18, 2015 by xi'an

[Here is the pre-Bayesian quote from Hume that students had to analyse this year for the Baccalauréat:]

The maxim, by which we commonly conduct ourselves in our reasonings, is, that the objects, of which we have no experience, resembles those, of which we have; that what we have found to be most usual is always most probable; and that where there is an opposition of arguments, we ought to give the preference to such as are founded on the greatest number of past observations. But though, in proceeding by this rule, we readily reject any fact which is unusual and incredible in an ordinary degree; yet in advancing farther, the mind observes not always the same rule; but when anything is affirmed utterly absurd and miraculous, it rather the more readily admits of such a fact, upon account of that very circumstance, which ought to destroy all its authority. The passion of surprise and wonder, arising from miracles, being an agreeable emotion, gives a sensible tendency towards the belief of those events, from which it is derived.” David Hume, An Enquiry Concerning Human Understanding,

Philosophy of Science, a very short introduction (and review)

Posted in Books, Kids, Statistics, Travel with tags , , , , , , , , , , , on November 3, 2013 by xi'an

When visiting the bookstore on the campus of the University of Warwick two weeks ago, I spotted this book, Philosophy of Science, a very short introduction, by Samir Okasha, and the “bargain” offer of getting two books for £10 enticed me to buy it along with a Friedrich Nietzsche, a very short introduction… (Maybe with the irrational hope that my daughter would take a look at those for her philosophy course this year!)

Popper’s attempt to show that science can get by without induction does not succeed.” (p.23)

Since this is [unsusrprisingly!] a very short introduction, I did not get much added value from the book. Nonetheless, it was an easy read for short trips in the metro and short waits here and there. And would be a good [very short] introduction to any one newly interested in the philosophy of sciences. The first chapter tries to define what science is, with reference to the authority of Popper (and a mere mention of Wittgenstein), and concludes that there is no clear-cut demarcation between science and pseudo-science. (Mathematics apparently does not constitute a science: “Physics is the most fundamental science of all”, p.55) I would have liked to see the quote from Friedrich Nietzsche

“It is perhaps just dawning on five or six minds that physics, too, is only an interpretation and exegesis of the world (to suit us, if I may say so!) and not a world-explanation.”

in Beyond Good and Evil. as it illustrates the main point of the chapter and maybe the book that scientific theories can never be proven true, Plus, it is often misinterpreted as a anti-science statement by Nietzsche. (Plus, it links both books I bought!) Continue reading

the signal and the noise

Posted in Books, Statistics with tags , , , , , , , , , , , on February 27, 2013 by xi'an

It took me a while to get Nate Silver’s the signal and the noise: why so many predictions fall – but some don’t (hereafter s&n) and another while to read it (blame A Memory of Light!).

“Bayes and Price are telling Hume, don’t blame nature because you are too daft to understand it.” s&n, p.242

I find s&n highly interesting and it is rather refreshing to see the Bayesian approach so passionately promoted by a former poker player, as betting and Dutch book arguments have often been used as argument in favour of this approach. While it works well for some illustrations in the book, like poker and the stock market, as well as political polls and sports, I prefer more decision theoretic motivations for topics like weather prediction, sudden epidemics, global warming or terrorism. Of course, this passionate aspect makes s&n open to criticisms, like this one by Marcus and Davies in The New Yorker about seeing everything through the Bayesian lenses. The chapter on Bayes and Bayes’ theorem (Chapter 8) is a wee caricaturesque in this regard. Indeed, Silver sees too much in Bayes’ Essay, to the point of mistakenly attributing to Bayes a discussion of Hume’s sunrise problem. (The only remark is made in the Appendix, which was written by Price—like possibly the whole of the Essay!—, and  P.S. Laplace is the one who applied Bayesian reasoning to the problem, leading to Laplace’s succession rule.) The criticisms of frequentism are also slightly over-the-levee: they are mostly directed at inadequate models that a Bayesian analysis would similarly process in the wrong way. (Some critics argue on the opposite that Bayesian analysis is too much dependent on the model being “right”! Or on the availability of a fully-specified  model.) Seeing frequentism as restricted to “collecting data among just a sample of the population rather than the whole population” (p.252) is certainly not presenting a broad coverage of frequentism.

“Prediction serves a very central role in hypothesis testing, for instance, and therefore in all of science.” s&n, p.230

The book is written in a fairly enjoyable style, highly personal (no harm with that) and apart from superlativising (!) everyone making a relevant appearance—which seems the highest common denominator of all those pop’sci’ books I end up reviewing so very often!, maybe this is something like Rule #1 in Scientific Writing 101 courses: “makes the scientists sound real, turn’em into real people”—, I find it rather well-organised as it brings the reader from facts (prediction usually does poorly) to the possibility of higher quality prediction (by acknowledging prior information, accepting uncertainty, using all items of information available, further accepting uncertainty, &tc.). I am not sure the reader is the wiser by the end of the book on how one should improve one’s prediction tools, but there is a least a warning about the low quality of most predictions and predictive tools that should linger in the reader’s ears…. I enjoyed very much the chapter on chess, esp. the core about Kasparov’s misreading the computer reasons for a poor move (no further spoiler!), although I felt it was not much connected to the rest of the book.

In his review, Larry Wasserman argues that the defence Silver makes of his procedure is more frequentist than Bayesian. Because he uses calibration and long-term performances. Well… Having good calibration properties does not mean the procedure is not Bayesian or frequentist, simply that it is making efficient use of the available information. Anyway, I agree (!) with Larry on the point that Silver somehow “confuses “Bayesian inference” with “using Bayes’ theorem”. Or puts too much meaning in the use of Bayes’ theorem, not unlike the editors of Science & Vie a few months ago. To push Larry’s controversial statement a wee further, I would even wonder whether the book has anything to do about inference. Indeed, in the end, I find s&n rather uninformative about statistical modelling and even more (or less!) about model checking. The only “statistical” model that is truly discussed over the book is the power law distribution, applied to earthquakes and terrorist attack fatalities. This is not an helpful model in that (a) it does not explain anything, as it does not make use of covariates or side information, and (b) it has no predictive power, especially in the tails.  On the first point, concluding that Israel’s approach to counter-terrorism is successful because it “is the only country that has been able to bend” the power-law curve (p.442) sounds rather hasty. I’d like to see the same picture for Iraq, say. Actually, I found one in this arXiv paper. And it looks about the same for Afghanistan (Fig.4). On the second point, the modelling is poor in handling extreme values (which are the ones of interest in both cases) and cannot face change-points or lacks of stationary, an issue not sufficiently covered in s&n in my opinion. The difficulty with modelling volatile concepts like the stock market, the next presidential election or the move of your poker opponents is that there is no physical, immutable, law at play. Things can change from one instant to the next. Unpredictably. Esp. in the tails.

There are plenty of graphs in s&n, which is great, but not all of them are at the Tufte quality level. For instance, Figure 11-1 about the “average time U.S. common stock was held” contains six pie charts corresponding to six decades with the average time and a percentage which could be how long compared with the 1950s a stock was held. The graph is not mentioned in the text. (I will not mention Figure 8-2!) I also spotted a minuscule typo (`probabalistic’) on Figure 10-2A.

Maybe one last and highly personal remark about the chapter on poker (feel free to skip!): while I am a very poor card player, I do not mind playing cards (and loosing) with my kids. However, I simply do not understand the rationale of playing poker. If there is no money at stake, the game does not seem to make sense since every player can keep bluffing until the end of time. And if there is money at stake, I find the whole notion unethical. This is a zero sum game, so money comes from someone else’s pocket (or more likely someone else’s retirement plan or someone else’s kids college savings plan). Not much difference with the way the stock market behaves nowadays… (Incidentally, this chapter did not discuss at all the performances of computer poker programs, unexpectedly, as the number of possibilities is very small and they should thus be fairly efficient.)