## Berlin snapshot #2 [jatp]

Posted in pictures, Running, Travel with tags Berlin, Berlin wall, jatp, morning run, murals, Ost-Berlin on March 30, 2017 by xi'an## estimation versus testing [again!]

Posted in Books, Statistics, University life with tags Bayes factors, Bayesian inference, Harold Jeffreys, hypothesis testing, parameter estimation, point null hypotheses, psychology, refereeing, review, spike-and-slab prior, unification on March 30, 2017 by xi'an**T**he following text is a review I wrote of the paper “Parameter estimation and Bayes factors”, written by J. Rouder, J. Haff, and J. Vandekerckhove. (As the journal to which it is submitted gave me the option to sign my review.)

The opposition between estimation and testing as a matter of prior modelling rather than inferential goals is quite unusual in the Bayesian literature. In particular, if one follows Bayesian decision theory as in Berger (1985) there is no such opposition, but rather the use of different loss functions for different inference purposes, while the Bayesian model remains single and unitarian.

Following Jeffreys (1939), it sounds more congenial to the Bayesian spirit to return the posterior probability of an hypothesis * H⁰* as an answer to the question whether this hypothesis holds or does not hold. This however proves impossible when the “null” hypothesis

*has prior mass equal to zero (or is not measurable under the prior). In such a case the mathematical answer is a probability of zero, which may not satisfy the experimenter who asked the question. More fundamentally, the said prior proves inadequate to answer the question and hence to incorporate the information contained in this very question. This is how Jeffreys (1939) justifies the move from the original (and deficient) prior to one that puts some weight on the null (hypothesis) space. It is often argued that the move is unnatural and that the null space does not make sense, but this only applies when believing very strongly in the model itself. When considering the issue from a modelling perspective, accepting the null*

**H⁰***means using a new model to represent the model and hence testing becomes a model choice problem, namely whether or not one should use a complex or simplified model to represent the generation of the data. This is somehow the “unification” advanced in the current paper, albeit it does appear originally in Jeffreys (1939) [and then numerous others] rather than the relatively recent Mitchell & Beauchamp (1988). Who may have launched the spike & slab denomination.*

**H⁰**I have trouble with the analogy drawn in the paper between the spike & slab estimate and the Stein effect. While the posterior mean derived from the spike & slab posterior is indeed a quantity drawn towards zero by the Dirac mass at zero, it is rarely the point in using a spike & slab prior, since this point estimate does not lead to a conclusion about the hypothesis: for one thing it is never exactly zero (if zero corresponds to the null). For another thing, the construction of the spike & slab prior is both artificial and dependent on the weights given to the spike and to the slab, respectively, to borrow expressions from the paper. This approach thus leads to model averaging rather than hypothesis testing or model choice and therefore fails to answer the (possibly absurd) question as to which model to choose. Or refuse to choose. But there are cases when a decision must be made, like continuing a clinical trial or putting a new product on the market. Or not.

In conclusion, the paper surprisingly bypasses the decision-making aspect of testing and hence ends up with a inconclusive setting, staying midstream between Bayes factors and credible intervals. And failing to provide a tool for decision making. The paper also fails to acknowledge the strong dependence of the Bayes factor on the tail behaviour of the prior(s), which cannot be [completely] corrected by a finite sample, hence its relativity and the unreasonableness of a fixed scale like Jeffreys’ (1939).

## Berlin snapshot #1 [jatp]

Posted in Statistics with tags Berlin, bridge, Germany, jatp, running, Schilling, Spree on March 29, 2017 by xi'an## Fourth Bayesian, Fiducial, and Frequentist Conference

Posted in Books, pictures, Statistics, Travel, University life, Wines with tags Bayesian Analysis, Cambridge, Error-Statistical philosophy, foundations, Harvard University, Philosophy of Science, snow, Statistics done wrong on March 29, 2017 by xi'an**N**ext May 1-3, I will attend the 4th Bayesian, Fiducial and Frequentist Conference at Harvard University (hopefully not under snow at that time of year), which is a meeting between philosophers and statisticians about foundational thinking in statistics and inference under uncertainty. This should be fun! (Registration is now open.)

## zurück nach Berlin [jatp]

Posted in Statistics with tags Amazon, Berlin, Germany, jatp, OxWaSP, Stadtmitte on March 28, 2017 by xi'an## Le Monde puzzle [#1000…1025]

Posted in Kids, R with tags Alice and Bob, arithmetics, competition, Le Monde, mathematical puzzle, R, Tangente on March 28, 2017 by xi'anLe Monde mathematical puzzle launched a competition to celebrate its 1000th puzzle! A fairly long-term competition as it runs over the 25 coming puzzles (and hence weeks). Starting with puzzle #1001. Here is the 1000th puzzle, not part of the competition:

Alice & Bob spend five (identical) vouchers in five different shops, each time buying the maximum number of items to get close to the voucher value. In these five shops, they buy sofas at 421 euros each, beds at 347 euros each, kitchen appliances at 289 euros each, tables at 251 euros each and bikes at 211 euros each, respectively. Once the buying frenzy is over, they realise that within a single shop, they would have spent exactly four vouchers for the same products. What is the value of a voucher?