## Archive for University of Amsterdam

## JASP, a really really fresh way to do stats

Posted in Statistics with tags Bayes factors, Bayesian inference, design, Harold Jeffreys, JASP, tee-shirt, University of Amsterdam on February 1, 2018 by xi'an## absolutely no Bayesians inside!

Posted in Statistics with tags Amsterdam, cartoon, English grammar, JASP, statistical software, sticker, Trojan horse, University of Amsterdam, Viktor Breekman on December 11, 2017 by xi'an## bridgesampling [R package]

Posted in pictures, R, Statistics, University life with tags Amsterdam, bridge, bridge sampling, bridgesampling, JAGS, R, R package, STAN, University of Amsterdam, warped bridge sampling on November 9, 2017 by xi'an**Q**uentin F. Gronau, Henrik Singmann and Eric-Jan Wagenmakers have arXived a detailed documentation about their * bridgesampling* R package. (No wonder that researchers from Amsterdam favour bridge sampling!)

*[The package relates to a [52 pages] tutorial on bridge sampling by Gronau et al. that I will hopefully comment soon.]*The bridge sampling methodology for marginal likelihood approximation requires

*two*Monte Carlo samples for a ratio of

*two*integrals. A nice twist in this approach is to use a dummy integral that is already available, with respect to a probability density that is an approximation to the exact posterior. This means avoiding the difficulties with bridge sampling of bridging two different parameter spaces, in possibly different dimensions, with potentially very little overlap between the posterior distributions. The substitute probability density is chosen as Normal or warped Normal, rather than a t which would provide more stability in my opinion. The

*package also provides an error evaluation for the approximation, although based on spectral estimates derived from the*

**bridgesampling****package. The remainder of the document exhibits how the package can be used in conjunction with either JAGS or Stan. And concludes with the following words of caution:**

*coda*

“It should also be kept in mind that there may be cases in which the bridge sampling procedure may not be the ideal choice for conducting Bayesian model comparisons. For instance, when the models are nested it might be faster and easier to use the Savage-Dickey density ratio (Dickey and Lientz 1970; Wagenmakers et al. 2010). Another example is when the comparison of interest concerns a very large model space, and a separate bridge sampling based computation of marginal likelihoods may take too much time. In this scenario, Reversible Jump MCMC (Green 1995) may be more appropriate.”

## Bayesian spectacles

Posted in Books, pictures, Statistics, University life with tags Amsterdam, Bayes factors, Bayesian Spectacles, blogging, Holland, JASP, non-informative priors, objective Bayes, reference priors, UMPBTs, uniformly most powerful tests, University of Amsterdam on October 4, 2017 by xi'anE.J. Wagenmakers and his enthusiastic team of collaborators at University of Amsterdam and in the JASP software designing team have started a blog called Bayesian spectacles which I find a fantastic title. And not only because I wear glasses. Plus, they got their own illustrator, Viktor Beekman, which sounds like the epitome of sophistication! (Compared with resorting to vacation or cat pictures…)

In a most recent post they addressed the criticisms we made of the 72 author paper on p-values, one of the co-authors being E.J.! Andrew already re-addressed some of the address, but here is a disagreement he let me to chew on my own [and where the Abandoners are us!]:

Disagreement 2.The Abandoners’ critique the UMPBTs –the uniformly most powerful Bayesian tests– that features in the original paper. This is their right (see also the discussion of the 2013 Valen Johnson PNAS paper), but they ignore the fact that the original paper presented a series of other procedures that all point to the same conclusion: p-just-below-.05 results are evidentially weak. For instance, a cartoon on the JASP blog explains the Vovk-Sellke bound. A similar result is obtained using the upper bounds discussed in Berger & Sellke (1987) and Edwards, Lindman, & Savage (1963). We suspect that the Abandoners’ dislike of Bayes factors (and perhaps their upper bounds) is driven by a disdain for the point-null hypothesis. That is understandable, but the two critiques should not be mixed up. The first question is Given that we wish to test a point-null hypothesis, do the Bayes factor upper bounds demonstrate that the evidence is weak for p-just-below-.05 results? We believe they do, and in this series of blog posts we have provided concrete demonstrations.

Obviously, this reply calls for an examination of the entire BS blog series, but being short in time at the moment, let me point out that the upper lower bounds on the Bayes factors showing much more support for H⁰ than a p-value at 0.05 only occur in special circumstances. Even though I spend some time in my book discussing those bounds. Indeed, the [interesting] fact that the lower bounds are larger than the p-values does not hold in full generality. Moving to a two-dimensional normal with potentially zero mean is enough to see the order between lower bound and p-value reverse, as I found [quite] a while ago when trying to expand Berger and Sellker (1987, the same year as I was visiting Purdue where both had a position). I am not sure this feature has been much explored in the literature, I did not pursue it when I realised the gap was missing in larger dimensions… I must also point out I do not have the same repulsion for point nulls as Andrew! While considering whether a parameter, say a mean, is exactly zero [or three or whatever] sounds rather absurd when faced with the strata of uncertainty about models, data, procedures, &tc.—even in theoretical physics!—, comparing several [and all wrong!] models with or without some parameters for later use still makes sense. And my reluctance in using Bayes factors does not stem from an opposition to comparing models or from the procedure itself, which is quite appealing within a Bayesian framework [thus appealing *per se*!], but rather from the unfortunate impact of the prior [and its tail behaviour] on the quantity and on the delicate calibration of the thing. And on a lack of reference solution [to avoid the O and the N words!]. As exposed in the demise papers. (Which main version remains in a publishing limbo, the onslaught from the referees proving just too much for me!)