Archive for JRSSB

misspecified [but published!]

Posted in Statistics with tags , , , , , on April 1, 2020 by xi'an

Jeffreys priors for hypothesis testing [Bayesian reads #2]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , on February 9, 2019 by xi'an

A second (re)visit to a reference paper I gave to my OxWaSP students for the last round of this CDT joint program. Indeed, this may be my first complete read of Susie Bayarri and Gonzalo Garcia-Donato 2008 Series B paper, inspired by Jeffreys’, Zellner’s and Siow’s proposals in the Normal case. (Disclaimer: I was not the JRSS B editor for this paper.) Which I saw as a talk at the O’Bayes 2009 meeting in Phillie.

The paper aims at constructing formal rules for objective proper priors in testing embedded hypotheses, in the spirit of Jeffreys’ Theory of Probability “hidden gem” (Chapter 3). The proposal is based on symmetrised versions of the Kullback-Leibler divergence κ between null and alternative used in a transform like an inverse power of 1+κ. With a power large enough to make the prior proper. Eventually multiplied by a reference measure (i.e., the arbitrary choice of a dominating measure.) Can be generalised to any intrinsic loss (not to be confused with an intrinsic prior à la Berger and Pericchi!). Approximately Cauchy or Student’s t by a Taylor expansion. To be compared with Jeffreys’ original prior equal to the derivative of the atan transform of the root divergence (!). A delicate calibration by an effective sample size, lacking a general definition.

At the start the authors rightly insist on having the nuisance parameter v to differ for each model but… as we all often do they relapse back to having the “same ν” in both models for integrability reasons. Nuisance parameters make the definition of the divergence prior somewhat harder. Or somewhat arbitrary. Indeed, as in reference prior settings, the authors work first conditional on the nuisance then use a prior on ν that may be improper by the “same” argument. (Although conditioning is not the proper term if the marginal prior on ν is improper.)

The paper also contains an interesting case of the translated Exponential, where the prior is L¹ Student’s t with 2 degrees of freedom. And another one of mixture models albeit in the simple case of a location parameter on one component only.

the paper where you are a node

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , on February 5, 2019 by xi'an

Sophie Donnet pointed out to me this arXived paper by Tianxi Li, Elizaveta Levina, and Ji Zhu, on a network resampling strategy for X validation, where I appear as a datapoint rather than as a [direct] citation! Which reminded me of the “where you are the hero” gamebooks with which my kids briefly played, before computer games took over. The model selection method is illustrated on a dataset made of X citations [reduced to 706 authors]  in all papers published between 2003 and 2012 in the Annals of Statistics, Biometrika, JASA, and JRSS Series B. With the outcome being the determination of a number of communities, 20, which the authors labelled as they wanted, based on 10 authors with the largest number of citations in the category. As it happens, I appear in the list, within the “mixed (causality + theory + Bayesian)” category (!), along with Jamie Robbins, Paul Fearnhead, Gilles Blanchard, Zhiqiang Tan, Stijn Vansteelandt, Nancy Reid, Jae Kwang Kim, Tyler VanderWeele, and Scott Sisson, which is somewhat mind-boggling in that I am pretty sure I never quoted six of these authors [although I find it hilarious that Jamie appears in the category, given that we almost got into a car crash together, at one of the Valencià meetings!].

mixture modelling for testing hypotheses

Posted in Books, Statistics, University life with tags , , , , , , , , , , on January 4, 2019 by xi'an

After a fairly long delay (since the first version was posted and submitted in December 2014), we eventually revised and resubmitted our paper with Kaniav Kamary [who has now graduated], Kerrie Mengersen, and Judith Rousseau on the final day of 2018. The main reason for this massive delay is mine’s, as I got fairly depressed by the general tone of the dozen of reviews we received after submitting the paper as a Read Paper in the Journal of the Royal Statistical Society. Despite a rather opposite reaction from the community (an admittedly biased sample!) including two dozens of citations in other papers. (There seems to be a pattern in my submissions of Read Papers, witness our earlier and unsuccessful attempt with Christophe Andrieu in the early 2000’s with the paper on controlled MCMC, leading to 121 citations so far according to G scholar.) Anyway, thanks to my co-authors keeping up the fight!, we started working on a revision including stronger convergence results, managing to show that the approach leads to an optimal separation rate, contrary to the Bayes factor which has an extra √log(n) factor. This may sound paradoxical since, while the Bayes factor  converges to 0 under the alternative model exponentially quickly, the convergence rate of the mixture weight α to 1 is of order 1/√n, but this does not mean that the separation rate of the procedure based on the mixture model is worse than that of the Bayes factor. On the contrary, while it is well known that the Bayes factor leads to a separation rate of order √log(n) in parametric models, we show that our approach can lead to a testing procedure with a better separation rate of order 1/√n. We also studied a non-parametric setting where the null is a specified family of distributions (e.g., Gaussians) and the alternative is a Dirichlet process mixture. Establishing that the posterior distribution concentrates around the null at the rate √log(n)/√n. We thus resubmitted the paper for publication, although not as a Read Paper, with hopefully more luck this time!

coauthorship and citation networks

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , , on February 21, 2017 by xi'an

cozauthorAs I discovered (!) the Annals of Applied Statistics in my mailbox just prior to taking the local train to Dauphine for the first time in 2017 (!), I started reading it on the way, but did not get any further than the first discussion paper by Pengsheng Ji and Jiashun Jin on coauthorship and citation networks for statisticians. I found the whole exercise intriguing, I must confess, with little to support a whole discussion on the topic. I may have read the paper too superficially as a métro pastime, but to me it sounded more like a post-hoc analysis than a statistical exercise, something like looking at the network or rather at the output of a software representing networks and making sense of clumps and sub-networks a posteriori. (In a way this reminded of my first SAS project at school, on the patterns of vacations in France. It was in 1983 on pinched cards. And we spent a while cutting & pasting in a literal sense the 80 column graphs produced by SAS on endless listings.)

It may be that part of the interest in the paper is self-centred. I do not think analysing a similar dataset in another field like deconstructionist philosophy or Korean raku would have attracted the same attention. Looking at the clusters and the names on the pictures is obviously making sense, if more at a curiosity than a scientific level, as I do not think this brings much in terms of ranking and evaluating research (despite what Bernard Silverman suggests in his preface) or understanding collaborations (beyond the fact that people in the same subfield or same active place like Duke tend to collaborate). Speaking of curiosity, I was quite surprised to spot my name in one network and even more to see that I was part of the “High-Dimensional Data Analysis” cluster, rather than of the “Bayes” cluster.  I cannot fathom how I ended up in that theme, as I cannot think of a single paper of mines pertaining to either high dimensions or data analysis [to force the trait just a wee bit!]. Maybe thanks to my joint paper with Peter Mueller. (I tried to check the data itself but cannot trace my own papers in the raw datafiles.)

I also wonder what is the point of looking at solely four major journals in the field, missing for instance most of computational statistics and biostatistics, not to mention machine learning or econometrics. This results in a somewhat narrow niche, if obviously recovering the main authors in the [corresponding] field. Some major players in computational stats still make it to the lists, like Gareth Roberts or Håvard Rue, but under the wrong categorisation of spatial statistics.