Archive for Series B

the end of the Series B’log…

Posted in Books, Statistics, University life with tags , , , , on September 22, 2017 by xi'an

Today is the last and final day of Series B’log as David Dunson, Piotr Fryzlewicz and myself have decided to stop the experiment, faute de combattants. (As we say in French.) The authors nicely contributed long abstracts of their papers, for which I am grateful, but with a single exception, no one came out with comments or criticisms, and the idea to turn some Series B papers into discussion papers does not seem to appeal, at least in this format. Maybe the concept will be rekindled in another form in the near future, but for now we let it lay down. So be it!

Series B’log

Posted in Books, Statistics, University life with tags , , , , on May 31, 2017 by xi'an

Since the above announcement in the RSS newsletter a few months ago, about the Series B’log coming to life, I have received exactly zero comments from readers, despite several authors kindly contributing an extended abstract of their paper. And announcements to various societies…

Hence I now seriously wonder at the survival probability of the blog, given this collective lack of interest. It may be that the information did not reach enough people (despite my mentioning its existence on each talk I give abroad). It may be that the blog still sounds like “under construction”, in which case I’d like to hear suggestions to make it look more definitive! But overall I remain fairly pessimistic [even conditional on my Gallic gloom] about our chances of success with this experiment which could have turned every Series B paper into a potential discussion paper!

a somewhat hasty announcement

Posted in Books, Statistics, University life with tags , , , , , on March 13, 2017 by xi'an

When I received the above RSS newsletter on Thursday, I was a bit shocked as I had not planned to make the existence of the Series B’log known to the entire Society. Even though it was already visible and with unrestricted access. The reason being that experimenting with authors and editors was easier without additional email and password exchanges…

Anyway, now that we have jumped that Rubicon, I would more than welcome comments and suggestions to make the blog structure more efficient and readable. I am still confused as to how the front page should look like, because I want to keep the hierarchy of the Journal, i.e., volume/issue/paper, reflected in this structure, rather than piling up comments and authors’ summaries in an haphazard manner. I have started to tag entries by the volume/issue tag, in order to keep some of this hierarchy respected but I would like to also provide all entries related to a given paper without getting into much extra-work. Given that I already have to process most entries through latex2wp in the best scenario.

coauthorship and citation networks

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , , on February 21, 2017 by xi'an

cozauthorAs I discovered (!) the Annals of Applied Statistics in my mailbox just prior to taking the local train to Dauphine for the first time in 2017 (!), I started reading it on the way, but did not get any further than the first discussion paper by Pengsheng Ji and Jiashun Jin on coauthorship and citation networks for statisticians. I found the whole exercise intriguing, I must confess, with little to support a whole discussion on the topic. I may have read the paper too superficially as a métro pastime, but to me it sounded more like a post-hoc analysis than a statistical exercise, something like looking at the network or rather at the output of a software representing networks and making sense of clumps and sub-networks a posteriori. (In a way this reminded of my first SAS project at school, on the patterns of vacations in France. It was in 1983 on pinched cards. And we spent a while cutting & pasting in a literal sense the 80 column graphs produced by SAS on endless listings.)

It may be that part of the interest in the paper is self-centred. I do not think analysing a similar dataset in another field like deconstructionist philosophy or Korean raku would have attracted the same attention. Looking at the clusters and the names on the pictures is obviously making sense, if more at a curiosity than a scientific level, as I do not think this brings much in terms of ranking and evaluating research (despite what Bernard Silverman suggests in his preface) or understanding collaborations (beyond the fact that people in the same subfield or same active place like Duke tend to collaborate). Speaking of curiosity, I was quite surprised to spot my name in one network and even more to see that I was part of the “High-Dimensional Data Analysis” cluster, rather than of the “Bayes” cluster.  I cannot fathom how I ended up in that theme, as I cannot think of a single paper of mines pertaining to either high dimensions or data analysis [to force the trait just a wee bit!]. Maybe thanks to my joint paper with Peter Mueller. (I tried to check the data itself but cannot trace my own papers in the raw datafiles.)

I also wonder what is the point of looking at solely four major journals in the field, missing for instance most of computational statistics and biostatistics, not to mention machine learning or econometrics. This results in a somewhat narrow niche, if obviously recovering the main authors in the [corresponding] field. Some major players in computational stats still make it to the lists, like Gareth Roberts or Håvard Rue, but under the wrong categorisation of spatial statistics.

Elsevier in the frontline

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on January 27, 2017 by xi'an

“Viewed this way, the logo represents, in classical symbolism, the symbiotic relationship between publisher and scholar. The addition of the Non Solus inscription reinforces the message that publishers, like the elm tree, are needed to provide sturdy support for scholars, just as surely as scholars, the vine, are needed to produce fruit. Publishers and scholars cannot do it alone. They need each other. This remains as apt a representation of the relationship between Elsevier and its authors today – neither dependent, nor independent, but interdependent.”

There were two items of news related with the publishark Elsevier in the latest issue of Nature I read. One was that Germany, Peru, and Taiwan had no longer access to Elsevier journals, after negotiations or funding stopped. Meaning the scientists there have to find alternative ways to procure the papers, from the authors’ webpage [I do not get why authors fail to provide their papers through their publication webpage!] to peer-to-peer platforms like Sci-Hub. Beyond this short term solution, I hope this pushes for the development of arXiv-based journals, like Gower’s Discrete Analysis. Actually, we [statisticians] should start planing a Statistics version of it!

The second item is about  Elsevier developing its own impact factor index, CiteScore. While I do not deem the competition any more relevant for assessing research “worth”, seeing a publishark developing its own metrics sounds about as appropriate as Breithart News starting an ethical index for fake news. I checked the assessment of Series B on that platform, which returns the journal as ranking third, with the surprising inclusion of the Annual Review of Statistics and its Application [sic], a review journal that only started two years ago, of Annals of Mathematics, which does not seem to pertain to the category of Statistics, Probability, and Uncertainty, and of Statistics Surveys, an IMS review journal that started in 2009 (of which I was blissfully unaware). And the article in Nature points out that, “scientists at the Eigenfactor project, a research group at the University of Washington, published a preliminary calculation finding that Elsevier’s portfolio of journals gains a 25% boost relative to others if CiteScore is used instead of the JIF“. Not particularly surprising, eh?!

When looking for an illustration of this post, I came upon the hilarious quote given at the top: I particularly enjoy the newspeak reversal between the tree and the vine,  the parasite publishark becoming the support and the academics the (invasive) vine… Just brilliant! (As a last note, the same issue of Nature mentions New Zealand aiming at getting rid of all invasive predators: I wonder if publishing predators are also included!)

a new Editor for Series B

Posted in Statistics with tags , , , on January 16, 2017 by xi'an

As every odd year, the Royal Statistical Society is seeking a new joint editor for Series B! After four years of dedication to the (The!) journal, Piotr Fryzlewicz is indeed going to retire from this duty by the end of 2017. Many thanks to Piotr for his unfailing involvement in Series B and the preservation of its uncompromising selection of papers! The call thus open for candidates for the next round of editorship, from 2018 to 2021, with a deadline of 31 January, 2017. Interested candidates should contact Martin Owen, at the Society’s address or by email at with journal as recipient (local-part). The new editor will work with the current joint editor, David Dunson, whose term runs till December 2019. (I am also looking forward working with Piotr’s successor in developing the Series B blog, Series’ Blog!)

a Bayesian criterion for singular models [discussion]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , on October 10, 2016 by xi'an

London Docks 12/02/09[Here is the discussion Judith Rousseau and I wrote about the paper by Mathias Drton and Martyn Plummer, a Bayesian criterion for singular models, which was discussed last week at the Royal Statistical Society. There is still time to send a written discussion! Note: This post was written using the latex2wp converter.]

It is a well-known fact that the BIC approximation of the marginal likelihood in a given irregular model {\mathcal M_k} fails or may fail. The BIC approximation has the form

\displaystyle BIC_k = \log p(\mathbf Y_n| \hat \pi_k, \mathcal M_k) - d_k \log n /2

where {d_k } corresponds on the number of parameters to be estimated in model {\mathcal M_k}. In irregular models the dimension {d_k} typically does not provide a good measure of complexity for model {\mathcal M_k}, at least in the sense that it does not lead to an approximation of

\displaystyle \log m(\mathbf Y_n |\mathcal M_k) = \log \left( \int_{\mathcal M_k} p(\mathbf Y_n| \pi_k, \mathcal M_k) dP(\pi_k|k )\right) \,.

A way to understand the behaviour of {\log m(\mathbf Y_n |\mathcal M_k) } is through the effective dimension

\displaystyle \tilde d_k = -\lim_n \frac{ \log P( \{ KL(p(\mathbf Y_n| \pi_0, \mathcal M_k) , p(\mathbf Y_n| \pi_k, \mathcal M_k) ) \leq 1/n | k ) }{ \log n}

when it exists, see for instance the discussions in Chambaz and Rousseau (2008) and Rousseau (2007). Watanabe (2009} provided a more precise formula, which is the starting point of the approach of Drton and Plummer:

\displaystyle \log m(\mathbf Y_n |\mathcal M_k) = \log p(\mathbf Y_n| \hat \pi_k, \mathcal M_k) - \lambda_k(\pi_0) \log n + [m_k(\pi_0) - 1] \log \log n + O_p(1)

where {\pi_0} is the true parameter. The authors propose a clever algorithm to approximate of the marginal likelihood. Given the popularity of the BIC criterion for model choice, obtaining a relevant penalized likelihood when the models are singular is an important issue and we congratulate the authors for it. Indeed a major advantage of the BIC formula is that it is an off-the-shelf crierion which is implemented in many softwares, thus can be used easily by non statisticians. In the context of singular models, a more refined approach needs to be considered and although the algorithm proposed by the authors remains quite simple, it requires that the functions { \lambda_k(\pi)} and {m_k(\pi)} need be known in advance, which so far limitates the number of problems that can be thus processed. In this regard their equation (3.2) is both puzzling and attractive. Attractive because it invokes nonparametric principles to estimate the underlying distribution; puzzling because why should we engage into deriving an approximation like (3.1) and call for Bayesian principles when (3.1) is at best an approximation. In this case why not just use a true marginal likelihood?

1. Why do we want to use a BIC type formula?

The BIC formula can be viewed from a purely frequentist perspective, as an example of penalised likelihood. The difficulty then stands into choosing the penalty and a common view on these approaches is to choose the smallest possible penalty that still leads to consistency of the model choice procedure, since it then enjoys better separation rates. In this case a {\log \log n} penalty is sufficient, as proved in Gassiat et al. (2013). Now whether or not this is a desirable property is entirely debatable, and one might advocate that for a given sample size, if the data fits the smallest model (almost) equally well, then this model should be chosen. But unless one is specifying what equally well means, it does not add much to the debate. This also explains the popularity of the BIC formula (in regular models), since it approximates the marginal likelihood and thus benefits from the Bayesian justification of the measure of fit of a model for a given data set, often qualified of being a Bayesian Ockham’s razor. But then why should we not compute instead the marginal likelihood? Typical answers to this question that are in favour of BIC-type formula include: (1) BIC is supposingly easier to compute and (2) BIC does not call for a specification of the prior on the parameters within each model. Given that the latter is a difficult task and that the prior can be highly influential in non-regular models, this may sound like a good argument. However, it is only apparently so, since the only justification of BIC is purely asymptotic, namely, in such a regime the difficulties linked to the choice of the prior disappear. This is even more the case for the sBIC criterion, since it is only valid if the parameter space is compact. Then the impact of the prior becomes less of an issue as non informative priors can typically be used. With all due respect, the solution proposed by the authors, namely to use the posterior mean or the posterior mode to allow for non compact parameter spaces, does not seem to make sense in this regard since they depend on the prior. The same comments apply to the author’s discussion on Prior’s matter for sBIC. Indeed variations of the sBIC could be obtained by penalizing for bigger models via the prior on the weights, for instance as in Mengersen and Rousseau (2011) or by, considering repulsive priors as in Petralia et al. (20120, but then it becomes more meaningful to (again) directly compute the marginal likelihood. Remains (as an argument in its favour) the relative computational ease of use of sBIC, when compared with the marginal likelihood. This simplification is however achieved at the expense of requiring a deeper knowledge on the behaviour of the models and it therefore looses the off-the-shelf appeal of the BIC formula and the range of applications of the method, at least so far. Although the dependence of the approximation of {\log m(\mathbf Y_n |\mathcal M_k)} on {\mathcal M_j }, $latex {j \leq k} is strange, this does not seem crucial, since marginal likelihoods in themselves bring little information and they are only meaningful when compared to other marginal likelihoods. It becomes much more of an issue in the context of a large number of models.

2. Should we care so much about penalized or marginal likelihoods ?

Marginal or penalized likelihoods are exploratory tools in a statistical analysis, as one is trying to define a reasonable model to fit the data. An unpleasant feature of these tools is that they provide numbers which in themselves do not have much meaning and can only be used in comparison with others and without any notion of uncertainty attached to them. A somewhat richer approach of exploratory analysis is to interrogate the posterior distributions by either varying the priors or by varying the loss functions. The former has been proposed in van Havre et l. (2016) in mixture models using the prior tempering algorithm. The latter has been used for instance by Yau and Holmes (2013) for segmentation based on Hidden Markov models. Introducing a decision-analytic perspective in the construction of information criteria sounds to us like a reasonable requirement, especially when accounting for the current surge in studies of such aspects.

[Posted as arXiv:1610.02503]