Archive for Electronic Journal of Statistics

noise contrastive estimation

Posted in Statistics with tags , , , , , , , , , on July 15, 2019 by xi'an

As I was attending Lionel Riou-Durand’s PhD thesis defence in ENSAE-CREST last week, I had a look at his papers (!). The 2018 noise contrastive paper is written with Nicolas Chopin (both authors share the CREST affiliation with me). Which compares Charlie Geyer’s 1994 bypassing the intractable normalising constant problem by virtue of an artificial logit model with additional simulated data from another distribution ψ.

“Geyer (1994) established the asymptotic properties of the MC-MLE estimates under general conditions; in particular that the x’s are realisations of an ergodic process. This is remarkable, given that most of the theory on M-estimation (i.e.estimation obtained by maximising functions) is restricted to iid data.”

Michael Guttman and Aapo Hyvärinen also use additional simulated data in another likelihood of a logistic classifier, called noise contrastive estimation. Both methods replace the unknown ratio of normalising constants with an unbiased estimate based on the additional simulated data. The major and impressive result in this paper [now published in the Electronic Journal of Statistics] is that the noise contrastive estimation approach always enjoys a smaller variance than Geyer’s solution, at an equivalent computational cost when the actual data observations are iid. And the artificial data simulations ergodic. The difference between both estimators is however negligible against the Monte Carlo error (Theorem 2).

This may be a rather naïve question, but I wonder at the choice of the alternative distribution ψ. With a vague notion that it could be optimised in a GANs perspective. A side result of interest in the paper is to provide a minimal (re)parameterisation of the truncated multivariate Gaussian distribution, if only as an exercise for future exams. Truncated multivariate Gaussian for which the normalising constant is of course unknown.

a resolution of the Jeffreys-Lindley paradox

Posted in Books, Statistics, University life with tags , , , , on April 24, 2019 by xi'an

“…it is possible to have the best of both worlds. If one allows the significance level to decrease as the sample size gets larger (…) there will be a finite number of errors made with probability one. By allowing the critical values to diverge slowly, one may catch almost all the errors.” (p.1527)

When commenting another post, Michael Naaman pointed out to me his 2016 Electronic Journal of Statistics paper where he resolves the Jeffreys-Lindley paradox. The argument there is to consider a Type I error going to zero with the sample size n going to infinity but slowly enough for both Type I and Type II errors to go to zero. And guarantee  a finite number of errors as the sample size n grows to infinity. This translates for the Jeffreys-Lindley paradox into a pivotal quantity within the posterior probability of the null that converges to zero with n going to infinity. Hence makes it (most) agreeable with the Type I error going to zero. Except that there is little reason to assume this pivotal quantity goes to infinity with n, despite its distribution remaining constant in n. Being constant is less unrealistic, by comparison! That there exists an hypothetical sequence of observations such that the p-value and the posterior probability agree, even exactly, does not “solve” the paradox in my opinion.

Florid’AISTATS

Posted in pictures, R, Statistics, Travel, University life with tags , , , , , , , , , on August 31, 2016 by xi'an

The next AISTATS conference is taking place in Florida, Fort Lauderdale, on April 20-22. (The website keeps the same address one conference after another, which means all my links to the AISTATS 2016 conference in Cadiz are no longer valid. And that the above sunset from Florida is named… cadiz.jpg!) The deadline for paper submission is October 13 and there are two novel features:

  1. Fast-track for Electronic Journal of Statistics: Authors of a small number of accepted papers will be invited to submit an extended version for fast-track publication in a special issue of the Electronic Journal of Statistics (EJS) after the AISTATS decisions are out. Details on how to prepare such extended journal paper submission will be announced after the AISTATS decisions.
  2. Review-sharing with NIPS: Papers previously submitted to NIPS 2016 are required to declare their previous NIPS paper ID, and optionally supply a one-page letter of revision (similar to a revision letter to journal editors; anonymized) in supplemental materials. AISTATS reviewers will have access to the previous anonymous NIPS reviews. Other than this, all submissions will be treated equally.

I find both initiatives worth applauding and replicating in other machine-learning conferences. Particularly in regard with the recent debate we had at Annals of Statistics.

the end of Series B!

Posted in Books, pictures, Statistics, University life with tags , , , , on May 25, 2016 by xi'an

I received this news from the RSS today that all the RSS journals are turning 100% electronic. No paper version any longer! I deeply regret this move on which, as an RSS member, I would have appreciated to be consulted as I find much easier to browse through the current issue when it arrives in my mailbox, rather than being t best reminded by an email that I will most likely ignore and erase. And as I consider the production of the journals the prime goal of the Royal Statistical Society. And as I read that only 25% of the members had opted so far for the electronic format, which does not sound to me like a majority. In addition, moving to electronic-only journals does not bring the perks one would expect from electronic journals:

  • no bonuses like supplementary material, code, open or edited comments
  • no reduction in the subscription rate of the journals and penalty fees if one still wants a paper version, which amounts to a massive increase in the subscription price
  • no disengagement from the commercial publisher, whose role become even less relevant
  • no access to the issues of the years one has paid for, once one stops subscribing.

“The benefits of electronic publishing include: faster publishing speeds; increased content; instant access from a range of electronic devices; additional functionality; and of course, environmental sustainability.”

The move is sold with typical marketing noise. But I do not buy it: publishing speeds will remain the same as driven by the reviewing part, I do not see where the contents are increased, and I cannot seriously read a journal article from my phone, so this range of electronic devices remains a gadget. Not happy!

Savage-Dickey published

Posted in Statistics, University life with tags , , on July 12, 2010 by xi'an

We got this email on Saturday about our Savage-Dickey resolution:

Your article “On resolving the Savage–Dickey paradox” was published in the Electronic Journal of Statistics 2010, Vol. 4, 643-654.
You may access electronic version of your paper in Euclid by DOI link http://dx.doi.org/10.1214/10-EJS564

No extreme wonder that it appeared that quickly (when considering it was written in November and submitted to EJS in February) since EJS is an electronic journal but nice nonetheless!

Savage-Dickey paper accepted

Posted in Statistics, University life with tags , , , , on June 3, 2010 by xi'an

After our second (light) round of revision, the  [rearXived] paper on the Savage-Dickey paradox was accepted by the Electronic Journal of Statistics. Great! This is actually my first paper in EJS. In fact, I managed to include a short comment inspired by Geoff Nicholls, following a conversation we had at CRiSM. Namely, the three expressions we recover for the Monte Carlo approximations to the Bayes factor can all be seen as different avatars of the bridge sampling family of estimators. Therefore, it could be possible to compare those approaches against their asymptotic variance, or even to improve upon them…

Savage-Dickey revised

Posted in Statistics with tags , , , , on May 29, 2010 by xi'an

We have at last re-submitted (and rearXived, but only to appear on Monday!) our paper on the Savage-Dickey paradox to the Electronic Journal of Statistics (after wasting a few weeks doing nothing!). The revision was quite easy to write, especially because the comments applied to an earlier version of the paper I had submitted by mistake and  theyrequested examples thatactually  were  in the latest version! The comments were in any case quite supportive, although some hinted at a partial misunderstanding with the nature of the “paradox”. I am afraid the measure-theoretic difficulty with this Savage-Dickey paradox will not vanish once the paper is published.