Archive for Computational Statistics and Data Analysis

alternatives to EM

Posted in Books, Statistics with tags , , , , , , , on January 30, 2019 by xi'an

In an arXived preprint submitted to Computational Statistics & Data Analysis, Chan, Han, and Lim study alternatives to EM for latent class models. That is, mixtures of products of Multinomials. (First occurrence of an indicator function being called the “Iverson bracket function”!) The introduction is fairly extensive given this most studied model. The criticisms of EM laid by the authors are that (a) it does not produce an evaluation of the estimation error, which does not sound correct; (b) the convergence is slow, which is also rather misleading as my [low dimensional] experience with mixtures is that it gets very quickly and apparently linearly  to the vicinity of one of the modes. The argument in favour of alternative non-linear optimisation approaches is that they can achieve quadratic convergence. One solution is a projected Quasi-Newton method, based on a quadratic approximation to the target. With some additional intricacies that make the claim of being “way easier than EM algorithm” somewhat specious. The second approach proposed in the paper is sequential quadratic programming, which incorporates the Lagrange multiplier in the target. While the different simulations in the paper show that EM may indeed call for a much larger number of iterations, the obtained likelihoods all are comparable.

SDSS with friends

Posted in Statistics with tags , , , , , , , , on May 4, 2018 by xi'an

When browsing over lunch the April issue of Amstat News, I came upon this page advertising rather loudly the SDSS symposium of next month. And realised that not only it features “perhaps the most prominent statistician to have repeatedly published material written by others without attribution” (a quote from Gelman and Basbøll, 2013, in American Scientist), namely  Ed Wegman, as the guest of honor,  but also one co-author of a retracted Computational Statistics paper [still included in Wegman’s list of publications] as program chair and another co-author from the “Hockey Stick” plagiarised report as plenary speaker. A fairly friendly reunion, then, if “networking” is to be understood this way, except that this is a major conference, supported by ASA and other organisations. Rather shocking, isn’t it?! (The entry also made me realise that the three co-authors were the original editors of WIREs, before Wegman and Said withdrew in 2012.)

Bayesian computing, methods and applications (and Elsevier)

Posted in Books, Statistics, University life with tags , , , , , on April 13, 2012 by xi'an

I received an email this weekend calling for submissions to a special issue of Computational Statistics and Data Analysis on the special topic Bayesian computing, methods and applications, edited by Cathy Chen, David Dunson, Sylvia Frühwirth-Schnatter, and Stephen Walker.   The theme is

The last two decades have seen an explosion in the popularity and use of Bayesian methods, largely as a result of the advances in sampling based approaches to inference. At the same time, important advances and developments in methodology have coincided with highly sophisticated breakthroughs in computational techniques. Consequently, practitioners are increasingly turning to Bayesian methods so as to effectively tackle more complex and realistic models and problems, particularly as richer sources of data continue to become available. The primary aim of the issue is to illustrate and showcase recent advances in Bayesian computation and inferential methods, as well as highlight their application to empirical problems in a broad range of areas, including econometrics, biology, finance and medicine, amongst many others. Methodological contributions that highlight recent developments in Bayesian computing are strongly encouraged.

The papers should have a computational or advanced data analytic component in order to be considered for publication. Authors who are uncertain about the suitability of their papers should contact the special issue editors. All submissions must contain original unpublished work not being considered for publication elsewhere. Submissions will be refereed according to standard procedures for Computational Statistics and Data Analysis. The deadline for submissions is 30 June 2012.

Unfortunately, this journal is published by Elsevier, the costly much too costly publisher [Computational Statistics and Data Analysis costs for instance 2,763 euros per year for institutions and libraries!, Journal of Multivariate Analysis is 2,704 euros…] and, since I am completely in agreement with the position, I have signed the Cost of Knowledge pledge a few weeks ago [although I do not yet appear on the list], meaning I now abstain from supporting the extremely unbalanced business model of Elsevier though publishing, reviewing, or (a)editing in one of the journals it publishes. (Which means I refuse referring on this sole ground about once a week now.) Even though Elsevier published a letter to mathematicians a few weeks ago, I however doubt they can modify their business model so drastically as to get down to average prices for their journals. Unless the pressure from the community is so committed and shared that the flow of submissions dries out, which I doubt will occur on a short time-scale. (The impact of a reduced submission pool on citation indices and impact factors is on another scale…) Jean-Michel pointed me to this arXiv report [to appear in Notices of the American Mathematical Society] by Douglas Arnold and Henry Cohn on the Cost of Knowledge boycott (analysed by mathematicians, not statisticians). It seems that the boycott has not impacted as much the statisticians’ community, judging from the number of signatures registered so far.

As a coincidence, I read today in the Guardian that the Wellcome Trust is pulling its weight in support of open source publishing, threatening to withdraw funding from researchers who do not “ensure that results are freely available to the public within six months of first publication”. Elsevier’s statement is not encouraging, though: “we will also remain committed to the subscription model. We want to be able to offer our customers choice, and we see that, in addition to new models the subscription model remains very much in demand.” (I do not see the connection between high subscription rates and choice, nor a proof that anyone demands high subscription rates!) Another coincidence is that I also got an email about The Open Statistics & Probability Journal, which is a free, peer reviewed, on-line journal. Which shows that some companies have found a way to manage a business model that is compatible with open access, if not a good solution in my opinion: just charge the authors $400 per published paper…