Archive for JASA

reading classics (#7)

Posted in Statistics with tags , , , , , , , , , on January 28, 2013 by xi'an

Last Monday, my student Li Chenlu presented the foundational 1962 JASA paper by Allan Birnbaum, On the Foundations of Statistical Inference. The very paper that derives the Likelihood Principle from the cumulated Conditional and Sufficiency principles and that had been discussed [maybe ad nauseam] on this ‘Og!!! Alas, thrice alas!, I was still stuck in the plane flying back from Atlanta as she was presenting her understanding of the paper, as the flight had been delayed four hours thanks to (or rather woe to!) the weather conditions in Paris the day before (chain reaction…):

I am sorry I could not attend this lecture and this for many reasons: first and  foremost, I wanted to attend every talk from my students both out of respect for them and to draw a comparison between their performances. My PhD student Sofia ran the seminar that day in my stead, for which I am quite grateful, but I do do wish I had been there… Second, this a.s. has been the most philosophical paper in the series.and I would have appreciated giving the proper light on the reasons for and the consequences of this paper as Li Chenlu stuck very much on the paper itself. (She provided additional references in the conclusion but they did not seem to impact the slides.)  Discussing for instance Berger’s and Wolpert’s (1988) new lights on the topic, as well as Deborah Mayo‘s (2010) attacks, and even Chang‘s (2012) misunderstandings, would have clearly helped the students.

reading classics (#6)

Posted in Statistics with tags , , , , , , , on December 21, 2012 by xi'an

Today my student Xiaolin Cheng presented the mythical 1990 JASA paper of Alan Gelfand and Adrian Smith, Sampling-based approaches to calculating marginal densities. The very one that started the MCMC revolution of the 1990′s! Re-reading it through his eyes was quite enlightening, even though he stuck quite closely to the paper. (To the point of not running his own simulation, nor even reporting Gelfand and Smith’s, as shown by the slides below. This would have helped, I think…)

Indeed, those slides focus very much on the idea that such substitution samplers can provide parametric approximations to the marginal densities of the components of the simulated parameters. To the point of resorting to importance sampling as an alternative to the standard Rao-Blackwell estimate, a solution that did not survive long. (We briefly discussed this point during the seminar, as the importance function was itself based on a Rao-Blackwell estimate, with possibly tail issues. Gelfand and Smith actually conclude on the higher efficiency of the Gibbs sampler.) Maybe not so surprisingly, the approximation of the “other” marginal, namely the marginal likelihood, as it is much more involved (and would lead to the introduction of the infamous harmonic mean estimator a few years later! And Chib’s (1995), which is very close in spirit to the Gibbs sampler). While Xiaolin never mentioned Markov chains in his talk, Gelfand and Smith only report that Gibbs sampling is a Markovian scheme, and refer to both Geman and Geman (1984) and Tanner and Wong (1987), for convergence issues. Rather than directly invoking Markov arguments as in Tierney (1994) and others. A fact that I find quite interesting, a posteriori, as it highlights the strong impact Meyn and Tweedie would have, three years later.

Pitman closeness renewal?

Posted in Statistics, University life with tags , , , , on July 26, 2012 by xi'an

As noticed there a few months ago, the Pitman closeness criterion for comparing estimators (through the probability

Pθ(|δ-θ|<|δ’-θ|)

which should be larger than .5 for the first estimator to be deemed “better” or “Pitman closer”) has been “resuscitated” by Canadian researchers. In 1993, I wrote a JASA (discussion) paper along with Gene Hwang and Bill Strawderman pointing out the many inconsistencies of this criterion as a decision tool.  It was entitled “Is Pitman Closeness a Reasonable Criterion?” (The answer was in the question, right?!)

In an arXiv posting today, Jozani, Balakrishnan, and Davies propose new characterisations for comparing (in this sense) symmetrically distributed estimators. There is nothing wrong with this mathematical exercise, obviously. However, the approach still seems to suffer from the same decisional inconsistencies as in the past:

  1. the results in the paper (see, e.g., Lemma 1 and 2) only apply to independent estimators, which is rather unrealistic (to the point of having the authors applying it to dependent estimators, the sample median X[n/2] versus a fixed index observation, e.g. X3, and again at the end of the paper in the comparison of several order statistics). Having independent estimators to compare is a rather rare situation as one tries to make the most of a given sample;
  2. the setup is highly dependent on considering a single (one-dimensional) location parameter, the results do not apply to more general settings (except location-scale cases with scale parameters known to some extent, see Lemma 5) ;
  3. some results (see Remark 4) allow to find a whole range of estimators dominating a given (again independent) estimator δ’, but they do not give a ranking of those estimators, except in the weak sense of having the above probability maximal in one of the estimators δ (Lemma 9). This is due to the independence constraint on the comparison. There is therefore no possibility (in this setting) of obtaining an estimator that is the “Pitman closest estimator of θ“, as claimed by the authors in the final section of their paper.

Once again, I have nothing against these derivations, which are mostly correct, but I simply argue here that they cannot constitute a competitor to standard decision theory.

computational difficulties [with notations]

Posted in R, Statistics, University life with tags , , , , on August 25, 2011 by xi'an

Here is an email I received from Umberto:

I have a doubt regarding the tempered transitions method you considered in your JASA article with Celeux and Hurn.

On page 961 you detail the several steps for building a proposal for a given distribution by simulating through l tempered power densities. I am slightly confused regarding the interpretation of your MCMC(x,π) notation.

For example does MCMC(y_l,\pi^{1/\beta_{l-1}}) means that an MCMC procedure starting at yl, say Metropolis-Hastings, is used to generate a single proposal yl+1 for \pi^{1/\beta_{l-1}} ?

If this is the case, then yl+1 might be rejected or accepted and in the former case I would have yl+1=yl right? In other words I am not required to simulate proposals using MCMC(y_l,\pi^{1/\beta_{l-1}}) until I finally accept yl+1.

By reading the last paragraph in page 962 it seems to me that, indeed, the y1,…,y2l-1 thus generated are not necessarily accepted proposals for the corresponding power densities.

In retrospect, I still like this MCMC(x,π) notation in the simulated tempering “up-and-down” scheme (and the paper!). Because it is generic, in the sense of an R function that would take the function MCMC(x,π) as its input. To clarify the notation in this light, MCMC(x,π) returns a value that is the outcome of the corresponding MCMC step. This value may be equal to x (MCMC rejection) or to another value (MCMC acceptance). So the sequence y1,…,y2l-1 is made of consecutive values that differ and of consecutive values that do not (it is even possible that all the terms in the sequence are equal). At the end of this “up-and-down” tempering, the value y2l-1 may be the next value of the Markov chain targeted at the original target π. Or the current value may be replicated. This depends on the overall acceptance probability (4) on page 961. (Following Neal, 1996, Statistics and Computing.) This is a very compelling idea, whose mileage may vary depending on the number of required steps and powers.

Do we need… apparently not!

Posted in Books, Statistics, University life with tags , , , , , , , , on March 31, 2011 by xi'an

We had sent our discussion paper of Murray Aitkin’s Statistical Inference, with Andrew Gelman and Judith Rousseau, to the review section of JASA, but were again unsuccessful as the paper was sent back with the comments that “this paper is not a good fit for JASA Reviews. You may wish to consider broadening your discussion so that the paper reads less as an attack on Aitkin’s book“. While I understand that journals cannot publish all critical accounts of all statistics books, I feel a bit depressed by my overall lack of success in publishing extended book reviews. Electronic journals could easily include book discussions and I do not think this would negatively impact the readership as book reviews are generally appreciated by the community.

Follow

Get every new post delivered to your Inbox.

Join 343 other followers