commentaries in financial econometrics

My comment(arie)s on the moment approach to Bayesian inference by Ron Gallant have appeared, along with other comment(arie)s:

Invited Article
Reflections on the Probability Space Induced by Moment Conditions with
Implications for Bayesian Inference
A. Ronald Gallant . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Commentaries
Dante Amengual and Enrique Sentana .. . . . . . . . . . 248
John Geweke . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .253
Jae-Young Kim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
Oliver Linton and Ruochen Wu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .261
Christian P. Robert . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Christopher A. Sims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
Wei Wei and Asger Lunde . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . .278
Author Response
A. Ronald Gallant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .284

formula (4) in Gallant's paperWhile commenting on commentaries is formally bound to induce an infinite loop [or l∞p], I remain puzzled by the main point of the paper, which is that setting a structural distribution on a moment function Z(x,θ) plus a prior p(θ) induces a distribution on the pair (x,θ) in a possibly weaker σ-algebra. (The two distributions may actually be incompatible.) Handling this framework requires checking that a posterior exists, which sounds rather unnatural (even though we also have to check properness of the posterior). And the meaning of such a posterior remains unclear, as for instance in this assertion that (4) above is a likelihood, when it does not define a density in x but on the object inside the exponential.

“…it is typically difficult to determine whether there exists a p(x|θ) such that the implied distribution of m(x,θ) is the one stated, and if not, what damage is done thereby” J. Geweke (p.254)

The first discussion points out a potential link with the Bayesian and empirical likelihood approaches I discussed last week. John Geweke rightly points out the above difficulty of compatibility between the two distributions, while wondering at the motivation for setting a distribution on a moment m(x,θ) as opposed to a likelihood. He also recalls the dangerous features of the harmonic mean estimator, as I did when discussing Gallant in Paris two years ago. (Geweke proposing what amounts to path sampling in the comments.) Note that the final commentaries by Wei and Lunde does resort to the harmonic mean as well! And does not seem to mind that (4) is not a proper likelihood. Which means it is likely that the harmonic mean estimate has infinite variance. And that the resulting inference has no Bayesian justification.

“The paper’s argument also fails to justify its recommended procedure as Bayesian, in my view, because it does not show the existence of a pre-sample joint pdf for data and parameters that is consistent with the procedure it proposes.” C. Sims (p.272)

In his discussion, Chris Sims makes another point with which I completely agree, namely that defining a single distribution on a moment m(x,θ) does not turn the inference Bayesian. Even when adding a prior into the picture. Once more, defining a distribution on m(x,θ) does not imply a joint distribution on (x,θ). Sims illustrates this point with a Gamma assumption on x to the power θ>1, for which there is no available joint. Hence no conditional of x given θ. To continue on Sims’ “constructive comments”, I would like to point out [as I did not do in the written comments] the potential link between this GMM-based Bayesian approach and Holmes and Bissiri (2016) construction of Gibbs posteriors, where the likelihood is replaced with an empirical loss.

In the reply to the discussion, Ron Gallant rejects such criticisms. An answer I find completely unsatisfactory is that missing a Jacobian term in the likelihood is akin to changing the prior: this is formally correct but then why would the resulting prior be of any relevance? The Jacobian may be strongly dominating the original prior. When re-analysing Chris Sims’ Gamma example, Gallant starts (p.287) by turning the distribution on m(x,θ), namely x to the power θ, into a conditional distribution given θ, when it should be a marginal.  This proves to be equivalent to assuming that θ and m(x,θ) are a priori independent. There is no reason to make this assumption.

“MCMC implicitly assumes that the dominating measure on (XxΘ,C,P) is Lebesgue.” (p.287)

Then comes the fairly surprising assertion above, which blames MCMC on the difficulties with the definition of a density on m(x,θ). An MCMC algorithm does not make measure-theoretic assumptions: it uses as its input a given target distribution, which means a density that is defined with respect to a specified dominating measure and from there picks a proposal defined wrt the very same dominating measure to happily hop here and there and eventually towards the target distribution. The only constraint is to use the same dominating measure, not to use Lebesgue. This switch in the discussion is hence definitely puzzling… In any case, the amount of discussion [and ‘Og’s entries] the paper generated can be seen as justifying by itself its publication in the Journal of Financial Econometrics! That we end up having divergent opinions on the problem and the maths behind stands to reason.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s