## Bayesian inference for partially identified models [book review]

“The crux of the situation is that we lack theoretical insight into even quite basic questions about what is going on. More particularly, we cannot sayy anything about the limiting posterior marginal distribution of α compared to the prior marginal distribution of α.” (p.142)

*Bayesian inference for partially identified models* is a recent CRC Press book by Paul Gustafson that I received for a review in CHANCE with keen interest! If only because the concept of unidentifiability has always puzzled me. And that I have never fully understood what I felt was a sort of joker card that a Bayesian model was the easy solution to the problem since the prior was compensating for the components of the parameter not identified by the data. As defended by Dennis Lindley that “unidentifiability causes no real difficulties in the Bayesian approach”. However, after reading the book, I am less excited in that I do not feel it answers this type of questions about non-identifiable models and that it is exclusively centred on the [undoubtedly long-term and multifaceted] research of the author on the topic.

“Without Bayes, the feeling is that all the data can do is locate the identification region, without conveying any sense that some values in the region are more plausible than others.” (p.47)

Overall, the book is pleasant to read, with a light and witty style. The notational conventions are somewhat unconventional but well explained, to distinguish θ from θ^{*} from θ^{†}. The format of the chapters is quite similar with a definition of the partially identified model, an exhibition of the transparent reparameterisation, the computation of the limiting posterior distribution [of the non-identified part], a demonstration [which it took me several iterations as the English exhibition rather than the French proof, pardon my French!]. Chapter titles suffer from an excess of the “further” denomination… The models themselves are mostly of one kind, namely binary observables and non-observables leading to partially observed multinomials with some non-identifiable probabilities. As in missing-at-random models (Chapter 3). In my opinion, it is only in the final chapters that the important questions are spelled-out, not always faced with a definitive answer. In essence, I did not get from the book (i) a characterisation of the non-identifiable parts of a model, of the identifiability of unidentifiability, and of the universality of the transparent reparameterisation, (ii) a tool to assess the impact of a particular prior and possibly to set it aside, and (iii) a limitation to the amount of unidentifiability still allowing for coherent inference. Hence, when closing the book, I still remain in the dark (or at least in the grey) on how to handle partially identified models. The author convincingly argues that there is no special advantage to using a misspecified if identifiable model to a partially identified model, for this imbues false confidence (p.162), however we also need the toolbox to verify this is indeed the case.

“Given the data we can turn the Bayesian computational crank nonetheless and see what comes out.” (p.xix)

“It is this author’s contention that computation with partially identified models is a “bottleneck” issue.” (p.141)

*Bayesian inference for partially identified models* is particularly concerned about computational issues and rightly so. It is however unclear to me (without more time to invest investigating the topic) why the “use of general-purpose software is limited to the [original] parametrisation” (p.24) and why importance sampling would do better than MCMC on a general basis. I would definitely have liked more details on this aspect. There is a computational considerations section at the end of the book, but it remains too allusive for my taste. My naïve intuition would be that the lack of identifiability leads to flatter posterior and hence to easier MCMC moves, but Paul Gustafson reports instead bad mixing from standard MCMC schemes (like WinBUGS).

In conclusion, the book opens a new perspective on the relevance of partially identifiable models, trying to lift the stigma associated with them, and calls for further theory and methodology to deal with those. Here are the author’s final points (p.162):

*“Identification is nuanced. Its absence does not preclude a parameter being well estimated, not its presence guarantee a parameter can be well estimated.”**“If we really took limitations of study designs and data quality seriously, then partially identifiable models would crop up all the time in a variety of scientific fields.”**“Making modeling assumptions for the sole purpose of gaining full identification can be a mug’s game (…)”**“If we accept partial identifiability, then consequently we need to regard sample size differently. There are profound implications of posterior variance tending to a positive limit as the sample size grows.”*

These points may be challenging enough to undertake to read *Bayesian inference for partially identified models* in order to make one’s mind about their eventual relevance in statistical modelling.

*[Disclaimer about potential self-plagiarism: this post will also be published as a book review in my CHANCE column. ]
*

July 9, 2015 at 1:53 am

I’m enormously excited to read this book as I’ve been enjoying a number of his papers recently (Leo Held put me on to them at O’Bayes after I complained that there was essentially no interesting Bayesian theory for hierarchical models).

I’m not sure how fair your criticism that he appears to only review and expand on his own work is. I’m not sure that there is much else out there. Would that we put as much work into this type of problem as we do into studying the inference for the mean of a normal distribution! (To venture a controversial opinion that isn’t based around colour preference!)

This actually makes me think of an important question that I can’t get my head around: how does hypothesis testing work for these models?

A similar question is “How does hypothesis testing work for a semi-parametric model?”

In both cases, the testing result will be strongly prior dependent (the first case will depend on the restriction of the prior to the identified region, the second will depend on the prior of the scale parameter for the non-parametric effect). So is there a framework for asking these sorts of questions?

As for tying down identifiable vs non-identifiable parts (and actually all of the (iii) question in your second paragraph), I suspect we need to tie up a computationally minded algebraic geometer and tickle them until they give us the answers…

July 9, 2015 at 9:01 pm

I have no idea about testing for semi-parametric models, but I’m sure Judith or Chris have definite ones!