Archive for Bayesian Analysis

the worst possible proof [X’ed]

Posted in Books, Kids, Statistics, University life with tags , , , , , , on July 18, 2015 by xi'an

XX-1Another surreal experience thanks to X validated! A user of the forum recently asked for an explanation of the above proof in Lynch’s (2007) book, Introduction to Applied Bayesian Statistics and Estimation for Social Scientists. No wonder this user was puzzled: the explanation makes no sense outside the univariate case… It is hard to fathom why on Earth the author would resort to this convoluted approach to conclude about the posterior conditional distribution being a normal centred at the least square estimate and with σ²X’X as precision matrix. Presumably, he has a poor opinion of the degree of matrix algebra numeracy of his readers [and thus should abstain from establishing the result]. As it seems unrealistic to postulate that the author is himself confused about matrix algebra, given his MSc in Statistics [the footnote ² seen above after “appropriately” acknowledges that “technically we cannot divide by” the matrix, but it goes on to suggest multiplying the numerator by the matrix

(X^\text{T}X)^{-1} (X^\text{T}X)

which does not make sense either, unless one introduces the trace tr(.) operator, presumably out of reach for most readers]. And this part of the explanation is unnecessarily confusing in that a basic matrix manipulation leads to the result. Or even simpler, a reference to Pythagoras’  theorem.

Bayesian inference for partially identified models [book review]

Posted in Books, Statistics, University life with tags , , , , , , , , , on July 9, 2015 by xi'an

“The crux of the situation is that we lack theoretical insight into even quite basic questions about what is going on. More particularly, we cannot sayy anything about the limiting posterior marginal distribution of α compared to the prior marginal distribution of α.” (p.142)

Bayesian inference for partially identified models is a recent CRC Press book by Paul Gustafson that I received for a review in CHANCE with keen interest! If only because the concept of unidentifiability has always puzzled me. And that I have never fully understood what I felt was a sort of joker card that a Bayesian model was the easy solution to the problem since the prior was compensating for the components of the parameter not identified by the data. As defended by Dennis Lindley that “unidentifiability causes no real difficulties in the Bayesian approach”. However, after reading the book, I am less excited in that I do not feel it answers this type of questions about non-identifiable models and that it is exclusively centred on the [undoubtedly long-term and multifaceted] research of the author on the topic.

“Without Bayes, the feeling is that all the data can do is locate the identification region, without conveying any sense that some values in the region are more plausible than others.” (p.47)

Overall, the book is pleasant to read, with a light and witty style. The notational conventions are somewhat unconventional but well explained, to distinguish θ from θ* from θ. The format of the chapters is quite similar with a definition of the partially identified model, an exhibition of the transparent reparameterisation, the computation of the limiting posterior distribution [of the non-identified part], a demonstration [which it took me several iterations as the English exhibition rather than the French proof, pardon my French!]. Chapter titles suffer from an excess of the “further” denomination… The models themselves are mostly of one kind, namely binary observables and non-observables leading to partially observed multinomials with some non-identifiable probabilities. As in missing-at-random models (Chapter 3). In my opinion, it is only in the final chapters that the important questions are spelled-out, not always faced with a definitive answer. In essence, I did not get from the book (i) a characterisation of the non-identifiable parts of a model, of the  identifiability of unidentifiability, and of the universality of the transparent reparameterisation, (ii) a tool to assess the impact of a particular prior and possibly to set it aside, and (iii) a limitation to the amount of unidentifiability still allowing for coherent inference. Hence, when closing the book, I still remain in the dark (or at least in the grey) on how to handle partially identified models. The author convincingly argues that there is no special advantage to using a misspecified if identifiable model to a partially identified model, for this imbues false confidence (p.162), however we also need the toolbox to verify this is indeed the case.

“Given the data we can turn the Bayesian computational crank nonetheless and see what comes out.” (p.xix)

“It is this author’s contention that computation with partially identified models is a “bottleneck” issue.” (p.141)

Bayesian inference for partially identified models is particularly concerned about computational issues and rightly so. It is however unclear to me (without more time to invest investigating the topic) why the “use of general-purpose software is limited to the [original] parametrisation” (p.24) and why importance sampling would do better than MCMC on a general basis. I would definitely have liked more details on this aspect. There is a computational considerations section at the end of the book, but it remains too allusive for my taste. My naïve intuition would be that the lack of identifiability leads to flatter posterior and hence to easier MCMC moves, but Paul Gustafson reports instead bad mixing from standard MCMC schemes (like WinBUGS).

In conclusion, the book opens a new perspective on the relevance of partially identifiable models, trying to lift the stigma associated with them, and calls for further theory and methodology to deal with those. Here are the author’s final points (p.162):

  • “Identification is nuanced. Its absence does not preclude a parameter being well estimated, not its presence guarantee a parameter can be well estimated.”
  • “If we really took limitations of study designs and data quality seriously, then partially identifiable models would crop up all the time in a variety of scientific fields.”
  • “Making modeling assumptions for the sole purpose of gaining full identification can be a mug’s game (…)”
  • “If we accept partial identifiability, then consequently we need to regard sample size differently. There are profound implications of posterior variance tending to a positive limit as the sample size grows.”

These points may be challenging enough to undertake to read Bayesian inference for partially identified models in order to make one’s mind about their eventual relevance in statistical modelling.

[Disclaimer about potential self-plagiarism: this post will also be published as a book review in my CHANCE column. ]

Current trends in Bayesian methodology with applications

Posted in Books, Statistics, Travel, University life with tags , , , , , on June 20, 2015 by xi'an

When putting this volume together with Umesh Singh, Dipak Dey, and Appaia Loganathan, my friend Satyanshu Upadhyay from Varanasi, India, asked me for a foreword. The book is now out, with chapters written by a wide variety of Bayesians. And here is my foreword, for what it’s worth:

It is a great pleasure to see a new book published on current aspects of Bayesian Analysis and coming out of India. This wide scope volume reflects very accurately on the present role of Bayesian Analysis in scientific inference, be it by statisticians, computer scientists or data analysts. Indeed, we have witnessed in the past decade a massive adoption of Bayesian techniques by users in need of statistical analyses, partly because it became easier to implement such techniques, partly because both the inclusion of prior beliefs and the production of a posterior distribution that provides a single filter for all inferential questions is a natural and intuitive way to process the latter. As reflected so nicely by the subtitle of Sharon McGrayne’s The Theory that Would not Die, the Bayesian approach to inference “cracked the Enigma code, hunted down Russian submarines” and more generally contributed to solve many real life or cognitive problems that did not seem to fit within the traditional patterns of a statistical model.
Two hundred and fifty years after Bayes published his note, the field is more diverse than ever, as reflected by the range of topics covered by this new book, from the foundations (with objective Bayes developments) to the implementation by filters and simulation devices, to the new Bayesian methodology (regression and small areas, non-ignorable response and factor analysis), to a fantastic array of applications. This display reflects very very well on the vitality and appeal of Bayesian Analysis. Furthermore, I note with great pleasure that the new book is edited by distinguished Indian Bayesians, India having always been a provider of fine and dedicated Bayesians. I thus warmly congratulate the editors for putting this exciting volume together and I offer my best wishes to readers about to appreciate the appeal and diversity of Bayesian Analysis.

ISBA 2016 [logo]

Posted in pictures, Statistics, Travel, University life, Wines with tags , , , , , , , , , , on April 22, 2015 by xi'an

Things are starting to get in place for the next ISBA 2016 World meeting, in Forte Village Resort Convention Center, Sardinia, Italy. June 13-17, 2016. And not only the logo inspired from the nuraghe below. I am sure the program will be terrific and make this new occurrence of a “Valencia meeting” worth attending. Just like the previous occurrences, e.g. Cancún last summer and Kyoto in 2012.

However, and not for the first time, I wonder at the sustainability of such meetings when faced with always increasing—or more accurately sky-rocketing!—registration fees… We have now reached €500 per participant for the sole (early reg.) fees, excluding lodging, food or transportation. If we bet on 500 participants, this means simply renting the convention centre would cost €250,000 for the four or five days of the meeting. This sounds enormous, even accounting for the processing costs of the congress organiser. (By comparison, renting the convention centre MCMSki in Chamonix for three days was less than €20,000.) Given the likely high costs of staying at the resort, it is very unlikely I will be able to support my PhD students  As I know very well of the difficulty to find dedicated volunteers willing to offer a large fraction of their time towards the success of behemoth meetings, this comment is by no means aimed at my friends from Cagliari who kindly accepted to organise this meeting. But rather at the general state of academic meetings which costs makes them out of reach for a large part of the scientific community.

Thus, this makes me wonder anew whether we should move to a novel conference model given that the fantastic growth of the Bayesian community makes the ideal of gathering together in a single beach hotel for a week of discussions, talks, posters, and more discussions unattainable. If truly physical meetings are to perdure—and this notion is as debatable as the one about the survival of paper versions of the journals—, a new approach would be to find a few universities or sponsors able to provide one or several amphitheatres around the World and to connect all those places by teleconference. Reducing the audience size at each location would greatly the pressure to find a few huge and pricey convention centres, while dispersing the units all around would diminish travel costs as well. There could be more parallel sessions and ways could be found to share virtual poster sessions, e.g. by having avatars presenting some else’s poster. Time could be reserved for local discussions of presented papers, to be summarised later to the other locations. And so on… Obviously, something would be lost of the old camaraderie, sharing research questions and side stories, as well as gossips and wine, with friends from all over the World. And discovering new parts of the World. But the cost of meetings is already preventing some of those friends to show up. I thus think it is time we reinvent the Valencia meetings into the next generation. And move to the Valenci-e-meetings.

Bayesian propaganda?

Posted in Books, Kids, pictures, Statistics, University life with tags , , , , , , , , , on April 20, 2015 by xi'an

“The question is about frequentist approach. Bayesian is admissable [sic] only by wrong definition as it starts with the assumption that the prior is the correct pre-information. James-Stein beats OLS without assumptions. If there is an admissable [sic] frequentist estimator then it will correspond to a true objective prior.”

I had a wee bit of a (minor, very minor!) communication problem on X validated, about a question on the existence of admissible estimators of the linear regression coefficient in multiple dimensions, under squared error loss. When I first replied that all Bayes estimators with finite risk were de facto admissible, I got the above reply, which clearly misses the point, and as I had edited the OP question to include more tags, the edited version was reverted with a comment about Bayesian propaganda! This is rather funny, if not hilarious, as (a) Bayes estimators are indeed admissible in the classical or frequentist sense—I actually fail to see a definition of admissibility in the Bayesian sense—and (b) the complete class theorems of Wald, Stein, and others (like Jack Kiefer, Larry Brown, and Jim Berger) come from the frequentist quest for best estimator(s). To make my point clearer, I also reproduced in my answer the Stein’s necessary and sufficient condition for admissibility from my book but it did not help, as the theorem was “too complex for [the OP] to understand”, which shows in fine the point of reading textbooks!

Bayesian computation: fore and aft

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , on February 6, 2015 by xi'an

BagneuxWith my friends Peter Green (Bristol), Krzysztof Łatuszyński (Warwick) and Marcello Pereyra (Bristol), we just arXived the first version of “Bayesian computation: a perspective on the current state, and sampling backwards and forwards”, which first title was the title of this post. This is a survey of our own perspective on Bayesian computation, from what occurred in the last 25 years [a  lot!] to what could occur in the near future [a lot as well!]. Submitted to Statistics and Computing towards the special 25th anniversary issue, as announced in an earlier post.. Pulling strength and breadth from each other’s opinion, we have certainly attained more than the sum of our initial respective contributions, but we are welcoming comments about bits and pieces of importance that we miss and even more about promising new directions that are not posted in this survey. (A warning that is should go with most of my surveys is that my input in this paper will not differ by a large margin from ideas expressed here or in previous surveys.)

not Bayesian enough?!

Posted in Books, Statistics, University life with tags , , , , , , , on January 23, 2015 by xi'an

Elm tree in the park, Parc de Sceaux, Nov. 22, 2011Our random forest paper was alas rejected last week. Alas because I think the approach is a significant advance in ABC methodology when implemented for model choice, avoiding the delicate selection of summary statistics and the report of shaky posterior probability approximation. Alas also because the referees somewhat missed the point, apparently perceiving random forests as a way to project a large collection of summary statistics on a limited dimensional vector as in the Read Paper of Paul Fearnhead and Dennis Prarngle, while the central point in using random forests is the avoidance of a selection or projection of summary statistics.  They also dismissed ou approach based on the argument that the reduction in error rate brought by random forests over LDA or standard (k-nn) ABC is “marginal”, which indicates a degree of misunderstanding of what the classification error stand for in machine learning: the maximum possible gain in supervised learning with a large number of classes cannot be brought arbitrarily close to zero. Last but not least, the referees did not appreciate why we mostly cannot trust posterior probabilities produced by ABC model choice and hence why the posterior error loss is a valuable and almost inevitable machine learning alternative, dismissing the posterior expected loss as being not Bayesian enough (or at all), for “averaging over hypothetical datasets” (which is a replicate of Jeffreys‘ famous criticism of p-values)! Certainly a first time for me to be rejected based on this argument!

Follow

Get every new post delivered to your Inbox.

Join 903 other followers