Archive for Brad Carlin

ghost [parameters] in the [Bayesian] shell

Posted in Books, Kids, Statistics with tags , , , , , , , on August 3, 2017 by xi'an

This question appeared on Stack Exchange (X Validated) two days ago. And the equalities indeed seem to suffer from several mathematical inconsistencies, as I pointed out in my Answer. However, what I find most crucial in this question is that the quantity on the left hand side is meaningless. Parameters for different models only make sense within their own model. Hence when comparing models parameters cannot co-exist across models. What I suspect [without direct access to Kruschke’s Doing Bayesian Data Analysis book and as was later confirmed by John] is that he is using pseudo-priors in order to apply Carlin and Chib (1995) resolution [by saturation of the parameter space] of simulating over a trans-dimensional space…

Savage-Dickey supermodels

Posted in Books, Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on September 13, 2016 by xi'an

The Wider Image: Bolivia's cholita climbers: Combination picture shows Aymara indigenous women (L-R) Domitila Alana, 42, Bertha Vedia, 48, Lidia Huayllas, 48, and Dora Magueno, 50, posing for a photograph at the Huayna Potosi mountain, Bolivia April 6, 2016Combination picture shows Aymara indigenous women (L-R) Domitila Alana, 42, Bertha Vedia, 48, Lidia Huayllas, 48, and Dora Magueno, 50, posing for a photograph at the Huayna Potosi mountain, Bolivia April 6, 2016. (c.) REUTERS/David Mercado. REUTERS/David MercadoA. Mootoovaloo, B. Bassett, and M. Kunz just arXived a paper on the computation of Bayes factors by the Savage-Dickey representation through a supermodel (or encompassing model). (I wonder why Savage-Dickey is so popular in astronomy and cosmology statistical papers and not so much elsewhere.) Recall that the trick is to write the Bayes factor in favour of the encompasssing model as the ratio of the posterior and of the prior for the tested parameter (thus eliminating nuisance or common parameters) at its null value,

B10=π(φ⁰|x)/π(φ⁰).

Modulo some continuity constraints on the prior density, and the assumption that the conditional prior on nuisance parameter is the same under the null model and the encompassing model [given the null value φ⁰]. If this sounds confusing or even shocking from a mathematical perspective, check the numerous previous entries on this topic on the ‘Og!

The supermodel created by the authors is a mixture of the original models, as in our paper, and… hold the presses!, it is a mixture of the likelihood functions, as in Phil O’Neill’s and Theodore Kypraios’ paper. Which is not mentioned in the current paper and should obviously be. In the current representation, the posterior distribution on the mixture weight α is a linear function of α involving both evidences, α(m¹-m²)+m², times the artificial prior on α. The resulting estimator of the Bayes factor thus shares features with bridge sampling, reversible jump, and the importance sampling version of nested sampling we developed in our Biometrika paper. In addition to O’Neill and Kypraios’s solution.

The following quote is inaccurate since the MCMC algorithm needs simulating the parameters of the compared models in realistic settings, hence representing the multidimensional integrals by Monte Carlo versions.

“Though we have a clever way of avoiding multidimensional integrals to calculate the Bayesian Evidence, this new method requires very efficient sampling and for a small number of dimensions is not faster than individual nested sampling runs.”

I actually wonder at the sheer rationale of running an intensive MCMC sampler in such a setting, when the weight α is completely artificial. It is only used to jump from one model to the next, which sound quite inefficient when compared with simulating from both models separately and independently. This approach can also be seen as a special case of Carlin’s and Chib’s (1995) alternative to reversible jump. Using instead the Savage-Dickey representation is of course infeasible. Which makes the overall reference to this method rather inappropriate in my opinion. Further, the examples processed in the paper all involve (natural) embedded models where the original Savage-Dickey approach applies. Creating an additional model to apply a pseudo-Savage-Dickey representation does not sound very compelling…

Incidentally, the paper also includes a discussion of a weird notion, the likelihood of the Bayes factor, B¹², which is plotted as a distribution in B¹², most strangely. The only other place I met this notion is in Murray Aitkin’s book. Something’s unclear there or in my head!

“One of the fundamental choices when using the supermodel approach is how to deal with common parameters to the two models.”

This is an interesting question, although maybe not so relevant for the Bayes factor issue where it should not matter. However, as in our paper, multiplying the number of parameters in the encompassing model may hinder convergence of the MCMC chain or reduce the precision of the approximation of the Bayes factor. Again, from a Bayes factor perspective, this does not matter [while it does in our perspective].

a day for comments

Posted in Mountains, Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , , , , , , , , on April 21, 2014 by xi'an

As I was flying over Skye (with [maybe] a first if hazy perspective on the Cuillin ridge!) to Iceland, three long sets of replies to some of my posts appeared on the ‘Og:

Thanks to them for taking the time to answer my musings…

 

Carlin and Chib (1995) for fixed dimension problems

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , on February 25, 2014 by xi'an

chantier de désamiantage, Université Pierre et Marie Curie, Paris (c) Bouchon/le FigaroYesterday, I was part of a (public) thesis committee at the Université Pierre et Marie Curie, in down-town Paris. After a bit of a search for the defence room (as the campus is still undergoing a massive asbestos clean-up, 20 years after it started…!), I listened to Florian Maire delivering his talk on an array of work in computational statistics ranging from the theoretical (Peskun ordering) to the methodological (Monte Carlo online EM) to the applied (unsupervised learning of classes shapes via deformable templates). The implementation of the online EM algorithm involved the use of pseudo-priors à la Carlin and Chib (1995), even though the setting was a fixed-dimension one, in order to fight the difficulty of exploring the space of templates by a regular Gibbs sampler. (As usual, the design of the pseudo-priors was crucial to the success of the method.) The thesis also included a recent work with Randal Douc and Jimmy Olsson on ranking inhomogeneous Markov kernels of the type

P \circ Q \circ P \circ Q \circ ...

against alternatives with components (P’,Q’). The authors were able to characterise minimal conditions for a Peskun-ordering domination on the components to transfer to the combination. Quite an interesting piece of work for a PhD thesis!

Brad Carlin’s take on Warren Buffett’s $1 billion offer [guest post]

Posted in Kids, Statistics, University life with tags , , , , on January 25, 2014 by xi'an

[This is a guest post from Brad Carlin, prepared for a news channel that never called back. If, like me, you have no idea of what March Madness means or of the meaning of brackets and seeds in this context, read the insert below first. Brad kindly added it upon my request…]

In mid-March every year, the national champion in US college basketball is decided through a 6-round tournament called “March Madness”: 64 teams are seeded into the tournament based on their perceived strengths, with 1 seeds going to top top 4 teams and 16 seeds going to weakest teams. A couple years ago the tourney added a partial 7-round, with just 4 more teams added (winning one of those those 4 “play-in” games merely gives you the right to be one of the 64 — and probably to lose in the next round to a top seeded team).

Anyway, every year Americans fill out “their brackets”, which means once the tourney is announced, it’s very very common to participate in a pool (often at your office) where everybody throws in some small amount of money ($10?) and submits a prediction of how *all* 67 games are going to come out, with the guy whose sheet most accurately predicts the actual outcome winning the money. Pool sheets are typically scored in some systematic way, the most common being 1 point for the first round (or play-in game, if included; many poolmasters just give you the play-in game winners for free), 2 for the second round, 4 for the 3rd, 8 for the 4th, and so on (the exponential award system reflecting how hard it is to keep getting the predictions right as the tourney progresses). Office pools have become so popular that it’s estimated that March Madness is now the most wagered-upon event in the world, passing even the Super Bowl (American pro football championship). Strictly speaking, as a form of gambling office pools are illegal in most states, but the cops generally look the other way provided the poolmaster isn’t taking a cut of the pot as a fee; all moneys collected must be distributed in prizes.

There are websites to tell you how best to fill out your poolsheet; poologic is one that I’ve been involved with, at least in some small way; see my name on a paper on the bottom of the program’s info page. My university has allowed (nay, encouraged) me to speak to reporters about March Madness when it comes around every year, and this year the big story is that Omaha billionaire Warren Buffett is underwriting an ad campaign by Quicken Loans that will pay you $1B if you get the whole bracket completely right — all 67 games, even the play-in games. The odds against this are astronomical (even with 10 million players), so my initial reaction was it’s not a very interesting problem; as Jeff Rosenthal might say, you’re more likely to be struck by lightning on your way to handing in your pool sheet than you are to win the prize. But the part I did find interesting is speculating about how much Mr. Buffett (the 2nd or 3rd richest man in the nation, and a very shrewd investor) charged Quicken for his agreement to underwrite (ie pay off the best if by some miracle somebody actually won). The blog post below suggests he probably only needed to charge about $78k to make the bet “fair”, but in fact he probably charged much, much more than this — a fee he will in all likelihood keep and smile as he walks to the bank.

Warren Buffett, Quicken Loans offer $1 billion for a perfect bracket during March Madness 2014

Hey, want to win $1 billion? Well, of course you do! All you have to do is fill out a perfect bracket for the NCAA men’s basketball tournament otherwise known as March Madness, and billionaire businessman Warren Buffett and Quicken Loans will send you $25 million a year for 40 years.

No big deal, you say? Think again: there are roughly 148 quintillion ways to fill out your bracket, the result of having to make 67 picks (the 63 games in the tournament proper plus the 4 “play-in” games).

Even though some of these picks are easy to make (say, that a 16 seed will not defeat a 1 seed, since no 16 has ever won a game in the men’s tournament), the odds of successfully completing a perfect bracket are astronomically small — somewhere between 1 in a billion and 1 in 128 billion.

So you’re saying there’s a chance? Yes, but a very slim one at best.

Brad Carlin, Ph.D., is a professor of biostatistics in the School of Public Health at the University of Minnesota and has been following this story since it unfolded. “This is a fascinating case, especially for me as a statistician long interested in NCAA tournaments and wagering. Mr. Buffett is essentially acting as the insurance company for Quicken, and it’s fun to speculate on how much he charged Quicken for this ‘coverage’.”

Rules of the contest specify only the first 10 million participants who register will be allowed to submit brackets. Assuming independent players each submitting one entry, standard probability calculations reveal the probability that at least one of them will achieve perfection. Multiplying by $1 billion produces the “premium” Mr. Buffett would need to charge Quicken to exactly cover the risk.

However, this premium varies widely depending on how hard you assume a perfect bracket is to achieve. Prof. Ezra Miller of Duke University suggests a skilled player would have a 1 in a billion chance of perfection; there is about a 1% chance that at least one of 10,000,000 such players would end up perfect. This leads to a premium of $1B x .01 = $10,000,000, a tidy sum indeed. However, Prof. Jay Berger of DePaul University thinks even a skilled player would have at best a 1 in 128 billion chance of perfection. The chance of at least one of 10,000,000 such players beating Mr. Buffett is only 0.0078%, leading to an insurance premium of just $78,000 – a relative pittance. Neither Mr. Buffett nor Quicken will reveal the actual premium, but it seems likely to be much closer to $10M than $78K; Quicken is donating $3M for home loans and other charitable causes as part of the promotion regardless of whether anyone hits perfection or not.

Carlin adds Mr. Buffett’s risk is probably even lower than this due to the lack of independence among players. “The best way to beat Mr. Buffett here would be to enter the 10,000,000 most likely pool sheets, which can be computed from point spreads and team computer ratings once the brackets come out.” But instead, Carlin predicts the public will likely do as they always do: overvalue “conventional wisdom” and overbet the favorites (higher seeds) in the tournament, leading to many pool sheets that look largely – or perhaps even exactly – alike. “Positive dependence among the 10,000,000 entries makes it even easier for Warren to hang onto his billion.”

Regardless of how they make their picks, odds, participants are going to have to overcome monumental odds to rake in the prize. For those who decide to get involved anyway, Carlin counsels diversity: “Don’t be afraid to pick a few upsets; unlike most March Madness challenges, here the goal is perfection, not merely point accumulation. Take more than a few chances to make sure your sheet differs from the other 9,999,999.”

For more from Carlin on picking March Madness winners, see this past Health Talk blog post.

Hidden Markov mixtures of regression

Posted in Statistics with tags , , , , , on December 1, 2009 by xi'an

It took the RSS feed of Bayesian Analysis to disappear from my screen—because the Bayesian Analysis 4(4) issue was completed—for me to spot this very nice paper by Matthew A. Taddy and Athanasios Kottas on Markov switching regression models. It reminds me of earlier papers of mine’s with Monica Billio and Alain Monfort, and with Merrilee Hurn and Ana Justel, on Markov switching and mixtures of regression, respectively. At that time, with Merrilee, we had in mind to extend mixtures of regressions to generalised linear mixtures of generalised linear models but never found the opportunity to concretise the model. The current paper goes much farther by using mixtures of Dirichlet priors, thus giving a semi-parametric flavour to the mixture of regressions. There is also an interesting application to fishery management.

This issue also includes an emotional postnote by Brad Carlin, who is now stepping down from being the Bayesian Analysis Editor-in-chief. Brad unreservedly deserves thanks for mentoring Bayesian Analysis towards a wider audience and a stronger requirement on the papers being published in the journal. I think Bayesian Analysis now is a mainstream journal rather than the emanation of a society, albeit as exciting as ISBA! The electronic format adopted by Bayesian Analysis should be exploited further towards forums and on-line discussions of all papers, rather than singling out one paper by issue, and I am glad Brad agrees on this possible change of editorial policy. All the best to the new Editor-in-chief, Herbie Lee!