## normalising constants of G-Wishart densities

Posted in Books, Statistics with tags , , , , , , on June 28, 2017 by xi'an

Abdolreza Mohammadi, Hélène Massam, and Gérard Letac arXived last week a paper on a new approximation of the ratio of two normalising constants associated with two G-Wishart densities associated with different graphs G. The G-Wishart is the generalisation of the Wishart distribution by Alberto Roverato to the case when some entries of the matrix are equal to zero, which locations are associated with the graph G. While enjoying the same shape as the Wishart density, this generalisation does not enjoy a closed form normalising constant. Which leads to an intractable ratio of normalising constants when doing Bayesian model selection across different graphs.

Atay-Kayis and Massam (2005) expressed the ratio as a ratio of two expectations, and the current paper shows that this leads to an approximation of the ratio of normalising constants for a graph G against the graph G augmented by the edge e, equal to

Γ(½{δ+d}) / 2 √π Γ(½{δ+d+1})

where δ is the degree of freedom of the G-Wishart and d is the number of minimal paths of length 2 linking the two end points of e. This is remarkably concise and provides a fast approximation. (The proof is quite involved, by comparison.) Which can then be used in reversible jump MCMC. The difficulty is obviously in evaluating the impact of the approximation on the target density, as there is no manageable available alternative to calibrate the approximation. In a simulation example where such an alternative is available, the error is negligible though.

## untangling riddle

Posted in Statistics with tags , , on December 23, 2016 by xi'an

This week riddle is rather anticlimactic [my rewording]:

N wires lead from the top of a tower down to the ground floor. But they have become hopelessly tangled. The wires on the top floor are all labelled, from 1 to N. The wires on the bottom, however, are not.

You need to figure out which ground-floor wire’s end corresponds to which top-floor wire’s end. On the ground floor, you can tie two wire ends together, and you can tie as many of these pairs as you like. You can then walk to the top floor and use a circuit detector to check whether any two of the wires at the top of the tower are connected or not.

What is the smallest number of trips to the top of the tower you need to take in order to correctly label all the wires on the bottom?

Indeed, first, if N=2, the wires cannot be identified, if N=3, they can be identified in 2 steps, and if N=4, they can also be identified in 2 steps (only pair two wires first, which gives their values and the values of the other two up to a permutation, then pair one of each group). Now, in general,  it seems that, by attaching the wires two by two, one can successively cut the number of wires by two at each trip. This is due to the identification of pairs: if wires a and b are connected in one step, identifying the other end of a will automatically determine the other end of b if one keeps track of all connected pairs at the top. Hence, if $N=2^k$, in (k-2) steps, we are down to 4 wires and it takes an extra 2 steps to identify those 4, hence all the other wires. Which leads to identification in N trips. If N is not a power of 2, things should go faster as “left over” wires in the binary division can be paired with set-aside wires and hence identify 3 wires on that trip, then further along the way… Which leads to a $2^{\lfloor \log_2(N) \rfloor}$ number of trips maybe (or even less). Continue reading

## chain event graphs [RSS Midlands seminar]

Posted in pictures, Statistics, University life with tags , , , , , , , , , , on October 16, 2013 by xi'an

Last evening, I attended the RSS Midlands seminar here in Warwick. The theme was chain event graphs (CEG), As I knew nothing about them, it was worth my time listening to both speakers and discussing with Jim Smith afterwards. CEGs are extensions of Bayes nets with originally many more nodes since they start with the probability tree involving all modalities of all variables. Intensive Bayesian model comparison is then used to reduce the number of nodes by merging modalities having the same children or removing variables with no impact on the variable of interest. So this is not exactly a new Bayes net based on modality dummies as nodes (my original question). This is quite interesting, esp. in the first talk illustration of using missing value indicators as a supplementary variable (to determine whether or not data is missing at random). I also wonder how much of a connection there is with variable length Markov chains (either as a model or as a way to prune the tree). A last vague idea is a potential connection with lumpable Markov chains, a concept I learned from Kemeny & Snell (1960): a finite Markov chain is lumpable if by merging two or more of its states it remains a Markov chain. I do not know if this has ever been studied from a statistical point of view, i.e. testing for lumpability, but this sounds related to the idea of merging modalities of some variables in the probability tree…

## Medical illuminations [book review]

Posted in Books, pictures, Statistics with tags , , , , on September 27, 2013 by xi'an

Howard Wainer wrote another book, about to be published by Oxford University Press, called Medical Illuminations. (The book is announced for January 2 on amazon. A great New Year gift to be sure!) While I attended WSC 2013 in Hong Kong and then again at the RSS Annual Conference in Newcastle, I saw a preliminary copy of the book and asked the representative of OUP if I could get a copy for CHANCE (by any chance?!)… And they kindly sent me a copy the next day!

“This is an odd book (…) gallop[ing] off in all directions at once.” (p.152)

As can be seen from the cover, which reproduces the great da Vinci’s notebook page above (and seen also from the title where illuminations flirts with illuminated [manuscript]), the book focus on visualisation of medical data to “improve healthcare”. Its other themes are using evidence and statistical thinking towards the same goal. Since I was most impressed by the graphical part, I first thought of entitling the post as “Howard does his Tufte (before wondering at the appropriateness of such a title)!

“As hard as this may be to believe, this display is not notably worse than many of the others containd in this remarkable volume.” (p.78)

In fact, this first section is very much related with CHANCE in that a large sequence of graphics were submitted by CHANCE readers when Howard Wainer launched a competition in the magazine for improving upon a Nightingale-like representation by Burtin of antibiotics efficiency. It starts from a administrative ruling that the New York State Health Department had to publish cancer maps overlayed with potentially hazardous sites without any (interpretation) buffer. From there, Wainer shows how the best as well as the worst can be made of graphical representations of statistical data. It reproduces (with due mention) Tufte‘s selection of Minard‘s rendering of the Napoleonic Russian campaign as the best graph ever… The corresponding chapters of the book keep their focus on medical data, with some commentaries on the graphical quality of the 2008 National Healthcare Quality Report (ans.: could do better!). While this is well-done and with a significant message, I would still favour Tufte for teaching data users to present their findings in the most effective way. An interesting final chapter for the section is about “controlling creativity” where Howard Wainer follows in the steps of John Tukey about the Atlas of United States Mortality, And then shows a perfectly incomprehensible chart taken from Understanding USA, a not very premonitory title… (Besides Howard’s conclusion quoted above, you should also read the one-star comments on amazon!)

“Of course, it is impossible to underestimate the graphical skills of the mass media.” (p.164)

Section II is about a better use of statistics and of communicating those statistics towards improving healthcare, from fighting diabetes, to picking the right treatment for hip fractures (from an X-ray),  to re-evaluate detection tests (for breast and prostate cancers) as possibly very inefficient, and to briefly wonder about accelerated testing. And Section III tries to explain why progress (by applying the previous recommendations) has not been more steady. It starts with a story about the use of check-lists in intensive care and the dramatic impact on their effectiveness against infections. (The story hit home as I lost my thumb due to an infection while in intensive care! Maybe a check-list would have helped. Maybe.)  The next chapter contrasts the lack of progress in using check-lists with the adoption of the Korean alphabet in Korea, a wee unrelated example given the focus of the book. (Overall, I find most of the final chapters on the weak side of the book.)

This is indeed an odd book, with a lot of clever remarks and useful insights, but not so much with a driving line that would have made Wainer’s Medical Illuminations more than the sum of its components. Each section and most chapters (!) contain sensible recommendations for improving the presentation and exploitation of medical data towards practitioners and patients. I however wonder how much the book can impact the current state of affairs, like producing better tools for monitoring one’s own diabetes. So, in the end, I recommend the reading of Medical Illuminations as a very pleasant moment, from which examples and anecdotes can be borrowed for courses and friendly discussions. For non-statisticians, it is certainly a worthy entry on the relevance of statistical processing of (raw) data.

## random sudokus

Posted in Books, R, Statistics with tags , , , on June 4, 2013 by xi'an

In a paper arXived on Friday, Roberto Fontana relates the generation of Sudoku grids to the one of Latin squares (which is unsurprising) and to maximum cliques of a graph (more surprising). The generation of a random Latin square proceeds in three steps:

1. generate a random Latin square L with identity permutation matrix on symbol 1 (in practice, this implies building the corresponding graph and picking one of the largest cliques at random);
2. modify L into L’ using a random permutation of the symbols 2,…,n in L’;
3. modify L’ into L” by a random permutation of the columns of L’.

A similar result holds for Sudokus (with the additional constraint on the regions). However, while the result is interesting in its own right, it only covers full Sudokus, rather than partially filled Sudokus with a unique solution, whose random production could be more relevant. (Or maybe not, given that the difficulty matters.) [The code uses some R packages, but then moves to SAS, rather surprisingly.]

## ugly graph of the day

Posted in Books, Statistics with tags , , on February 25, 2012 by xi'an

Julien pointed out to me this terrible graph, where variations cannot be perceived and where the circles are meaningless! Two three-point curves would have been way more explicit!!!

## Back from Philly

Posted in R, Statistics, Travel, University life with tags , , , , , , , , , on December 21, 2010 by xi'an

The conference in honour of Larry Brown was quite exciting, with lots of old friends gathered in Philadelphia and lots of great talks either recollecting major works of Larry and coauthors or presenting fairly interesting new works. Unsurprisingly, a large chunk of the talks was about admissibility and minimaxity, with John Hartigan starting the day re-reading Larry masterpiece 1971 paper linking admissibility and recurrence of associated processes, a paper I always had trouble studying because of both its depth and its breadth! Bill Strawderman presented a new if classical minimaxity result on matrix estimation and Anirban DasGupta some large dimension consistency results where the choice of the distance (total variation versus Kullback deviance) was irrelevant. Ed George and Susie Bayarri both presented their recent work on g-priors and their generalisation, which directly relate to our recent paper on that topic. On the afternoon, Holger Dette showed some impressive mathematics based on Elfving’s representation and used in building optimal designs. I particularly appreciated the results of a joint work with Larry presented by Robert Wolpert where they classified all Markov stationary infinitely divisible time-reversible integer-valued processes. It produced a surprisingly small list of four cases, two being trivial.. The final talk of the day was about homology, which sounded a priori rebutting, but Robert Adler made it extremely entertaining, so much that I even failed to resent the powerpoint tricks! The next morning, Mark Low gave a very emotional but also quite illuminating about the first results he got during his PhD thesis at Cornell (completing the thesis when I was using Larry’s office!). Brenda McGibbon went back to the three truncated Poisson papers she wrote with Ian Johnstone (via gruesome 13 hour bus rides from Montréal to Ithaca!) and produced an illuminating explanation of the maths at work for moving from the Gaussian to the Poisson case in a most pedagogical and enjoyable fashion. Larry Wasserman explained the concepts at work behind the lasso for graphs, entertaining us with witty acronyms on the side!, and leaving out about 3/4 of his slides! (The research group involved in this project produced an R package called huge.) Joe Eaton ended up the morning with a very interesting result showing that using the right Haar measure as a prior leads to a matching prior, then showing why the consequences of the result are limited by invariance itself. Unfortunately, it was then time for me to leave and I will miss (in both meanings of the term) the other half of the talks. Especially missing Steve Fienberg’s talk for the third time in three weeks! Again, what I appreciated most during those two days (besides the fact that we were all reunited on the very day of Larry’s birthday!) was the pain most speakers went to to expose older results in a most synthetic and intuitive manner… I also got new ideas about generalising our parallel computing paper for random walk Metropolis-Hastings algorithms and for optimising across permutation transforms.