## ISBA 2016 [#7]

Posted in Mountains, pictures, Running, Statistics, Travel, Wines with tags , , , , , , , , , , , , , on June 20, 2016 by xi'an

This series of posts is most probably getting by now an imposition on the ‘Og readership, which either attended ISBA 2016 and does (do?) not need my impressions or did not attend and hence does (do?) not need vague impressions about talks they (it?) did not see, but indulge me in reminiscing about this last ISBA meeting (or more reasonably ignore this post altogether). Now that I am back home (with most of my Sard wine bottles intact!, and a good array of Sard cheeses).

This meeting seems to be the largest ISBA meeting ever, with hundreds of young statisticians taking part in it (despite my early misgivings about the deterrent represented by the overall cost of attending the meeting. I presume holding the meeting in Europe made it easier and cheaper for most Europeans to attend (and hopefully the same will happen in Edinburgh in 2018!), as was the (somewhat unsuspected) wide availability of rental alternatives in the close vicinity of the conference resort. I also presume the same travel opportunities would not have been true in Banff, although local costs would have been lower. It was fantastic to see so many new researchers interested in Bayesian statistics and to meet some of them. And to have more sessions run by the j-Bayes section of ISBA (although I found it counterproductive that such sessions do not focus on a thematically coherent theme). As a result, the meeting was more intense than ever and I found it truly exhausting, despite skipping most poster sessions. Maybe also because I did not skip a single session thanks to the availability of an interesting theme for each block in the schedule. (And because I attended more [great] Sard dinners than I originally intended.) Having five sessions in parallel indeed means there is a fabulous offer of themes for every taste. It also means there are inevitably conflicts when picking one’s session.

Back to poster sessions, I feel I missed an essential part of the meeting, which made ISBA meetings so unique, but it also seems to me the organisation of those sessions should be reconsidered against the rise in attendance. (And my growing inability to stay up late!) One solution suggested by my recent AISTATS experience is to select posters towards lowering the number of posters in the four poster sessions. The success rate for the Cadiz meeting was 35%.) The obvious downsizes are the selection process (but this was done quite efficiently for AISTATS) and the potential reduction in the number of participants. A medium ground could see a smaller fraction of posters to be selected by this process (and published one way or another as in machine-learning conferences) and presented during the evening poster sessions, with other posters being given during the coffee breaks [which certainly does not help in reducing the intensity of the schedule]. Another and altogether solution is to extend the parallelism of oral sessions to poster sessions, by regrouping them into five or six themes or keywords chosen by the presenters and having those presented in different rooms to split the attendance down to human level and tolerable decibels. Nothing preventing participants to visit several rooms in a given evening. Or to keep posters for several nights in a row if the number of rooms allows.

It may also be that this edition of ISBA 2016 sees the end of the resort-style meeting in the spirit of the early Valencia meetings. Edinburgh 2018 will certainly be an open-space conference in that meals and lodgings will be “on” the participants who may choose where and how much. I have heard many times the argument that conferences held in single hotels or resorts facilitated the contacts between young and senior researchers, but I fear this is not sustainable against the growth of the audience. Holding the meeting in a reasonably close and compact location, as a University building, should allow for a sufficient degree of interaction, as was the case at ISBA 2016. (Kerrie Mengersen also suggested that a few restaurants nearby could be designated as “favourites” for participants to interact at dinner time.) Another suggestion to reinforce networking and interacting would be to hold more satellite workshops before the main conference. It seems there could be a young Bayesian workshop in England the prior week as well as a summer short course on simulation methods.

Organising meetings is getting increasingly complex and provides few rewards at the academic level, so I am grateful to the organisers of ISBA 2016 to have agreed to carry the burden this year. And to the scientific committee for setting the quality bar that high. (A special thought too for my friend Walter Racugno who had the ultimate bad luck of having an accident the very week of the meeting he had contributed to organise!)

[Even though I predict this is my last post on ISBA 2016 I would be delighted to have guest posts on others’ impressions on the meeting. Feel free to send me entries!]

## ISBA 2016

Posted in Kids, Statistics, Travel, University life, Wines with tags , , , , , , , , , , on June 14, 2016 by xi'an

I remember fondly the early Valencia meetings where we did not have to pick between sessions. Then one year there were two sessions and soon more. And we now have to pick among equally tantalising sessions. [Complaint of the super wealthy, I do realise.] After a morning trip to San’Antioco and the southern coast of Sardinia, I started my ISBA 2016 with an not [that Bayesian] high dimension session with Michael Jordan (who gave a talk related to his MCMski lecture), Isa Verdinelli and Larry Wasserman.

Larry gave a [non-Bayesian, what else?!] talk on the problem of data splitting versus double use of the same data. Or rather using a model index estimated from a given dataset to estimate the properties of the mean of the same data. As in model selection. While splitting the data avoids all sorts of problem, not splitting the data but using a different loss function could avoid the issue. (And the infinite regress that if we keep conducting inference, we may have to split further and further the data.) Namely, if we were looking only at quantities that do not vary across models. So it is surprising that prediction get affected by this.

In a second session around Bayesian tests and model choice, Sarah Filippi presented the Bayesian non-parametric test she devised with Chris Holmes, using Polya trees. And mentioned our testing-by-mixture approach as a valuable alternative! Veronika Rockova talked about her new approach to efficient variable selection by spike-and-slab priors, through a mix of particle MCMC and EM, plus some variational Bayes motivations. (She also mentioned extensions by repulsive sampling through the pinball sampler, of which her recent AISTATS paper reminded me.)

Later in the evening, I figured out that the poster sessions that make the ISBA/Valencia meetings so unique are alas out of reach for me as the level of noise and my reduced hearing capacities (!) make impossible any prolonged discussion on any serious notion. No poster session for ‘Og’s men!, then, even though I can hang out at the fringe and chat with friends!

## dynamic mixtures [at NBBC15]

Posted in R, Statistics with tags , , , , , , , , , , , , on June 18, 2015 by xi'an

A funny coincidence: as I was sitting next to Arnoldo Frigessi at the NBBC15 conference, I came upon a new question on Cross Validated about a dynamic mixture model he had developed in 2002 with Olga Haug and Håvård Rue [whom I also saw last week in Valencià]. The dynamic mixture model they proposed replaces the standard weights in the mixture with cumulative distribution functions, hence the term dynamic. Here is the version used in their paper (x>0)

$(1-w_{\mu,\tau}(x))f_{\beta,\lambda}(x)+w_{\mu,\tau}(x)g_{\epsilon,\sigma}(x)$

where f is a Weibull density, g a generalised Pareto density, and w is the cdf of a Cauchy distribution [all distributions being endowed with standard parameters]. While the above object is not a mixture of a generalised Pareto and of a Weibull distributions (instead, it is a mixture of two non-standard distributions with unknown weights), it is close to the Weibull when x is near zero and ends up with the Pareto tail (when x is large). The question was about simulating from this distribution and, while an answer was in the paper, I replied on Cross Validated with an alternative accept-reject proposal and with a somewhat (if mildly) non-standard MCMC implementation enjoying a much higher acceptance rate and the same fit.

## An objective prior that unifies objective Bayes and information-based inference

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , on June 8, 2015 by xi'an

During the Valencia O’Bayes 2015 meeting, Colin LaMont and Paul Wiggins arxived a paper entitled “An objective prior that unifies objective Bayes and information-based inference”. It would have been interesting to have the authors in Valencia, as they make bold claims about their w-prior as being uniformly and maximally uninformative. Plus achieving this unification advertised in the title of the paper. Meaning that the free energy (log transform of the inverse evidence) is the Akaike information criterion.

The paper starts by defining a true prior distribution (presumably in analogy with the true value of the parameter?) and generalised posterior distributions as associated with any arbitrary prior. (Some notations are imprecise, check (3) with the wrong denominator or the predictivity that is supposed to cover N new observations on p.2…) It then introduces a discretisation by considering all models within a certain Kullback divergence δ to be undistinguishable. (A definition that does not account for the assymmetry of the Kullback divergence.) From there, it most surprisingly [given the above discretisation] derives a density on the whole parameter space

$\pi(\theta) \propto \text{det} I(\theta)^{1/2} (N/2\pi \delta)^{K/2}$

where N is the number of observations and K the dimension of θ. Dimension which may vary. The dependence on N of the above is a result of using the predictive on N points instead of one. The w-prior is however defined differently: “as the density of indistinguishable models such that the multiplicity is unity for all true models”. Where the log transform of the multiplicity is the expected log marginal likelihood minus the expected log predictive [all expectations under the sampling distributions, conditional on θ]. Rather puzzling in that it involves the “true” value of the parameter—another notational imprecision, since it has to hold for all θ’s—as well as possibly improper priors. When the prior is improper, the log-multiplicity is a difference of two terms such that the first term depends on the constant used with the improper prior, while the second one does not…  Unless the multiplicity constraint also determines the normalising constant?! But this does not seem to be the case when considering the following section on normalising the w-prior. Mentioning a “cutoff” for the integration that seems to pop out of nowhere. Curiouser and curiouser. Due to this unclear handling of infinite mass priors, and since the claimed properties of uniform and maximal uninformativeness are not established in any formal way, and since the existence of a non-asymptotic solution to the multiplicity equation is neither demonstrated, I quickly lost interest in the paper. Which does not contain any worked out example. Read at your own risk!

## O-Bayes15 [day #1]

Posted in Books, pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , on June 3, 2015 by xi'an

So here we are back together to talk about objective Bayes methods, and in the City of Valencià as well.! A move back to a city where the 1998 O’Bayes took place. In contrast with my introductory tutorial, the morning tutorials by Luis Pericchi and Judith Rousseau were investigating fairly technical and advanced, Judith looking at the tools used in the frequentist (Bernstein-von Mises) analysis of priors, with forays in empirical Bayes, giving insights into a wide range of recent papers in the field. And Luis covering works on Bayesian robustness in the sense of resisting to over-influential observations. Following works of him and of Tony O’Hagan and coauthors. Which means characterising tails of prior versus sampling distribution to allow for the posterior reverting to the prior in case of over-influential datapoints. Funny enough, after a great opening by Carmen and Ed remembering Susie, Chris Holmes also covered Bayesian robust analysis. More in the sense of incompletely or mis-  specified models. (On the side, rekindling one comment by Susie and the need to embed robust Bayesian analysis within decision theory.) Which was also much Chris’ point, in line with the recent Watson and Holmes’ paper. Dan Simpson in his usual kick-the-anthill-real-hard-and-set-fire-to-it discussion pointed out the possible discrepancy between objective and robust Bayesian analysis. (With lines like “modern statistics has proven disruptive to objective Bayes”.) Which is not that obvious because the robust approach simply reincorporates the decision theory within the objective framework. (Dan also concluded with the comic strip below, whose message can be interpreted in many ways…! Or not.)

The second talk of the afternoon was given by Veronika Ročková on a novel type of spike-and-slab prior to handle sparse regression, bringing an alternative to the standard Lasso. The prior is a mixture of two Laplace priors whose scales are constrained in connection with the actual number of non-zero coefficients. I had not heard of this approach before (although Veronika and Ed have an earlier paper on a spike-and-slab prior to handle multicolinearity that Veronika presented in Boston last year) and I was quite impressed by the combination of minimax properties and practical determination of the scales. As well as by the performances of this spike-and-slab Lasso. I am looking forward the incoming paper!

The day ended most nicely in the botanical gardens of the University of Valencià, with an outdoor reception surrounded by palm trees and parakeet cries…

## O’Bayes 2015: back in València

Posted in pictures, Statistics, Travel, University life with tags , , , , , on September 11, 2014 by xi'an

The next O’Bayes meeting (more precisely the International Workshop on Objective Bayes Methodology, O-Bayes15), will take place in València, Spain, on June 1-4, 2015. This is the second time an O’Bayes conference takes place in València, after the one José Miguel Bernardo organised in 1998 there.  The principal objectives of O-Bayes15 will be to facilitate the exchange of recent research developments in objective Bayes theory, methodology and applications, and related topics (like limited information Bayesian statistics), to provide opportunities for new researchers, and to establish new collaborations and partnerships. Most importantly, O-Bayes15 will be dedicated to our friend Susie Bayarri, to celebrate her life and contributions to Bayesian Statistics. Check the webpage of O-Bayes15 for the program (under construction) and the practical details. Looking forward to the meeting and hopeful for a broadening of the basis of the O’Bayes community and of its scope!

## Cancun, ISBA 2014 [½ day #2]

Posted in pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , on July 19, 2014 by xi'an

Half-day #2 indeed at ISBA 2014, as the Wednesday afternoon kept to the Valencia tradition of free time, and potential cultural excursions, so there were only talks in the morning. And still the core poster session at (late) night. In which my student Kaniav Kamari presented a poster on a current project we are running with Kerrie Mengersen and Judith Rousseau on the replacement of the standard Bayesian testing setting with a mixture representation. Being half-asleep by the time the session started, I did not stay long enough to collect data on the reactions to this proposal, but the paper should be arXived pretty soon. And Kate Lee gave a poster on our importance sampler for evidence approximation in mixtures (soon to be revised!). There was also an interesting poster about reparameterisation towards higher efficiency of MCMC algorithms, intersecting with my long-going interest in the matter, although I cannot find a mention of it in the abstracts. And I had a nice talk with Eduardo Gutierrez-Pena about infering on credible intervals through loss functions. There were also a couple of appealing posters on g-priors. Except I was sleepwalking by the time I spotted them… (My conference sleeping pattern does not work that well for ISBA meetings! Thankfully, both next editions will be in Europe.)

Great talk by Steve McEachern that linked to our ABC work on Bayesian model choice with insufficient statistics, arguing towards robustification of Bayesian inference by only using summary statistics. Despite this being “against the hubris of Bayes”… Obviously, the talk just gave a flavour of Steve’s perspective on that topic and I hope I can read more to see how we agree (or not!) on this notion of using insufficient summaries to conduct inference rather than trying to model “the whole world”, given the mistrust we must preserve about models and likelihoods. And another great talk by Ioanna Manolopoulou on another of my pet topics, capture-recapture, although she phrased it as a partly identified model (as in Kline’s talk yesterday). This related with capture-recapture in that when estimating a capture-recapture model with covariates, sampling and inference are biased as well. I appreciated particularly the use of BART to analyse the bias in the modelling. And the talk provided a nice counterpoint to the rather pessimistic approach of Kline’s.

Terrific plenary sessions as well, from Wilke’s spatio-temporal models (in the spirit of his superb book with Noel Cressie) to Igor Prunster’s great entry on Gibbs process priors. With the highly significant conclusion that those processes are best suited for (in the sense that they are only consistent for) discrete support distributions. Alternatives are to be used for continuous support distributions, the special case of a Dirichlet prior constituting a sort of unique counter-example. Quite an inspiring talk (even though I had a few micro-naps throughout it!).

I shared my afternoon free time between discussing the next O’Bayes meeting (2015 is getting very close!) with friends from the Objective Bayes section, getting a quick look at the Museo Maya de Cancún (terrific building!), and getting some work done (thanks to the lack of wireless…)