Archive for Royal Statistical Society

Arnak Dalalyan at the RSS Journal Webinar

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , on October 15, 2023 by xi'an

My friend and CREST colleague Arnak Dalalyan will (re)present [online] a Read Paper at the RSS on 31 October with my friends Hani Doss and Alain Durmus as discussants:

‘Theoretical Guarantees for Approximate Sampling and Log-Concave Densities’

Arnak Dalalyan ENSAE Paris, France

Sampling from various kinds of distributions is an issue of paramount importance in statistics since it is often the key ingredient for constructing estimators, test procedures or confidence intervals. In many situations, exact sampling from a given distribution is impossible or computationally expensive and, therefore, one needs to resort to approximate sampling strategies. However, there is no well-developed theory providing meaningful non-asymptotic guarantees for the approximate sampling procedures, especially in high dimensional problems. The paper makes some progress in this direction by considering the problem of sampling from a distribution having a smooth and log-concave density defined on ℝᵖ⁠, for some integer p > 0. We establish non-asymptotic bounds for the error of approximating the target distribution by the distribution obtained by the Langevin Monte Carlo method and its variants. We illustrate the effectiveness of the established guarantees with various experiments. Underlying our analysis are insights from the theory of continuous time diffusion processes, which may be of interest beyond the framework of log-concave densities that are considered in the present work.

Estimating means of bounded random variables by betting

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , , , , , , , , on April 9, 2023 by xi'an

Ian Waudby-Smith and Aaditya Ramdas are presenting next month a Read Paper to the Royal Statistical Society in London on constructing a conservative confidence interval on the mean of a bounded random variable. Here is an extended abstract from within the paper:

For each m ∈ [0, 1], we set up a “fair” multi-round game of statistician
against nature whose payoff rules are such that if the true mean happened
to equal m, then the statistician can neither gain nor lose wealth in
expectation (their wealth in the m-th game is a nonnegative martingale),
but if the mean is not m, then it is possible to bet smartly and make
money. Each round involves the statistician making a bet on the next
observation, nature revealing the observation and giving the appropriate
(positive or negative) payoff to the statistician. The statistician then plays
all these games (one for each m) in parallel, starting each with one unit of
wealth, and possibly using a different, adaptive, betting strategy in each.
The 1 − α confidence set at time t consists of all m 2 [0, 1] such that the
statistician’s money in the corresponding game has not crossed 1/α. The
true mean μ will be in this set with high probability.

I read the paper on the flight back from Venice and was impressed by its universality, especially for a non-asymptotic method, while finding the expository style somewhat unusual for Series B, with notions late into being defined if at all defined. As an aside, I also enjoyed the historical connection to Jean Ville‘s 1939 PhD thesis (examined by Borel, Fréchet—his advisor—and Garnier) on a critical examination of [von Mises’] Kollektive. (The story by Glenn Shafer of Ville’s life till the war is remarkable, with the de Beauvoir-Sartre couple making a surprising and rather unglorious appearance!). Himself inspired by a meeting with Wald while in Berlin. The paper remains quite allusive about Ville‘s contribution, though, while arguing about its advance respective to Ville’s work… The confidence intervals (and sequences) depend on a supermartingale construction of the form

M_t(m):=\prod_{i=1}^t \exp\left\{ \lambda_i(X_i-m)-v_i\psi(\lambda_i)\right\}

which allows for a universal coverage guarantee of the derived intervals (and can optimised in λ). As I am getting confused by that point about the overall purpose of the analysis, besides providing an efficient confidence construction, and am lacking in background about martingales, betting, and sequential testing, I will not contribute to the discussion. Especially since ChatGPT cannot help me much, with its main “criticisms” (which I managed to receive while in Italy, despite the Italian Government banning the chabot!)

However, there are also some potential limitations and challenges to this approach. One limitation is that the accuracy of the method is dependent on the quality of the prior distribution used to set the odds. If the prior distribution is poorly chosen, the resulting estimates may be inaccurate. Additionally, the method may not work well for more complex or high-dimensional problems, where there may not be a clear and intuitive way to set up the betting framework.

and

Another potential consequence is that the use of a betting framework could raise ethical concerns. For example, if the bets are placed on sensitive or controversial topics, such as medical research or political outcomes, there may be concerns about the potential for manipulation or bias in the betting markets. Additionally, the use of betting as a method for scientific or policy decision-making may raise questions about the appropriate role of gambling in these contexts.

being totally off the radar… (No prior involved, no real-life consequence for betting, no gambling.)

Bayesian inference: challenges, perspectives, and prospects

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , , on March 29, 2023 by xi'an

Over the past year, Judith, Michael and I edited a special issue of Philosophical Transactions of the Royal Society on Bayesian inference: challenges, perspectives, and prospects, in celebration of the current President of the Royal Society, Adrian Smith, and his contributions to Bayesian analysis that have impacted the field up to this day. The issue is now out! The following is the beginning of our introduction of the series.

When contemplating his past achievements, it is striking to align the emergence of massive advances in these fields with some papers or books of his. For instance, Lindley’s & Smith’s ‘Bayes Estimates for the Linear Model’ (1971), a Read Paper at the Royal Statistical Society, is making the case for the Bayesian analysis of this most standard statistical model, as well as emphasizing the notion of exchangeability that is foundational in Bayesian statistics, and paving the way to the emergence of hierarchical Bayesian modelling. It thus makes a link between the early days of Bruno de Finetti, whose work Adrian Smith translated into English, and the current research in non-parametric and robust statistics. Bernardo’s & Smith’s masterpiece, Bayesian Theory (1994), sets statistical inference within decision- and information-theoretic frameworks in a most elegant and universal manner that could be deemed a Bourbaki volume for Bayesian statistics if this classification endeavour had reached further than pure mathematics. It also emphasizes the central role of hierarchical modelling in the construction of priors, as exemplified in Carlin’s et al.‘Hierarchical Bayesian analysis of change point problems’ (1992).

The series of papers published in 1990 by Alan Gelfand & Adrian Smith, esp. ‘Sampling-Based Approaches to Calculating Marginal Densities’ (1990), is overwhelmingly perceived as the birth date of modern Markov chain Monte Carlo (MCMC) methods, as itbrought to the whole statistics community (and the quickly wider communities) the realization that MCMC simulation was the sesame to unlock complex modelling issues. The consequences on the adoption of Bayesian modelling by non-specialists are enormous and long-lasting.Similarly, Gordon’set al.‘Novel approach to nonlinear/non-Gaussian Bayesian state estimation’ (1992) is considered as the birthplace of sequential Monte Carlo, aka particle filtering, with considerable consequences in tracking, robotics, econometrics and many other fields. Titterington’s, Smith’s & Makov’s reference book, ‘Statistical Analysis of Finite Mixtures(1984)  is a precursor in the formalization of heterogeneous data structures, paving the way for the incoming MCMC resolutions like Tanner & Wong (1987), Gelman & King (1990) and Diebolt & Robert (1990). Denison et al.’s book, ‘Bayesian methods for nonlinear classification and regression’ (2002) is another testimony to the influence of Adrian Smith on the field,stressing the emergence of robust and general classification and nonlinear regression methods to analyse complex data, prefiguring in a way the later emergence of machine-learning methods,with the additional Bayesian assessment of uncertainty. It is also bringing forward the capacity of operating Bayesian non-parametric modelling that is now broadly accepted, following a series of papers by Denison et al. in the late 1990s like CART and MARS.

We are quite grateful to the authors contributing to this volume, namely Joshua J. Bon, Adam Bretherton, Katie Buchhorn, Susanna Cramb, Christopher Drovandi, Conor Hassan, Adrianne L. Jenner, Helen J. Mayfield, James M. McGree, Kerrie Mengersen, Aiden Price, Robert Salomone, Edgar Santos-Fernandez, Julie Vercelloni and Xiaoyu Wang, Afonso S. Bandeira, Antoine Maillard, Richard Nickl and Sven Wang , Fan Li, Peng Ding and Fabrizia Mealli, Matthew Stephens, Peter D. Grünwald, Sumio Watanabe, Peter Müller, Noirrit K. Chandra and Abhra Sarkar, Kori Khan and Alicia Carriquiry, Arnaud Doucet, Eric Moulines and Achille Thin, Beatrice Franzolini, Andrea Cremaschi, Willem van den Boom and Maria De Iorio, Sandra Fortini and Sonia Petrone, Sylvia Frühwirth-Schnatter, Sara Wade, Chris C. Holmes and Stephen G. Walker, Lizhen Nie and Veronika Ročková. Some of the papers are open-access, if not all, hence enjoy them!

martingale posteriors

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , on November 7, 2022 by xi'an

A new Royal Statistical Society Read Paper featuring Edwin Fong, Chris Holmes, and Steve Walker. Starting from the predictive

p(y_{n+1:+\infty}|y_{1:n})\ \ \ (1)

rather than from the posterior distribution on the parameter is a fairly novel idea, also pursued by Sonia Petrone and some of her coauthors. It thus adopts a de Finetti’s perspective while adding some substance to the rather metaphysical nature of the original. It however relies on the “existence” of an infinite sample in (1) that assumes a form of underlying model à la von Mises or at least an infinite population. The representation of a parameter θ as a function of an infinite sequence comes as a shock first but starts making sense when considering it as a functional of the underlying distribution. Of course, trading (modelling) a random “opaque” parameter θ for (envisioning) an infinite sequence of random (un)observations may sound like a sure loss rather than as a great deal, but it gives substance to the epistemic uncertainty about a distributional parameter, even when a model is assumed, as in Example 1, which defines θ in the usual parametric way (i.e., the mean of the iid variables). Furthermore, the link with bootstrap and even more Bayesian bootstrap becomes clear when θ is seen this way.

Always a fan of minimal loss approaches, but (2.4) defines either a moment or a true parameter value that depends on the parametric family indexed by θ. Hence does not exist outside the primary definition of said parametric family. The following construct of the empirical cdf based on the infinite sequence as providing the θ function is elegant but what is its Bayesian justification? (I did not read Appendix C.2. in full detail but could not spot the prior on F.)

“The resemblance of the martingale posterior to a bootstrap estimator should not have gone unnoticed”

I am always fan of minimal loss approaches, but I wonder at (2.4), as it defines either a moment or a true parameter value that depends on the parametric family indexed by θ. Hence it does not exist outside the primary definition of said parametric family, which limits its appeal. The following construct of the empirical cdf based on the infinite sequence as providing the θ function is elegant and connect with bootstrap, but I wonder at its Bayesian justification. (I did not read Appendix C.2. in full detail but could not spot a prior on F.)

While I completely missed the resemblance, it is indeed the case that, if the predictive at each step is build from the earlier “sample”, the support is not going to evolve. However, this is not particularly exciting as the Bayesian non-parametric estimator is most rudimentary. This seems to bring us back to Rubin (1981) ?! A Dirichlet prior is mentioned with no further detail. And I am getting confused at the complete lack of structure, prior, &tc. It seems to contradict the next section:

“While the prescription of (3.1) remains a subjective task, we find it to be no more subjective than the selection of a likelihood function”

Copulas!!! Again, I am very glad to see copulas involved in the analysis. However, I remain unclear as to why Corollary 1 implies that any sequence of copulas could do the job. Further, why does the Gaussian copula appear as the default choice? What is the computing cost of the update (4.4) after k steps? Similarly (4.7) is using a very special form of copula, with independent-across-dimension increments. I am also missing a guided tour on the implementation, as it sounds explosive in book-keeping and multiplying, while relying on a single hyperparameter in (4.5.2)?

In the illustration section, the use of the galaxy dataset may fail to appeal to Radford Neal, in a spirit similar to Chopin’s & Ridgway’s call to leave the Pima Indians alone, since he delivered a passionate lecture on the inappropriateness of a mixture model for this dataset (at ICMS in 2001). I am unclear as to where the number of modes is extracted from the infinite predictive. What is $\theta$ in this case?

Copulas!!! Although I am unclear why Corollary 1 implies that any sequence of copulas does the job. And why the Gaussian copula appears as the default choice. What is the computing cost of the update (4.4) after k steps? Similarly (4.7) is using a very special form of copula, with independent-across-dimension increments. Missing a guided tour on the implementation, as it sounds explosive in book-keeping and multiplying. A single hyperparameter (4.5.2)?

statistical aspects of climate change [discuss]

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , on August 4, 2022 by xi'an


As part of its annual conference in Aberdeen, Scotland, the RSS is organising a discussion meeting on two papers presented on Wednesday 14 September 2022, 5.00PM – 7.00PM (GMT+1), with free on-line registration.

Two papers will be presented:

‘Assessing present and future risk of water damage using building attributes, meteorology, and topography’ by Heinrich-Mertsching et al.​
‘The importance of context in extreme value analysis with application to extreme temperatures in the USA and Greenland’ by Clarkson et al.​

“The Discussion Meeting at this year’s RSS conference in Aberdeen will feature two papers on the Statistical Aspects of Climate Change. The Discussion Meetings Committee chose this topic area motivated by the UN Climate Change Conference (COP26) held in Glasgow last year and because climate changes and the environment is one of the RSS’s six current campaigning priorities for 2022.

You are welcome to listen to the speakers and join in the discussion of the papers which follows the presentations. All the proceedings will be published in a forthcoming issue of Journal of the Royal Statistical Society, Series C (Applied Statistics) .”

Dr Shirley Coleman, Chair and Honorary Officer for Discussion Meetings