Archive for seminar

complex Cauchys

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , on February 8, 2018 by xi'an

During a visit of Don Fraser and Nancy Reid to Paris-Dauphine where Nancy gave a nice introduction to confidence distributions, Don pointed out to me a 1992 paper by Peter McCullagh on the Cauchy distribution. Following my recent foray into the estimation of the Cauchy location parameter. Among several most interesting aspects of the Cauchy, Peter re-expressed the density of a Cauchy C(θ¹,θ²) as

f(x;θ¹,θ²) = |θ²| / |x-θ|²

when θ=θ¹+ιθ² [a complex number on the half-plane]. Denoting the Cauchy C(θ¹,θ²) as Cauchy C(θ), the property that the ratio aX+b/cX+d follows a Cauchy for all real numbers a,b,c,d,

C(aθ+b/cθ+d)

[when X is C(θ)] follows rather readily. But then comes the remark that

“those properties follow immediately from the definition of the Cauchy as the ratio of two correlated normals with zero mean.”

which seems to relate to the conjecture solved by Natesh Pillai and Xiao-Li Meng a few years ago. But the fact that  a ratio of two correlated centred Normals is Cauchy is actually known at least from the1930’s, as shown by Feller (1930, Biometrika) and Geary (1930, JRSS B).

Bayesian regression trees [seminar]

Posted in pictures, Statistics, University life with tags , , , , , , , , , , on January 26, 2018 by xi'an
During her visit to Paris, Veronika Rockovà (Chicago Booth) will give a talk in ENSAE-CREST on the Saclay Plateau at 2pm. Here is the abstract
Posterior Concentration for Bayesian Regression Trees and Ensembles
(joint with Stephanie van der Pas)Since their inception in the 1980’s, regression trees have been one of the more widely used non-parametric prediction methods. Tree-structured methods yield a histogram reconstruction of the regression surface, where the bins correspond to terminal nodes of recursive partitioning. Trees are powerful, yet  susceptible to over-fitting.  Strategies against overfitting have traditionally relied on  pruning  greedily grown trees. The Bayesian framework offers an alternative remedy against overfitting through priors. Roughly speaking, a good prior  charges smaller trees where overfitting does not occur. While the consistency of random histograms, trees and their ensembles  has been studied quite extensively, the theoretical understanding of the Bayesian counterparts has  been  missing. In this paper, we take a step towards understanding why/when do Bayesian trees and their ensembles not overfit. To address this question, we study the speed at which the posterior concentrates around the true smooth regression function. We propose a spike-and-tree variant of the popular Bayesian CART prior and establish new theoretical results showing that  regression trees (and their ensembles) (a) are capable of recovering smooth regression surfaces, achieving optimal rates up to a log factor, (b) can adapt to the unknown level of smoothness and (c) can perform effective dimension reduction when p>n. These results  provide a piece of missing theoretical evidence explaining why Bayesian trees (and additive variants thereof) have worked so well in practice.

distributions for parameters [seminar]

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on January 22, 2018 by xi'an
Next Thursday, January 25, Nancy Reid will give a seminar in Paris-Dauphine on distributions for parameters that covers different statistical paradigms and bring a new light on the foundations of statistics. (Coffee is at 10am in the Maths department common room and the talk is at 10:15 in room A, second floor.)

Nancy Reid is University Professor of Statistical Sciences and the Canada Research Chair in Statistical Theory and Applications at the University of Toronto and internationally acclaimed statistician, as well as a 2014 Fellow of the Royal Society of Canada. In 2015, she received the Order of Canada, was elected a foreign associate of the National Academy of Sciences in 2016 and has been awarded many other prestigious statistical and science honours, including the Committee of Presidents of Statistical Societies (COPSS) Award in 1992.

Nancy Reid’s research focuses on finding more accurate and efficient methods to deduce and conclude facts from complex data sets to ultimately help scientists find specific solutions to specific problems.

There is currently some renewed interest in developing distributions for parameters, often without relying on prior probability measures. Several approaches have been proposed and discussed in the literature and in a series of “Bayes, fiducial, and frequentist” workshops and meeting sessions. Confidence distributions, generalized fiducial inference, inferential models, belief functions, are some of the terms associated with these approaches.  I will survey some of this work, with particular emphasis on common elements and calibration properties. I will try to situate the discussion in the context of the current explosion of interest in big data and data science. 

Alan Gelfand in Paris

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , on May 11, 2017 by xi'an

Alan Gelfand (Duke University) will be in Paris on the week of May 15 and give several seminars, including one at AgroParisTech on May 16:

Modèles hiérarchiques

and on at CREST (BiPS)  on May 18, 2pm:

Scalable Gaussian processes for analyzing space and space-time datasets

talk at Trinity College

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , on May 7, 2017 by xi'an

Tomorrow noon, I will give a talk at Trinity College Dublin on the asymptotic properties of ABC. (Here are the slides from the talk I gave in Berlin last month.)

Gregynog #2 [jatp]

Posted in Kids, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , , , , on April 26, 2017 by xi'an

MDL multiple hypothesis testing

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , on September 1, 2016 by xi'an

“This formulation reveals an interesting connection between multiple hypothesis testing and mixture modelling with the class labels corresponding to the accepted hypotheses in each test.”

After my seminar at Monash University last Friday, David Dowe pointed out to me the recent work by Enes Makalic and Daniel Schmidt on minimum description length (MDL) methods for multiple testing as somewhat related to our testing by mixture paper. Work which appeared in the proceedings of the 4th Workshop on Information Theoretic Methods in Science and Engineering (WITMSE-11), that took place in Helsinki, Finland, in 2011. Minimal encoding length approaches lead to choosing the model that enjoys the smallest coding length. Connected with, e.g., Rissannen‘s approach. The extension in this paper consists in considering K hypotheses at once on a collection of m datasets (the multiple then bears on the datasets rather than on the hypotheses). And to associate an hypothesis index to each dataset. When the objective function is the sum of (generalised) penalised likelihoods [as in BIC], it leads to selecting the “minimal length” model for each dataset. But the authors introduce weights or probabilities for each of the K hypotheses, which indeed then amounts to a mixture-like representation on the exponentiated codelengths. Which estimation by optimal coding was first proposed by Chris Wallace in his book. This approach eliminates the model parameters at an earlier stage, e.g. by maximum likelihood estimation, to return a quantity that only depends on the model index and the data. In fine, the purpose of the method differs from ours in that the former aims at identifying an appropriate hypothesis for each group of observations, rather than ranking those hypotheses for the entire dataset by considering the posterior distribution of the weights in the later. The mixture has somehow more of a substance in the first case, where separating the datasets into groups is part of the inference.