Archive for tree

off to BayesComp 20, Gainesville

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , on January 7, 2020 by xi'an

little people of Tofino [jatp]

Posted in Kids, pictures, Running, Travel with tags , , , , , , on August 19, 2018 by xi'an

Bayesian regression trees [seminar]

Posted in pictures, Statistics, University life with tags , , , , , , , , , , on January 26, 2018 by xi'an
During her visit to Paris, Veronika Rockovà (Chicago Booth) will give a talk in ENSAE-CREST on the Saclay Plateau at 2pm. Here is the abstract
Posterior Concentration for Bayesian Regression Trees and Ensembles
(joint with Stephanie van der Pas)Since their inception in the 1980’s, regression trees have been one of the more widely used non-parametric prediction methods. Tree-structured methods yield a histogram reconstruction of the regression surface, where the bins correspond to terminal nodes of recursive partitioning. Trees are powerful, yet  susceptible to over-fitting.  Strategies against overfitting have traditionally relied on  pruning  greedily grown trees. The Bayesian framework offers an alternative remedy against overfitting through priors. Roughly speaking, a good prior  charges smaller trees where overfitting does not occur. While the consistency of random histograms, trees and their ensembles  has been studied quite extensively, the theoretical understanding of the Bayesian counterparts has  been  missing. In this paper, we take a step towards understanding why/when do Bayesian trees and their ensembles not overfit. To address this question, we study the speed at which the posterior concentrates around the true smooth regression function. We propose a spike-and-tree variant of the popular Bayesian CART prior and establish new theoretical results showing that  regression trees (and their ensembles) (a) are capable of recovering smooth regression surfaces, achieving optimal rates up to a log factor, (b) can adapt to the unknown level of smoothness and (c) can perform effective dimension reduction when p>n. These results  provide a piece of missing theoretical evidence explaining why Bayesian trees (and additive variants thereof) have worked so well in practice.

art brut

Posted in pictures with tags , , , , , , on December 12, 2012 by xi'an

art brut

Posted in Mountains, pictures, Travel with tags , , , , on August 16, 2012 by xi'an

winter trees (2)

Posted in pictures, University life with tags , , , , , on February 17, 2012 by xi'an

Bayesian adaptive sampling

Posted in R, Statistics with tags , , , , , , , , on December 6, 2010 by xi'an

In the continuation of my earlier post on computing evidence, I read a very interesting paper by Merlise Clyde, Joyee Ghosh and Michael Littman, to appear in JCGS. It is called  Bayesian adaptive sampling for variable selection and model averaging. The sound idea at the basis of the paper is that, when one is doing variable selection (i.e. exploring a finite if large state space) in setups where the individual probabilities of the models are known (up to a constant), it is not necessary to return to models that have been already visited. Hence the move to sample models without replacement called BAS (for Bayesian adaptive sampling) in the paper. The first part discusses the way to sample without replacement a state space whose elements and probabilities are defined by a binary tree, i.e.

f(\mathbf{\gamma})=\prod_{j=1}^k \rho_{j|<j}^{\gamma_j}(1-\rho_{j|<j})^{1-\gamma_j}

(The connection with variable selection is that each level of the tree corresponds to the binary choice between including and excluding one of the variables. The tree thus has 2k endpoints/leaves for k potential variables in the model.) The cost in updating the probabilities is actually in O(k) if k is the number of levels, instead of 2k because most of the branches of the tree are unaffected by setting one final branch to probability zero. The second part deals with the adaptive and approximative issues.

Continue reading