Archive for Liège

a thesis on random forests

Posted in Books, Kids, Statistics, University life with tags , , , , on August 4, 2014 by xi'an

Blackstone park, Providence, November 28, 2012During a session of the IFCAM workshop this morning I noticed a new arXiv posting on random forests. Entitled Understanding Random Forests: From Theory to Practice, it actually corresponds to a PhD thesis written by Gilles Louppe on the topic. At the Université de Liège, Belgie/Belgium/Belgique. In this thesis, Gilles Louppe provides a rather comprehensive coverage of the random forest methodology, from specific bias-variance decompositions and convergence properties to the historical steps towards random forests, to implementation details and recommendations, to describing how to rank (co)variates by order of importance. The last point was of particular relevance for our current work on ABC model choice with random forests as it relies on random forests and relies on the frequency of appearance of a given variable to label its importance. The thesis showed me this was not a great way of selecting covariates as it did not account for correlation and could easily miss important covariates. It is a very complete, well-written and beautifully LaTeXed (with fancy grey boxes and all that jazz!). As part of his thesis, Gilles Louppe also contributed to the open source machine learning library Scikit.  The thesis thus makes a most profitable and up-to-date entry into the topic of random forests…

%d bloggers like this: