Archive for machine learning

AISTATS 2016 [#2]

Posted in Kids, pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , , , on May 13, 2016 by xi'an

The second and third days of AISTATS 2016 passed like a blur, with not even the opportunity to write my impressions in real time! Maybe long tapa breaks are mostly to blame for this… In any case, we had two further exciting plenary talks about privacy-preserving data analysis by Kamalika Chaudhuri and crowdsourcing and machine learning by Adam Tauman Kalai. The talk by Kamalika was covering recent results by Kamalika and coauthors about optimal privacy preservation in classification and a generalisation to correlated data, with the neat notion of a Markov Quilt.  Other talks that same day also dwelt on this privacy issue, but I could not be . The talk by Adam was full of fun illustrations on humans training learning systems (with the unsolved difficulty of those humans deliberately mis-training the system, as exhibited recently by the short-lived Microsoft Tay experiment).

Both poster sessions were equally exciting, with the addition of MLSS student posters on the final day. Among many, I particularly enjoyed Iain Murray’s pseudo-marginal slice sampling, David Duvenaud’s fairly intriguing use of early stopping for non-parametric inference,  Garrett Bernstein’s work on aggregated Markov chains, Ye Wang’s scalable geometric density estimation [with a special bonus for his typo on the University of Turing, instead of Torino], Gemma Moran’s and Chengtao Li’s posters on determinantal processes, and Matej Balog’s Mondrian forests with a Laplace kernel [envisioning potential applications for ABC]. Again, just to mention a few…

The participants [incl. myself] also took one evening off to visit a sherry winery in Jerez, with a well-practiced spiel on the story of the company, with some building designed by Gutave Eiffel, and with a wine-tasting session. As I personally find this type of brandy too strong in alcohol, I am not a big fan of sherry but it was nonetheless an amusing trip! With no visible after-effects the next morning, since the audience was as large as usual for Adam’s talk [although I did not cross a machine-learning soul on my 6am run…]

In short, I enjoyed very much AISTATS 2016 and remain deeply impressed by the efficiency of the selection process and the amount of involvement of the actors of this selection, as mentioned earlier on the ‘Og. Kudos!

AISTATS 2016 [#1]

Posted in pictures, R, Running, Statistics, Travel, Wines with tags , , , , , , , , , , , , on May 11, 2016 by xi'an

Travelling through Seville, I arrived in Càdiz on Sunday night, along with a massive depression [weather-speaking!]. Walking through the city from the station was nonetheless pleasant as this is an town full of small streets and nice houses. If with less churches than Seville! Richard Samworth gave the first plenary talk of AISTATS 2016  with a presentation on random projections for classification. His classifier is based on an average of a large number of linear random projections of the original data where the projections are chosen as minimising the prediction error over a subset of the components. The performances of this approach seem to be consistently higher than for random forests, which makes it definitely worth investigating further. (A related R package is available.)

The following talks that day covered Bayesian optimisation and probabilistic numerics, with Javier Gonzales introducing glasses for Bayesian optimisation in order to solve its myopia (!)—by which he meant predicting the output of the optimisation over n future steps. And a first mention of the Pima Indians by Daniel Hernandez-Lobato in his talk about EP with stochastic gradient steps towards optimisation. (As well as much larger datasets.) And Mark Girolami bringing quasi-Monte Carlo into control variates. A kernel based ABC by Mijung Park, which uses kernels and maximum mean discrepancy to avoid defining summary statistics, and a version of parallel MCMC by Guillaume Basse. Plus another session on deep learning.

As usual with AISTATS conferences, the central activity of the day was the noon poster session, including speakers discussing their paper, and I had several interesting chats about MCMC related topics, with e.g. one alternative notion of ensemble MCMC [centred on estimating the normalising constant].

We awarded the notable student paper awards before the welcoming cocktail: The winners are Bo DaiNedelina Teneva, and Ye Wang.  And this first day ended up with a companionable evening in a most genuine tapa bar, tasting local blood sausage and local blue cheese. (If you do not mind the corrida theme!)

position opening at ENSAE ParisTech

Posted in Kids, Statistics, Travel, University life with tags , , , , , , , on March 28, 2016 by xi'an

ensaeprofParis and la Seine, from Pont du Garigliano, Oct. 20, 2011There is an opening for an associate or full professor position in Statistics and Machine Learning at ENSAE, Paris (soon to move to the Paris-Saclay campus, next to École Polytechnique). The details are provided here. The deadline is April 18, 2016, for a hiring in September or October 2016.

AISTATS 2016 [programme]

Posted in Books, Kids, pictures, Statistics, Travel, University life with tags , , , , , , , , on March 14, 2016 by xi'an

The full programme for AISTATS 2016 in Cádiz is now on-line, including the posters (except for the additional posters by MLSS participants). Richard Samworth is scheduled to talk on Monday morning, May 9, Kamalika Chaudhuri on Tuesday morning, May 10, and Adam Tauman Kalai  on Wednesday morning, May 11. As at the previous AISTATS meeting, poster sessions are central to the day, while evenings are free (which shows this is not a Bayesian meeting!!!). See you in Cádiz, hopefully! (Registration is still open, just in case.)

MLSS 2016: machine learning summer school in Cádiz [deadline]

Posted in Kids, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , on March 11, 2016 by xi'an

Following [time-wise] the AISTATS 2016 meeting, a machine learning school is organised in Cádiz (as is the tradition for AISTATS meetings in Europe, i.e., in even years). With an impressive [if downright scary] poster! There is no strong statistics component in the programme, apart from a course by Tamara Broderick on non-parametric Bayes, but the list of speakers is impressive and the ten day school is worth recommending for all interested students.  (I remember giving a short course at MLSS 2004 on Berder Island in Brittany, with the immediate reward of running the Auray-Vannes half-marathon that year…) The deadline for applications is March 25, 2016.

go, go, go…deeper!

Posted in pictures, Statistics with tags , , , , , , , , , , on February 19, 2016 by xi'an

While visiting Warwick, last week, I came across the very issue of Nature with the highly advertised paper of David Silver and co-authors from DeepMind detailing how they designed their Go player algorithm that bested a European Go master five games in a row last September. Which is a rather unexpected and definitely brilliant feat given the state of the art! And compares (in terms of importance, if not of approach) with the victory of IBM Deep Blue over Gary Kasparov 20 years ago… (Another deep algorithm, showing that the attraction of programmers for this label has not died off over the years!)This paper is not the easiest to read (especially over breakfast), with (obviously) missing details, but I gathered interesting titbits from this cursory read. One being the reinforced learning step where the predictor is improved by being applied against earlier versions. While this can lead to overfitting, the authors used randomisation to reduce this feature. This made me wonder if a similar step could be on predictors like random forests. E.g., by weighting the trees or the probability of including a predictor or another.Another feature of major interest is their parallel use of two neural networks in the decision-making, a first one estimating a probability distribution over moves learned from millions of human Go games and a second one returning a utility or value for each possible move. The first network is used for tree exploration with Monte Carlo steps, while the second leads to the final decision.

This is a fairly good commercial argument for machine learning techniques (and for DeepMind as well), but I do not agree with the doom-sayers predicting the rise of the machines and our soon to be annihilation! (Which is the major theme of Superintelligence.) This result shows that, with enough learning data and sufficiently high optimising power and skills, it is possible to produce an excellent predictor of the set of Go moves leading to a victory. Without the brute force strategy of Deep Blue that simply explored the tree of possible games to a much more remote future than a human player could do (along with the  perfect memory of a lot of games). I actually wonder if DeepMind has also designed a chess algorithm on the same principles: there is no reason why it should no work. However, this success does not predict the soon to come emergence of AI’s able to deal with vaguer and broader scopes: in that sense, automated drivers are much more of an advance (unless they start bumping into other cars and pedestrians on a regular basis!).

Bayesian composite likelihood

Posted in Books, Statistics, University life with tags , , , , , , on February 11, 2016 by xi'an

“…the pre-determined weights assigned to the different associations between observed and unobserved values represent strong a priori knowledge regarding the informativeness of clues. A poor choice of weights will inevitably result in a poor approximation to the “true” Bayesian posterior…”

Last Xmas, Alexis Roche arXived a paper on Bayesian inference via composite likelihood. I find the paper quite interesting in that [and only in that] it defends the innovative notion of writing a composite likelihood as a pool of opinions about some features of the data. Recall that each term in the composite likelihood is a marginal likelihood for some projection z=f(y) of the data y. As in ABC settings, although it is rare to derive closed-form expressions for those marginals. The composite likelihood is parameterised by powers of those components. Each component is associated with an expert, whose weight reflects the importance. The sum of the powers is constrained to be equal to one, even though I do not understand why the dimensions of the projections play no role in this constraint. Simplicity is advanced as an argument, which sounds rather weak… Even though this may be infeasible in any realistic problem, it would be more coherent to see the weights as producing the best Kullback approximation to the true posterior. Or to use a prior on the weights and estimate them along the parameter θ. The former could be incorporated into the later following the approach of Holmes & Walker (2013). While the ensuing discussion is most interesting, it remains missing in connecting the different components in terms of the (joint) information brought about the parameters. Especially because the weights are assumed to be given rather than inferred. Especially when they depend on θ. I also wonder why the variational Bayes interpretation is not exploited any further. And see no clear way to exploit this perspective in an ABC environment.


Get every new post delivered to your Inbox.

Join 1,033 other followers