Archive for deep learning

Nature snippets

Posted in Statistics with tags , , , , , , , , , , , , , on October 1, 2019 by xi'an

In the August 1 issue of Nature I took with me to Japan, there were many entries of interest. The first pages included a tribune (“personal take on events”) by a professor of oceanography calling for a stop to the construction of the TMT telescope on the Mauna Kea mountain. While I am totally ignorant of the conditions of this construction and in particular of the possible ecological effects on a fragile altitude environment, the tribune is fairly confusing invoking mostly communitarian and religious, rather than scientific ones. And referring to Western science and Protestant missionaries as misrepresenting a principle of caution. While not seeing the contradiction in suggesting the move of the observatory to the Canary Islands, which were (also) invaded by Spanish settlers in the 13th century.

Among other news, Indonesia following regional tendencies to nationalise research by forcing foreign researchers to have their data vetted by the national research agency and to include Indonesian nationals in their projects. And, although this now sounds stale news, the worry about the buffoonesque Prime Minister of the UK. And of the eugenic tendencies of his cunning advisor… A longer article by Patrick Riley from Google on three problems with machine learning, from splitting the data inappropriately (biases in the data collection) to hidden variables (unsuspected confounders) to mistaking the objective (impact of the loss function used to learn the predictive function). (Were these warnings heeded in the following paper claiming that deep learning was better at predicting kidney failures?)  Another paper of personal interest was reporting a successful experiment in Guangzhou, China, infecting tiger mosquitoes with a bacteria to make the wild population sterile. While tiger mosquitoes have reached the Greater Paris area,  and are thus becoming a nuisance, releasing 5 million more mosquitoes per week in the wild may not sound like the desired solution but since the additional mosquitoes are overwhelmingly male, we would not feel the sting of this measure! The issue also contained a review paper on memory editing for clinical treatment of psychopathology, which is part of the 150 years of Nature anniversary collection, but that I did not read (or else I forgot!)

Hippocratic oath for maths?

Posted in Statistics with tags , , , , , , , , , , , , on August 23, 2019 by xi'an

On a free day in Nachi-Taksuura, I came across this call for a professional oath for mathematicians (and computer engineers and scientists in related fields). By UCL mathematician Hannah Fry. The theme is the same as with Weapons of math destruction, namely that algorithms have a potentially huge impact on everyone’s life and that those who design these algorithms should be accountable for it. And aware of the consequences when used by non-specialists. As illustrated by preventive justice software. And child abuse prediction software. Some form of ethics course should indeed appear in data science programs, for at least pointing out the limitations of automated decision making. However, I remain skeptical of the idea as (a) taking an oath does not mean an impossibility to breaking that oath, especially when one is blissfully unaware of breaking it (b) acting as ethically as possible should be part of everyone’s job, whether when designing deep learning algorithms or making soba noodles (c) the Hippocratic oath is mostly a moral statement that varies from place to place and from an epoch to the next (as, e.g., with respect to abortion which was prohibited in Hippocrates’ version) and does not prevent some doctors from engaging into unsavory activities. Or getting influenced by dug companies. And such an oath would not force companies to open-source their code, which in my opinion is a better way towards the assessment of such algorithms. The article does not mention either the Montréal Déclaration for a responsible AI, which goes further than a generic and most likely ineffective oath.

deep learning in Toulouse [post-doc position]

Posted in pictures, Travel, University life with tags , , , , , , , , , , on April 25, 2019 by xi'an

An opening for an ERC post-doc position on Bayesian deep learning with Cédric Févotte in Toulouse.

tenure track position in Clermont, Auvergne

Posted in pictures, Travel, University life with tags , , , , , , , , , , on April 23, 2019 by xi'an

My friend Arnaud Guillin pointed out this opening of a tenure-track professor position at his University of Clermont Auvergne, in Central France. With specialty in statistics and machine-learning, especially deep learning. The deadline for applications is 12 May 2019. (Tenure-track positions are quite rare in French universities and this offer includes a limited teaching load over three years, potential tenure and titularisation at the end of a five year period, and is restricted to candidates who did their PhD or their postdoc abroad.)

deep and embarrassingly parallel MCMC

Posted in Books, pictures, Statistics with tags , , , , , , , on April 9, 2019 by xi'an

Diego Mesquita, Paul Blomstedt, and Samuel Kaski (from Helsinki, like the above picture) just arXived a paper on embarrassingly parallel MCMC. Following a series of papers discussed on this ‘og in the past. They use a deep learning approach of Dinh et al. (2017) to the computation of the probability density of a convoluted and non-volume-preserving transform of a given random variable to turn multiple samples from sub-posteriors [corresponding to the k k-th roots of the true posterior] into a sample from the true posterior. If I understand correctly the argument [on page 4], the deep neural network provides a density estimate that apparently does better than traditional non-parametric density estimates. Maybe by being more efficient than a Parzen-Rosenblat estimator which is of order the number of simulations… For any value of θ, the estimate of the true target is the product of these estimates and for a value of θ simulated from one of the subposteriors an importance weight naturally ensues. However, for a one-dimensional transform of θ, h(θ), I would prefer estimating first the density of h(θ) for each sample and then construct an importance weight. If only to avoid the curse of dimension.

On various benchmarks, like the banana-shaped 2D target above, the proposed method (NAP) does better. Even in relatively high dimensions. Given that the overall computing times are not produced, with only the calibration that the same number of subsamples were produced for all methods, it would be interesting to test the same performances on even higher dimensions and larger population sizes.

ICM 2018

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , on August 4, 2018 by xi'an

While I am not following the International Congress of Mathematicians which just started in Rio, and even less attending, I noticed an entry on their webpage on my friend and colleague Maria Esteban which I would have liked to repost verbatim but cannot figure how. (ICM 2018 also features a plenary lecture by Michael Jordan on gradient based optimisation [which was also Michael’s topic at ISBA 2018] and another one by Sanjeev Arora on the maths deep learning, two talks broadly related with statistics, which is presumably a première at this highly selective maths conference!)

JSM 2018 [#1]

Posted in Mountains, Statistics, Travel, University life with tags , , , , , , , , , , on July 30, 2018 by xi'an

As our direct flight from Paris landed in the morning in Vancouver,  we found ourselves in the unusual situation of a few hours to kill before accessing our rental and where else better than a general introduction to deep learning in the first round of sessions at JSM2018?! In my humble opinion, or maybe just because it was past midnight in Paris time!, the talk was pretty uninspiring in missing the natural question of the possible connections between the construction of a prediction function and statistics. Watching improving performances at classifying human faces does not tell much more than creating a massively non-linear function in high dimensions with nicely designed error penalties. Most of the talk droned about neural networks and their fitting by back-propagation and the variations on stochastic gradient descent. Not addressing much rather natural (?) questions about choice of functions at each level, of the number of levels, of the penalty term, or regulariser, and even less the reason why no sparsity is imposed on the structure, despite the humongous number of parameters involved. What came close [but not that close] to sparsity is the notion of dropout, which is a sort of purely automated culling of the nodes, and which was new to me. More like a sort of randomisation that turns the optimisation criterion in an average. Only at the end of the presentation more relevant questions emerged, presenting unsupervised learning as density estimation, the pivot being the generative features of (most) statistical models. And GANs of course. But nonetheless missing an explanation as to why models with massive numbers of parameters can be considered in this setting and not in standard statistics. (One slide about deterministic auto-encoders was somewhat puzzling in that it seemed to repeat the “fiducial mistake”.)