Le Monde has launched a series of tribunes proposing career advices from 35 personalities, among whom this week (Jan. 4, 2017) Cédric Villani. His suggestion for younger generations is to invest in artificial intelligence and machine learning. While acknowledging this still is a research topic, then switching to robotics [although this is mostly a separate. The most powerful advice in this interview is to start with a specialisation when aiming at a large spectrum of professional opportunities, gaining the opening from exchanges with people and places. And cultures. Concluding with a federalist statement I fully share.
Archive for machine learning
As I had read many comments and reviews about this book, including one by Arthur Charpentier, on Freakonometrics, I eventually decided to buy it from my Amazon Associate savings (!). With a strong a priori bias, I am afraid, gathered from reading some excerpts, comments, and the overall advertising about it. And also because the book reminded me of another quantic swan. Not to mention the title. After reading it, I am afraid I cannot tell my ascertainment has changed much.
“Models are opinions embedded in mathematics.” (p.21)
The core message of this book is that the use of algorithms and AI methods to evaluate and rank people is unsatisfactory and unfair. From predicting recidivism to fire high school teachers, from rejecting loan applications to enticing the most challenged categories to enlist for for-profit colleges. Which is indeed unsatisfactory and unfair. Just like using the h index and citation ranking for promotion or hiring. (The book mentions the controversial hiring of many adjunct faculty by KAU to boost its ranking.) But this conclusion is not enough of an argument to write a whole book. Or even to blame mathematics for the unfairness: as far as I can tell, mathematics has nothing to do with unfairness. Some analysts crunch numbers, produce a score, and then managers make poor decisions. The use of mathematics throughout the book is thus completely inappropriate, when the author means statistics, machine learning, data mining, predictive algorithms, neural networks, &tc. (OK, there is a small section on Operations Research on p.127, but I figure deep learning can bypass the maths.) Continue reading
A few weeks after the editorial “Algorithms and Blues“, Nature offers another (general public) entry on AIs and their impact on society, entitled “The Black Box of AI“. The call is less on open source AIs and more on accountability, namely the fact that decisions produced by AIS and impacting people one way or another should be accountable. Rather than excused by the way out “the computer said so”. What the article exposes is how (close to) impossible this is when the algorithms are based on black-box structures like neural networks and other deep-learning algorithms. While optimised to predict as accurately as possible one outcome given a vector of inputs, hence learning in that way how the inputs impact this output [in the same range of values], these methods do not learn in a more profound way in that they very rarely explain why the output occurs given the inputs. Hence, given a neural network that predicts go moves or operates a self-driving car, there is a priori no knowledge to be gathered from this network about the general rules of how humans play go or drive cars. This rather obvious feature means that algorithms that determine the severity of a sentence cannot be argued as being rational and hence should not be used per se (or that the judicial system exploiting them should be sued). The article is not particularly deep (learning), but it mentions a few machine-learning players like Pierre Baldi, Zoubin Ghahramani and Stéphane Mallat, who comments on the distance existing between those networks and true (and transparent) explanations. And on the fact that the human brain itself goes mostly unexplained. [I did not know I could include such dynamic images on WordPress!]
The next AISTATS conference is taking place in Florida, Fort Lauderdale, on April 20-22. (The website keeps the same address one conference after another, which means all my links to the AISTATS 2016 conference in Cadiz are no longer valid. And that the above sunset from Florida is named… cadiz.jpg!) The deadline for paper submission is October 13 and there are two novel features:
- Fast-track for Electronic Journal of Statistics: Authors of a small number of accepted papers will be invited to submit an extended version for fast-track publication in a special issue of the Electronic Journal of Statistics (EJS) after the AISTATS decisions are out. Details on how to prepare such extended journal paper submission will be announced after the AISTATS decisions.
- Review-sharing with NIPS: Papers previously submitted to NIPS 2016 are required to declare their previous NIPS paper ID, and optionally supply a one-page letter of revision (similar to a revision letter to journal editors; anonymized) in supplemental materials. AISTATS reviewers will have access to the previous anonymous NIPS reviews. Other than this, all submissions will be treated equally.
I find both initiatives worth applauding and replicating in other machine-learning conferences. Particularly in regard with the recent debate we had at Annals of Statistics.
In the recent days, we have had a lively discussion among AEs of the Annals of Statistics, as to whether or not set up a policy regarding publications of documents that have already been published in a shortened (8 pages) version in a machine learning conference like NIPS. Or AISTATS. While I obviously cannot disclose details here, the debate is quite interesting and may bring the machine learning and statistics communities closer if resolved in a certain way. My own and personal opinion on that matter is that what matters most is what’s best for Annals of Statistics rather than the authors’ tenure or the different standards in the machine learning community. If the submitted paper is based on a brilliant and novel idea that can appeal to a sufficiently wide part of the readership and if the maths support of that idea is strong enough, we should publish the paper. Whether or not an eight-page preliminary version has been previously published in a conference proceeding like NIPS does not seem particularly relevant to me, as I find those short papers mostly unreadable and hence do not read them. Since Annals of Statistics runs an anti-plagiarism software that is most likely efficient, blatant cases of duplications could be avoided. Of course, this does not solve all issues and papers with similar contents can and will end up being published. However, this is also the case for statistics journals and statistics, in the sense that brilliant ideas sometimes end up being split between two or three major journals.
[David Dowe sent me the following ad for a position of research fellow in statistics, machine learning, and Astrophysics at Monash University, Melbourne.]
RESEARCH FELLOW: in Statistics and Machine Learning for Astrophysics, Monash University, Australia, deadline 31 July.
We seek to fill a 2.5 year post-doctoral fellowship dedicated to extensions and applications of the Bayesian Minimum Message Length (MML) technique to the analysis of spectroscopic data from recent large astronomical surveys, such as GALAH (GALactic Archaeology with HERMES). The position is based jointly within the Monash Centre for Astrophysics (MoCA, in the School of Physics and Astronomy) and the Faculty of Information Technology (FIT).
The successful applicant will develop and extend the MML method as needed, applying it to spectroscopic data from the GALAH project, with an aim to understanding nucleosynthesis in stars as well as the formation and evolution of our Galaxy (“galactic archaeology”). The position is based at the Clayton campus (in suburban Melbourne, Australia) of Monash University, which hosts approximately 56,000 equivalent full-time students spread across its Australian and off-shore campuses, and approximately 3500 academic staff.The successful applicant will work with world experts in both the Bayesian information-theoretic MML method as well as nuclear astrophysics. The immediate supervisors will be Professor John Lattanzio (MoCA), Associate Professor David Dowe (FIT) and Dr Aldeida Aleti (FIT).
The second and third days of AISTATS 2016 passed like a blur, with not even the opportunity to write my impressions in real time! Maybe long tapa breaks are mostly to blame for this… In any case, we had two further exciting plenary talks about privacy-preserving data analysis by Kamalika Chaudhuri and crowdsourcing and machine learning by Adam Tauman Kalai. The talk by Kamalika was covering recent results by Kamalika and coauthors about optimal privacy preservation in classification and a generalisation to correlated data, with the neat notion of a Markov Quilt. Other talks that same day also dwelt on this privacy issue, but I could not be . The talk by Adam was full of fun illustrations on humans training learning systems (with the unsolved difficulty of those humans deliberately mis-training the system, as exhibited recently by the short-lived Microsoft Tay experiment).
Both poster sessions were equally exciting, with the addition of MLSS student posters on the final day. Among many, I particularly enjoyed Iain Murray’s pseudo-marginal slice sampling, David Duvenaud’s fairly intriguing use of early stopping for non-parametric inference, Garrett Bernstein’s work on aggregated Markov chains, Ye Wang’s scalable geometric density estimation [with a special bonus for his typo on the University of Turing, instead of Torino], Gemma Moran’s and Chengtao Li’s posters on determinantal processes, and Matej Balog’s Mondrian forests with a Laplace kernel [envisioning potential applications for ABC]. Again, just to mention a few…
The participants [incl. myself] also took one evening off to visit a sherry winery in Jerez, with a well-practiced spiel on the story of the company, with some building designed by Gutave Eiffel, and with a wine-tasting session. As I personally find this type of brandy too strong in alcohol, I am not a big fan of sherry but it was nonetheless an amusing trip! With no visible after-effects the next morning, since the audience was as large as usual for Adam’s talk [although I did not cross a machine-learning soul on my 6am run…]
In short, I enjoyed very much AISTATS 2016 and remain deeply impressed by the efficiency of the selection process and the amount of involvement of the actors of this selection, as mentioned earlier on the ‘Og. Kudos!