Archive for INRIA

keep meetings hybrid

Posted in Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , , , on September 30, 2022 by xi'an

I was reading the latest ISBA Bulletin and the tribune by ISBA President Sudipto Banerjee celebrating the return to the physical ISBA World meeting, along with worries about participants who caught COVID there. (Unfortunately, one good friend of mine experienced symptoms that went beyond the mild cold-like ones I zoomed through a few days ago.) This particular issue of creating a COVID cluster [during coffee breaks?!] provides [me with] one further argument for my supporting hybrid and multimodal meetings on a general basis. Which should [imho] appear in the proposals for the 2026 and 2028 World Meetings (deadline on 31 October)…(The 2024 meeting in Venezia will certainly involve hybridicity! As will BayesComp in Levi.) Discussing the topic with others in some scientific committees recently made me realise this was not such a shared perspective, from reasons varying from worrying about balancing the budget, to zoom fatigue, to the added value of informal interactions. Still, there also are reasons for hybridising our meetings, from reduced travel impact, to more inclusiveness,  on geographical, diversity, affordability, seniority grounds. Holding hybrid conferences with multiple regional mirrors allows for a potentially higher degree of interaction and local input.  And a minimal organisational effort.

ABC in… everywhere [programme]

Posted in Mountains, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , , on April 8, 2021 by xi'an

The ABC in Svalbard workshop is taking place on-line next week (and most sadly not in Svalbard). The programme is available on the ABC site. It starts (in Australia) at 4:00GMT (14 AEST) and finishes (in France) at 15:30GMT (17:30 CET). Registration is free but needed to access the Zoom codes!  See you on Zoom next week!!!

missing bit?

Posted in Books, Statistics, University life with tags , , , , , , , , on January 9, 2021 by xi'an

Nature of 7 December 2020 has a Nature Index (a supplement made of a series of articles, more journalistic than scientific, with corporate backup, which “have no influence over the content”) on Artificial Intelligence, including the above graph representing “the top 200 collaborations among 146 institutions based between 2015 and 2019, sized according to each institution’s share in artificial intelligence”, with only the UK, Germany, Switzerland and Italy identified for Europe… Missing e.g. the output from France and from its major computer science institute, INRIA. Maybe because “the articles picked up by [their] database search concern specific applications of AI in the life sciences, physical sciences, chemistry, and Earth and environmental sciences”.  Or maybe because of the identification of INRIA as such.

“Access to massive data sets on which to train machine-learning systems is one advantage that both the US and China have. Europe, on the other hand, has stringent data laws, which protect people’s privacy, but limit its resources for training AI algorithms. So, it seems unlikely that Europe will produce very sophisticated AI as a consequence”

This comment is sort of contradictory for the attached articles calling for a more ethical AI. Like making AI more transparent and robust. While having unrestricted access to personal is helping with social engineering and control favoured by dictatures and corporate behemoths, a culture of data privacy may (and should) lead to develop new methodology to work with protected data (as in an Alan Turing Institute project) and to infuse more trust from the public. Working with less data does not mean less sophistication in handling it but on the opposite! Another clash of events appears in one of the six trailblazers portrayed in the special supplement being Timnit Gebru, “former co-lead of the Ethical AI Team at Google”, who parted way with Google at the time the issue was published. (See Andrew’s blog for  discussion of her firing. And the MIT Technology Review for an analysis of the paper potentially at the source of it.)

Francis Bach à l’Académie des Sciences

Posted in Statistics with tags , , , , , on April 8, 2020 by xi'an

Congrats to Francis Bach, freshly nominated to the French Academy of Sciences, joining Stéphane Mallat²⁰¹⁴ and Éric Moulines²⁰¹⁷ as data science academicians!

double descent

Posted in Books, Statistics, University life with tags , , , , , , , , , , , on November 7, 2019 by xi'an

Last Friday, I [and a few hundred others!] went to the SMILE (Statistical Machine Learning in Paris) seminar where Francis Bach was giving a talk. (With a pleasant ride from Dauphine along the Seine river.) Fancis was talking about the double descent phenomenon observed in recent papers by Belkin & al. (2018, 2019), and Mei & Montanari (2019). (As the seminar room at INRIA was quite crowded and as I was sitting X-legged on the floor close to the screen, I took a few slides from below!) The phenomenon is that the usual U curve warning about over-fitting and reproduced in most statistics and machine-learning courses can under the right circumstances be followed by a second decrease in the testing error when the number of features goes beyond the number of observations. This is rather puzzling and counter-intuitive, so I briefkly checked the 2019 [8 pages] article by Belkin & al., who are studying two examples, including a standard “large p small n” Gaussian regression. where the authors state that

“However, as p grows beyond n, the test risk again decreases, provided that the model is fit using a suitable inductive bias (e.g., least norm solution). “

One explanation [I found after checking the paper] is that the variates (features) in the regression are selected at random rather than in an optimal sequential order. Double descent is missing with interpolating and deterministic estimators. Hence requiring on principle all candidate variates to be included to achieve minimal averaged error. The infinite spike is when the number p of variate is near the number n of observations. (The expectation accounts as well for the randomisation in T. Randomisation that remains an unclear feature in this framework…)

%d bloggers like this: