Archive for INRIA

crowd-based peer review

Posted in Statistics with tags , , , , , , , , , , on June 20, 2017 by xi'an

In clear connection with my earlier post on Peer Community In… and my visit this week to Montpellier towards starting a Peer Community In Computational Statistics, I read a tribune in Nature (1 June, p.9) by the editor of Synlett, Benjamin List, describing an experiment conducted by this journal in chemical synthesis. The approach was to post (volunteered) submitted papers on a platform accessible to a list of 100 reviewers, nominated by the editorial board, who could anonymously comment on the papers and read others’ equally anonymous comments. With a 72 hours deadline! According to Benjamin List (and based on  a large dataset of … 10 papers!), the outcome of the experiment is one of better quality return than with traditional reviewing policies. While Peer Community In… does not work exactly this way, and does not aim at operating as a journal, it is exciting and encouraging to see such experiments unfold!

Michael Jordan’s seminar in Paris next week

Posted in Statistics, University life with tags , , , , , on June 3, 2016 by xi'an

Next week, on June 7, at 4pm, Michael will give a seminar at INRIA, rue du Charolais, Paris 12 (map). Here is the abstract:

A Variational Perspective on Accelerated Methods in Optimization

Accelerated gradient methods play a central role in optimization,achieving optimal rates in many settings. While many generalizations and extensions of Nesterov’s original acceleration method have been proposed,it is not yet clear what is the natural scope of the acceleration concept.In this paper, we study accelerated methods from a continuous-time perspective. We show that there is a Lagrangian functional that we call the Bregman Lagrangian which generates a large class of accelerated methods in continuous time, including (but not limited to) accelerated gradient descent, its non-Euclidean extension, and accelerated higher-order gradient methods. We show that the continuous-time limit of all of these methods correspond to travelling the same curve in space time at different speeds, and in this sense the continuous-time setting is the natural one for understanding acceleration.  Moreover, from this perspective, Nesterov’s technique and many of its generalizations can be viewed as a systematic way to go from the continuous-time curves generated by the Bregman Lagrangian to a family of discrete-time accelerated algorithms. [Joint work with Andre Wibisono and Ashia Wilson.]

(Interested readers need to register to attend the lecture.)

Bangalore workshop [ಬೆಂಗಳೂರು ಕಾರ್ಯಾಗಾರ] and new book

Posted in Books, pictures, R, Statistics, Travel, University life with tags , , , , , , , , , , , , on August 13, 2014 by xi'an

IIScOn the last day of the IFCAM workshop in Bangalore, Marc Lavielle from INRIA presented a talk on mixed effects where he illustrated his original computer language Monolix. And mentioned that his CRC Press book on Mixed Effects Models for the Population Approach was out! (Appropriately listed as out on a 14th of July on amazon!) He actually demonstrated the abilities of Monolix live and on diabets data provided by an earlier speaker from Kolkata, which was a perfect way to start initiating a collaboration! Nice cover (which is all I saw from the book at this stage!) that maybe will induce candidates to write a review for CHANCE. Estimation of those mixed effect models relies on stochastic EM algorithms developed by Marc Lavielle and Éric Moulines in the 90’s, as well as MCMC methods.

David Blei smile in Paris (seminar)

Posted in Statistics, Travel, University life with tags , , , , , , , , on October 30, 2013 by xi'an

Nicolas Chopin just reminded me of a seminar given by David Blei in Paris tomorrow (at 4pm, SMILE seminarINRIA 23 avenue d’Italie, 5th floor, orange room) on Stochastic Variational Inference and Scalable Topic Models, machine learning seminar that I will alas miss, being busy on giving mine at CMU. Here is the abstract:

Probabilistic topic modeling provides a suite of tools for analyzing
large collections of electronic documents.  With a collection as
input, topic modeling algorithms uncover its underlying themes and
decompose its documents according to those themes.  We can use topic
models to explore the thematic structure of a large collection of
documents or to solve a variety of prediction problems about text.

Topic models are based on hierarchical mixed-membership models,
statistical models where each document expresses a set of components
(called topics) with individual per-document proportions. The
computational problem is to condition on a collection of observed
documents and estimate the posterior distribution of the topics and
per-document proportions. In modern data sets, this amounts to
posterior inference with billions of latent variables.

How can we cope with such data?  In this talk I will describe
stochastic variational inference, a general algorithm for
approximating posterior distributions that are conditioned on massive
data sets.  Stochastic inference is easily applied to a large class of
hierarchical models, including time-series models, factor models, and
Bayesian nonparametric models.  I will demonstrate its application to
topic models fit with millions of articles.  Stochastic inference
opens the door to scalable Bayesian computation for modern data

PhD+postdoc grant on ABC

Posted in Statistics, University life with tags , , , , , , on April 3, 2010 by xi'an

I have received the following email announcement about a joint INRA/INRIA PhD grant on statistical methods for high frequency genomics, backed by an additional two year postdoc contract:

“Identifier les signatures de sélection dans les données issues de la génomique haut-débit : développement de modèles et de méthodes d’analyse statistique”.
Le développement rapide des technologies de séquençage et de génotypage à haut débit permet désormais de produire de très grandes quantités de données de polymorphisme génétique à une échelle populationnelle, y compris chez des espèces « non-modèles ». Dans ce contexte, la recherche de marqueurs moléculaires portant des signatures de sélection est primordiale pour comprendre la dynamique de l’adaptation. Cette thèse aura donc pour objet de développer des méthodes d’analyse statistique innovantes, pour caractériser la typologie des marqueurs génétiques vis-à-vis de leur statut évolutif. Ces méthodes seront développées dans un cadre bayésien, et se concentreront sur les outils stochastiques afférents (méthodes MCMC et approche ABC lorsque la vraisemblance n’est pas accessible) et les techniques de sélection de variables.

whose google translation is

The fast development of high-frequency sequencing and genotyping technologies allows henceforth to produce very large quantities  of genetic polymorphism data in a populationnal scale, including “non-model” species. In this context, the search for molecular markers carrying selection signatures is essential to understand the dynamics of the adaptation. This thesis will thus have for its goal to develop innovative statistical analysis methods, to characterize the typology of the genetic markers towards their evolutionary status. These methods will be developed in a Bayesian framework, and will concentrate on the relative stochastic tools (MCMC and ABC methods when the likelihood is not available) and the techniques of variable selection.

It involves my friend and coauthor Gilles Celeux (Paris Sud, Orsay) as one of the advisors, as well as two researchers from the place that taught me everything about ABC, the INRA CBGP (Centre de Biologie et de Gestion des Populations)  lab in Montpelliers: Mathieu Gautier and Renaud Vitalis. It is thus a highly interesting proposal whose deadline is April 23.