Mardi 26 novembre 2013 à 14h00
Salle de Conseil, 4ème étage (LINCS) 23 AVENUE D’ITALIE 75013 PARIS
Titre de l’exposé : Feature Selection for Neuro-Dynamic Programming
Neuro-Dynamic Programming encompasses techniques from both reinforcement learning and approximate dynamic programming. Feature selection refers to the choice of basis that defines the function class that is required in the application of these techniques. This talk reviews two popular approaches to neuro-dynamic programming, TD-learning and Q-learning. The main goal of this work is to demonstrate how insight from idealized models can be used as a guide for feature selection for these algorithms. Several approaches are surveyed, including fluid and diffusion models, and the application of idealized models arising from mean-field game approximations. The theory is illustrated with several examples.
Archive for Gainesville
I was attending a lecture this morning at CREST by Patrice Bertail where he was using estimated renewal parameters on a Markov chain to build (asymptotically) convergent bootstrap procedures. Estimating renewal parameters is obviously of interest in MCMC algorithms as they can be used to assess the convergence of the associated Markov chain: That is, if the estimation does not induce a significant bias. Another question that came to me during the talk is that; since those convergence assessments techniques are formally holding for any small set, choosing the small set in order to maximise the renewal rate also maximises the number of renewal events and hence the number of terms in the control sequence: Thus, the maximal renewal rate þ is definitely a quantity of interest: Now, is this quantity þ an intrinsic parameter of the chain, i.e. a quantity that drives its mixing and/or converging behaviour(s)? For instance; an iid sequence has a renewal rate of 1; because the whole set is a “small” set. Informally, the time between two consecutive renewal events is akin to the time between two simulations from the target and stationary distribution, according to the Kac’s representation we used in our AAP paper with Jim Hobert. So it could be that þ is directly related with the effective sample size of the chain, hence the autocorrelation. (A quick web search did not produce anything relevant:) Too bad this question did not pop up last week when I had the opportunity to discuss it with Sean Meyn in Gainesville!
On day #2, besides my talk on “empirical Bayes” (ABCel) computation (mostly recycled from Varanasi, photos included), Christophe Andrieu gave a talk on exact approximations, using unbiased estimators of the likelihood and characterising estimators garanteeing geometric convergence (bounded weights, essentially, which is a condition popping out again and again in the Monte Carlo literature). Then Art Owen (father of empirical likelihood among other things!) spoke about QMC for MCMC, a topic that always intringued me.
Indeed, while I see the point of using QMC for specific integration problems, I am more uncertain about its relevance for statistics as a simulation device. Having points uniformly distributed over the unit hypercube in a much more efficient way than a random sample is not helping much when only a tiny region of the unit hypercube, namely the one where the likelihood concentrates, matters. (In other words, we are rarely interested in the uniform distribution over the unit hypercube: we instead want to simulate from a highly irregular and definitely concentrated distribution.) I have the same reservation about the applicability of stratified sampling: the strata have to be constructed in relation with the target distribution. The method Art advocates using a CUD (completely uniformly distributed) sequence as the underlying (deterministic) pseudo-unifom sequence. Highly interesting and I want to read the paper in greater details, but the fact that most simulation steps use a random number of uniforms seems detrimental to the performances of the method in general.
After a lunch break at a terrific BBQ place, with a stop at Lake Alice to watch the alligator(s) I had missed during my morning run, I was able this time to attend till the end Xiao-Li Meng’s talk, where he presented new improvements on bridge sampling based on location-scale (or warping) transforms of the original two-samples to make them share mean and variance. Hani Doss concluded the meeting with a talk on the computation of Bayes factors when using (non-parametric) Dirichlet mixture priors, whose resolution does not require simulations for each value of the scale parameter of the Dirichlet prior, thanks to a Radon-Nykodim derivative representation. (Which nicely connected with Art’s talk in that the latter mentioned therein that most simulation methods are actually based on Riemann integration rather than Lebesgue integration. Hani’s representation is not, with nested sampling being another example.)
We ended up the day with a(nother) barbecue outside, under the stars, in the peace and quiet of a local wood, with wine and laughs, just like George would have concluded the workshop. This was a fitting ending to a meeting dedicated to his memory…
After a rather long flight, I arrived in Gainesville for this special Winter workshop. We indeed had to wait for hours in Paris to get defrosted and then the ride to Atlanta is terribly lengthy (esp. after I realised that all one’s files arXived for the trip were on my “other” computer… and that the book intended for the trip still stood under the X’ mas tree…!) I just managed to read a book for review, rewrite my slides and watch two movies, plus the last part of one I had started on my way back from India…
Anyway, here I am, back in Gainesville, a few years after my last visit, quite glad to meet again with old friends while terribly missing George Casella. The conference is actually dedicated to his memory. The schedule is well-done, once again giving speakers plenty of times and participants plenty of breaks, along with a superb conference room with tables and plugs. I listened to and enjoy all of them, but the one that did not overlap with the latest workshop at ICERM was Dawn Woodard’s, with a challenging data analysis problem about ambulance routes in Toronto. I clearly was not the only one finding this problem interesting and coming up with (mostly hair-brained!) alternatives. (Another apex of the day was to find a 2007 Beaune premier cru at a very reasonable price in a local store!)
Today I am off (again!) to Florida, taking part in the Winter Workshop at the University of Florida, Gainesville. The theme this year is New Directions in Monte Carlo Methods. I am quite excited to meet again with many old friends (this almost sounds like a rehearsal for MCMSki 4!), but also sad that George Casella who would have been my oldest friend there will be missing. Dearly missing and missed. At the same time, I appreciate that this workshop gives me the opportunity to meet at last with George’s family (with whom I share so many memories) and his colleagues at UFL. I am sure we will have plenty of pizzas, wine(s), and laughs in remembrance of the numerous good times we all had with George. And I will run the streets we ran together, quite a while ago…
Got the following email from Amazon:
Today we have added a new feature, Amazon Author Rank, the definitive list of best-selling authors on Amazon.com. This list makes it easy for readers to discover the best-selling authors on Amazon.com overall and within a selection of major genres. Your Amazon Author Rank is 44,881 in Print Books.
It is a new feature so, with a very limited past horizon, this rank seems to be moving wildly! (For instance, it is now 36,776, just a few hours later.) But so are the individual book sales. Hence a clear lack of smoothing in the indicator.
Another interesting feature of this Author Central facility is the display of US sales by district, Not only because it shows that New York and San Francisco are the cities where I sell the most books (great!) but also because it uses the notion of “combined areas”, aggregating “the copies sold in these sparsely populated areas in order to obscure any single retailer’s sales”. A good display of data protection (even though the level of aggregation sounds too high to me, resulting in “combined areas” being the 3rd highest sale area. And including Gainesville, Florida and Ithaca, New York, the two latest locations of George Casella, in this combination!