Statistics month in Marseilles (CIRM)

Posted in Books, Kids, Mountains, pictures, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , , , , , , , on June 24, 2015 by xi'an

Calanque de Morgiou, Marseille, July 7, 2010Next February, the fabulous Centre International de Recherche en Mathématiques (CIRM) in Marseilles, France, will hold a Statistics month, with the following programme over five weeks

Each week will see minicourses of a few hours (2-3) and advanced talks, leaving time for interactions and collaborations. (I will give one of those minicourses on Bayesian foundations.) The scientific organisers of the B’ week are Gilles Celeux and Nicolas Chopin.

The CIRM is a wonderful meeting place, in the mountains between Marseilles and Cassis, with many trails to walk and run, and hundreds of fantastic climbing routes in the Calanques at all levels. (In February, the sea is too cold to contemplate swimming. The good side is that it is not too warm to climb and the risk of bush fire is very low!) We stayed there with Jean-Michel Marin a few years ago when preparing Bayesian Essentials. The maths and stats library is well-provided, with permanent access for quiet working sessions. This is the French version of the equally fantastic German Mathematik Forschungsinstitut Oberwolfach. There will be financial support available from the supporting societies and research bodies, at least for young participants and the costs if any are low, for excellent food and excellent lodging. Definitely not a scam conference!

whazzat?! [scam conferences inc.]

Posted in Kids, Mountains, pictures, Travel, University life with tags , , , , , , , , , , , , on June 24, 2015 by xi'an

Tour Eiffel from Pont de l'Alma, Paris, Dec. 16, 2012Earlier today, I received an invitation to give a plenary talk at a Probability and Statistics Conference in Marrakech, a nice location if any! As it came from a former graduate student from the University of Rouen (where I taught before Paris-Dauphine), and despite an already heavy travelling schedule for 2016!, I considered his offer. And looked for the conference webpage to find the dates as my correspondent had forgotten to include those. Instead of the genuine conference webpage, which had not yet been created, what I found was a fairly unpleasant scheme playing on the same conference name and location, but run by a predator conglomerate called WASET.  WASET stands for World Academy of Science, Engineering, and Technology. Their website lists thousands of conferences, all in nice, touristy, places, and all with an identical webpage. For instance, there is the ICMS 2015: 17th International Conference on Mathematics and Statistics next week. With a huge “conference committee” but no a single name I can identify. And no-one from France. Actually, the website kindly offers entry by city as well as topics, which helps in spotting that a large number of ICMS conferences all take place on the same dates and at the same hotel in Paris… The trick is indeed to attract speakers with the promise of publication in a special issue of a bogus journal and to have them pay 600€ for registration and publication fees, only to have all topics mixed together in a few conference rooms, according to many testimonies I later found on the web. And as clear from the posted conference program! In the “best” of cases since other testimonies mention lost fees and rejected registrations. Testimonies also mention this tendency to reproduce the acronym of a local conference. While it is not unheard of conferences amounting to academic tourism, even from the most established scientific societies!, I am quite amazed at the scale of this enterprise, even though I cannot completely understand how people can fall for it. Looking at the website, the fees, the unrelated scientific committee, and the lack of scientific program should be enough to put those victims off. Unless they truly want to partake to academic tourism, obviously.

arXiv frenzy

Posted in R, Statistics, University life with tags , , , , , , on June 23, 2015 by xi'an

In the few past days, there has been so many arXiv postings of interest—presumably the NIPS submission effect!—that I cannot hope to cover them in the coming weeks! Hopefully, some will still come out on the ‘Og in a near future:

  • arXiv:1506.06629: Scalable Approximations of Marginal Posteriors in Variable Selection by Willem van den Boom, Galen Reeves, David B. Dunson
  • arXiv:1506.06285: The MCMC split sampler: A block Gibbs sampling scheme for latent Gaussian models by Óli Páll Geirsson, Birgir Hrafnkelsson, Daniel Simpson, Helgi Sigurðarson [also deserves a special mention for gathering only ***son authors!]
  • arXiv:1506.06268: Bayesian Nonparametric Modeling of Higher Order Markov Chains by Abhra Sarkar, David B. Dunson
  • arXiv:1506.06117: Convergence of Sequential Quasi-Monte Carlo Smoothing Algorithms by Mathieu Gerber, Nicolas Chopin
  • arXiv:1506.06101: Robust Bayesian inference via coarsening by Jeffrey W. Miller, David B. Dunson
  • arXiv:1506.05934: Expectation Particle Belief Propagation by Thibaut Lienart, Yee Whye Teh, Arnaud Doucet
  • arXiv:1506.05860: Variational Gaussian Copula Inference by Shaobo Han, Xuejun Liao, David B. Dunson, Lawrence Carin
  • arXiv:1506.05855: The Frequentist Information Criterion (FIC): The unification of information-based and frequentist inference by Colin H. LaMont, Paul A. Wiggins
  • arXiv:1506.05757: Bayesian Inference for the Multivariate Extended-Skew Normal Distribution by Mathieu Gerber, Florian Pelgrin
  • arXiv:1506.05741: Accelerated dimension-independent adaptive Metropolis by Yuxin Chen, David Keyes, Kody J.H. Law, Hatem Ltaief
  • arXiv:1506.05269: Bayesian Survival Model based on Moment Characterization by Julyan Arbel, Antonio Lijoi, Bernardo Nipoti
  • arXiv:1506.04778: Fast sampling with Gaussian scale-mixture priors in high-dimensional regression by Anirban Bhattacharya, Antik Chakraborty, Bani K. Mallick
  • arXiv:1506.04416: Bayesian Dark Knowledge by Anoop Korattikara, Vivek Rathod, Kevin Murphy, Max Welling [a special mention for this title!]
  • arXiv:1506.03693: Optimization Monte Carlo: Efficient and Embarrassingly Parallel Likelihood-Free Inference by Edward Meeds, Max Welling
  • arXiv:1506.03074: Variational consensus Monte Carlo by Maxim Rabinovich, Elaine Angelino, Michael I. Jordan
  • arXiv:1506.02564: Gradient-free Hamiltonian Monte Carlo with Efficient Kernel Exponential Families by Heiko Strathmann, Dino Sejdinovic, Samuel Livingstone, Zoltan Szabo, Arthur Gretton [comments coming soon!]

ABC for big data

Posted in Books, Statistics, University life with tags , , , , , , , on June 23, 2015 by xi'an

abcpestou“The results in this paper suggest that ABC can scale to large data, at least for models with a xed number of parameters, under the assumption that the summary statistics obey a central limit theorem.”

In a week rich with arXiv submissions about MCMC and “big data”, like the Variational consensus Monte Carlo of Rabinovich et al., or scalable Bayesian inference via particle mirror descent by Dai et al., Wentao Li and Paul Fearnhead contributed an impressive paper entitled Behaviour of ABC for big data. However, a word of warning: the title is somewhat misleading in that the paper does not address the issue of big or tall data per se, e.g., the impossibility to handle the whole data at once and to reproduce it by simulation, but rather the asymptotics of ABC. The setting is not dissimilar to the earlier Fearnhead and Prangle (2012) Read Paper. The central theme of this theoretical paper [with 24 pages of proofs!] is to study the connection between the number N of Monte Carlo simulations and the tolerance value ε when the number of observations n goes to infinity. A main result in the paper is that the ABC posterior mean can have the same asymptotic distribution as the MLE when ε=o(n-1/4). This is however in opposition with of no direct use in practice as the second main result that the Monte Carlo variance is well-controlled only when ε=O(n-1/2). There is therefore a sort of contradiction in the conclusion, between the positive equivalence with the MLE and

Something I have (slight) trouble with is the construction of an importance sampling function of the fABC(s|θ)α when, obviously, this function cannot be used for simulation purposes. The authors point out this fact, but still build an argument about the optimal choice of α, namely away from 0 and 1, like ½. Actually, any value different from 0,1, is sensible, meaning that the range of acceptable importance functions is wide. Most interestingly (!), the paper constructs an iterative importance sampling ABC in a spirit similar to Beaumont et al. (2009) ABC-PMC. Even more interestingly, the ½ factor amounts to updating the scale of the proposal as twice the scale of the target, just as in PMC.

Another aspect of the analysis I do not catch is the reason for keeping the Monte Carlo sample size to a fixed value N, while setting a sequence of acceptance probabilities (or of tolerances) along iterations. This is a very surprising result in that the Monte Carlo error does remain under control and does not dominate the overall error!

“Whilst our theoretical results suggest that point estimates based on the ABC posterior have good properties, they do not suggest that the ABC posterior is a good approximation to the true posterior, nor that the ABC posterior will accurately quantify the uncertainty in estimates.”

Overall, this is clearly a paper worth reading for understanding the convergence issues related with ABC. With more theoretical support than the earlier Fearnhead and Prangle (2012). However, it does not provide guidance into the construction of a sequence of Monte Carlo samples nor does it discuss the selection of the summary statistic, which has obviously a major impact on the efficiency of the estimation. And to relate to the earlier warning, it does not cope with “big data” in that it reproduces the original simulation of the n sized sample.

on Markov chain Monte Carlo methods for tall data

Posted in Books, Statistics, University life with tags , , , , , on June 22, 2015 by xi'an

Rémi Bardenet, Arnaud Doucet, and Chris Holmes arXived a long paper (with the above title) a month ago, paper that I did not have time to read in detail till today. The paper is quite comprehensive in its analysis of the current literature on MCMC for huge, tall, or big data. Even including our delayed acceptance paper! Now, it is indeed the case that we are all still struggling with this size difficulty. Making proposals in a wide range of directions, hopefully improving the efficiency of dealing with tall data. However, we are not there yet in that the outcome is either about as costly as the original MCMC implementation or its degree of approximation is unknown, even when bounds are available.

Most of the paper proposal is based on aiming at an unbiased estimator of the likelihood function in a pseudo-marginal manner à la Andrieu and Roberts (2009) and on a random subsampling scheme that presumes (a) iid-ness and (b) a lower bound on each term in the likelihood. It seems to me slightly unrealistic to assume that a much cheaper and tight lower bound on those terms could be available. Firmly set in the iid framework, the problem itself is unclear: do we need 10⁸ observations of a logistic model with a few parameters? The real challenge is rather in non-iid hierarchical models with random effects and complex dependence structures. For which subsampling gets much more delicate. None of the methods surveyed in the paper broaches upon such situations where the entire data cannot be explored at once.

An interesting experiment therein, based on the Glynn and Rhee (2014) unbiased representation, shows that the approach does not work well. This could lead the community to reconsider the focus on unbiasedness by coming full circle to the opposition  between bias and variance. And between intractable likelihood and representative subsample likelihood.

Reading the (superb) coverage of earlier proposals made me trace back on the perceived appeal of the decomposition of Neiswanger et al. (2014) as I came to realise that the product of functions renormalised into densities has no immediate probabilistic connection with its components. As an extreme example, terms may fail to integrate. (Of course, there are many Monte Carlo features that exploit such a decomposition, from the pseudo-marginal to accept-reject algorithms. And more to come.) Taking samples from terms in the product is thus not directly related to taking samples from each term, in opposition with the arithmetic mixture representation. I was first convinced by using a fraction of the prior in each term but now find it unappealing because there is no reason the prior should change for a smaller sampler and no equivalent to the prohibition of using the data several times. At this stage, I would be much more in favour of raising a random portion of the likelihood function to the right power. An approach that I suggested to a graduate student earlier this year and which is also discussed in the paper. And considered too naïve and a “very poor approach” (Section 6, p.18), even though there must be versions that do not run afoul of the non-Gaussian nature of the log likelihood ratio. I am certainly going to peruse more thoroughly this Section 6 of the paper.

Another interesting suggestion in this definitely rich paper is the foray into an alternative bypassing the uniform sampling in the Metropolis-Hastings step, using instead the subsampled likelihood ratio. The authors call this “exchanging acceptance noise for subsampling noise” (p.22). However, there is no indication about the resulting stationary and I find the notion of only moving to higher likelihoods (or estimates of) counter to the spirit of Metropolis-Hastings algorithms. (I have also eventually realised the meaning of the log-normal “difficult” benchmark that I missed in the earlier : it means log-normal data is modelled by a normal density.)  And yet another innovation along the lines of a control variate for the log likelihood ratio, no matter it sounds somewhat surrealistic.

La Rochambelle 2015 [10K – 38:28 – 73th & 4th V4]

Posted in Kids, pictures, Running, Travel with tags , , , , , , , , , on June 21, 2015 by xi'an

stade1Another year attending La Rochambelle, the massive women-only race or walk against breast cancer in Caen, Normandy! With the fantastic vision of 20,000 runners in the same pink tee-shirt swarming down-town Caen and the arrival stadium. Which made it quite hard to spot my three relatives in the race! I also ran my fourth iteration of the 10k the next day, from the British War Cemetery of Cambes-en-Plaine to the Memorial for Peace in Caen. The conditions were not as optimal as last year, especially in terms of wind, and I lost one minute on my total time, as well as one position, the third V2 remaining tantalisingly a dozen meters in front of me till the end of the race. A mix of too light trainings, travel fatigue and psychological conviction I was going to end up fourth! Here are my split times, with a very fast start that showed up in the second half near 4mn/km, when the third V2 passed me.

cambes

Altos de Losada [guest wine post by Susie]

Posted in pictures, Travel, University life, Wines with tags , , , , , on June 20, 2015 by xi'an

[Here is a wine criticism written by Susie Bayarri in 2013 about a 2008 bottle of Altos de Losada, a wine from Leon:]

altosThe cork is fantastic. Very good presentation and labelling of the bottle. The wine  color is like dark cherry, I would almost say of the color of blood. Very bright although unfiltered. The cover is d16efinitely high. The tear is very nice (at least in my glass), slow, wide, through parallel streams… but it does not dye my glass at all.

The bouquet is its best feature… it is simply voluptuous… with ripe plums as well as vanilla, some mineral tone plus a smoky hint. I cannot quite detect which wood is used… I have always loved the bouquet of this wine…

In mouth, it remains a bit closed. Next time, I will make sure I decant it (or I will use that Venturi device) but it is nonetheless excellent… the wine is truly fruity, but complex as well (nothing like grape juice). The tannins are definitely present, but tamed and assimilated (I think they will continue to mellow) and it has just a hint of acidity… Despite its alcohol content, it remains light, neither overly sweet nor heavy. The after-taste offers a pleasant bitterness… It is just delicious, an awesome wine!

Follow

Get every new post delivered to your Inbox.

Join 875 other followers