Archive for trains

Robert’s paradox [reading in Reading]

Posted in Statistics, Travel, University life with tags , , , , , , , , , , , , on January 28, 2015 by xi'an

paradoxOn Wednesday afternoon, Richard Everitt and Dennis Prangle organised an RSS workshop in Reading on Bayesian Computation. And invited me to give a talk there, along with John Hemmings, Christophe Andrieu, Marcelo Pereyra, and themselves. Given the proximity between Oxford and Reading, this felt like a neighbourly visit, especially when I realised I could take my bike on the train! John Hemmings gave a presentation on synthetic models for climate change and their evaluation, which could have some connection with Tony O’Hagan’s recent talk in Warwick, Dennis told us about “the lazier ABC” version in connection with his “lazy ABC” paper, [from my very personal view] Marcelo expanded on the Moreau-Yoshida expansion he had presented in Bristol about six months ago, with the notion that using a Gaussian tail regularisation of a super-Gaussian target in a Langevin algorithm could produce better convergence guarantees than the competition, including Hamiltonian Monte Carlo, Luke Kelly spoke about an extension of phylogenetic trees using a notion of lateral transfer, and Richard introduced a notion of biased approximation to Metropolis-Hasting acceptance ratios, notion that I found quite attractive if not completely formalised, as there should be a Monte Carlo equivalent to the improvement brought by biased Bayes estimators over unbiased classical counterparts. (Repeating a remark by Persi Diaconis made more than 20 years ago.) Christophe Andrieu also exposed some recent developments of his on exact approximations à la Andrieu and Roberts (2009).

Since those developments are not yet finalised into an archived document, I will not delve into the details, but I found the results quite impressive and worth exploring, so I am looking forward to the incoming publication. One aspect of the talk which I can comment on is related to the exchange algorithm of Murray et al. (2006). Let me recall that this algorithm handles double intractable problems (i.e., likelihoods with intractable normalising constants like the Ising model), by introducing auxiliary variables with the same distribution as the data given the new value of the parameter and computing an augmented acceptance ratio which expectation is the targeted acceptance ratio and which conveniently removes the unknown normalising constants. This auxiliary scheme produces a random acceptance ratio and hence differs from the exact-approximation MCMC approach, which target directly the intractable likelihood. It somewhat replaces the unknown constant with the density taken at a plausible realisation, hence providing a proper scale. At least for the new value. I wonder if a comparison has been conducted between both versions, the naïve intuition being that the ratio of estimates should be more variable than the estimate of the ratio. More generally, it seemed to me [during the introductory part of Christophe’s talk] that those different methods always faced a harmonic mean danger when being phrased as expectations of ratios, since those ratios were not necessarily squared integrable. And not necessarily bounded. Hence my rather gratuitous suggestion of using other tools than the expectation, like maybe a median, thus circling back to the biased estimators of Richard. (And later cycling back, unscathed, to Reading station!)

On top of the six talks in the afternoon, there was a small poster session during the tea break, where I met Garth Holloway, working in agricultural economics, who happened to be a (unsuspected) fan of mine!, to the point of entitling his poster “Robert’s paradox”!!! The problem covered by this undeserved denomination connected to the bias in Chib’s approximation of the evidence in mixture estimation, a phenomenon that I related to the exchangeability of the component parameters in an earlier paper or set of slides. So “my” paradox is essentially label (un)switching and its consequences. For which I cannot claim any fame! Still, I am looking forward the completed version of this poster to discuss Garth’s solution, but we had a beer together after the talks, drinking to the health of our mutual friend John Deely.

Au Luxembourg

Posted in pictures, Statistics, Travel, University life with tags , , , , , , on December 3, 2013 by xi'an

luxemIn a “crazy travelling week” (dixit my daughter), I gave a talk at an IYS 2013 conference organised by Stephen Senn (formerly at Glasgow) and colleagues in the city of Luxembourg, Grand Duché du Luxembourg. I enjoyed very much the morning train trip there as it was a misty morning, with the sun rising over the frosted-white countryside. (I cannot say much about the city of Luxembourg itself though as I only walked the kilometre from the station to the conference hotel and the same way back. There was a huge gap on the plateau due to a river in the middle, which would have been a nice place to run, I presume…)

One of the few talks I attended there was about an econometric model with instrumental variables. In general, and this dates back to my student’s years at ENSAE, I do not get the motivation for the distinction between endogenous and exogenous in econometrics models. Especially in non-parametric models as, if we do not want to make parametric assumptions, we have difficulties in making instead correlation hypotheses… My bent would be to parametrise everything under the suspicion of this everything being correlated with everything. The instrumental variables econometricians seem so fond of appear to me like magical beings, since we have to know they are instrumental. And because they seem to allow to always come back to a linear setting, by eliminating the non-linear parts. Sounds like a “more for less” free-lunch deal. (Any pointer would be appreciated.) The speaker there actually acknowledged (verbatim) that they are indeed magical and that they cannot be justified by mathematics or statistics. A voodoo part of econometrics then?!

A second talk that left me perplexed was about a generalised finite mixture model. The model sounded like a mixture along time of individuals, ie a sort of clustering of longitudinal data. It looked like it should be easier to estimate than usual mixtures of regressions because an individual contributed to the same regression line for all the times when it was observed. The talk was uninspiring as it missed connections to EM and to Bayesian solutions, focussing instead on a gradient method that sounded inappropriate for a multimodal likelihood. (Funny enough, the choice in the number of regressions was done by BIC.)

snowed-in Paris

Posted in pictures with tags , , , , , , , , on March 17, 2013 by xi'an

Fontaine des Innocents, Paris, March 12, 2013Just like my astronomy colleague from Brighton, I do wonder why so little snow on the ground creates so much havoc on the transportation network. RER trains have been crawling at a snail-like pace, buses cancelled and trams inoperative for the past few days. I presume this is because there is no degree of freedom left in the way those public transportation systems are operated. They are bursting at every possible seam for lack of sufficient investment in material and people, and, as a result, cannot handle any event that would require any extra manpower (like running the trains all night long to keep the rails free from snow and ice…) The electricity company EDF has started relying on its (young) retirees in cases of major crises and I think the rail companies should do the same.

train chaos

Posted in Kids, Running, Travel with tags , , , , , , on October 13, 2012 by xi'an

Saturday morning in an omnibus train bound to Granville is not a relaxing experience without one’s noise cancelling headphones! It starts with passengers unable to gather their carriage and seat numbers from their train tickets (“Désolée, c’est ma place.”, “Vous êtes sûre? Je croyais que c’était ici…”, “Je ne rappelais pas de mon numéro de siège.”, “Ce n’est pas la voiture 3?!”, &tc.) Then piling an unbelievable number of rolling trolleys in the aisle, efficiently blocking the corridor. Once everyone is apparently settled, phones start to inanely transfer the information to innumerable correspondents (“Allo, tonton? Je suis dans le train!”, “Oui, j’arrive. Vers midi moins le quart, je crois…”, “J’ai pris mon sac gris, tu sais, le sac gris.”, “J’ai la grande valise, t’inquiète!”, “Tu sais je me suis levé à quelle heure ce matin?”, &tc.) And, thanks to poor quality headphones (for which the retailers should be fined!), providing the whole carriage with syrupy French variety music. Not mentioning the woman behind me travelling with a small girl, a dog, and a wandering cat and insanely spending her time discussing about dogs with another traveller across the aisle. Unsurprisingly leading to her daughter complaining and then crying after a while. Even after ten attempts at reading “Petit ours brun“…)

What are the distributions on the positive k-dimensional quadrant with parametrizable covariance matrix? (solved)

Posted in R, Statistics, University life with tags , , , , on April 8, 2012 by xi'an

Paulo (from the Instituto de Matemática e Estatística, Universidade de São Paulo, Brazil) has posted an answer to my earlier question both as a comment on the ‘Og and as a solution on StackOverflow (with a much more readable LaTeX output). His solution is based on the observation that the multidimensional log-normal distribution still allows for closed form expressions of both the mean and the variance and that those expressions can further be inverted to impose the pair (μ,Σ)  on the log-normal vector. In addition, he shows that the only constraint on the covariance matrix is that the covariance σij is larger than iμj.. Very neat!

In the meanwhile, I corrected my earlier R code on the gamma model, thanks to David Epstein pointing a mistake in the resolution of the moment equation and I added the constraint on the covariance, already noticed by David in his question. Here is the full code:

sol=function(mu,sigma){
  solub=TRUE
  alpha=rep(0,3)
  beta=rep(0,2)
  beta[1]=mu[1]/sigma[1]
  alpha[1]=mu[1]*beta[1]
  coef=mu[2]*sigma[1]-mu[1]*sigma[3]
  if (coef<0){
   solub=FALSE}else{
    beta[2]=coef/(sigma[1]*sigma[2]-sigma[3]^2)
    alpha[2]=sigma[3]*beta[2]/sigma[1]^2
    alpha[3]=mu[2]*beta[2]-mu[1]*alpha[2]
    if (alpha[3]    }
list(solub=solub,alpha=alpha,beta=beta)
}

mu=runif(2,0,10);sig=c(mu[1]^2/runif(1),mu[2]^2/runif(1));sol(mu,c(sig,runif(1,max(-sqrt(prod(sig)),
-mu[1]*mu[2]),sqrt(prod(sig)))))

and I did not get any FALSE outcome when running this code several times.

Statistics on train delays

Posted in pictures, Statistics, Travel with tags , , , , on February 22, 2011 by xi'an

An interesting item of news on the French public radio: instead of producing rail delays in terms of trains, unions created an alternative indicator per passenger. This sounds fairer, as the impact of delays is felt by every passenger rather than by the train itself! Since most delays occur at rush hour, the consequences are obviously negative for the train companies: only 36% of SNCF passengers arrive on time, while I presume it is much worse for RER passengers.

Of hunter-gatherers and R packagers

Posted in Books, Kids, Statistics with tags , , , , , on April 17, 2009 by xi'an

After an exhausting day spent in the train to escort my daughter back from Petite Bretagne, I came home to read about the on-going action of the North Sea fishermen, who blockaded the North Sea ports protesting against the EU fishing quotas.

I usually find the train a great environment to work and this was true on the morning trip where I spent three hours building the R package for our new MCMC book with George Casella. [But on the way back, there were noisy people all over the place and concentrating was a problem…] It took me two days to understand the structure of writing R packages, first by mimicking the LearnBayes package of Jim Albert, then by reading the on-line available documentation. Once I got over the error messages than seemed to imply I did not have the right version of R and once installed the additional codetools package, due to Luke Tierney, I managed to run

R CMD check mcsm

R CMD build mcsm

R CMD INSTALL mcsm

satisfactorily, including the documentation (the worst part!)… I have done the first four chapters so far and the remaining chapters should follow rather quickly. This is quite comforting because this is the very last step of writing the draft of Enter Monte Carlo Statistical Methods (this is the current title, by the way).

PS-Getting back to those fishermen, I quite understand their plight, i.e. that the current quotas are pushing them out of business, but the answer from the French government, namely to sponsor them for not fishing rather than for changing jobs, is absurd. There is enough evidence to support the thesis of a depletion of the fish population in the North Sea and the Atlantic to understand that the culture of hunting-gathering that still underlies commercial fishing is not sustainable. Some species like the tunas are already close to extinction if nothing short of a ban is enforced. This is obviously tough on tuna hunter-gatherers, but they must be stopped…