Archive for the Books Category

the Flatland paradox

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , , , on May 13, 2015 by xi'an

Pierre Druilhet arXived a note a few days ago about the Flatland paradox (due to Stone, 1976) and his arguments against the flat prior. The paradox in this highly artificial setting is as follows:  Consider a sequence θ of N independent draws from {a,b,1/a,1/b} such that

  1. N and θ are unknown;
  2. a draw followed by its inverse and this inverse are removed from θ;
  3. the successor x of θ is observed, meaning an extra draw is made and the above rule applied.

Then the frequentist probability that x is longer than θ given θ is at least 3/4—at least because θ could be zero—while the posterior probability that x is longer than θ given x is 1/4 under the flat prior over θ. Paradox that 3/4 and 1/4 clash. Not so much of a paradox because there is no joint probability distribution over (x,θ).

The paradox was actually discussed at length in Larry Wasserman’s now defunct Normal Variate. From which I borrowed Larry’s graphical representation of the four possible values of θ given the (green) endpoint of x. Larry uses the Flatland paradox hammer to fix another nail on the coffin he contemplates for improper priors. And all things Bayes. Pierre (like others before him) argues against the flat prior on θ and shows that a flat prior on the length of θ leads to recover 3/4 as the posterior probability that x is longer than θ.

As I was reading the paper in the métro yesterday morning, I became less and less satisfied with the whole analysis of the problem in that I could not perceive θ as a parameter of the model. While this may sound a pedantic distinction, θ is a latent variable (or a random effect) associated with x in a model where the only unknown parameter is N, the total number of draws used to produce θ and x. The distributions of both θ and x are entirely determined by N. (In that sense, the flatland paradox can be seen as a marginalisation paradox in that an improper prior on N cannot be interpreted as projecting a prior on θ.) Given N, the distribution of x of length l(x) is then 1/4N times the number of ways of picking (N-l(x)) annihilation steps among N. Using a prior on N like 1/N , which is improper, then leads to favour the shortest path as well. (After discussing the issue with Pierre Druilhet, I realised he had a similar perspective on the issue. Except that he puts a flat prior on the length l(x).) Looking a wee bit further for references, I also found that Bruce Hill had adopted the same perspective of a prior on N.

terrible graph of the day

Posted in Books, Kids, R, Statistics with tags , , , , , , on May 12, 2015 by xi'an

A truly terrible graph in Le Monde about overweight and obesity in the EU countries (and Switzerland). The circle presentation makes no logical sense. Countries are ordered by 2030 overweight percentages, which implies the order differs for men and women. (With a neat sexist differentiation between male and female figures.)  The allocation of the (2010) grey bar to its country is unclear (left or right?). And there is no uncertain associated with the 2030 predictions. There is no message coming out of the graph, like the massive explosion in the obesity and overweight percentages in EU countries. Now, given that the data is available for women and men, ‘Og’s readers should feel free to send me alternative representations!

quantile functions: mileage may vary

Posted in Books, R, Statistics with tags , , , , , , on May 12, 2015 by xi'an

When experimenting with various quantiles functions in R, I was shocked [ok this is a bit excessive, let us say surprised] by how widely the execution times would vary. To the point of blaming a completely different feature of R. Borrowing from Charlie Geyer’s webpage on the topic of probability distributions in R, here is a table for some standard distributions: I ran

u=runif(1e7)
system.time(x<-qcauchy(u))

choosing an arbitrary parameter whenever needed.

Distribution Function Time
Cauchy qcauchy 2.2
Chi-Square qchisq 43.8
Exponential qexp 0.95
F qf 34.2
Gamma qgamma 37.2
Logistic qlogis 1.7
Log Normal qlnorm 2.2
Normal qnorm 1.4
Student t qt 31.7
Uniform qunif 0.86
Weibull qweibull 2.9

Of course, it does not mean much in that all the slow distributions (except for Weibull) are parameterised. Nonetheless, that a chi-square inversion take 50 times longer than a uniform inversion remains puzzling as to why it is not coded more efficiently. In particular, I was wondering why the chi-square inversion was slower than the Gamma inversion. Rerunning both inversions showed that they are equivalent:

> u=runif(1e7)
> system.time(x<-qgamma(u,sha=1.5))
utilisateur système écoulé
 21.534 0.016 21.532
> system.time(x<-qchisq(u,df=3))
utilisateur système écoulé
21.372 0.008 21.361

Which also shows how variable system.time can be.

arbitrary distributions with set correlation

Posted in Books, Kids, pictures, R, Statistics, University life with tags , , , , , , , , , , on May 11, 2015 by xi'an

A question recently posted on X Validated by Antoni Parrelada: given two arbitrary cdfs F and G, how can we simulate a pair (X,Y) with marginals  F and G, and with set correlation ρ? The answer posted by Antoni Parrelada was to reproduce the Gaussian copula solution: produce (X’,Y’) as a Gaussian bivariate vector with correlation ρ and then turn it into (X,Y)=(F⁻¹(Φ(X’)),G⁻¹(Φ(Y’))). Unfortunately, this does not work, because the correlation does not keep under the double transform. The graph above is part of my answer for a χ² and a log-Normal cdf for F amd G: while corr(X’,Y’)=ρ, corr(X,Y) drifts quite a  lot from the diagonal! Actually, by playing long enough with my function

tacor=function(rho=0,nsim=1e4,fx=qnorm,fy=qnorm)
{
  x1=rnorm(nsim);x2=rnorm(nsim)
  coeur=rho
  rho2=sqrt(1-rho^2)
  for (t in 1:length(rho)){
     y=pnorm(cbind(x1,rho[t]*x1+rho2[t]*x2))
     coeur[t]=cor(fx(y[,1]),fy(y[,2]))}
  return(coeur)
}

Playing further, I managed to get an almost flat correlation graph for the admittedly convoluted call

tacor(seq(-1,1,.01),
      fx=function(x) qchisq(x^59,df=.01),
      fy=function(x) qlogis(x^59))

zerocorNow, the most interesting question is how to produce correlated simulations. A pedestrian way is to start with a copula, e.g. the above Gaussian copula, and to twist the correlation coefficient ρ of the copula until the desired correlation is attained for the transformed pair. That is, to draw the above curve and invert it. (Note that, as clearly exhibited by the graph just above, all desired correlations cannot be achieved for arbitrary cdfs F and G.) This is however very pedestrian and I wonder whether or not there is a generic and somewhat automated solution…

the buried giant [book review]

Posted in Books, Kids, pictures, Travel with tags , , , , , , , , , , on May 10, 2015 by xi'an

Last year, I posted a review of Ishiguro’s  “When we were orphans”, with the comment that, while I enjoyed the novel and appreciated its multiple layers, while missing a strong enough grasp on the characters… I brought back from New York Ishiguro’s latest novel, “The Buried Giant“, with high expectations, doubled by the location of the story in an Arthurian setting, at a time when Britons had not yet been subsumed into Anglo-Saxon culture or forced to migrate to little Britain (Brittany). Looking forward a re-creation of an Arthurian cycle, possibly with a post-modern twist. (Plus, the book as an object is quite nice, with a black slice.)

“I respect what I think he was trying to do, but for me it didn’t work. It couldn’t work. No writer can successfully use the ‘surface elements’ of a literary genre — far less its profound capacities — for a serious purpose, while despising it to the point of fearing identification with it. I found reading the book painful. It was like watching a man falling from a high wire while he shouts to the audience, “Are they going say I’m a tight-rope walker?”” Ursula Le Gun, March 2, 2015.

Alas, thrice alas, after reading it within a fortnight, I am quite disappointed by the book. Which, like the giant, would have better remained buried..  Ishiguro pursues his delving into the notion of memories and remembrances, with the twisted reality they convey. After the detective cum historical novel of “When we were orphans”, he moves to the allegory of the early medieval tale, where characters have to embark upon a quest and face supernatural dangers like pixies and ogres. But mostly suffer from a collective amnesia they cannot shake. The idea is quite clever and once again attractive, but the resulting story sounds too artificial and contrived to involve me into the devenir of its characters. As an aside, the two central characters, Beatrix and Axl, have hardly Briton names. Beatrix is of Latin origin and means traveller, while Axl is of Scandinavian origin and means father of peace. Appropriate symbols for their roles in the allegory, obviously. But this also makes me wonder how deep the allegory is, that is, how many levels of references and stories are hidden behind the bland trek of A & B through a fantasy Britain.

A book review in The Guardian links this book with Tolkien’s Lord of the Rings. I fail to see the connection: Tolkien was immersed for his whole life into Norse sagas and Saxon tales, creating his own myth out of his studies without a thought for parody or allegory. Here, the whole universe is misty and vague, and characters act with no reason or rationale. The whole episode in the monastery and the subsequent tunnel exploration do not make sense in terms of the story, while I cannot fathom what they are supposed to stand for. The theme of the ferryman carrying couples to an island where they may rest, together or not, sounds too obvious to just mean this. What else does it stand for?! The encounters of the rag woman, first in the Roman ruins where she threatens to cut a rabbit’s neck, then in a boat where she acts as a decoy, are completely obscure as to what they are supposed to mean. Maybe this accumulation of senseless events is the whole point of the book, but such a degree of deconstruction does not make for a pleasant read. Eventually, I came to hope that the mists rise again and carry away all past memories of “The Buried Giant“!

headlines

Posted in Books, pictures with tags , , , , , , on May 8, 2015 by xi'an

Huma3

Le Monde puzzle [#910]

Posted in Books, Kids, Statistics, University life with tags , , on May 8, 2015 by xi'an

An game-theoretic Le Monde mathematical puzzle:

A two-person game consists in choosing an integer N and for each player to successively pick a number in {1,…,N} under the constraint that a player cannot pick a number next to a number this player has already picked. Is there a winning strategy for either player and for all values of N?

for which I simply coded a recursive optimal strategy function:

gain=function(mine,yours,none){
  fine=none
  if (length(mine)>0)
    fine=none[apply(abs(outer(mine,none,"-")),
              2,min)>1]
  if (length(fine)>0){
   rwrd=0
   for (i in 1:length(fine)) 
    rwrd=max(rwrd,1-gain(yours,c(mine,fine[i]),
         none[none!=fine[i]]))
   return(rwrd)}
  return(0)}

which returned a zero gain, hence no winning strategy for all values of N except 1.

> gain(NULL,NULL,1)
[1] 1
> gain(NULL,NULL,1:2)
[1] 0
> gain(NULL,NULL,1:3)
[1] 0
> gain(NULL,NULL,1:4)
[1] 0

Meaning that the starting player is always the loser!

Follow

Get every new post delivered to your Inbox.

Join 846 other followers