importance sampling schemes for evidence approximation [revised]

Posted in Statistics, University life with tags , , , , , , , on November 18, 2014 by xi'an

After a rather intense period of new simulations and versions, Juong Een (Kate) Lee and I have now resubmitted our paper on (some) importance sampling schemes for evidence approximation in mixture models to Bayesian Analysis. There is no fundamental change in the new version but rather a more detailed description of what those importance schemes mean in practice. The original idea in the paper is to improve upon the Rao-Blackwellisation solution proposed by Berkoff et al. (2002) and later by Marin et al. (2005) to avoid the impact of label switching on Chib’s formula. The Rao-Blackwellisation consists in averaging over all permutations of the labels while the improvement relies on the elimination of useless permutations, namely those that produce a negligible conditional density in Chib’s (candidate’s) formula. While the improvement implies truncated the overall sum and hence induces a potential bias (which was the concern of one referee), the determination of the irrelevant permutations after relabelling next to a single mode does not appear to cause any bias, while reducing the computational overload. Referees also made us aware of many recent proposals that conduct to different evidence approximations, albeit not directly related with our purpose. (One was Rodrigues and Walker, 2014, discussed and commented in a recent post.)

a probabilistic proof to a quasi-Monte Carlo lemma

Posted in Books, Statistics, Travel, University life with tags , , , , , on November 17, 2014 by xi'an

As I was reading in the Paris métro a new textbook on Quasi-Monte Carlo methods, Introduction to Quasi-Monte Carlo Integration and Applications, written by Gunther Leobacher and Friedrich Pillichshammer, I came upon the lemma that, given two sequences on (0,1) such that, for all i’s,

|u_i-v_i|\le\delta\quad\text{then}\quad\left|\prod_{i=1}^s u_i-\prod_{i=1}^s v_i\right|\le 1-(1-\delta)^s

and the geometric bound made me wonder if there was an easy probabilistic proof to this inequality. Rather than the algebraic proof contained in the book. Unsurprisingly, there is one based on associating with each pair (u,v) a pair of independent events (A,B) such that, for all i’s,

A_i\subset B_i\,,\ u_i=\mathbb{P}(A_i)\,,\ v_i=\mathbb{P}(B_i)

and representing

\left|\prod_{i=1}^s u_i-\prod_{i=1}^s v_i\right| = \mathbb{P}(\cap_{i=1}^s A_i) - \mathbb{P}(\cap_{i=1}^s B_i)\,.

Obviously, there is no visible consequence to this remark, but it was a good way to switch off the métro hassle for a while! (The book is under review and the review will hopefully be posted on the ‘Og as soon as it is completed.)

snapshot from UF campus (#2)

Posted in pictures, Running, Travel, University life with tags , , , , on November 16, 2014 by xi'an

tree with moss

Le Monde puzzle [#887bis]

Posted in Kids, R, Statistics, University life with tags , , on November 16, 2014 by xi'an

As mentioned in the previous post, an alternative consists in finding the permutation of {1,…,N} by “adding” squares left and right until the permutation is complete or no solution is available. While this sounds like the dual of the initial solution, it brings a considerable improvement in computing time, as shown below. I thus redefined the construction of the solution by initialising the permutation at random (it could start at 1 just as well)

perfect=(1:trunc(sqrt(2*N)))^2
perm=friends=(1:N)
t=1
perm[t]=sample(friends,1)
friends=friends[friends!=perm[t]]

and then completing only with possible neighbours, left

out=outer(perfect-perm[t],friends,"==")
if (max(out)==1){
  t=t+1
  perm[t]=sample(rep(perfect[apply(out,1,
   max)==1],2),1)-perm[t-1]
  friends=friends[friends!=perm[t]]}

or right

out=outer(perfect-perm[1],friends,"==")
if (max(out)==1){
  t=t+1
  perf=sample(rep(perfect[apply(out,1,
    max)==1],2),1)-perm[1]
  perm[1:t]=c(perf,perm[1:(t-1)])
  friends=friends[friends!=perf]}

(If you wonder about why the rep in the sample step, it is a trick I just found to avoid the insufferable feature that sample(n,1) is equivalent to sample(1:n,1)! It costs basically nothing but bypasses reprogramming sample() each time I use it… I am very glad I found this trick!) The gain in computing time is amazing:

> system.time(for (i in 1:50) pick(15))
utilisateur     système       écoulé
      5.397       0.000       5.395
> system.time(for (i in 1:50) puck(15))
utilisateur     système      écoulé
      0.285       0.000       0.287

An unrelated point is that a more interesting (?) alternative problem consists in adding a toroidal constraint, namely to have the first and the last entries in the permutation to also sum up to a perfect square. Is it at all possible?

snapshot from UF campus

Posted in pictures, Running, Travel, University life with tags , , , , on November 15, 2014 by xi'an

campus

Le Monde puzzle [#887]

Posted in Books, Kids, R, Statistics with tags , , , on November 15, 2014 by xi'an

A simple combinatorics Le Monde mathematical puzzle:

N is a golden number if the sequence {1,2,…,N} can be reordered so that the sum of any consecutive pair is a perfect square. What are the golden numbers between 1 and 25?

Indeed, from an R programming point of view, all I have to do is to go over all possible permutations of {1,2,..,N} until one works or until I have exhausted all possible permutations for a given N. However, 25!=10²⁵ is a wee bit too large… Instead, I resorted once again to brute force simulation, by first introducing possible neighbours of the integers

  perfect=(1:trunc(sqrt(2*N)))^2
  friends=NULL
  le=1:N
  for (perm in 1:N){
    amis=perfect[perfect>perm]-perm
    amis=amis[amis<N]
    le[perm]=length(amis)
    friends=c(friends,list(amis))
    }

and then proceeding to construct the permutation one integer at time by picking from its remaining potential neighbours until there is none left or the sequence is complete

orderin=0*(1:N)
t=1
orderin[1]=sample((1:N),1)
for (perm in 1:N){
  friends[[perm]]=friends[[perm]]
              [friends[[perm]]!=orderin[1]]
  le[perm]=length(friends[[perm]])
  }
while (t<N){
  if (length(friends[[orderin[t]]])==0)
        break()
  if (length(friends[[orderin[t]]])>1){
    orderin[t+1]=sample(friends[[orderin[t]]],1)}else{
    orderin[t+1]=friends[[orderin[t]]]
    }
  for (perm in 1:N){
    friends[[perm]]=friends[[perm]]
      [friends[[perm]]!=orderin[t+1]]
    le[perm]=length(friends[[perm]])
    }
  t=t+1}

and then repeating this attempt until a full sequence is produced or a certain number of failed attempts has been reached. I gained in efficiency by proposing a second completion on the left of the first integer once a break occurs:

while (t<N){
  if (length(friends[[orderin[1]]])==0)
        break()
  orderin[2:(t+1)]=orderin[1:t]
  if (length(friends[[orderin[2]]])>1){
    orderin[1]=sample(friends[[orderin[2]]],1)}else{
    orderin[1]=friends[[orderin[2]]]
    }
  for (perm in 1:N){
    friends[[perm]]=friends[[perm]]
       [friends[[perm]]!=orderin[1]]
    le[perm]=length(friends[[perm]])
    }
  t=t+1}

(An alternative would have been to complete left and right by squared numbers taken at random…) The result of running this program showed there exist permutations with the above property for N=15,16,17,23,25,26,…,77.  Here is the solution for N=49:

25 39 10 26 38 43 21 4 32 49 15 34 30 6 3 22 42 7 9 27 37 12 13 23 41 40 24 1 8 28 36 45 19 17 47 2 14 11 5 44 20 29 35 46 18 31 33 16 48

As an aside, the authors of Le Monde puzzle pretended (in Tuesday, Nov. 12, edition) that there was no solution for N=23, while this sequence

22 3 1 8 17 19 6 10 15 21 4 12 13 23 2 14 11 5 20 16 9 7 18

sounds fine enough to me… I more generally wonder at the general principle behind the existence of such sequences. It sounds quite probable that they exist for N>24. (The published solution does not bring any light on this issue, so I assume the authors have no mathematical analysis to provide.)

that the median cannot be a sufficient statistic

Posted in Kids, Statistics, University life with tags , , , , , on November 14, 2014 by xi'an

When reading an entry on The Chemical Statistician that a sample median could often be a choice for a sufficient statistic, it attracted my attention as I had never thought a median could be sufficient. After thinking a wee bit more about it, and even posting a question on cross validated, but getting no immediate answer, I came to the conclusion that medians (and other quantiles) cannot be sufficient statistics for arbitrary (large enough) sample sizes (a condition that excludes the obvious cases of one & two observations where the sample median equals the sample mean).

In the case when the support of the distribution does not depend on the unknown parameter θ, we can invoke the Darmois-Pitman-Koopman theorem, namely that the density of the observations is necessarily of the exponential family form,

\exp\{ \theta T(x) - \psi(\theta) \}h(x)

to conclude that, if the natural sufficient statistic

S=\sum_{i=1}^n T(x_i)

is minimal sufficient, then the median is a function of S, which is impossible since modifying an extreme in the n>2 observations modifies S but not the median.

In the other case when the support does depend on the unknown parameter θ, we can consider the case when

f(x|\theta) = h(x) \mathbb{I}_{A_\theta}(x) \tau(\theta)

where the set indexed by θ is the support of f. In that case, the factorisation theorem implies that

\prod_{i=1}^n \mathbb{I}_{A_\theta}(x_i)

is a 0-1 function of the sample median. Adding a further observation y⁰ which does not modify the median then leads to a contradiction since it may be in or outside the support set.

Incidentally, if an aside, when looking for examples, I played with the distribution

\dfrac{1}{2}\mathfrak{U}(0,\theta)+\dfrac{1}{2}\mathfrak{U}(\theta,1)

which has θ as its theoretical median if not mean. In this example, not only the sample median is not sufficient (the only sufficient statistic is the order statistic and rightly so since the support is fixed and the distributions not in an exponential family), but the MLE is also different from the sample median. Here is an example with n=30 observations, the sienna bar being the sample median:

MLnot

Follow

Get every new post delivered to your Inbox.

Join 700 other followers