Archive for partition

coupling for the Gibbs sampler

Posted in Books, Mountains, pictures, Running, Statistics, Travel, University life with tags , , , , , , , , , , on November 27, 2022 by xi'an

At BNP13, Brian Trippe presented the AISTAT 2022 paper he recently wrote with Tin D. Nguyen and Tamara Broderick. Which made me read their 2021 paper on the topic. There, they note that coupling may prove challenging, which they blame on label switching. Considering a naïve Gibbs sampler on the space of partitions, meaning allocating each data-point to one of the existing partitions or to a singleton, they construct an optimal transport coupling under Hamming distance. Which appears to be achievable in O(NK³log{K}), if K is the maximal number of partitions among both chains. The paper does not goes deeply into the implementation, which involves [to quote] (a) computing the distances between each pair of partitions in the Cartesian product of supports of the Gibbs conditionals and (b) solving the optimal transport problem. Except in the appendix where the book-keeping necessary to achieve O(K²) for pairwise distances and the remaining complexity follows from the standard Orlin’s algorithm. What remains unclear from the paper is that, while the chains couple faster (fastest?), the resulting estimators do not necessarily improve upon budget-equivalent alternatives. (The reason for the failure of the single chain in Figure 2 is hard to fathom.)

Le Monde puzzle [#1164]

Posted in Books, Kids, R with tags , , , , , , , , , , , , , on November 16, 2020 by xi'an

The weekly puzzle from Le Monde is quite similar to older Diophantine episodes (I find myself impossible to point out):

Give the maximum integer that cannot be written as 105x+30y+14z. Same question for 105x+70y+42z+30w.

These are indeed Diophantine equations and the existence of a solution is linked with Bézout’s Lemma. Take the first equation. Since 105 and 30 have a greatest common divisor equal to 3×5=15, there exists a pair (x⁰,y⁰) such that

105 x⁰ + 30 y⁰ = 15

hence a solution to every equation of the form

105 x + 30 y = 15 a

for any relative integer a. Similarly, since 14 and 15 are co-prime,

there exists a pair (a⁰,b⁰) such that

15 a⁰ + 14 b⁰ = 1

hence a solution to every equation of the form

15 a⁰ + 14 b⁰ = c

for every relative integer c. Meaning 105x+30y+14z=c can be solved in all cases. The same result applies to the second equation. Since algorithms for Bézout’s decomposition are readily available, there is little point in writing an R code..! However, the original question must impose the coefficients to be positive, which of course kills the Bézout’s identity argument. Stack Exchange provides the answer as the linear Diophantine problem of Frobenius! While there is no universal solution for three and more base integers, Mathematica enjoys a FrobeniusNumber solver. Producing 271 and 383 as the largest non-representable integers. Also found by my R code

o=function(i,e,x){
  if((a<-sum(!!i))==sum(!!e))sol=(sum(i*e)==x) else{sol=0
    for(j in 0:(x/e[a+1]))sol=max(sol,o(c(i,j),e,x))}
  sol}
a=(min(e)-1)*(max(e)-1)#upper bound
M=b=((l<-length(e)-1)*prod(e))^(1/l)-sum(e)#lower bound
for(x in a:b){sol=0
for(i in 0:(x/e[1]))sol=max(sol,o(i,e,x))
M=max(M,x*!sol)}

(And this led me to recover the earlier ‘Og entry on the coin problem! As of last November.) The published solution does not bring any useful light as to why 383 is the solution, except for demonstrating that 383 is non-representable and any larger integer is representable.

Le Monde puzzle [#1159]

Posted in Books, Kids, R with tags , , , , , , on October 6, 2020 by xi'an

The weekly puzzle from Le Monde is quite similar to #1157:

Is it possible to break the ten first integers, 1,…,10, into two groups such that the sum over the first group is equal to the product over the second? Is it possible that the second group is of cardinal 4? of cardinal 3?

An exhaustive R search returns 3 solutions by

library(R.utils)
bitz<-function(i)
  c(rev(as.binary(i)),rep(0,10))[d<-1:10]
for (i in 1:2^10)
  if (sum(d[!!bitz(i)])==prod(b<-d[!bitz(i)])) print(b)
[1]  1  4 10 #40
[1] 6 7 #42
[1] 1 2 3 7 #42

which brings a positive reply to the question. Moving from N=10 to N=19 produces similar results

[1]  1  9 18 #162
[1]  2  6 14 #168
[1]  1  3  4 14 #168
[1]  1  2  7 12 #168

with this interesting pattern of only two acceptable products, but I am obviously unable to run the same code for N=49, which is the subsidiary question to the puzzle. Turning to a more conceptual (!) approach, over a long insomnia bout (!!) and a subsequent run, I realised that if there are three terms, x¹,x² and x³, in the second group, they need satisfy

x¹x²x³+x¹+x²+x³=½N(N+1)

and if in addition one of them is equal to 1, x¹ say, this equation simplifies into

(x²+1)(x³+1)=½N(N+1)

which always leads to a solution, as e.g. for N=49,

x¹=1, x²=24 and x³=48.

A brute-force search also led to a four term solution in that case

x¹=1, x²=7, x³=10 and x⁴=17.

Le Monde puzzle [#1650]

Posted in Books, Kids, R with tags , , , , , , , , , on September 5, 2018 by xi'an

A penultimate Le Monde mathematical puzzle  before the new competition starts [again!]

For a game opposing 40 players over 12 questions, anyone answering correctly a question gets as reward the number of people who failed to answer. Alice is the single winner: what is her minimal score? In another round, Bob is the only lowest grade: what is his maximum score?

For each player, the score S is the sum δ¹s¹+…+δ⁸s⁸, where the first term is an indicator for a correct answer and the second term is the sum over all other players of their complementary indicator, which can be replaced with the sum over all players since δ¹(1-δ¹)=0. Leading to the vector of scores

worz <- function(ansz){
  scor=apply(1-ansz,2,sum)
  return(apply(t(ansz)*scor,2,sum))}

Now, running by brute-force a massive number of simulations confirmed my intuition that the minimal winning score is 39, the number of players minus one [achieved by Alice giving a single good answer and the others none at all], while the maximum loosing score appeared to be 34, for which I had much less of an intuition!  I would have rather guessed something in the vicinity of 80 (being half of the answers replied correctly by half of the players)… Indeed, while in SIngapore, I however ran in the wee hours a quick simulated annealing code from this solution and moved to 77.

And the 2018 version of Le Monde maths puzzle competition starts today!, for a total of eight double questions, starting with an optimisation problem where the adjacent X table is filled with zeros and ones, trying to optimise (max and min) the number of positive entries [out of 45] for which an even number of neighbours is equal to one. On the represented configuration, green stands for one (16 ones) and P for the positive entries (31 of them). This should be amenable to a R resolution [R solution], by, once again!, simulated annealing. Deadline for the reply on the competition website is next Tuesday, midnight [UTC+1]

Le Monde puzzle [#1033]

Posted in Books, Kids, R with tags , , , , on December 19, 2017 by xi'an

lemondapariA simple Le Monde mathematical puzzle after two geometric ones I did not consider:

  1. Bob gets a 2×3 card with three integer entries on the first row and two integer entries on the second row such that (i) entry (1,1) is 1, (ii) summing up subsets of adjacent entries produces all integers from 1 to 21. (Adjacent means sharing an index.) Deduce Bob’s voucher.
  2.  Alice gets Bob’s voucher completed into a 2×4 card with further integer entries. What is the largest value of N such that all integers from 1 to N are available through summing up all subsets of entries?

The first question only requires a few attempts but it can be solves by brute force simulation. Here is a R code that leads to the solution:

alsumz<-function(sol){return( 
  c(sol,sum(sol[1:2]),sum(sol[2:3]),sum(sol[4:5]),
  sum(sol[c(1,4)]), sum(sol[c(1,5)]),sum(sol[1:3]),
  sum(sol[c(1,4,5)]),sum(sol[c(1,2,5)]),
  sum(sol[c(2,4,5)]), sum(sol[c(1,2,3,5)]),sum(sol[2:5]), 
  sum(sol[c(1,2,4)]),sum(sol[c(1,2,4,5)]),sum(sol[1:4]),
  sum(sol)),sum(sol[c(2,3,5)]))}

produces (1,8,7,3,2) as the only case for which

(length(unique(alsum(sol)))==21)

The second puzzle means considering all sums and checking there exists a solution for all subsets. There is no constraint in this second question, hence on principle this could produce N=2⁸-1=255, but I have been unable to exceed 175 through brute force simulation. (Which entitled me to use the as.logical(intToBits(i) R command!)

%d bloggers like this: