Archive for sorting

Le Monde puzzle [#1110]

Posted in Books, Kids, R, Travel with tags , , , , , , , , , on September 16, 2019 by xi'an

A low-key sorting problem as Le Monde current mathematical puzzle:

If the numbers from 1 to 67 are randomly permuted and if the sorting algorithm consists in picking a number i with a position higher than its rank i and moving it at the correct i-th position, what is the maximal number of steps to sort this set of integers when the selected integer is chosen optimaly?

As the question does not clearly state what happens to the number j that stood in the i-th position, I made the assumption that the entire sequence from position i to position n is moved one position upwards (rather than having i and j exchanged). In which case my intuition was that moving the smallest moveable number was optimal, resulting in the following R code

sorite<-function(permu){ n=length(permu) p=0 while(max(abs(permu-(1:n)))>0){
    j=min(permu[permu<(1:n)])
    p=p+1
    permu=unique(c(permu[(1:n)<j],j,permu[j:n]))} 
  return(p)}

which takes at most n-1 steps to reorder the sequence. I however checked this insertion sort was indeed the case through a recursive function

resorite<-function(permu){ n=length(permu);p=0 while(max(abs(permu-(1:n)))>0){
    j=cand=permu[permu<(1:n)]
    if (length(cand)==1){
      p=p+1
      permu=unique(c(permu[(1:n)<j],j,permu[j:n]))
    }else{
      sol=n^3
      for (i in cand){
        qermu=unique(c(permu[(1:n)<i],i,permu[i:n]))
        rol=resorite(qermu)
        if (rol<sol)sol=rol}
      p=p+1+sol;break()}} 
  return(p)}

which did confirm my intuition.

an inverse permutation test

Posted in Books, Kids, R, Statistics with tags , , , , on September 23, 2016 by xi'an

A straightforward but probabilistic riddle this week in the Riddler, which is to find the expected order of integer i when the sequence {1,2,…,n} is partitioned at random into two sets, A and B, each of which is then sorted before both sets are merged. For instance, if {1,2,3,4} is divided in A={1,4} and B={2,3}, the order of 2 in {1,4,2,3} is 3. An R rendering of the experiment is

m=rbinom(1,n,.5)
if (m*(n-m)>0){
fist=sort(sample(1:n,m))
return(order(c(fist,sort((1:n)[-fist])))[i])
}else{
return(i)}

It is rather easy to find that the probability that the order of i takes the value j is

{i-1 \choose j-1}(1/2)^i

if j<i (i is in A) and

{n-i \choose n-j}(1/2)^{n-i+1}

if $j>i$ (i is in B), the case i=j being the addition of both cases, but the mean can be found (almost) immediately by considering that, when i is in A, its average value is (i+1)/2 and when it is in B, its average value is (n+i)/2 [by symmetry]. Hence a global mean of (2i+n+1)/4….

Le Monde puzzle [#855]

Posted in Books, Kids, Statistics with tags , , , , , , on March 7, 2014 by xi'an

A Le Monde mathematical puzzle that reminds me of an earlier one:

Given ten tokens with different unknown weights, and a scale that can rank three tokens at a time, starting with ranking three tokens, what is the minimum number of actions necessary to rank the ten of them if (a) one token at a time is added, (b) one or two tokens are added? If no restriction is imposed on the token introduction, is there a more efficient strategy?

It indeed relates to earlier posts on sorting and ternary sorting. Checking further on StackOverflow I found this reply to a question about ternary sorting:

Average number of comparisons:

in ternary search = ((1/3)*1+(2/3)*2)*ln(n)/ln(3)~1.517*ln(n)
in binary search  =                 1*ln(n)/ln(2)~1.443*ln(n)

Worst number of comparisons:

in ternary search = 2*ln(n)/ln(3)~1.820*ln(n)
in binary search  = 1*ln(n)/ln(2)~1.443*ln(n)

albeit with no reference. So this somewhat answers part (c) of the question. (If asymptotically since this does not work for n=10! Even for n=100, it is way too large.) Looking for a solution to part (a), I looked at the performances of a dyadic sorting algorithm, partitioning recursively the already sorted part of the sample into three parts to locate each time the proper position of the new token. This led me to the following code

rang=function(x,ranj,taken){

  mag=max(ranj)-min(ranj)+1
  i1=ranj[1]-1+trunc(mag/3)
  i2=ranj[1]-1+trunc(2*mag/3)
  locrk=rank(c(taken[c(i1,i2)],x))

  if (locrk[3]==1) return(ifelse(i1>3,
     1+rang(x,ranj=1:(i1-1),taken),
     1+(i1>1)))
  if (locrk[3]==2) return(ifelse(i2-i1>4,
     1+rang(x,ranj=(i1+1):(i2-1),taken),ranj=1:(i1-1),taken),
     1+(i1>1)))
  if (locrk[3]==3) return(ifelse(mag-i2>2,
     1+rang(x,ranj=(i2+1):mag,taken),ranj=1:(i1-1),taken),
     1+(i1>1)))
}

ternone=function(toke){

toke[1:3]=sort(toke[1:3])
counter=1
for (ul in (4:length(toke))){

  counter=counter+rang(toke[ul],1:(ul-1),toke)
  toke[1:ul]=sort(toke[1:ul])
  }

return(counter)
}

ternone(sample(1:10))

which produces a minimum of eight (of course!) and a maximum of 20 uses of the scale (based on 10⁶ random permutations of (1,…,10)). Unsurprisingly, the solution proposed in Le Monde the week after does better as it obtains 16 as the worst case figure. Repeating the experiment with n=100 values, the maximum was 303 (with a mean of 270 and a minimum of 240 ternary comparisons).

Moving to an insertion two tokens at a time, I tested a scheme where two new tokens were tested against the current median, then the median of one half, and so on until they split on both sides of this median. Here is the corresponding R code

ring=function(x,y,ranj,taken){

mag=max(ranj)-min(ranj)+1
i1=ranj[1]-1+trunc(mag/2)
locrk=rank(c(taken[i1],x,y))

if (locrk[1]==3) return(ifelse(i1>2,
    1+ring(x,y,ranj=1:(i1-1),taken),
    1+(i1==2)))
if (locrk[1]==2) return(ifelse(mag>4,
    1+rang(min(x,y),ranj=min(ranj):(i1-1),taken)
    +rang(max(x,y),(i1+1):max(ranj),taken),1))
if (locrk[1]==1) return(ifelse(mag-i1>2,
    1+ring(x,y,ranj=(i1+1):mag,taken),
    1+(mag-i1>1)))
}

terntwo=function(toke){

toke[1:3]=sort(toke[1:3])
counter=1+rang(toke[4],1:3,toke)
toke[1:4]=sort(toke[1:4])

for (ul in -1+2*(3:(length(toke)/2))){

  counter=counter+ring(toke[ul],toke[ul+1],1:(ul-1),toke)
  toke[1:ul]=sort(toke[1:ul])
  ul=ul+1
  }

return(counter)
}

leading to a value of 13.

This Feb. 19 issue of Le Monde Science&Médecine leaflet also contains a two page report on growing psychological work-related issues like stress, burn-out and depression, in research labs, mostly connected with the increase in administrative duties and grant writing, duties most of us have not been trained for. There is also a short column by Etienne Gys telling about the short lived claim by Moukhtarbaï Otelbaev to have solved the Navier-Stokes problem. Until Terry Tao wrote a paper stating why the proof could not work.

Ternary sorting

Posted in R with tags , , , , , on July 25, 2011 by xi'an

The last Le Monde puzzle made me wonder about the ternary version of the sorting algorithms, which all seem to be binary (compare x and y, then…). The problem is, given (only) a blackbox procedure that returns the relative order of three arbitrary numbers, how many steps are necessary to sort a series of n nnumbers? The heapsort entry in Wikipedia mentions a ternary sorting version, but does not get into details. Robert Sedgewick (author of a fabulous book on algorithmic I enjoyed very much when I started programming) has a talk about the optimality of quicksort where he mentions ternary sorting, but this seems to be more related with ties than with my problem… It is of course highly formal in that I do not know of a physical device that would justify moving from binary to ternary comparisons.