a Simpson paradox of sorts

The riddle from The Riddler this week is about finding an undirected graph with N nodes and no isolated node such that the number of nodes with more connections than the average of their neighbours is maximal. A representation of a connected graph is through a matrix X of zeros and ones, on which one can spot the nodes satisfying the above condition as the positive entries of the vector (X1)^2-(X^21), if 1 denotes the vector of ones. I thus wrote an R code aiming at optimising this target

targe <- function(F){
  sum(F%*%F%*%rep(1,N)/(F%*%rep(1,N))^2<1)}

by mere simulated annealing:

rate <- function(N){ 
# generate matrix F
# 1. no single 
F=matrix(0,N,N) 
F[sample(2:N,1),1]=1 
F[1,]=F[,1] 
for (i in 2:(N-1)){ 
if (sum(F[,i])==0) 
F[sample((i+1):N,1),i]=1 
F[i,]=F[,i]} 
if (sum(F[,N])==0) 
F[sample(1:(N-1),1),N]=1 
F[N,]=F[,N] 
# 2. more connections 
F[lower.tri(F)]=F[lower.tri(F)]+
  sample(0:1,N*(N-1)/2,rep=TRUE,prob=c(N,1)) 
F[F>1]=1
F[upper.tri(F)]=t(F)[upper.tri(t(F))]
#simulated annealing
T=1e4
temp=N
targo=targe(F)
for (t in 1:T){
  #1. local proposal
  nod=sample(1:N,2)
  prop=F
  prop[nod[1],nod[2]]=prop[nod[2],nod[1]]=
     1-prop[nod[1],nod[2]]
  while (min(prop%*%rep(1,N))==0){
    nod=sample(1:N,2)
    prop=F
    prop[nod[1],nod[2]]=prop[nod[2],nod[1]]=
     1-prop[nod[1],nod[2]]}
  target=targe(prop)
  if (log(runif(1))*temp<target-targo){ 
    F=prop;targo=target} 
#2. global proposal 
  prop=F prop[lower.tri(prop)]=F[lower.tri(prop)]+
   sample(c(0,1),N*(N-1)/2,rep=TRUE,prob=c(N,1)) 
prop[prop>1]=1
  prop[upper.tri(prop)]=t(prop)[upper.tri(t(prop))]
  target=targe(prop)
  if (log(runif(1))*temp<target-targo){
      F=prop;targo=target}
   temp=temp*.999
   }
return(F)}

Eward SimpsonThis code returns quite consistently (modulo the simulated annealing uncertainty, which grows with N) the answer N-2 as the number of entries above average! Which is rather surprising in a Simpson-like manner since all entries but two are above average. (Incidentally, I found out that Edward Simpson recently wrote a paper in Significance about the Simpson-Yule paradox and him being a member of the Bletchley Park Enigma team. I must have missed out the connection with the Simpson paradox when reading the paper in the first place…)

One Response to “a Simpson paradox of sorts”

  1. […] Tip from X’ian’s OG, by way of R Bloggers. […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s