## (x=scan())%in%(2*4^(n=0:x)-2^n-1)

Posted in Books, Kids, R with tags , , , , , , , , , on March 28, 2019 by xi'an

One challenge on code golf is to find the shortest possible code to identify whether or not an integer belongs to the binary cyclops numbers which binary expansion is 0, 101, 11011, 1110111, 111101111, &tc. The n-th such number being

$a(n) = 2^{2n + 1} - 2^n - 1 = 2\,4^n - 2^n - 1 = (2^n - 1)(2\,2^n + 1)$

this leads to the above solution in R (26 bits). The same length as the C solution [which I do not get]

f(n){n=~n==(n^=-~n)*~n/2;}

And with shorter versions in many esoteric languages I had never heard of, like the 8 bits Brachylog code

ḃD↔Dḍ×ᵐ≠

or the 7 bits Jelly

B¬ŒḂ⁼SƊ

As a side remark, since this was not the purpose of the game, the R code is most inefficient in creating a set of size (x+1), with most terms being Inf.

## València summer school

Posted in Kids, pictures, R, Running, Statistics, Travel, University life, Wines with tags , , , , , , , , on January 31, 2018 by xi'an

In another continuation of the summer of Bayesian conferences in Europe, the Universidat de Valencià is organising a summer school on Bayesian statistics, from 16 July till 20 July, 2018. Which thus comes right after our summer school on computational statistics at Warwick. With a basic course on Bayesian learning (2 days). And a more advanced course on Bayesian modeling with BayesX. And a final day workshop.

## Extending R

Posted in Books, Kids, R, Statistics with tags , , , , , , , , , , , , , , , , , on July 13, 2016 by xi'an

As I was previously unaware of this book coming up, my surprise and excitement were both extreme when I received it from CRC Press a few weeks ago! John Chambers, one of the fathers of S, precursor of R, had just published a book about extending R. It covers some reflections of the author on programming and the story of R (Parts 2 and 1),  and then focus on object-oriented programming (Part 3) and the interfaces from R to other languages (Part 4). While this is “only” a programming book, and thus not strictly appealing to statisticians, reading one of the original actors’ thoughts on the past, present, and future of R is simply fantastic!!! And John Chambers is definitely not calling to simply start over and build something better, as Ross Ihaka did in this [most read] post a few years ago. (It is also great to see the names of friends appearing at times, like Julie, Luke, and Duncan!)

“I wrote most of the original software for S3 methods, which were useful for their application, in the early 1990s.”

In the (hi)story part, Chambers delves into the details of the evolution of S at Bells Labs, as described in his [first]  “blue book” (which I kept on my shelf until very recently, next to the “white book“!) and of the occurrence of R in the mid-1990s. I find those sections fascinating maybe the more because I am somewhat of a contemporary, having first learned Fortran (and Pascal) in the mid-1980’s, before moving in the early 1990s to C (that I mostly coded as translated Pascal!), S-plus and eventually R, in conjunction with a (forced) migration from Unix to Linux, as my local computer managers abandoned Unix and mainframe in favour of some virtual Windows machines. And as I started running R on laptops with the help of friends more skilled than I (again keeping some of the early R manuals on my shelf until recently). Maybe one of the most surprising things about those reminiscences is that the very first version of R was dated Feb 29, 2000! Not because of Feb 29, 2000 (which, as Chambers points out, is the first use of the third-order correction to the Gregorian calendar, although I would have thought 1600 was the first one), but because I would have thought it appeared earlier, in conjunction with my first Linux laptop, but this memory is alas getting too vague!

As indicated above, the book is mostly about programming, which means in my case that some sections are definitely beyond my reach! For instance, reading “the onus is on the person writing the calling function to avoid using a reference object as the argument to an existing function that expects a named list” is not immediately clear… Nonetheless, most sections are readable [at my level] and enlightening about the mottoes “everything that exists is an object” and “everything that happens is a function” repeated throughout.  (And about my psycho-rigid ways of translating Pascal into every other language!) I obviously learned about new commands and notions, like the difference between

`x <- 3`

and

`x <<- 3`

(but I was disappointed to learn that the number of <‘s was not related with the depth or height of the allocation!) In particular, I found the part about replacement fascinating, explaining how a command like

`diag(x)[i] = 3`

could modify x directly. (While definitely worth reading, the chapter on R packages could have benefited from more details. But as Chambers points out there are whole books about this.) Overall, I am afraid the book will not improve my (limited) way of programming in R but I definitely recommend it to anyone even moderately skilled in the language.

## Glibc GHOST vulnerability

Posted in Linux with tags , , , , , on January 28, 2015 by xi'an

Just heard about a security vulnerability on Linux machines running Red Hat version 5 to 7, Ubuntu 10.04 and 12.04, Debian version 7, Fedora versions 19 and older, and SUSE versions 11 and older. The vulnerability occurs through a buffer overflow from some functions in the C library Glibc, which allows for a remote code to execute, and the fix to the problem is indicated on that NixCRaft webpage. (It is also possible to run the GHOST C code if you want to live dangerously!)

## simulated annealing for Sudokus [2]

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , on March 17, 2012 by xi'an

On Tuesday, Eric Chi and Kenneth Lange arXived a paper on a comparison of numerical techniques for solving sudokus. (The very Kenneth Lange who wrote this fantastic book on numerical analysis.) One of these techniques is the simulated annealing approach I had played with a long while ago.  They seem to use the same penalisation function as mine, i.e., the number of constraint violations, but the moves are different in that they pick pairs of cells without clues (i.e., not constrained) and swap their contents. The pairs are not picked at random but with probability proportional to exp(k), if k is the number of constraint violations. The temperature decreases geometrically and the simulated annealing program stops when the zero cost is achieved or when a maximum 10⁵ iterations are reached. The R program I wrote while visiting SAMSI had more options, but it was also horrendously slow! The CPU time reported by the authors is far far lower, almost in the range of the backtracking solution that serves as their reference. (Of course, it is written in Fortran 95, not in R…) As in my case, the authors mentioned they sometimes get stuck in a local minimum with only 2 cells with constraint violations.

So I reprogrammed an R code following (as much as possible) their scheme. However, I do not get a better behaviour than with my earlier code, and certainly no solution within seconds, if any. For instance, the temperature decrease in 100(.99)t seems too steep to manage 105 steps. So, either I am missing a crucial element in the code, or my R code is very poor and clever Fortran programming does the trick! Here is my code

```target=function(s){
tar=sum(apply(s,1,duplicated)+apply(s,2,duplicated))
for (r in 1:9){
bloa=(1:3)+3*(r-1)%%3
blob=(1:3)+3*trunc((r-1)/3)
tar=tar+sum(duplicated(as.vector(s[bloa,blob])))
}
return(tar)
}

cost=function(i,j,s){
#constraint violations in cell (i,j)
cos=sum(s[,j]==s[i,j])+sum(s[i,]==s[i,j])
boxa=3*trunc((i-1)/3)+1;
boxb=3*trunc((j-1)/3)+1;
cos+sum(s[boxa:(boxa+2),boxb:(boxb+2)]==s[i,j])
}

entry=function(){
s=con
pop=NULL
for (i in 1:9)
pop=c(pop,rep(i,9-sum(con==i)))
s[s==0]=sample(pop)
return(s)
}

move=function(tau,s,con){
pen=(1:81)
for (i in pen[con==0])
pen[i]=cost(((i-1)%%9)+1,trunc((i-1)/9)+1,s)
wi=sample((1:81)[con==0],2,prob=exp(pen[(1:81)[con==0]]))
prop=s
prop[wi[1]]=s[wi[2]]
prop[wi[2]]=s[wi[1]]

if (runif(1)<exp((target(s)-target(prop)))/tau)
s=prop
return(s)
}

#Example:
s=matrix(0,ncol=9,nrow=9)
s[1,c(1,6,7)]=c(8,1,2)
s[2,c(2:3)]=c(7,5)
s[3,c(5,8,9)]=c(5,6,4)
s[4,c(3,9)]=c(7,6)
s[5,c(1,4)]=c(9,7)
s[6,c(1,2,6,8,9)]=c(5,2,9,4,7)
s[7,c(1:3)]=c(2,3,1)
s[8,c(3,5,7,9)]=c(6,2,1,9)

con=s
tau=100
s=entry()
for (t in 1:10^4){
for (v in 1:100) s=move(tau,s,con)
tau=tau*.99
if (target(s)==0) break()
}
```

## speed of R, C, &tc.

Posted in R, Running, Statistics, University life with tags , , , , , , , , , on February 3, 2012 by xi'an

My Paris colleague (and fellow-runner) Aurélien Garivier has produced an interesting comparison of 4 (or 6 if you consider scilab and octave as different from matlab) computer languages in terms of speed for producing the MLE in a hidden Markov model, using EM and the Baum-Welch algorithms. His conclusions are that

• matlab is a lot faster than R and python, especially when vectorization is important : this is why the difference is spectacular on filtering/smoothing, not so much on the creation of the sample;
• octave is a good matlab emulator, if no special attention is payed to execution speed…;
• scilab appears as a credible, efficient alternative to matlab;
• still, C is a lot faster; the inefficiency of matlab in loops is well-known, and clearly shown in the creation of the sample.

(In this implementation, R is “only” three times slower than matlab, so this is not so damning…) All the codes are available and you are free to make suggestions to improve the speed of of your favourite language!

## the Art of R Programming [guest post]

Posted in Books, R, Statistics, University life with tags , , , , on January 31, 2012 by xi'an

(This post is the preliminary version of a book review by Alessandra Iacobucci, to appear in CHANCE. Enjoy [both the review and the book]!)

As Rob J. Hyndman enthusiastically declares in his blog, “this is a gem of a book”. I would go even further and argue that The Art of R programming is a whole mine of gems. The book is well constructed, and has a very coherent structure.

After an introductory chapter, where the reader gets a quick overview on R basics that allows her to work through the examples in the following chapters, the rest of the book can be divided in three main parts. In the first part (Chapters 2 to 6) the reader is introduced to main R objects and to the functions built to handle and operate on each of them. The second part (Chapters 7 to 13) is focussed on general programming issues: R structures and object-oriented nature, I/O, string handling and manipulating issues, and graphics. Chapter 13 is all devoted to the topic of debugging. The third part deals with more advanced topics, such as speed of execution and performance issues (Chapter 14), mix-matching functions written in R and C (or Python), and parallel processing with R. Even though this last part is intended for more experienced programmers, the overall programming skills of the intended reader “may range anywhere from those of a professional software developer to `I took a programming course in college’.” (p.xxii).

With a fluent style, Matloff is able to deal with a large number of topics in a relatively limited number of pages, resulting in an astonishingly complete yet handy guide. At almost every page we discover a new command, most likely the command we had always looked for and done without by means of more or less cumbersome roundabouts. As a matter of fact, it is possible that there exists a ready-made and perfectly suited R function for nearly anything that comes up to one’s mind. Users coming from compiled programming languages may find it difficult to get used to this wealth of functions, just as they may feel uncomfortable not declaring variable types, not initializing vectors and arrays, or getting rid of loops. Nevertheless, through numerous examples and a precise knowledge of its strengths and limitations, Matloff masterly introduces the reader to the flexibility of R. He repeatedly underlines the functional nature of R in every part of the book and stresses from the outset how this feature has to be exploited for an effective programming. Continue reading