**I**n another continuation of the summer of Bayesian conferences in Europe, the Universidat de Valencià is organising a summer school on Bayesian statistics, from 16 July till 20 July, 2018. Which thus comes right after our summer school on computational statistics at Warwick. With a basic course on Bayesian learning (2 days). And a more advanced course on Bayesian modeling with BayesX. And a final day workshop.

## Archive for C

## València summer school

Posted in Kids, pictures, R, Running, Statistics, Travel, University life, Wines with tags Bayesian statistics, BayesX, C, R, short courses, Spain, summer school, València, workshop on January 31, 2018 by xi'an## Extending R

Posted in Books, Kids, R, Statistics with tags Bell Labs, book review, C, CRAN, extending R, Fortran, John Chambers, laptop, Linux, Luke Tierney, object-oriented programming, packages, Pascal, R, Ross Ihaka, S, S-plus, unix on July 13, 2016 by xi'an**A**s I was previously unaware of this book coming up, my surprise and excitement were both extreme when I received it from CRC Press a few weeks ago! John Chambers, one of the fathers of S, precursor of R, had just published a book about extending R. It covers some reflections of the author on programming and the story of R (Parts 2 and 1), and then focus on object-oriented programming (Part 3) and the interfaces from R to other languages (Part 4). While this is “only” a programming book, and thus not strictly appealing to statisticians, reading one of the original actors’ thoughts on the past, present, and future of R is simply fantastic!!! And John Chambers is definitely not calling to simply start over and build something better, as Ross Ihaka did in this [most read] post a few years ago. (It is also great to see the names of friends appearing at times, like Julie, Luke, and Duncan!)

“I wrote most of the original software for S3 methods, which were useful for their application, in the early 1990s.”

In the (hi)story part, Chambers delves into the details of the evolution of S at Bells Labs, as described in his [first] “blue book” (which I kept on my shelf until very recently, next to the “white book“!) and of the occurrence of R in the mid-1990s. I find those sections fascinating maybe the more because I am somewhat of a contemporary, having first learned Fortran (and Pascal) in the mid-1980’s, before moving in the early 1990s to C (that I mostly coded as translated Pascal!), S-plus and eventually R, in conjunction with a (forced) migration from Unix to Linux, as my local computer managers abandoned Unix and mainframe in favour of some virtual Windows machines. And as I started running R on laptops with the help of friends more skilled than I (again keeping some of the early R manuals on my shelf until recently). Maybe one of the most surprising things about those reminiscences is that the very first version of R was dated Feb 29, 2000! Not because of Feb 29, 2000 (which, as Chambers points out, is the first use of the third-order correction to the Gregorian calendar, although I would have thought 1600 was the first one), but because I would have thought it appeared earlier, in conjunction with my first Linux laptop, but this memory is alas getting too vague!

As indicated above, the book is mostly about programming, which means in my case that some sections are definitely beyond my reach! For instance, reading “*the onus is on the person writing the calling function to avoid using a reference object as the argument to an existing function that expects a named list*” is not immediately clear… Nonetheless, most sections are readable [at my level] and enlightening about the mottoes “*everything that exists is an object*” and “*everything that happens is a function*” repeated throughout. (And about my psycho-rigid ways of translating Pascal into every other language!) I obviously learned about new commands and notions, like the difference between

```
````x <- 3`

and

```
````x <<- 3`

(but I was disappointed to learn that the number of <‘s was not related with the depth or height of the allocation!) In particular, I found the part about replacement fascinating, explaining how a command like

```
````diag(x)[i] = 3`

could modify x directly. (While definitely worth reading, the chapter on R packages could have benefited from more details. But as Chambers points out there are whole books about this.) Overall, I am afraid the book will not improve *my* (limited) way of programming in R but I definitely recommend it to anyone even moderately skilled in the language.

## Glibc GHOST vulnerability

Posted in Linux with tags C, Glibc, Kubuntu 12.04, Linux, security vulnerability, Ubuntu 10.10 on January 28, 2015 by xi'an**J**ust heard about a security vulnerability on Linux machines running Red Hat version 5 to 7, Ubuntu 10.04 and 12.04, Debian version 7, Fedora versions 19 and older, and SUSE versions 11 and older. The vulnerability occurs through a buffer overflow from some functions in the C library Glibc, which allows for a remote code to execute, and the fix to the problem is indicated on that NixCRaft webpage. (It is also possible to run the GHOST C code if you want to live dangerously!)

## simulated annealing for Sudokus [2]

Posted in Books, pictures, R, Statistics, University life with tags C, Colosseo, Fortran 95, R, Roma, simulated annealing, sudoku, temperature on March 17, 2012 by xi'an**O**n Tuesday, Eric Chi and Kenneth Lange arXived a paper on a comparison of numerical techniques for solving sudokus. (The very Kenneth Lange who wrote this fantastic book on numerical analysis.) One of these techniques is the simulated annealing approach I had played with a long while ago. They seem to use the same penalisation function as mine, i.e., the number of constraint violations, but the moves are different in that they pick pairs of cells without clues (i.e., not constrained) and swap their contents. The pairs are not picked at random but with probability proportional to *exp(k)*, if *k* is the number of constraint violations. The temperature decreases geometrically and the simulated annealing program stops when the zero cost is achieved or when a maximum 10⁵ iterations are reached. The R program I wrote while visiting SAMSI had more options, but it was also horrendously slow! The CPU time reported by the authors is far far lower, almost in the range of the backtracking solution that serves as their reference. (Of course, it is written in Fortran 95, not in R…) As in my case, the authors mentioned they sometimes get stuck in a local minimum with only 2 cells with constraint violations.

**S**o I reprogrammed an R code following (as much as possible) their scheme. However, I do not get a better behaviour than with my earlier code, and certainly no solution within seconds, if any. For instance, the temperature decrease in *100(.99) ^{t}* seems too steep to manage

*10*steps. So, either I am missing a crucial element in the code, or my R code is very poor and clever Fortran programming does the trick! Here is my code

^{5}target=function(s){ tar=sum(apply(s,1,duplicated)+apply(s,2,duplicated)) for (r in 1:9){ bloa=(1:3)+3*(r-1)%%3 blob=(1:3)+3*trunc((r-1)/3) tar=tar+sum(duplicated(as.vector(s[bloa,blob]))) } return(tar) } cost=function(i,j,s){ #constraint violations in cell (i,j) cos=sum(s[,j]==s[i,j])+sum(s[i,]==s[i,j]) boxa=3*trunc((i-1)/3)+1; boxb=3*trunc((j-1)/3)+1; cos+sum(s[boxa:(boxa+2),boxb:(boxb+2)]==s[i,j]) } entry=function(){ s=con pop=NULL for (i in 1:9) pop=c(pop,rep(i,9-sum(con==i))) s[s==0]=sample(pop) return(s) } move=function(tau,s,con){ pen=(1:81) for (i in pen[con==0]) pen[i]=cost(((i-1)%%9)+1,trunc((i-1)/9)+1,s) wi=sample((1:81)[con==0],2,prob=exp(pen[(1:81)[con==0]])) prop=s prop[wi[1]]=s[wi[2]] prop[wi[2]]=s[wi[1]] if (runif(1)<exp((target(s)-target(prop)))/tau) s=prop return(s) } #Example: s=matrix(0,ncol=9,nrow=9) s[1,c(1,6,7)]=c(8,1,2) s[2,c(2:3)]=c(7,5) s[3,c(5,8,9)]=c(5,6,4) s[4,c(3,9)]=c(7,6) s[5,c(1,4)]=c(9,7) s[6,c(1,2,6,8,9)]=c(5,2,9,4,7) s[7,c(1:3)]=c(2,3,1) s[8,c(3,5,7,9)]=c(6,2,1,9) con=s tau=100 s=entry() for (t in 1:10^4){ for (v in 1:100) s=move(tau,s,con) tau=tau*.99 if (target(s)==0) break() }

## speed of R, C, &tc.

Posted in R, Running, Statistics, University life with tags Baum-Welch algorithm, C, EM, HMM, Matlab, Octave, Python, R, Scilab, speed on February 3, 2012 by xi'an**M**y Paris colleague (and fellow-runner) Aurélien Garivier has produced an interesting comparison of 4 (or 6 if you consider scilab and octave as different from matlab) computer languages in terms of speed for producing the MLE in a hidden Markov model, using EM and the Baum-Welch algorithms. His conclusions are that

- matlab is a lot faster than R and python, especially when vectorization is important : this is why the difference is spectacular on filtering/smoothing, not so much on the creation of the sample;
- octave is a good matlab emulator, if no special attention is payed to execution speed…;
- scilab appears as a credible, efficient alternative to matlab;
- still, C is
**a lot**faster; the inefficiency of matlab in loops is well-known, and clearly shown in the creation of the sample.

(In this implementation, R is “only” three times slower than matlab, so this is not so damning…) All the codes are available and you are free to make suggestions to improve the speed of of your favourite language!

## the Art of R Programming [guest post]

Posted in Books, R, Statistics, University life with tags C, Norman Matloff, programming, R, software on January 31, 2012 by xi'an*(This post is the preliminary version of a book review by Alessandra Iacobucci, to appear in CHANCE. Enjoy [both the review and the book]!)*

**A**s Rob J. Hyndman enthusiastically declares in his blog, “this is a gem of a book”. I would go even further and argue that *The Art of R programming* is a whole mine of gems. The book is well constructed, and has a very coherent structure.

**A**fter an introductory chapter, where the reader gets a quick overview on R basics that allows her to work through the examples in the following chapters, the rest of the book can be divided in three main parts. In the first part (Chapters 2 to 6) the reader is introduced to main R objects and to the functions built to handle and operate on each of them. The second part (Chapters 7 to 13) is focussed on general programming issues: R structures and object-oriented nature, I/O, string handling and manipulating issues, and graphics. Chapter 13 is all devoted to the topic of debugging. The third part deals with more advanced topics, such as speed of execution and performance issues (Chapter 14), mix-matching functions written in R and C (or Python), and parallel processing with R. Even though this last part is intended for more experienced programmers, the overall programming skills of the intended reader “may range anywhere from those of a professional software developer to `I took a programming course in college’.” (p.xxii).

**W**ith a fluent style, Matloff is able to deal with a large number of topics in a relatively limited number of pages, resulting in an astonishingly complete yet handy guide. At almost every page we discover a new command, most likely *the* command we had always looked for and done without by means of more or less cumbersome roundabouts. As a matter of fact, it is possible that there exists a ready-made and perfectly suited R function for nearly anything that comes up to one’s mind. Users coming from compiled programming languages may find it difficult to get used to this wealth of functions, just as they may feel uncomfortable not declaring variable types, not initializing vectors and arrays, or getting rid of loops. Nevertheless, through numerous examples and a precise knowledge of its strengths and limitations, Matloff masterly introduces the reader to the flexibility of R. He repeatedly underlines the functional nature of R in every part of the book and stresses from the outset how this feature has to be exploited for an effective programming. Continue reading

## Dennis Ritchie 1941-2011

Posted in Books, R, University life with tags C, Dennis Ritchie, Linux, Pascal, Purdue University, Steve Jobs on October 29, 2011 by xi'an**I** just got the “news” that Dennis Ritchie died, although this happened on October 12… The announcement was surprisingly missing from my information channels and certainly got little media coverage, compared with Steve Jobs‘ demise. (I did miss the obituaries in ** the New York Times** and in

**.**

*the Guardian***has the most appropriate heading,**

*The Economist**printf(“goodbye, Dennis”);*!!!) Still, Dennis Ritchie contributed to computer science to extents comparable to Steve Jobs’, if on a lesser commercial plane: he is a founding father of both the C language and the Unix operating system. I remember spending many days perusing over his reference book,

*The C programming language*, co-written with Brian Kernighan. (I kept trying programming in C until Olivier Cappé kindly pointed out to me that I was merely translating my Pascal vision into C code, missing most of the appeal of the language!) And, of course, I also remember discovering Unix when arriving at Purdue as a logical and much more modern operating system: just tfour years after programming principal components on punched card and in SAS, this was a real shock! I took a few evening classes at Purdue run by the Computer Department and I still carry around the Purdue University UNIX Pocket Guide. Although I hardly ever use it, it is there on the first shelf on top of my desk… As is

*The C programming language*even though I have not opened it in years!

**S**o we (geeks, computer users, Linuxians, R users, …) owe a lot to Dennis Ritchie and it is quite sad both that he passed away by himself and that his enormous contribution was not better acknowledged. Thus, indeed,

for (i=0; i<ULONG_LONG_MAX; i++) printf("thanks a lot, Dennis")