Archive for January, 2012

ultimate R recursion

Posted in Books, R, Statistics, University life with tags , , , , , , on January 31, 2012 by xi'an

One of my students wrote the following code for his R exam, trying to do accept-reject simulation (of a Rayleigh distribution) and constant approximation at the same time:

fAR1=function(n){
 u=runif(n)
 x=rexp(n)
 f=(C*(x)*exp(-2*x^2/3))
 g=dexp(n,1)
 test=(u<f/(3*g))
 y=x[test]
 p=length(y)/n #acceptance probability
 M=1/p
 C=M/3
 hist(y,20,freq=FALSE)
 return(x)
 }

which I find remarkable if alas doomed to fail! I wonder if there exists a (real as opposed to fantasy) computer language where you could introduce constants C and only define them later… (What’s rather sad is that I keep insisting on the fact that accept-reject does not need the constant C to operate. And that I found the same mistake in several of the students’ code. There is a further mistake in the above code when defining g. I also wonder where the 3 came from…)

the Art of R Programming [guest post]

Posted in Books, R, Statistics, University life with tags , , , , on January 31, 2012 by xi'an

(This post is the preliminary version of a book review by Alessandra Iacobucci, to appear in CHANCE. Enjoy [both the review and the book]!)

As Rob J. Hyndman enthusiastically declares in his blog, “this is a gem of a book”. I would go even further and argue that The Art of R programming is a whole mine of gems. The book is well constructed, and has a very coherent structure.

After an introductory chapter, where the reader gets a quick overview on R basics that allows her to work through the examples in the following chapters, the rest of the book can be divided in three main parts. In the first part (Chapters 2 to 6) the reader is introduced to main R objects and to the functions built to handle and operate on each of them. The second part (Chapters 7 to 13) is focussed on general programming issues: R structures and object-oriented nature, I/O, string handling and manipulating issues, and graphics. Chapter 13 is all devoted to the topic of debugging. The third part deals with more advanced topics, such as speed of execution and performance issues (Chapter 14), mix-matching functions written in R and C (or Python), and parallel processing with R. Even though this last part is intended for more experienced programmers, the overall programming skills of the intended reader “may range anywhere from those of a professional software developer to `I took a programming course in college’.” (p.xxii).

With a fluent style, Matloff is able to deal with a large number of topics in a relatively limited number of pages, resulting in an astonishingly complete yet handy guide. At almost every page we discover a new command, most likely the command we had always looked for and done without by means of more or less cumbersome roundabouts. As a matter of fact, it is possible that there exists a ready-made and perfectly suited R function for nearly anything that comes up to one’s mind. Users coming from compiled programming languages may find it difficult to get used to this wealth of functions, just as they may feel uncomfortable not declaring variable types, not initializing vectors and arrays, or getting rid of loops. Nevertheless, through numerous examples and a precise knowledge of its strengths and limitations, Matloff masterly introduces the reader to the flexibility of R. He repeatedly underlines the functional nature of R in every part of the book and stresses from the outset how this feature has to be exploited for an effective programming. Continue reading

an academic book reviewer???

Posted in Books, Statistics with tags , , , on January 30, 2012 by xi'an

I just noticed two recent and highly negative reviews of Monte Carlo Statistical Methods on amazon:

 1. I was trying to read this book in details on importance sampling. It wasted me a few hours looking at the detailed mathematically formula in the corresponding section in the book without getting a clear high level picture. The convoluted examples given the section is more than necessary. Eventually, I found this series of video lecture from mathematicalmonk on youtube.

If you compare the book with this series of video, I believe you will agree this book diverse a 2 star. Technically this book may be sophisticated. But just by sampling the important sampling section and checking a few other sections in the book, I think that I can conclude fairly safely that if there is anything that a reader don’t understand in the book, it is the author’s fault but not the readers.

and

2. This review is about the material quality of the printing in the copy I received. This is not about the content.

I have access to a real copy of this edition in the local library. It is the usual high quality hardcover: it has a matte cover with texture, beautifully bound; the paper inside is high-quality, very soft and slightly off-white; and the printing of the text is very sharp. The version I received from Amazon claimed to be exactly the same, but was very different:
– The hardcover was shiny, did not have texture, and had a natural tendency to bend strongly outwards, it even cannot stay opened if I leave it alone, it will close.
– The paper inside is whiter, horribly white, like standard printing A4 paper;
– The text printing looks like a cheap photocopy of the original. It don’t even match a home laser printer. Some formulas are difficult to read. Moreover, some pages are not even centered.

It looks and feels like a cheap knock-off photocopy done in a garage. When I pay a lot of money for a hardcover edition I want the real thing, not a cheap knock-off. Authors should avoid their work being degraded with this cheap printing.

and I thought the ‘Og readers might be interested! The second reviewer’s complaint may be about a scam my friend Julien also fall victim to, people pretending to sell the original and making cheap copies. The reviewer should have asked for a refund or else should have returned the book, that’s all.  Nothing us authors can do anything about. Now it may also be a case of poor print-on-demand output from the publisher itself. I have enquired with Springer to see if this may be the case.

The first review from “academic book reviewer” is much more hilarious. And not only for the grammar. The on-line course by mathematicalmonk is a nice explanation I would also recommend to students. However, this on-line course uses about the same arguments as ours and, at some point, the reader (of a graduate mathematical text) needs to get to the foundations of the method(s) and this requires some advanced mathematics. This is missed by a reader who is apparently not much of an academic [reviewer]. (Most of his/her reviews are of the same whining nature.) Still, I love the above line “if there is anything that a reader don’t (sic) understand in the book, it is the author’s fault but not the readers“! I will certainly keep that in mind for future book reviews.

room with a view (2)

Posted in pictures, Travel, University life with tags , , , on January 29, 2012 by xi'an

l’affriolé, Paris

Posted in Kids, Travel, Wines with tags , , , on January 29, 2012 by xi'an

sunrise on the Cam

Posted in pictures, Running, Travel, University life with tags , , on January 28, 2012 by xi'an

Le Monde puzzle [#755?]

Posted in R with tags , , , on January 28, 2012 by xi'an

Le Monde puzzle of last weekend was about sudoku-like matrices.

Consider an (n,n) matrix containing the integers from 1 to n². The matrix is “friendly” if the set of the sums of the rows is equal to the set of the sum of the columns. Find examples for n=4,5,6. Why is there no friendly matrix when n=9?

Checking for small n’s seems easy enough:

friend=function(n){
s=1
while (s>0){
  A=matrix(sample(1:n^2),ncol=n)
  s=sum(abs(sort(apply(A,1,sum))-
        sort(apply(A,2,sum))))}
A
}

For instance, running

> friend(3)
     [,1] [,2] [,3]
[1,]    8    4    2
[2,]    1    9    5
[3,]    6    3    7
> friend(4)
     [,1] [,2] [,3] [,4]
[1,]   14   10   11    6
[2,]   13    3   12    8
[3,]    4    5   16   15
[4,]    9    1    2    7
> friend(5)
     [,1] [,2] [,3] [,4] [,5]
[1,]   17   14   10   16    5
[2,]    8   19    7   15   22
[3,]    2    3   25   13    6
[4,]   11   12   18    1   21
[5,]   24   23   20    4    9
>friend(6)
     [,1] [,2] [,3] [,4] [,5] [,6]
[1,]   14    4   36   30   10   18
[2,]    7   16   24   27   32   11
[3,]   21   25   12    5   22   17
[4,]   26   20    6   31   19   34
[5,]    3   35    1   28    9   29
[6,]   23    2   33   15   13    8

produces right answers. But the case n=6 proved itself almost too hard for brute-force handling!!! I have no time to devise a simulated-annealing code to speed up the resolution so this will have to wait till the next weekend edition of Le Monde. As to why n=9 does not enjoy a solution… (When n=2 there is no solution, as proven by the brute force exhibition of all cases.)

Follow

Get every new post delivered to your Inbox.

Join 680 other followers