computational methods for numerical analysis with R [book review]

Posted in Books, Kids, pictures, R, Statistics, University life with tags , , , , , , , , , , , , , , , on October 31, 2017 by xi'an

This is a book by James P. Howard, II, I received from CRC Press for review in CHANCE. (As usual, the customary warning applies: most of this blog post will appear later in my book review column in CHANCE.) It consists in a traditional introduction to numerical analysis with backup from R codes and packages. The early chapters are setting the scenery, from basics on R to notions of numerical errors, before moving to linear algebra, interpolation, optimisation, integration, differentiation, and ODEs. The book comes with a package cmna that reproduces algorithms and testing. While I do not find much originality in the book, given its adherence to simple resolutions of the above topics, I could nonetheless use it for an elementary course in our first year classes. With maybe the exception of the linear algebra chapter that I did not find very helpful.

“…you can have a solution fast, cheap, or correct, provided you only pick two.” (p.27)

The (minor) issue I have with the book and that a potential mathematically keen student could face as well is that there is little in the way of justifying a particular approach to a given numerical problem (as opposed to others) and in characterising the limitations and failures of the presented methods (although this happens from time to time as e.g. for gradient descent, p.191). [Seeping in my Gallic “mal-être”, I am prone to over-criticise methods during classing, to the (increased) despair of my students!, but I also feel that avoiding over-rosy presentations is a good way to avoid later disappointments or even disasters.] In the case of this book, finding [more] ways of detecting would-be disasters would have been nice.

An uninteresting and highly idiosyncratic side comment is that the author preferred the French style for long division to the American one, reminding me of my first exposure to the latter, a few months ago! Another comment from a statistician is that mentioning time series inter- or extra-polation without a statistical model sounds close to anathema! And makes extrapolation a weapon without a cause.

“…we know, a priori, exactly how long the [simulated annealing] process will take since it is a function of the temperature and the cooling rate.” (p.199)

Unsurprisingly, the section on Monte Carlo integration is disappointing for a statistician/probabilistic numericist like me,  as it fails to give a complete enough picture of the methodology. All simulations seem to proceed there from a large enough hypercube. And recommending the “fantastic” (p.171) R function integrate as a default is scary, given the ability of the selected integration bounds to misled its users. Similarly, I feel that the simulated annealing section is not providing enough of a cautionary tale about the highly sensitive impact of cooling rates and absolute temperatures. It is only through the raw output of the algorithm applied to the travelling salesman problem that the novice reader can perceive the impact of some of these factors. (The acceptance bound on the jump (6.9) is incidentally wrongly called a probability on p.199, since it can take values larger than one.)

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Books Review section in CHANCE.]

ABC for wargames

Posted in Books, Kids, pictures, Statistics with tags , , , , , , on February 10, 2016 by xi'an

I recently came across an ABC paper in PLoS ONE by Xavier Rubio-Campillo applying this simulation technique to the validation of some differential equation models linking force sizes and values for both sides. The dataset is made of battle casualties separated into four periods, from pike and musket to the American Civil War. The outcome is used to compute an ABC Bayes factor but it seems this computation is highly dependent on the tolerance threshold. With highly variable numerical values. The most favoured model includes some fatigue effect about the decreasing efficiency of armies along time. While the paper somehow reminded me of a most peculiar book, I have no idea on the depth of this analysis, namely on how relevant it is to model a battle through a two-dimensional system of differential equations, given the numerous factors involved in the matter…

the Grumble distribution and an ODE

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , on December 3, 2014 by xi'an

As ‘Og’s readers may have noticed, I paid some recent visits to Cross Validated (although I find this too addictive to be sustainable on a long term basis!, and as already reported a few years ago frustrating at several levels from questions asked without any preliminary personal effort, to a lack of background material to understand hints towards the answer, to not even considering answers [once the homework due date was past?], &tc.). Anyway, some questions are nonetheless great puzzles, to with this one about the possible transformation of a random variable R with density

$p(r|\lambda) = \dfrac{2\lambda r\exp\left(\lambda\exp\left(-r^{2}\right)-r^{2}\right)}{\exp\left(\lambda\right)-1}$

into a Gumble distribution. While the better answer is that it translates into a power law,

$V=e^{e^{-R^2}}\sim q(v|\lambda)\propto v^{\lambda-1}\mathbb{I}_{(1,e)}(v)$,

I thought using the S=R² transform could work but obtained a wrong sign in the pseudo-Gumble density

$W=S-\log(\lambda)\sim \eth(w)\propto\exp\left(\exp(-w)-w\right)$

and then went into seeking another transform into a Gumbel rv T, which amounted to solve the differential equation

$\exp\left(-e^{-t}-t\right)\text{d}t=\exp\left(e^{-w}-w\right)\text{d}w$

As I could not solve analytically the ODE, I programmed a simple Runge-Kutta numerical resolution as follows:

solvR=function(prec=10^3,maxz=1){
z=seq(1,maxz,le=prec)
t=rep(1,prec) #t(1)=1
for (i in 2:prec)
t[i]=t[i-1]+(z[i]-z[i-1])*exp(-z[i-1]+
exp(-z[i-1])+t[i-1]+exp(-t[i-1]))
zold=z
z=seq(.1/maxz,1,le=prec)
t=c(t[-prec],t)
for (i in (prec-1):1)
t[i]=t[i+1]+(z[i]-z[i+1])*exp(-z[i+1]+
exp(-z[i+1])+t[i+1]+exp(-t[i+1]))
return(cbind(c(z[-prec],zold),t))
}


Which shows that [the increasing] t(w) quickly gets too large for the function to be depicted. But this is a fairly useless result in that a transform of the original variable and of its parameter into an arbitrary distribution is always possible, given that  W above has a fixed distribution… Hence the pun on Gumble in the title.