**T**he R book with George Casella has progressed further over X’mas break in that the first drafts of the first four chapters

- Introduction to R programming
- Random variable generation
- Monte Carlo methods
- Controlling and accelerating convergence

are now completed. The second chapter focusses on accept-reject algorithms, with an entry on the ABC algorithm, I was expecting to complete the fifth chapter on optimisation methods by tomorrow but this sounds rather unrealistic at this later stage, since I have not yet finished the stochastic search first part… The manuscript is now 115 pages long with 32K words and it looks about half completed since the goal should be close to 200 than 300 pages. I also implemented an additional index counter for R commands. **I**n fact, as when I was writing * Monte Carlo Statistical Methods*, I find it harder to write about optimisation methods than about integration methods, presumably because those optimisation problems have a clearer goal and because the corresponding methods are more easily assessed—in that they hit the target or not—, thus more easily wrong. For instance, I like very much the motivation behind the simulated annealing methods but always find them a pain to implement if one wants to hit exactly a well-known minimum (as, for instance, when designing an annealing algorithm for solving sudokus) in a less than geological time! Integration methods are somehow more amenable to the intuition of a statistician than optimisation methods because there is a natural notion of error that goes with the former. When contemplating the solution to a maximum of a function

*, using, say, a brute force random walk algorithm, the solution produced by the algorithm is off the truth by a quantity that is an error that has a less direct statistical meaning. Obviously, this is also a probabilistic algorithm and, as such, it is endowed with an intrinsic variation. Thus, repeating the algorithm many times will provide an evaluation of this variation. But, given a*

**h***single*sequence ending up at the final value

**for the argument**

*h(θ)***, there is no intuitive way to assess how close**

*θ**is to the true argument*

**θ****or how close h(θ) is to the true maximum (or minimum)**

*θ*^{*}**, unless one starts using more information about**

*h(θ*^{*})*like its gradient and its Hessian, and then it gets immediately more case-dependent. This is obviously natural in that the nature of optimisation is keener than for integration: it is local instead of global, supported by the properties of the target h at a single point rather than over the whole domain. (Without aiming at raising controversies, this is also a reason for opting for Bayesian solutions like posterior means rather than maximum likelihood estimators or maxima a posteriori….)*

**h**
January 20, 2009 at 7:18 pm

[…] with George Casella, namely the one on stochastic optimisation techniques mentioned in the earlier post, so we have now reached five completed chapters for the first […]