## a new method to solve the transformation of calculus

Posted in Statistics with tags , , , , , , on December 23, 2018 by xi'an An hilariously ridiculous email I just received (warning: book cover unrelated):

Good day! this is very important to the “Mathematics” and the related fields,
“The Simulator”,“Probability theory”,”Statistics”,”Numerical Analysis”,
“Cryptography”,“Data mining”,“The big data analysis”and“Artificial Intelligence”.
The transformation of random variables in Calculus is very difficult and sometimes
is impossible to be done. The simulator can get the accuracy and precise simulated data
and the database could be the probability distributution if the data size is 100,000,000
or more. The probabilistic model can be designed and getting the probability distribution
and the coefficient using the simulator.

(1)“The Simulator” has four methods,
1) the basic method is the inverse function of the distribution function,
2) the transformation method to get the simulated data,
3) the numerical analysis method to build the simulated database,
4) the simulated database and the estimated line of random variable to get the simulated data.
(2) “Probability Theory” can use the simulator to a tool.
(3) ”Statistics”, the sampling distribution of the point estimator and the test statistic
can be seen as the transformation equation and the critical point and p value is from
the sampling distribution.
(4) ”Numerical Analysis”, the simulator data can overcome the limit of numerical analysis,
the number of random variables could be more 10000.
(5) “Cryptography”, the simulator of the probabilistic model will derive the lock code
which cannot be unlocked.
(6) “Data mining”, the data set can be a specific probability distribution using
“goodness of fit” or “Curve-fitting” or “Curvilinear”.
1) “goodness of fit”, there are 45 distributions for the null hypothesis.
2) “Curve-fitting”, the estimated line of random variable and the estimated line
of the distribution function.
3) “Curvilinear”, the data set is not arithmetic series.
(7) “The big data analysis”, the number of random variables could be more 10000
about the simulator of the probabilistic model.
(8) “Artificial Intelligence”, the model after analysis can be the transformation
equation, the simulator of the probabilistic model can get the simulated data.

The first book name is “The simulator” will be public, the context contains
(1) The simulation methods,
(2)“Probability Theory”,
(3) ”Statistics” and how to write the statistical package even the population is not
Normal distribution or a special statistical model.
(4)“Cryptography”,
(5)“Explored the arithmetic data”,

## a concise introduction to statistical inference [book review]

Posted in Statistics with tags , , , , , , , , , , on February 16, 2017 by xi'an

[Just to warn readers and avoid emails about Xi’an plagiarising Christian!, this book was sent to me by CRC Press for a review. To be published in CHANCE.] This is an introduction to statistical inference. And with 180 pages, it indeed is concise! I could actually stop the review at this point as a concise review of a concise introduction to statistical inference, as I do not find much originality in this introduction, intended for “mathematically sophisticated first-time student of statistics”. Although sophistication is in the eye of the sophist, of course, as this book has margin symbols in the guise of integrals to warn of section using “differential or integral calculus” and a remark that the book is still accessible without calculus… (Integral calculus as in Riemann integrals, not Lebesgue integrals, mind you!) It even includes appendices with the Greek alphabet, summation notations, and exponential/logarithms.

“In statistics we often bypass the probability model altogether and simply specify the random variable directly. In fact, there is a result (that we won’t cover in detail) that tells us that, for any random variable, we can find an appropriate probability model.” (p.17)

Given its limited mathematical requirements, the book does not get very far in the probabilistic background of statistics methods, which makes the corresponding chapter not particularly helpful as opposed to a prerequisite on probability basics. Since not much can be proven without “all that complicated stuff about for any ε>0” (p.29). And makes defining correctly notions like the Central Limit Theorem impossible. For instance, Chebychev’s inequality comes within a list of admitted results. There is no major mistake in the chapter, even though mentioning that two correlated Normal variables are jointly Normal (p.27) is inexact.

“The power of a test is the probability that you do not reject a null that is in fact correct.” (p.120)

Most of the book follows the same pattern as other textbooks at that level, covering inference on a mean and a probability, confidence intervals, hypothesis testing, p-values, and linear regression. With some words of caution about the interpretation of p-values. (And the unfortunate inversion of the interpretation of power above.) Even mentioning the Cult [of Significance] I reviewed a while ago.

Given all that, the final chapter comes as a surprise, being about Bayesian inference! Which should make me rejoice, obviously, but I remain skeptical of introducing the concept to readers with so little mathematical background. And hence a very shaky understanding of a notion like conditional distributions. (Which reminds me of repeated occurrences on X validated when newcomers hope to bypass textbooks and courses to grasp the meaning of posteriors and such. Like when asking why Bayes Theorem does not apply for expectations.) I can feel the enthusiasm of the author for this perspective and it may diffuse to some readers, but apart from being aware of the approach, I wonder how much they carry away from this brief (decent) exposure. The chapter borrows from Lee (2012, 4th edition) and from Berger (1985) for the decision-theoretic part. The limitations of the exercise are shown for hypothesis testing (or comparison) by the need to restrict the parameter space to two possible values. And for decision making. Similarly, introducing improper priors and the likelihood principle [distinguished there from the law of likelihood] is likely to get over the head of most readers and clashes with the level of the previous chapters. (And I do not think this is the most efficient way to argue in favour of a Bayesian approach to the problem of statistical inference: I have now dropped all references to the likelihood principle from my lectures. Not because of the controversy, but simply because the students do not get it.) By the end of the chapter, it is unclear a neophyte would be able to spell out how one could specify a prior for one of the problems processed in the earlier chapters. The appendix on de Finetti’s formalism on personal probabilities is very much unlikely to help in this regard. While it sounds so far beyond the level of the remainder of the book.

## a maths mansion!

Posted in Books, Kids, pictures, Travel with tags , , , , , , , , , on October 11, 2015 by xi'an I read in The Guardian today about James Stewart’s house being for sale. James Stewart was a prolific author of many college and high-school books on calculus and pre-calculus. I have trouble understanding how one can write so many books on the same topic, but he apparently managed, to the point of having this immense house designed by architects to his taste. Which sounds a bit passé in my opinion. Judging from the covers of the books, and from the shape of the house, he had a fascination for the integral sign (which has indeed an intrinsic beauty!). Still amazing considering it was paid by his royalties. Less amazing when checking the price of those books: they are about \$250 a piece. Multiplied by hundreds of thousands of copies sold every year, it sums up to being able to afford such a maths mansion! (I am not so sure I can take over the undergrad market by recycling the Bayesian Choice..!) ## Numerical analysis for statisticians

Posted in Books, R, Statistics, University life with tags , , , , , , , , , on August 26, 2011 by xi'an

“In the end, it really is just a matter of choosing the relevant parts of mathematics and ignoring the rest. Of course, the hard part is deciding what is irrelevant.” Somehow, I had missed the first edition of this book and thus I started reading it this afternoon with a newcomer’s eyes (obviously, I will not comment on the differences with the first edition, sketched by the author in the Preface). Past the initial surprise of discovering it was a mathematics book rather than an algorithmic book, I became engrossed into my reading and could not let it go! Numerical Analysis for Statisticians, by Kenneth Lange, is a wonderful book. It provides most of the necessary background in calculus and some algebra to conduct rigorous numerical analyses of statistical problems. This includes expansions, eigen-analysis, optimisation, integration, approximation theory, and simulation, in less than 600 pages. It may be due to the fact that I was reading the book in my garden, with the background noise of the wind in tree leaves, but I cannot find any solid fact to grumble about! Not even about  the MCMC chapters! I simply enjoyed Numerical Analysis for Statisticians from beginning till end.

“Many fine textbooks (…) are hardly substitutes for a theoretical treatment emphasizing mathematical motivations and derivations. However, students do need exposure to real computing and thoughtful numerical exercises. Mastery of theory is enhanced by the nitty gritty of coding.”

From the above, it may sound as if Numerical Analysis for Statisticians does not fulfill its purpose and is too much of a mathematical book. Be assured this is not the case: the contents are firmly grounded in calculus (analysis) but the (numerical) algorithms are only one code away. An illustration (among many) is found in Section 8.4: Finding a Single Eigenvalue, where Kenneth Lange shows how the Raleigh quotient algorithm of the previous section can be exploited to this aim, when supplemented with a good initial guess based on Gerschgorin’s circle theorem. This is brilliantly executed in two pages and the code is just one keyboard away. The EM algorithm is immersed into a larger M[&]M perspective. Problems are numerous and mostly of high standards, meaning one (including me) has to sit and think about them. References are kept to a minimum, they are mostly (highly recommended) books, plus a few papers primarily exploited in the problem sections. (When reading the Preface, I found that “John Kimmel, [his] long suffering editor, exhibited extraordinary patience in encouraging [him] to get on with this project”. The quality of Numerical Analysis for Statisticians is also a testimony to John’s editorial acumen!)

“Every advance in computer architecture and software tempts statisticians to tackle numerically harder problems. To do so intelligently requires a good working knowledge of numerical analysis. This book equips students to craft their own software and to understand the advantages and disadvantages of different numerical methods. Issues of numerical stability, accurate approximation, computational complexity, and mathematical modeling share the limelight in a broad yet rigorous overview of those parts of numerical analysis most relevant to statisticians.”

While I am reacting so enthusiastically to the book (imagine, there is even a full chapter on continued fractions!), it may be that my French math background is biasing my evaluation and that graduate students over the World would find the book too hard. However, I do not think so: the style of Numerical Analysis for Statisticians is very fluid and the rigorous mathematics are mostly at the level of undergraduate calculus. The more advanced topics like wavelets, Fourier transforms and Hilbert spaces are very well-introduced and do not require prerequisites in complex calculus or functional analysis. (Although I take no joy in this, even measure theory does not appear to be a prerequisite!) On the other hand, there is a prerequisite for a good background in statistics. This book will clearly involve a lot of work from the reader, but the respect shown by Kenneth Lange to those readers will sufficiently motivate them to keep them going till assimilation of those essential notions. Numerical Analysis for Statisticians is also recommended for more senior researchers and not only for building one or two courses on the bases of statistical computing. It contains most of the math bases that we need, even if we do not know we need them! Truly an essential book.