## simulating hazard

Posted in Books, Kids, pictures, Statistics, Travel with tags , , , , , , , , , , , , on May 26, 2020 by xi'an

A rather straightforward X validated question that however leads to an interesting simulation question: when given the hazard function h(·), rather than the probability density f(·), how does one simulate this distribution? Mathematically h(·) identifies the probability distribution as much as f(·),

$1-F(x)=\exp\left\{ \int_{-\infty}^x h(t)\,\text{d}t \right\}=\exp\{H(x)\}$

which means cdf inversion could be implemented in principle. But in practice, assuming the integral is intractable, what would an exact solution look like? Including MCMC versions exploiting one fixed point representation or the other.. Since

$f(x)=h(x)\,\exp\left\{ \int_{-\infty}^x h(t)\,\text{d}t \right\}$

using an unbiased estimator of the exponential term in a pseudo-marginal algorithm would work. And getting an unbiased estimator of the exponential term can be done by Glynn & Rhee debiasing. But this is rather costly… Having Devroye’s book under my nose [at my home desk] should however have driven me earlier to the obvious solution to… simply open it!!! A whole section (VI.2) is indeed dedicated to simulations when the distribution is given by the hazard rate. (Which made me realise this problem is related with PDMPs in that thinning and composition tricks are common to both.) Besides the inversion method, ie X=H⁻¹(U), Devroye suggests thinning a Poisson process when h(·) is bounded by a manageable g(·). Or a generic dynamic thinning approach that converges when h(·) is non-increasing.

## one or two?

Posted in Books, Kids, R with tags , , , , , , on March 12, 2020 by xi'an

A superposition of two random walks from The Riddler:

Starting from zero, a random walk is produced by choosing moves between ±1 and ±2 at each step. If the choice between both is made towards maximising the probability of ending up positive after 100 steps, what is this probability?

Although the optimal path is not necessarily made of moves that optimise the probability of ending up positive after the remaining steps, I chose to follow a dynamic programming approach by picking between ±1 and ±2 at each step based on that probability:

bs=matrix(0,405,101) #best stategy with value i-203 at time j-1
bs[204:405,101]=1
for (t in 100:1){
tt=2*t
bs[203+(-tt:tt),t]=.5*apply(cbind(
bs[204+(-tt:tt),t+1]+bs[202+(-tt:tt),t+1],
bs[201+(-tt:tt),t+1]+bs[205+(-tt:tt),t+1]),1,max)}


resulting in the probability

> bs[203,1]
[1] 0.6403174


Just checking that a simple strategy of picking ±1 above zero and ±2 below leads to the same value

ga=rep(0,T)
for(v in 1:100) ga=ga+(1+(ga<1))*sample(c(-1,1),T,rep=TRUE)


or sort of

> mean(ga>0)
[1] 0.6403494


With highly similar probabilities when switching at ga<2

> mean(ga>0)
[1] 0.6403183


or ga<0

> mean(ga>0)
[1] 0.6403008


and too little difference to spot a significant improvement between the three boundaries.

## BayesComp 2020 at a glance

Posted in Statistics, Travel, University life with tags , , , , , , , , , , , , , , , , on December 18, 2019 by xi'an

## an independent sampler that maximizes the acceptance rate of the MH algorithm

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , , , , , , on September 3, 2019 by xi'an

An ICLR 2019 paper by Neklyudov, Egorov and Vetrov on an optimal choice of the proposal in an independent Metropolis algorithm I discovered via an X validated question. Namely whether or not the expected Metropolis-Hastings acceptance ratio is always one (which it is not when the support of the proposal is restricted). The paper mentions the domination of the Accept-Reject algorithm by the associated independent Metropolis-Hastings algorithm, which has actually been stated in our Monte Carlo Statistical Methods (1999, Lemma 6.3.2) and may prove even older. The authors also note that the expected acceptance probability is equal to one minus the total variation distance between the joint defined as target x Metropolis-Hastings proposal distribution and its time-reversed version. Which seems to suffer from the same difficulty as the one mentioned in the X validated question. Namely that it only holds when the support of the Metropolis-Hastings proposal is at least the support of the target (or else when the support of the joint defined as target x Metropolis-Hastings proposal distribution is somewhat symmetric. Replacing total variation with Kullback-Leibler then leads to a manageable optimisation target if the proposal is a parameterised independent distribution. With a GAN version when the proposal is not explicitly available. I find it rather strange that one still seeks independent proposals for running Metropolis-Hastings algorithms as the result will depend on the family of proposals considered and as performances will deteriorate with dimension (the authors mention a 10% acceptance rate, which sounds quite low). [As an aside, ICLR 2020 will take part in Addis Abeba next April.]

## off to SimStat2019, Salzburg

Posted in Mountains, Running, Statistics, University life with tags , , , , , , , , , , , , , on September 2, 2019 by xi'an

Today, I am off to Salzburg for the SimStat 2019 workshop, or more formally the 10th International Workshop on Simulation and Statistics, where I give a talk on ABC. The program of the workshop is quite diverse and rich and so I do not think I will have time to take advantage of the Hohe Tauern or the Berchtesgaden Alps to go climbing. Especially since I am also discussing papers in an ABC session.

## simulation fodder for future exams

Posted in Books, Kids, R, Statistics with tags , , , , on February 20, 2019 by xi'an

Here are two nice exercises for a future simulation exam, seen and solved on X validated.The first one is about simulating a Gibbs sampler associated with the joint target

exp{-|x|-|y|-a|y-x|}

defined over IR² for a≥0 (or possibly a>-1). The conditionals are identical and non-standard, but a simple bound on the conditional density is the corresponding standard double exponential density, which makes for a straightforward accept-reject implementation. However it is also feasible to break the full conditional into three parts, depending on the respective positions of x, y, and 0, and to obtain easily invertible cdfs on the three intervals.The second exercise is about simulating from the cdf

$F(x)=1-\exp\{-ax-bx^{p+1}/(p+1)\}$

which can be numerically inverted. It is however more fun to call for an accept-reject algorithm by bounding the density with a ½ ½ mixture of an Exponential Exp(a) and of the 1/(p+1)-th power of an Exponential Exp(b/(p+1)). Since no extra constant appears in the solution,  I suspect the (p+1) in b/(p+1) was introduced on purpose. As seen in the above fit for 10⁶ simulations (and a=1,b=2,p=3), there is no deviation from the target! There is nonetheless an even simpler and rather elegant resolution to the exercise: since the tail function (1-F(x)) appears as the product of two tail functions, exp(-ax) and the other one, the cdf is the distribution of the minimum of two random variates, one with the Exp(a) distribution and the other one being the 1/(p+1)-th power of an Exponential Exp(b/(p+1)) distribution. Which of course returns a very similar histogram fit:

## a new method to solve the transformation of calculus

Posted in Statistics with tags , , , , , , on December 23, 2018 by xi'an

An hilariously ridiculous email I just received (warning: book cover unrelated):

Good day! this is very important to the “Mathematics” and the related fields,
“The Simulator”,“Probability theory”,”Statistics”,”Numerical Analysis”,
“Cryptography”,“Data mining”,“The big data analysis”and“Artificial Intelligence”.
The transformation of random variables in Calculus is very difficult and sometimes
is impossible to be done. The simulator can get the accuracy and precise simulated data
and the database could be the probability distributution if the data size is 100,000,000
or more. The probabilistic model can be designed and getting the probability distribution
and the coefficient using the simulator.

(1)“The Simulator” has four methods,
1) the basic method is the inverse function of the distribution function,
2) the transformation method to get the simulated data,
3) the numerical analysis method to build the simulated database,
4) the simulated database and the estimated line of random variable to get the simulated data.
(2) “Probability Theory” can use the simulator to a tool.
(3) ”Statistics”, the sampling distribution of the point estimator and the test statistic
can be seen as the transformation equation and the critical point and p value is from
the sampling distribution.
(4) ”Numerical Analysis”, the simulator data can overcome the limit of numerical analysis,
the number of random variables could be more 10000.
(5) “Cryptography”, the simulator of the probabilistic model will derive the lock code
which cannot be unlocked.
(6) “Data mining”, the data set can be a specific probability distribution using
“goodness of fit” or “Curve-fitting” or “Curvilinear”.
1) “goodness of fit”, there are 45 distributions for the null hypothesis.
2) “Curve-fitting”, the estimated line of random variable and the estimated line
of the distribution function.
3) “Curvilinear”, the data set is not arithmetic series.
(7) “The big data analysis”, the number of random variables could be more 10000
about the simulator of the probabilistic model.
(8) “Artificial Intelligence”, the model after analysis can be the transformation
equation, the simulator of the probabilistic model can get the simulated data.

The first book name is “The simulator” will be public, the context contains
(1) The simulation methods,
(2)“Probability Theory”,
(3) ”Statistics” and how to write the statistical package even the population is not
Normal distribution or a special statistical model.
(4)“Cryptography”,
(5)“Explored the arithmetic data”,