## conditioning on zero probability events

Posted in Books, Kids, pictures, Statistics, University life with tags , , , , on November 15, 2019 by xi'an

An interesting question on X validated as to how come a statistic T(X) can be sufficient when its support depends on the parameter θ behind the distribution of X. The reasoning there being that the distribution of X given T(X)=t does depend on θ since it is not defined for some values of θ … Which is not correct in that the conditional distribution of X depends on the realisation of T, meaning that if this realisation is impossible, then the conditional is arbitrary and of no relevance. Which also led me to tangentially notice and bemoan that most (Stack) exchanges on conditioning on zero probability events are pretty unsatisfactory in that they insist on interpreting P(X=x) [equal to zero] in a literal sense when it is merely a notation in the continuous case. And undefined when X has a discrete support. (Conditional probability is always a sore point for my students!)

## shortened iterations [code golf]

Posted in Kids, pictures, Statistics, Travel with tags , , , , , , , , on October 29, 2019 by xi'an

A codegolf lazy morning exercise towards finding the sequence of integers that starts with an arbitrary value n and gets updated by blocks of four as

$a_{4k+1} = a_{4k} \cdot(4k+1)\\ a_{4k+2} = a_{4k+1} + (4k+2)\\ a_{4k+3} = a_{4k+2} - (4k+3)\\ a_{4k+4} = a_{4k+3} / (4k+4)$

until the last term is not an integer. While the update can be easily implemented with the appropriate stopping rule, a simple congruence analysis shows that, depending on n, the sequence is 4, 8 or 12 values long when

$n\not\equiv 1(4)\\ n\equiv 1(4)\ \text{and}\ 3(n-1)+4\not\equiv 0(32)\\ 3(n-1)+4\equiv 0(32)$

respectively. But sadly the more interesting fixed length solution

~=rep #redefine function
b=(scan()-1)*c(32~4,8,40~4,1,9~3)/32+c(1,1,3,0~3,6,-c(8,1,9,-71,17)/8)
b[!b%%1] #keep integers only


ends up being longer than the more basic one:

a=scan()
while(!a[T]%%1)a=c(a,d<-a[T]*T,d+T+1,e<-d-1,e/((T<-T+4)-1))
a[-T]


where Robin’s suggestion of using T rather than length is very cool as T has double meaning, first TRUE (and 1) then the length of a…

## natural LaTeX

Posted in Books, Statistics, University life with tags , , , , , , on July 13, 2019 by xi'an

Nature must have been out of inspiration in the past weeks for running a two-page article on LaTeX [as the toolbox column] and how it compares with… Word! Which is not so obvious since most articles in Nature are not involving equations (much) and are from fields where Word prevails. Besides the long-running whine that LaTeX is not he selling argument for this article seemed to be the increasing facility to use (basic) LaTeX commands in forums (like Stack Exchange) and blogs (like WordPress) via MathJax. But the author also pushes for the lighter (R?)Markdown as, “in LaTeX, there is a greater risk that contributors will make changes that prevent the code compiling into a PDF” (which does not make sense to me). This tribune also got me to find out that there is a blog dedicated to the “LaTeX fetish”, which sounds to me like an perfect illustration of Internet vigilantism, especially with arguments like “free and open source software has a strong tendency towards being difficult to install and get up and running”.

## (x=scan())%in%(2*4^(n=0:x)-2^n-1)

Posted in Books, Kids, R with tags , , , , , , , , , on March 28, 2019 by xi'an

One challenge on code golf is to find the shortest possible code to identify whether or not an integer belongs to the binary cyclops numbers which binary expansion is 0, 101, 11011, 1110111, 111101111, &tc. The n-th such number being

$a(n) = 2^{2n + 1} - 2^n - 1 = 2\,4^n - 2^n - 1 = (2^n - 1)(2\,2^n + 1)$

this leads to the above solution in R (26 bits). The same length as the C solution [which I do not get]

f(n){n=~n==(n^=-~n)*~n/2;}

And with shorter versions in many esoteric languages I had never heard of, like the 8 bits Brachylog code

ḃD↔Dḍ×ᵐ≠

or the 7 bits Jelly

B¬ŒḂ⁼SƊ

As a side remark, since this was not the purpose of the game, the R code is most inefficient in creating a set of size (x+1), with most terms being Inf.

## I’m getting the point

Posted in Statistics with tags , , , , , , on February 14, 2019 by xi'an

A long-winded X validated discussion on the [textbook] mean-variance conjugate posterior for the Normal model left me [mildly] depressed at the point and use of answering questions on this forum. Especially as it came at the same time as a catastrophic outcome for my mathematical statistics exam.  Possibly an incentive to quit X validated as one quits smoking, although this is not the first attempt

## generating from a failure rate function [X’ed]

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , on July 4, 2015 by xi'an

While I now try to abstain from participating to the Cross Validated forum, as it proves too much of a time-consuming activity with little added value (in the sense that answers are much too often treated as disposable napkins by users who cannot be bothered to open a textbook and who usually do not exhibit any long-term impact of the provided answer, while clogging the forum with so many questions that the individual entries seem to get so little traffic, when compared say with the stackoverflow forum, to the point of making the analogy with disposable wipes more appropriate!), I came across a truly interesting question the other night. Truly interesting for me in that I had never considered the issue before.

The question is essentially wondering at how to simulate from a distribution defined by its failure rate function, which is connected with the density f of the distribution by

$\eta(t)=\frac{f(t)}{\int_t^\infty f(x)\,\text{d}x}=-\frac{\text{d}}{\text{d}t}\,\log \int_t^\infty f(x)\,\text{d}x$

From a purely probabilistic perspective, defining the distribution through f or through η is equivalent, as shown by the relation

$F(t)=1-\exp\left\{-\int_0^t\eta(x)\,\text{d}x\right\}$

but, from a simulation point of view, it may provide a different entry. Indeed, all that is needed is the ability to solve (in X) the equation

$\int_0^X\eta(x)\,\text{d}x=-\log(U)$

when U is a Uniform (0,1) variable. Which may help in that it does not require a derivation of f. Obviously, this also begs the question as to why would a distribution be defined by its failure rate function.

## the Grumble distribution and an ODE

Posted in Books, Kids, R, Statistics, University life with tags , , , , , , on December 3, 2014 by xi'an

As ‘Og’s readers may have noticed, I paid some recent visits to Cross Validated (although I find this too addictive to be sustainable on a long term basis!, and as already reported a few years ago frustrating at several levels from questions asked without any preliminary personal effort, to a lack of background material to understand hints towards the answer, to not even considering answers [once the homework due date was past?], &tc.). Anyway, some questions are nonetheless great puzzles, to with this one about the possible transformation of a random variable R with density

$p(r|\lambda) = \dfrac{2\lambda r\exp\left(\lambda\exp\left(-r^{2}\right)-r^{2}\right)}{\exp\left(\lambda\right)-1}$

into a Gumble distribution. While the better answer is that it translates into a power law,

$V=e^{e^{-R^2}}\sim q(v|\lambda)\propto v^{\lambda-1}\mathbb{I}_{(1,e)}(v)$,

I thought using the S=R² transform could work but obtained a wrong sign in the pseudo-Gumble density

$W=S-\log(\lambda)\sim \eth(w)\propto\exp\left(\exp(-w)-w\right)$

and then went into seeking another transform into a Gumbel rv T, which amounted to solve the differential equation

$\exp\left(-e^{-t}-t\right)\text{d}t=\exp\left(e^{-w}-w\right)\text{d}w$

As I could not solve analytically the ODE, I programmed a simple Runge-Kutta numerical resolution as follows:

solvR=function(prec=10^3,maxz=1){
z=seq(1,maxz,le=prec)
t=rep(1,prec) #t(1)=1
for (i in 2:prec)
t[i]=t[i-1]+(z[i]-z[i-1])*exp(-z[i-1]+
exp(-z[i-1])+t[i-1]+exp(-t[i-1]))
zold=z
z=seq(.1/maxz,1,le=prec)
t=c(t[-prec],t)
for (i in (prec-1):1)
t[i]=t[i+1]+(z[i]-z[i+1])*exp(-z[i+1]+
exp(-z[i+1])+t[i+1]+exp(-t[i+1]))
return(cbind(c(z[-prec],zold),t))
}


Which shows that [the increasing] t(w) quickly gets too large for the function to be depicted. But this is a fairly useless result in that a transform of the original variable and of its parameter into an arbitrary distribution is always possible, given that  W above has a fixed distribution… Hence the pun on Gumble in the title.