## mean simulations

Posted in Books, Statistics with tags , , , , , , , , on May 10, 2023 by xi'an

A rather intriguing question on X validated, namely a simulation approach to sampling a bivariate distribution fully specified by one conditional p(x|y) and the symmetric conditional expectation IE[Y|X=x]. The book Conditional Specification of Statistical Models, by Arnold, Castillo and Sarabia, as referenced by and in the question, contains (§7.7) illustrations of such cases. As for instance with some power series distribution on ℕ but also for some exponential families (think Laplace transform). An example is when

$P(X=x|Y=y) = c(x)y^x/c^*(y)\quad c(x)=\lambda^x/x!$

which means X conditional on Y=y is exponential E(λy). The expectation IE[Y|X=x] is then sufficient to identify the joint. As I figured out before checking the book, this result is rather immediate to establish by solving a linear system, but it does not help in finding a way to simulating the joint. (I am afraid it cannot be connected to the method of simulated moments!)

## powering a probability [a Bernoulli factory tale]

Posted in Books, R, Statistics with tags , , on April 21, 2023 by xi'an

Starting from an X validated question on finding an unbiased estimator of an integral raised to a non-integer power, I came across a somewhat interesting Bernoulli factory solution! Thanks to Peter Occil’s encyclopedic record of cases, pointing out to Mendo’s (2019) solution for functions of ρ that can be expressed as power series. Like ργ since

$(1-[1-\rho])^\gamma=1+\gamma(1-\rho)+\frac{\gamma(\gamma-1)(1-\rho)^2}{2}+\cdots$

which rather magically turns into the reported algorithm

Set k=1
Repeat the following process, until it returns a value x:
1. Generate a Bernoulli B(ϱ) variate z; if z=1, return x=1
2. Else, with probability γ/k, return x=0


since

$\rho^\gamma=\rho+(1-\rho)(1-\gamma)\big\{\rho+\frac{(1-\rho)(2-\gamma)}{2}\big[\rho+\cdots$

## signed mixtures [X’ed]

Posted in Books, Kids, Statistics with tags , , , , , , , , on March 26, 2023 by xi'an

Following a question on X validated, the hypoexponential distribution, I came across (for the second time) a realistic example of a mixture (of exponentials) whose density wrote as a signed mixture, i.e. involving both negative and positive weights (with sum still equal to one). Namely,

$\displaystyle f(x)=\sum_i^d \lambda_i e^{-\lambda_ix}\prod_{j=1,i\neq j}^{d}\frac{\lambda_j}{\lambda_j-\lambda_i}\quad x,\lambda_j>0$

representing the density of a sum of d Exponential variates. The above is only well-defined when all rates differ, while a more generic definition involving matrix exponentiation exists. But the case when (only) two rates are equal can rather straightforwardly be derived by a direct application of L’Hospital rule, which my friend George considered as the number one calculus rule!

## ABConic mean evidence approximation

Posted in Statistics with tags , , , , , on March 7, 2023 by xi'an

Following a question on X validated about evidence approximation in ABC settings, i.e., on returning an approximation of the evidence based on the outputs of parallel ABC runs for the models under comparison, I wondered at the relevance of an harmonic mean estimator in that context.

Rather than using the original ABC algorithm that proposes a model, a parameter from that model, and a simulated dataset from that model with that parameter,  an alternate, cost-free, solution would be to run an ABC version of [harmonic mean evidence approximation à la Newton & Raftery (1994). Since

$\mathcal Z=1\Big/\int \dfrac{\pi(\theta|D)}{p(D|\theta)}\,\text d\theta$

the evidence can formally be approximated by

$\hat{\mathcal Z} =1\Big/\frac{1}{N}\sum_{i=1}^N\frac{1}{p(D|\theta_i)}\qquad\theta_i\sim\pi(\theta|D)$

and its ABC version is

$\hat{\mathcal Z} =1\Big/\frac{1}{N}\sum_{i=1}^N\frac{1}{K_\epsilon(d(D,D^\star(\theta_i)))}\qquad\theta_i\sim\pi^\epsilon(\theta|D)$

where Kε(.) is the kernel used for the ABC acceptance/rejection step and d(.,.) is the distance used to measure the discrepancy between samples. Since the kernel values are already computed for evidence, the cost is null. Obviously, an indicator kernel does not return a useful estimate but something like a Cauchy kernel could do.

However, when toying with a normal-normal model and calibrating the Cauchy scale to fit the actual posterior as in the above graph, the estimated evidence 5 10⁻⁵ proved much smaller than the actual one, 8 10⁻².

## conditional confusion [X validated]

Posted in Statistics with tags , on January 23, 2023 by xi'an