This paper was arXived today by Martino and Louzada. It starts from the adaptive rejection sampling algorithm of Gilks and Wild (1993), which provided an almost automatic random generator for univariate log-concave target probability densities. By creating an envelope of the true target based on convexity. This algorithm was at the core of BUGS in its early days. And makes a good example of accept reject algorithm that I often use in class. It is however not so popular nowadays because of the unidimensional constraint. And because it is not particularly adequate for simulating a single realisation from a given target. As is the case when used in a Gibbs sampler. The authors only consider here the issue of the growing cost of running the adaptive rejection sampling algorithm, which is due to the update of the envelope at each rejection. They however fail to quantify this increase, which involves (a) finding the interval where the current rejection lies, among n existing intervals, which is of order O(n), and (b) creating both modified envelopes on both new intervals, which is of order O(1). The proposal of the authors, called cheap adaptive rejection sampling, settles for a fixed number τ of support points (and intervals), solely swapping a rejected point with the closest support point when this improves the acceptance rate. The test for swapping involves first computing the alternative target (on a pair of intervals) and the corresponding normalising constant, while swapping the rejected point with the closest support point involves finding the closest point, which is of order O(log τ). I am rather surprised that the authors do not even mention the execution time orders, resorting instead to a simulation where the gain brought by their proposal is not overwhelming. There is also an additional cost for figuring out the fixed number τ of support points, another issue not mentioned in the paper.
Estimating both parameters of a negative binomial distribution NB(N,p) by maximum likelihood sounds like an obvious exercise. But it is not because some samples lead to degenerate solutions, namely p=0 and N=∞… This occurs when the mean of the sample is larger than its empirical variance, s²>x̄, not an impossible instance: I discovered this when reading a Cross Validated question asking about the action to take in such a case. A first remark of interest is that this only happens when the negative binomial distribution is defined in terms of failures (since else the number of successes is bounded). A major difference I never realised till now, for estimating N is not a straightforward exercise. A second remark is that a negative binomial NB(N,p) is a Poisson compound of an LSD variate with parameter p, the Poisson being with parameter η=-N log(1-p). And the LSD being a power distribution pk/k rather than a psychedelic drug. Since this is not an easy framework, Adamidis (1999) introduces an extra auxiliary variable that is a truncated exponential on (0,1) with parameter -log(1-p). A very neat trick that removes the nasty normalising constant on the LSD variate.
“Convergence was achieved in all cases, even when the starting values were poor and this emphasizes the numerical stability of the EM algorithm.” K. Adamidis
Adamidis then constructs an EM algorithm on the completed set of auxiliary variables with a closed form update on both parameters. Unfortunately, the algorithm only works when s²>x̄. Otherwise, it gets stuck at the boundary p=0 and N=∞. I was hoping for a replica of the mixture case where local maxima are more interesting than the degenerate global maximum… (Of course, there is always the alternative of using a Bayesian noninformative approach.)
“Recently, El Moselhy et al. proposed a method to construct a map that pushed forward the prior measure to the posterior measure, casting Bayesian inference as an optimal transport problem. Namely, the constructed map transforms a random variable distributed according to the prior into another random variable distributed according to the posterior. This approach is conceptually different from previous methods, including sampling and approximation methods.”
Yesterday, Kim et al. arXived a paper with the above title, linking transport theory with Bayesian inference. Rather strangely, they motivate the transport theory with Galton’s quincunx, when the apparatus is a discrete version of the inverse cdf transform… Of course, in higher dimensions, there is no longer a straightforward transform and the paper shows (or recalls) that there exists a unique solution with positive Jacobian for log-concave posteriors. For instance, log-concave priors and likelihoods. This solution remains however a virtual notion in practice and an approximation is constructed via a (finite) functional polynomial basis. And minimising an empirical version of the Kullback-Leibler distance.
I am somewhat uncertain as to how and why apply such a transform to simulations from the prior (which thus has to be proper). Producing simulations from the posterior certainly is a traditional way to approximate Bayesian inference and this is thus one approach to this simulation. However, the discussion of the advantage of this approach over, say, MCMC, is quite limited. There is no comparison with alternative simulation or non-simulation methods and the computing time for the transport function derivation. And on the impact of the dimension of the parameter space on the computing time. In connection with recent discussions on probabilistic numerics and super-optimal convergence rates, Given that it relies on simulations, I doubt optimal transport can do better than O(√n) rates. One side remark about deriving posterior credible regions from (HPD) prior credible regions: there is no reason the resulting region is optimal in volume (HPD) given that the transform is non-linear.
A fairly good race in Argentan, despite (relatively) hot weather that saw the winning time increase by two minutes and a half. (Last year’s winner actually lost 3 and half minutes on the same track). Gaining 18 seconds over my time from last year was thus a significant achievement. This was my 17th—or 18th including one I ran on my own as it had been cancelled—Argentan half-marathon. The first half [of the half] was a bit unpleasant under the relentless sun, and each wee slope made me miss my 3:52 kilometre goal. But very few runners passed me for good after the 5th kilometre and reaching the forest part was a blessing, providing shade and cool. A single runner passed me there, although I had not slowed down, and I did not realise it was another runner in my V2 category as I could have tried to keep up. The last kilometres were indeed much smoother than in previous, thanks to a change in my training where I increased the number of long distance trainings. And presumable thanks to the previous week abroad when I trained twice a day. Anyway, I was still surprised to end up as the second V2, 15 seconds from both the first and third V2 runners. Which got me a nice Timex watch as a reward! And a pretty ugly cup… [Thanks again and again to the photographs of Normandiecourseapied for their free pictures!]
“For example, a religiously affiliated college that receives federal grants could fire a professor simply for being gay and still receive those grants. Or federal workers could refuse to process the tax returns of same-sex couples simply because of bigotry against their marriages. It doesn’t stop there. As critics of the bill quickly pointed out, the measure’s broad language — which also protects those who believe that “sexual relations are properly reserved to” heterosexual marriages alone — would permit discrimination against anyone who has sexual relations outside such a marriage. That would appear to include women who have children outside of marriage, a class generally protected by federal law.” The New York Time
An excerpt from this week New York Time Sunday Review editorial about what it qualifies as “a nasty bit of business congressional Republicans call the First Amendment Defense Act.” A bill which first line states to be intended to “prevent discriminatory treatment of any person on the basis of views held with respect to marriage” and which in essence would allow for discriminatory treatment of homosexual and unmarried couples not to be prosecuted. A fine example of Newspeak if any! (Maybe they could also borrow Orwell‘s notion of a Ministry of Love.) Another excerpt of the bill that similarly competes for Newspeak of the Year:
(5) Laws that protect the free exercise of religious beliefs and moral convictions about marriage will encourage private citizens and institutions to demonstrate tolerance for those beliefs and convictions and therefore contribute to a more respectful, diverse, and peaceful society.
This reminded me of a story I was recently told me about a friend of a friend who is currently employed by a Catholic school in Australia and is afraid of being fired if found being pregnant outside of marriage. Which kind of “freedom” is to be defended in such “tolerant” behaviours?!
When I started the ‘Og, in 2008, I was about to run the 23rd edition of the Argentan half-marathon… Seven years later, I am once again getting ready for the race, after a rather good training season, between the mountains of the North Cascade and the track of Malakoff. with the last week in England, Holland, and Canada having seen close to two trainings a day. (Borderline stress injury, maybe!) Weather does not look too bad this year, so we’ll see tomorrow how I fare against myself (and the other V2 runners, incidentally!).