The first circle (reserved for virtuous pagans) is about treating integral reals as if they were integers, the second circle (attributed to gluttons, although Dante’s is for the lustful) is about allocating more space along the way, as in the question I answered and in most of my students’ codes! The third circle (allocated here to blasphemous sinners, destined for Dante’s seven circle, when Dante’s third circle is to the gluttons) points out the consequences of not vectorising, with for instance the impressive capacities of the ifelse() function [exploited to the max in R codecolfing!]. And the fourth circle (made for the lustfuls rather than Dante’s avaricious and prodigals) is a short warning about the opposite over-vectorising. Circle five (destined for the treasoners, and not Dante’s wrathfuls) pushes for and advises about writing R functions. Circle six recovers Dante’s classification, welcoming (!) heretics, and prohibiting global assignments, in another short chapter. Circle seven (alloted to the simoniacs—who should be sharing the eight circle with many other sinners—rather than the violents as in Dante’s seventh) discusses object attributes, with the distinction between S3 and S4 methods somewhat lost on me. Circle eight (targeting the fraudulents, as in Dante’s original) is massive as it covers “a large number of ghosts, chimeras and devils”, a collection of difficulties and dangers and freak occurences, with the initial warning that “It is a sin to assume that code does what is intended”. A lot of these came as surprises to me and I was rarely able to spot the difficulty without the guidance of the book. Plenty to learn from these examples and counter-examples. Reaching Circle nine (where live (!) the thieves, rather than Dante’s traitors). A “special place for those who feel compelled to drag the rest of us into hell.” Discussing the proper ways to get help on fori. Like Stack Exchange. Concluding with the tongue-in-cheek comment that “there seems to be positive correlation between a person’s level of annoyance at [being asked several times the same question] and ability to answer questions.” This being a hidden test, right?!, as the correlation should be negative.

]]>(with another version replacing 2/π with the squared root of π/8) and

not to mention a rational faction. All of which are more efficient (in R), if barely, than the resident pnorm() function.

test replications elapsed relative user.self 3 logistic 100000 0.410 1.000 0.410 2 polya 100000 0.411 1.002 0.411 1 resident 100000 0.455 1.110 0.455

For the inverse cdf, the approximations there are involving numerical inversion except for

which proves slightly faster than qnorm()

test replications elapsed relative user.self 2 inv-polya 100000 0.401 1.000 0.401 1 resident 100000 0.450 1.000 0.450]]>

When looking at the OP’s R code, I did not notice anything amiss at first glance (I was about to drive back from Annecy, hence did not look too closely) and reran the attached code with a larger variance in the proposal, which returned the above picture for the MCMC sample, close enough (?) to the target. Later, from home, I checked the code further and noticed that the Metropolis ratio was only using the ratio of the targets. Dividing by the ratio of the proposals made a significant (?) to the representation of the target.

More interestingly, the OP was fundamentally confused between independent and random-walk Rosenbluth algorithms, from using the wrong ratio to aiming at the wrong scale factor and average acceptance ratio, and furthermore challenged by the very notion of Hessian matrix, which is often suggested as a default scale.

]]>Looking for a broader perspective, I thus wonder at what we would instead need to assess the lack of convergence of an MCMC chain without much massaging of the said chain. An evaluation of the (Kullback, Wasserstein, or else) distance between the distribution of the chain at iteration n or across iterations, and the true target? A percentage of the mass of the posterior visited so far, which relates to estimating the normalising constant, with a relatively vast array of solutions made available in the recent years? I remain perplexed and frustrated by the fact that, 30 years later, the computed values of the visited likelihoods are not better exploited. Through for instance machine-learning approximations of the target. that could themselves be utilised for approximating the normalising constant and potential divergences from other approximations.

]]>The optimal strategy is to follow A while the score is zero, C when the score is 3, and B otherwise. The corresponding winning probability is 0.8548, as checked by the following code

win=function(n=1,s=0){ if(n==4)return((s==3)+.4*(!s)+.8*(s==2)) else{return(max(c( .4*win(n+1,s+3)+.3*win(n+1,s+1)+.3*win(n+1,s), .1*win(n+1,s+3)+.8*win(n+1,s+1)+.1*win(n+1,s), win(n+1,s))))}}]]>

is distributed as an Exp(1) random variate. Meaning that for every scale μ, the integer part and the fractional part of an Exponential variate are independent, the former a Geometric. A refinement of the above consists in choosing

exp(-μ) =½

as the generation of M then consists in counting the number of 0’s before the first 1 in the binary expansion of U∼U(0,1). Actually the loop used in Ahrens & Dieter (1972) seems to be much less efficient than counting these 0’s

> benchmark("a"={u=runif(1) while(u<.5){ u=2*u F=F+log(2)}}, "b"={v=as.integer(rev(intToBits(2^31*runif(1)))) sum(cumprod(!v))}, "c"={sum(cumprod(sample(c(0,1),32,rep=T)))}, "g"={rgeom(1,prob=.5)},replications=1e4) test elapsed relative user.self 1 a 32.92 557.966 32.885 2 b 0.123 2.085 0.122 3 c 0.113 1.915 0.106 4 g 0.059 1.000 0.058

Obviously, trying to code the change directly in R resulted in much worse performances than the resident rexp(), coded in C.

]]>The problem sounds impossible to solve without an ability to compute the density value at a given value, since any convex combination *αf¹+(1-α)f²* would return the same two samples. Assuming continuity of the density *f* at the boundary point *a* between *A* and its complement, a desperate solution for *p(A)/1-p(A)* is to take the ratio of the density estimates at the value *a*, which turns out not so poor an approximation if seemingly biased. This was surprising to me as kernel density estimates are notoriously bad at boundary points.

If *f(x)* can be computed [up to a constant] at an arbitrary *x*, it is obviously feasible to simulate from *f* and approximate *p(A)*. But the problem is then moot as a resolution would not even need the initial samples. If exploiting those to construct a single kernel density estimate, this estimate can be used as a proposal in an MCMC algorithm. Surprisingly (?), using instead the empirical cdf as proposal does not work.

means that it is the superposition of shifted Cauchys on the unit circle (with nice complex representations). As such, it is easily simulated by re-shifting a Cauchy back to (-π,π), i.e. using the inverse transform

]]>sucz=0 for(i in 1:2^12){ path=intToBits(i)[1:12] sol=0 for(j in 1:12)sol=max(sol, prod(path[paz[[j]][paz[[j]]>0]]==01)* prod(path[-paz[[j]][paz[[j]]<0]]==00)) sucz=sucz+sol

where paz is the list of the 12 possible paths from North-West to South-East (excluding loops!), leading to a probability of 1135/2¹², I could not find a logical reasoning to reach this number. The paths of length 4, 6, 8 are valid in 2⁸, 2⁶, 2⁴ of the cases, respectively and logically!, but this does not help as they are dependent.

]]>