## inf R ! [book review]

Posted in Books, R, Travel with tags , , , , , , , , , , , on June 10, 2021 by xi'an

Thanks to my answering a (basic) question on X validated involving an R code, R mistakes and some misunderstanding about Bayesian hierarchical modelling, I got pointed out to Patrick Burns’ The R inferno. This is not a recent book as the second edition is of 2012, with a 2011 version still available on-line. Which is the version I read. As hinted by the cover, the book plays on Dante’s Inferno and each chapter is associated with a circle of Hell… Including drawings by Botticelli. The style is thus most enjoyable and sometimes hilarious. Like hell!

The first circle (reserved for virtuous pagans) is about treating integral reals as if they were integers, the second circle (attributed to gluttons, although Dante’s is for the lustful) is about allocating more space along the way, as in the question I answered and in most of my students’ codes! The third circle (allocated here to blasphemous sinners, destined for Dante’s seven circle, when Dante’s third circle is to the gluttons) points out the consequences of not vectorising, with for instance the impressive capacities of the ifelse() function [exploited to the max in R codecolfing!].  And the fourth circle (made for the lustfuls rather than Dante’s avaricious and prodigals) is a short warning about the opposite over-vectorising. Circle five (destined for the treasoners, and not Dante’s wrathfuls) pushes for and advises about writing R functions. Circle six recovers Dante’s classification, welcoming (!) heretics, and prohibiting global assignments, in another short chapter. Circle seven (alloted to the simoniacs—who should be sharing the eight circle with many other sinners—rather than the violents as in Dante’s seventh) discusses object attributes, with the distinction between S3 and S4 methods somewhat lost on me. Circle eight (targeting the fraudulents, as in Dante’s original) is massive as it covers “a large number of ghosts, chimeras and devils”, a collection of difficulties and dangers and freak occurences, with the initial warning that “It is a sin to assume that code does what is intended”. A lot of these came as surprises to me and I was rarely able to spot the difficulty without the guidance of the book. Plenty to learn from these examples and counter-examples. Reaching Circle nine (where live (!) the thieves, rather than Dante’s traitors). A “special place for those who feel compelled to drag the rest of us into hell.” Discussing the proper ways to get help on fori. Like Stack Exchange. Concluding with the tongue-in-cheek comment that “there seems to be positive correlation between a person’s level of annoyance at [being asked several times the same question] and ability to answer questions.” This being a hidden test, right?!, as the correlation should be negative.

## Kempner Fi

Posted in Books, Kids, R, Statistics with tags , , , , , , , on January 19, 2021 by xi'an

A short code-golf challenge led me to learn about the Kempner series, which is the series made of the inverted integers, excluding all those containing the digit 9. Most surprisingly this exclusion is enough to see the series converging (close to 23). The explanation for this convergence is that, citing Wikipedia,

“The number of n-digit positive integers that have no digit equal to ‘9’ is 8 × 9n−1

and since the inverses of these n-digit positive integers are less than 101−n the series is bounded by 80. In simpler terms, it converges because the fraction of remaining terms in the series is geometrically decreasing as (9/10)1−n. Unsurprisingly (?) the series is also atrociously slow to converge (for instance the first million terms sum up to 11) and there exist recurrence representations that speed up its computation.  Here is the code-golf version

n=scan()+2while(n<-n-1){  F=F+1/T  while(grepl(9,T<-T+1))0}F


that led me to learn about the R function grepl. (The explanation for the pun in the title is that Semper Fidelis is the motto of the corsair City of Saint-Malo or Sant-Maloù, Brittany.)

## on arithmetic derivations of square roots

Posted in Books, Kids, pictures, R, Statistics with tags , , , , , , , , , on November 13, 2020 by xi'an

An intriguing question made a short-lived appearance on the CodeGolf section of Stack Exchange, before being removed, namely the (most concise possible) coding of an arithmetic derivation of the square root of an integer, S, with a 30 digit precision and using only arithmetic operators. I was not aware of the myriad of solutions available, as demonstrated on the dedicated WIkipedia page. And ended playing with three of them during a sleepless pre-election night!

The first solution for finding √S is based on a continued fraction representation of the root,

$\sqrt{S}=a+\cfrac{r}{2a+\cfrac{r}{2a+\ddots}}$

with a²≤S and r=S-a². It is straightforward to code-golf:

while((r<-S-T*T)^2>1e-9)T=(F<-2*T+r/(2*T+F))-T;F


but I found it impossible to reach the 30 digit precision (even when decreasing the error bound from 10⁻⁹). Given the strict rules of the game, this would have been considered to be a failure.

The second solution is Goldschmidt’s algorithm

b=S
T=1/sum((1:S)^2<S)
while((1-S*prod(T)^2)^2>1e-9){
b=b*T[1]^2
T=c((3-b)/2,T)}
S*prod(T)


which is longer for code-golfing but produces both √S and 1/√S (and is faster than the Babylonian method and Newton-Raphson). Again no luck with high precision and almost surely unacceptable for the game.

The third solution is the most interesting [imho] as it mimicks long division, working two digits at a time (and connection with Napier’s bones)

~=length
D=~S
S=c(S,0*(1:30))
p=d=0
a=1:9
while(~S){
F=c(F,x<-sum(a*(20*p+a)<=(g<-100*d+10*S[1]+S[2])))
d=g-x*(20*p+x)
p=x+10*p
S=S[-1:-2]}
sum(10^{1+D/2-1:~F}*F)


plus providing an arbitrary number of digits with no error. This code requires S to be entered as a sequence of digits (with a possible extra top digit 0 to make the integer D even). Returning one digit at a time, it would further have satisfied the constraints of the question (if in a poorly condensed manner).

## not a Bernoulli factory

Posted in Books, Kids, pictures, R with tags , , , , , , , on May 20, 2020 by xi'an

A Riddler riddle I possibly misunderstood:

Four isolated persons are given four fair coins, which can be either flipped once or returned without being flipped. If all flipped coins come up heads, the team wins! Else, if any comes up tails, or if no flip at all is done, it looses. Each person is further given an independent U(0,1) realisation. What is the best strategy?

Since the players are separated, I would presume the same procedure is used by all. Meaning that a coin is tossed with probability p, ie if the uniform is less than p, and untouched otherwise. The probability of winning is then

4(1-p)³p½+6(1-p)³p½²+4(1-p)p³½³+p⁴½⁴

which is maximum for p=0.3420391, with a winning probability of 0.2848424.

solve $$x⌊x⌊x⌊x⌋⌋⌋=2020$$

Where the integral part is the integer immediately below x. Puzzle that I first fail solving by brute force, because I did not look at negative x’s… Since the fourth root of 2020 is between 6 and 7, the solution is either x=6+ε or x=-7+ε, with ε in (0,1). The puzzle then becomes either

$$(6+ε)⌊(6+ε)⌊(6+ε)⌊6+ε⌋⌋⌋ = (6+ε)⌊(6+ε)⌊36+6ε⌋⌋ = (6+ε)⌊(6+ε)(36+⌊6ε⌋)⌋ = 2020$$

where there are 6 possible integer values for $$⌊6ε⌋, with only ⌊6ε⌋=5 being possible, turning the equation into$$

$$(6+ε)⌊41(6+ε)⌋ = (6+ε)(246+⌊41ε⌋) = 2020$$

where again only $$⌊42ε⌋$$=40 being possible, ending up with

$$1716+286ε = 2020$$

which has no solution in (0,1). In the second case

(-7+$$ε$$)$$⌊(-7+ε)⌊(-7+ε)⌊-7+ε⌋⌋⌋$$ = (-7+$$ε$$)$$⌊(-7+ε)(49+⌊-7ε⌋)⌋ = 2020$$

shows that only $$⌊-7ε⌋$$=-3 is possible, leading to

(-7+$$ε$$)$$⌊46(-7+ε))⌋$$ = (-7+$$ε$$) ($$-322+⌊46ε⌋)=2020$$

with only $$⌊46ε⌋$$=17 possible, hence

2135-305$$ε$$=2020

and

$$ε$$=115/305.

A brute force simulated annealing resolution returns x=-6.622706 after 10⁸ iterations. A more interesting question is to figure out the discontinuity points of the function

$$ℵ(x) = x$$$$⌊x⌊x⌊x⌋⌋⌋$$

as they seem to be numerous:

For instance, only 854 of the first 2020 integers enjoy a solution to $$ℵ(x)$$=n.

## stack explode

Posted in Books, Kids, University life with tags , , , , , , , , on October 21, 2019 by xi'an

To say the least, most Stack Exchange communities have been quite active in the past days, not towards solving an unusual flow of questions from new or old users, but in protesting against the exclusion of a moderator who disputed on a moderator forum the relevance of a code of conduct change proposed or imposed by the private company behind Stack Exchange, now called Stack Overflow (like the honomym forum on Stack Exchange). A change about the use of gender pronouns in comments and answers (an announcement that attracted the second largest number of negative votes for the entire site). And an exclusion followed by a sequence of apologies from the company highest officers that did not seem to pacify anyone (first largest number of negative votes!) and that kept the excluded moderator excluded. And leading to close to one hundred moderators resigning or going AWOL. Including one of the most active members of X validated, Glen_b. Who posted a detailed description of the chain of events and a most rational explanation of why he was resigning from being a moderator. And then another major moderator, gung… A flak overflow as put by another report.

“We recognise that Stack Exchange is in no way obliged to take our input. We know that we are guests in the home of a private company. We don’t own the platform, and while we want to help to steer the ship, we don’t have the right to determine how it is governed. What built this network is a sense of community and common purpose, and a big part of that has always been the close relationship and communication between Stack Exchange and stakeholders, such as moderators and users. It’s a shame that we’ve lost something so fundamental.” dearstackexchange.com

What can be learned from this fiasco is that it is not a very good idea to let a technical Q&A forum such as Stack Exchange to be run by a private company. Even though many contributors may have never realised till now this is the case. And even when the company is using A/B tests, Bayesian GLMs, and Stan to decrease the number of “unfriendly comments” on the site. Companies are primarily there to make profit and report to stakeholders, rather than the millions of people contributing to the site for free, sometimes investing a considerable amount of time and energy towards making the questions answered in a constructive manner that benefits the entire audience. Despite the facade coolness as in the nerdy, geeky chatter on the company blog, the company  executives and employees obviously do not share the same goal as the volunteers in the numerous communities of the network. Dealing in public relations rather than sheer exchange and in public image rather than openness and in management rather than empowerment. And in advertising rather than sharing.

Another basic remark is that by growing into so many subjects beyond computer programming, and in particular non-technical topics, the SE platform has hit a stage where some communities goals will inevitably clash with others’. I deem it rather characteristic that the (one?) source of the crisis is the issue of using pronouns as stated by the OP (if any) or else using ungendered pronouns. (Pronouns like they which apparently works in English for both plural and singular—as does you—as early as the 14th century.) As some raised religious arguments against using one or several versions. As well as grammatical ones and further ones of being challenging for some non-native-English speakers. I do not think that a corporate imposition (with threats of exclusionary consequences) one single version of inclusion and tolerance is going to work and especially not within each and all of the communities constituting Stack Exchange, which is why working towards an alternative and decentralised network could be timely.