## understanding elections through statistics [book review]

Posted in Books, Kids, R, Statistics, Travel with tags , , , , , , , , , , , , , , , , , , , , , , , , on October 12, 2020 by xi'an

A book to read most urgently if hoping to take an informed decision by 03 November! Written by a political scientist cum statistician, Ole Forsberg. (If you were thinking of another political scientist cum statistician, he wrote red state blue state a while ago! And is currently forecasting the outcome of the November election for The Economist.)

“I believe [omitting educational level] was the main reason the [Brexit] polls were wrong.”

The first part of the book is about the statistical analysis of opinion polls (assuming their outcome is given, rather than designing them in the first place). And starting with the Scottish independence referendum of 2014. The first chapter covering the cartoon case of simple sampling from a population, with or without replacement, Bayes and non-Bayes. In somewhat too much detail imho given that this is an unrealistic description of poll outcomes. The second chapter expands to stratified sampling (with confusing title [Polling 399] and entry, since it discusses repeated polls that are not processed in said chapter). Mentioning the famous New York Times experiment where five groups of pollsters analysed the same data, making different decisions in adjusting the sample and identifying likely voters, and coming out with a range of five points in the percentage. Starting to get a wee bit more advanced when designing priors for the population proportions. But still studying a weighted average of the voting intentions for each category. Chapter three reaches the challenging task of combining polls, with a 2017 (South) Korea presidential election as an illustration, involving five polls. It includes a solution to handling older polls by proposing a simple linear regression against time. Chapter 4 sums up the challenges of real-life polling by examining the disastrous 2016 Brexit referendum in the UK. Exposing for instance the complicated biases resulting from polling by phone or on-line. The part that weights polling institutes according to quality does not provide any quantitative detail. (And also a weird averaging between the levels of “support for Brexit” and “maybe-support for Brexit”, see Fig. 4.5!) Concluding as quoted above that missing the educational stratification was the cause for missing the shock wave of referendum day is a possible explanation, but the massive difference in turnover between the age groups, itself possibly induced by the reassuring figures of the published polls and predictions, certainly played a role in missing the (terrible) outcome.

“The fabricated results conformed to Benford’s law on first digits, but failed to obey Benford’s law on second digits.” Wikipedia

The second part of this 200 page book is about election analysis, towards testing for fraud. Hence involving the ubiquitous Benford law. Although applied to the leading digit which I do not think should necessarily follow Benford law due to both the varying sizes and the non-uniform political inclinations of the voting districts (of which there are 39 for the 2009 presidential Afghan election illustration, although the book sticks at 34 (p.106)). My impression was that instead lesser digits should be tested. Chapter 4 actually supports the use of the generalised Benford distribution that accounts for differences in turnouts between the electoral districts. But it cannot come up with a real-life election where the B test points out a discrepancy (and hence a potential fraud). Concluding with the author’s doubt [repeated from his PhD thesis] that these Benford tests “are specious at best”, which makes me wonder why spending 20 pages on the topic. The following chapter thus considers other methods, checking for differential [i.e., not-at-random] invalidation by linear and generalised linear regression on the supporting rate in the district. Once again concluding at no evidence of such fraud when analysing the 2010 Côte d’Ivoire elections (that led to civil war). With an extension in Chapter 7 to an account for spatial correlation. The book concludes with an analysis of the Sri Lankan presidential elections between 1994 and 2019, with conclusions of significant differential invalidation in almost every election (even those not including Tamil provinces from the North).

R code is provided and discussed within the text. Some simple mathematical derivations are found, albeit with a huge dose of warnings (“math-heavy”, “harsh beauty”) and excuses (“feel free to skim”, “the math is entirely optional”). Often, one wonders at the relevance of said derivations for the intended audience and the overall purpose of the book. Nonetheless, it provides an interesting entry on (relatively simple) models applied to election data and could certainly be used as an original textbook on modelling aggregated count data, in particular as it should spark the interest of (some) students.

[Disclaimer about potential self-plagiarism: this post or an edited version will eventually appear in my Books Review section in CHANCE.]

## Barolo [서울에서]

Posted in Statistics with tags , , , , , , , , on April 20, 2020 by xi'an

## Parasite in chief [verbatim]

Posted in Statistics with tags , , , , , , , , , , , , , , , , , , on March 9, 2020 by xi'an

““How bad were the Academy Awards this year? Did you see? And the winner is a movie from South Korea. What the hell was that all about? We’ve got enough problems with South Korea, with trade. On top of it, they give them the best movie of the year? Was it good? I don’t know. Can we get, like, ‘Gone With the Wind’ back, please?” DT, 20022020

## my country

Posted in Books, Kids, pictures, Travel with tags , , , , , , , , , on December 1, 2019 by xi'an

Watching a new Korean drama TV series, called My country: The New Age, which relates to the early years of the Joseon dynasty and the power shifts between the first King, Yi Seung-Gye, later of King Taejo, a former general taking over by assassinating the previous ruler (and making him the last of his dynasty!), and his sons, but involving heavily three young characters growing through these years and making them instrumental to solving these power fights. First, Seo Hwi, the son of the greatest warrior of the (previous) era who was wrongly dishonored, himself a fabulous warrior, who witnessed his father’s suicide and was left alone to protect his sick sister Seo Yeon. Continuously persecuted by Naem Jeon, an eminence grise behind Yi Seung-Gye and definitely the arch-evil character of the series, with Hwi aiming at taking revenge on him, although he comes several times with opportunities to kill him and does not. (The number of violent deaths in the series is staggering!) Second, Han Hui-Jae a rebellious young woman raised in a gisaeng (geisha equivalent) residence by the owner of the place, after the political murder of her mother under Hui-Jae’s eyes. Rising to prominence in the structure which is used to gather information and gain influence through the clients of the place. And in love with Hwi. Third, Nam Seon-Ho, illegitimate son of Naem Jeon, whom he hates for causing the suicide of his low-born mother and who raises him after the death of his legitimate son (it however requires most of the show for Seon-Ho to turn completely against his father). Close friend of Hwi, but also his competitor in the military exams, which he unfairly wins by his father weighting in, and equally in love with Hae. Protector of Yeon when his brother Hwi gets sent to a disciplinary military unit for rebelling against the exam result. With the two never truly and actively fighting one another, as the old friendship keeps Seon-Ho saving Hwi and counter-acting his father’s nefarious moves, even though Seon-Ho pretends to aim only at gaining power. A complex  if classical triangle, with a theatrical (in the sense of formal and unrealistic) organisation of the plot. The most interesting character in the whole show may be Yi Bang-Won, the fifth son of the new king Yi Seung-Gye, who supported him acceding the throne and who strongly resents not being his nominated successor. (Maybe because he has a deeper presence, thank to historical sources. He would eventually become King Taejong.) Very similar in many ways to the Scholar who walks the night, except without a fantastic part. Almost indistinguishable music as well. But still an interesting experience, esp. when watching it in Seoul (which was Hanseong at that time.)

## fake conference

Posted in Books, Kids, University life with tags , , , , , , , , , , , on November 25, 2019 by xi'an

One of my (former) master students approached me last week for support to attend an AI conference in London next May, as he had been invited there as a speaker with the prospect of publishing a paper in an AI journal. And very excited about it. As the letter of invitation definitely sounded fake to me and as Conference Series LLC did not seem connected to anything scientific, I had a quick check whether or not this was another instance of predatory conference and indeed the organisation is an outlet of the (in)famous OMICS International company. Setting conferences all around the year and all around the world by charging participants a significant amount and cramming all speakers on potentially any topic in the same room of a suburban motel (near Heathrow in that case). It is somewhat surprising that they still manage to capture victims but if they aim wide enough to cover students like the one who contacted me and had no idea of the possibility of such scams, no wonder the operation is still running. Coincidence, I was reading a news article in Nature, while in Seoul, that “South Korea’s education ministry wants to stop academics from participating in conferences that it considers “weak” and of little academic value”. I hope it works better than India’s earlier attempt at banning publications in predatory journals.