Archive for PhD thesis

monomial representations on Netflix

Posted in Books, Kids, pictures, Travel with tags , , , , , , , , , , , , on February 16, 2021 by xi'an

When watching the first episode of Queen’s Gambit, following the recommendations of my son, I glimpsed the cover of a math thesis defended at Cornell by the mother of the main character..! Prior to 1957, year of her death. Searching a wee bit further, I found that there exists an actual thesis with this very title, albeit defended by Stephen Stanley in 1998 at the University of Birmingham. that is, Birmingham, UK [near Coventry]. Apart from this amusing trivia piece, I also enjoyed watching the first episodes of the series, the main actor being really outstanding in her acting, and the plot unfolding rather nicely, except for the chess games that are unrealistically hurried, presumably because watching people thinking is anathema on TV! The representation of misogyny at the time is however most realistic (I presume|!) and definitely shocking. (The first competition game when Beth Hamon loses is somewhat disappointing as failing to predict a Queen exchange is implausible at this level…) However, the growing self-destructive behaviour of Beth made me cringe to the point of stopping the series. The early episodes also reminded me of the days when my son had started playing chess with me, winning on a regular basis, had then joined a Saturday chess nearby, was moved to the adult section within a few weeks, and … stopped altogether a few weeks later as he (mistakenly) thought the older players were making fun of him!!! He never got to any competitive level but still plays on a regular basis and trashes me just as regularly. Coincidence or not, the Guardian has a “scandalous” chess story to relate last week,  when the Dutch champion defeated the world top two players, with one game won by him having prepared the Najdorf Sicilian opening up to the 17th round! (The chess problem below is from the same article but relates to Antonio Medina v Svetozar Gligoric, Palma 1968.)

generalised Poisson difference autoregressive processes

Posted in pictures, Statistics, Travel, University life with tags , , , , , , , , on February 14, 2020 by xi'an

Yesterday, Giulia Carallo arXived the paper on generalised Poisson difference autoregressive processes that is a component of her Ph.D. thesis at Ca’ Foscari Universita di Venezia and to which I contributed while visiting Venezia last Spring. The stochastic process under study is integer valued as a difference of two generalised Poisson variates, made dependent by an INGARCH process that expresses the mean as a regression over past values of the process and past means. Which can be easily simulated as a difference of (correlated) Poisson variates. These two variates can in their turn be (re)defined through a thinning operator that I find most compelling, namely as a sum of Poisson variates with a number of terms being a (quasi-) Binomial variate depending on the previous value. This representation proves useful in establishing stationarity conditions on the process. Beyond establishing various properties of the process, the paper also examines how to conduct Bayesian inference in this context, with specialised Gibbs samplers in action. And comparing models on real datasets via Geyer‘s (1994) logistic approximation to Bayes factors.

MCMC, with common misunderstandings

Posted in Books, pictures, R, Statistics, University life with tags , , , , , , , , , , , , on January 27, 2020 by xi'an

As I was asked to write a chapter on MCMC methods for an incoming Handbook of Computational Statistics and Data Science, published by Wiley, rather than cautiously declining!, I decided to recycle the answers I wrote on X validated to what I considered to be the most characteristic misunderstandings about MCMC and other computing methods, using as background the introduction produced by Wu Changye in his PhD thesis. Waiting for the opinion of the editors of the Handbook on this Q&A style. The outcome is certainly lighter than other recent surveys like the one we wrote with Peter Green, Krys Latuszinski, and Marcelo Pereyra, for Statistics and Computing, or the one with Victor Elvira, Nick Tawn, and Changye Wu.

Froebenius coin problem

Posted in pictures, R, Statistics with tags , , , , , , , , , , on November 29, 2019 by xi'an

A challenge from The Riddler last weekend came out as the classical Frobenius coin problem, namely to find the largest amount that cannot be obtained using only n coins of specified coprime denominations (i.e., with gcd equal to one). There is always such a largest value. For the units a=19 and b=538, I ran a basic R code that returned 9665 as the largest impossible value, which happens to be 19×538-538-19, the Sylvester solution to the problem when n=2. A recent paper by Tripathi (2017) manages the case n=3, for “almost all triples”, which decomposes into a myriad of sub-cases. (As an aside, Tripathi (2017) thanks a PhD student, Prof. Thomas W. Cusick, for contributing to the proof, which constitutes a part of his dissertation, but does not explain why he did not join as co-author.) The specific case when a=19, b=101, and c=538 suggested by The Riddler happens to fall in one of the simplest categories since, as ⌊cb⁻¹⌋ and ⌊cb⁻¹⌋ (a) are equal and gcd(a,b)=1 (Lemma 2), the solution is then the same as for the pair (a,b), namely 1799. As this was quite a light puzzle, I went looking for a codegolf challenge that addressed this problem and lo and behold! found one. And proposed the condensed R function

function(a)max((1:(b<-prod(a)))[-apply(combn(outer(a,0:b,"*"),sum(!!a))),2,sum)])

that assumes no duplicate and ordering in the input a. (And learned about combn from Robin.) It is of course very inefficient—to the point of crashing R—to look at the upper bound

\prod_{i=1}^n a_i \ \ \ \ \ \ \ (1)

for the Frobenius number since

\min_{(i,j);\text{gcd}(a_i,a_j)=1} (a_i-1)(a_j-1)\ \ \ \ \ \ \ (2)

is already an upper bound, by Sylvester’s formula. But coding (2) would alas take much more space…

ABC-SAEM

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , , , , , , on October 8, 2019 by xi'an

In connection with the recent PhD thesis defence of Juliette Chevallier, in which I took a somewhat virtual part for being physically in Warwick, I read a paper she wrote with Stéphanie Allassonnière on stochastic approximation versions of the EM algorithm. Computing the MAP estimator can be done via some adapted for simulated annealing versions of EM, possibly using MCMC as for instance in the Monolix software and its MCMC-SAEM algorithm. Where SA stands sometimes for stochastic approximation and sometimes for simulated annealing, originally developed by Gilles Celeux and Jean Diebolt, then reframed by Marc Lavielle and Eric Moulines [friends and coauthors]. With an MCMC step because the simulation of the latent variables involves an untractable normalising constant. (Contrary to this paper, Umberto Picchini and Adeline Samson proposed in 2015 a genuine ABC version of this approach, paper that I thought I missed—although I now remember discussing it with Adeline at JSM in Seattle—, ABC is used as a substitute for the conditional distribution of the latent variables given data and parameter. To be used as a substitute for the Q step of the (SA)EM algorithm. One more approximation step and one more simulation step and we would reach a form of ABC-Gibbs!) In this version, there are very few assumptions made on the approximation sequence, except that it converges with the iteration index to the true distribution (for a fixed observed sample) if convergence of ABC-SAEM is to happen. The paper takes as an illustrative sequence a collection of tempered versions of the true conditionals, but this is quite formal as I cannot fathom a feasible simulation from the tempered version and not from the untempered one. It is thus much more a version of tempered SAEM than truly connected with ABC (although a genuine ABC-EM version could be envisioned).