Archive for Cornell University

monomial representations on Netflix

Posted in Books, Kids, pictures, Travel with tags , , , , , , , , , , , , on February 16, 2021 by xi'an

When watching the first episode of Queen’s Gambit, following the recommendations of my son, I glimpsed the cover of a math thesis defended at Cornell by the mother of the main character..! Prior to 1957, year of her death. Searching a wee bit further, I found that there exists an actual thesis with this very title, albeit defended by Stephen Stanley in 1998 at the University of Birmingham. that is, Birmingham, UK [near Coventry]. Apart from this amusing trivia piece, I also enjoyed watching the first episodes of the series, the main actor being really outstanding in her acting, and the plot unfolding rather nicely, except for the chess games that are unrealistically hurried, presumably because watching people thinking is anathema on TV! The representation of misogyny at the time is however most realistic (I presume|!) and definitely shocking. (The first competition game when Beth Hamon loses is somewhat disappointing as failing to predict a Queen exchange is implausible at this level…) However, the growing self-destructive behaviour of Beth made me cringe to the point of stopping the series. The early episodes also reminded me of the days when my son had started playing chess with me, winning on a regular basis, had then joined a Saturday chess nearby, was moved to the adult section within a few weeks, and … stopped altogether a few weeks later as he (mistakenly) thought the older players were making fun of him!!! He never got to any competitive level but still plays on a regular basis and trashes me just as regularly. Coincidence or not, the Guardian has a “scandalous” chess story to relate last week,  when the Dutch champion defeated the world top two players, with one game won by him having prepared the Najdorf Sicilian opening up to the 17th round! (The chess problem below is from the same article but relates to Antonio Medina v Svetozar Gligoric, Palma 1968.)

your GAN is secretly an energy-based model

Posted in Books, Statistics, University life with tags , , , , , , , , , , , , , on January 5, 2021 by xi'an

As I was reading this NeurIPS 2020 paper by Che et al., and trying to make sense of it, I came across a citation to our paper Casella, Robert and Wells (2004) on a generalized accept-reject sampling scheme where the proposal changes at each simulation that sounds surprising if appreciated! But after checking this paper also appears as the first reference on the Wikipedia page for rejection sampling, which makes me wonder if many actually read it. (On the side, we mostly wrote this paper on a drive from Baltimore to Ithaca, after JSM 1999.)

“We provide more evidence that it is beneficial to sample from the energy-based model defined both by the generator and the discriminator instead of from the generator only.”

The paper seems to propose a post-processing of the generator output by a GAN, generating from the mixture of both generator and discriminator, via a (unscented) Langevin algorithm. The core idea is that, if p(.) is the true data generating process, g(.) the estimated generator and d(.) the discriminator, then

p(x) ≈ p⁰(x)∝g(x) exp(d(x))

(The approximation would be exact the discriminator optimal.) The authors work with the latent z’s, in the GAN meaning that generating pseudo-data x from g means taking a deterministic transform of z, x=G(z). When considering the above p⁰, a generation from p⁰ can be seen as accept-reject with acceptance probability proportional to exp[d{G(z)}]. (On the side, Lemma 1 is the standard validation for accept-reject sampling schemes.)

Reading this paper made me realised how much the field had evolved since my previous GAN related read. With directions like Metropolis-Hastings GANs and Wasserstein GANs. (And I noticed a “broader impact” section past the conclusion section about possible misuses with societal consequences, which is a new requirement for NeurIPS publications.)

Grand Central Terminal

Posted in Books, pictures, Travel with tags , , , , , , , , , , , on April 22, 2020 by xi'an

admissible estimators that are not Bayes

Posted in Statistics with tags , , , , , , on December 30, 2017 by xi'an

A question that popped up on X validated made me search a little while for point estimators that are both admissible (under a certain loss function) and not generalised Bayes (under the same loss function), before asking Larry Brown, Jim Berger, or Ed George. The answer came through Larry’s book on exponential families, with the two examples attached. (Following our 1989 collaboration with Roger Farrell at Cornell U, I knew about the existence of testing procedures that were both admissible and not Bayes.) The most surprising feature is that the associated loss function is strictly convex as I would have thought that a less convex loss would have helped to find such counter-examples.

machine learning and the future of realism

Posted in Books, Kids, Statistics, University life with tags , , , , , , , , on May 4, 2017 by xi'an

Giles and Cliff Hooker arXived a paper last week with this intriguing title. (Giles Hooker is an associate professor of statistics and biology at Cornell U, with an interesting blog on the notion of models, while Cliff Hooker is a professor of philosophy at Newcastle U, Australia.)

“Our conclusion is that simplicity is too complex”

The debate in this short paper is whether or not machine learning relates to a model. Or is it concerned with sheer (“naked”) prediction? And then does it pertain to science any longer?! While it sounds obvious at first, defining why science is more than prediction of effects given causes is much less obvious, although prediction sounds more pragmatic and engineer-like than scientific. (Furthermore, prediction has a somewhat negative flavour in French, being used as a synonym to divination and opposed to prévision.) In more philosophical terms, prediction offers no ontological feature. As for a machine learning structure like a neural network being scientific or a-scientific, its black box nature makes it much more the later than the former, in that it brings no explanation for the connection between input and output, between regressed and regressors. It further lacks the potential for universality of scientific models. For instance, as mentioned in the paper, Newton’s law of gravitation applies to any pair of weighted bodies, while a neural network built on a series of observations could not be assessed or guaranteed outside the domain where those observations are taken. Plus, would miss the simple square law established by Newton. Most fascinating questions, undoubtedly! Putting the stress on models from a totally different perspective from last week at the RSS.

As for machine learning being a challenge to realism, I am none the wiser after reading the paper. Utilising machine learning tools to produce predictions of causes given effects does not seem to modify the structure of the World and very little our understanding of it, since they do not bring explanation per se. What would lead to anti-realism is the adoption of those tools as substitutes for scientific theories and models.