Archive for independence

haunting of tramcar 105 [book review]

Posted in Statistics with tags , , , , , , , , , , , on April 20, 2019 by xi'an

A mix of steampunk and urban magic in a enlightened 1912 Cairo sounded like a good prolegomena and I bought P. Djèli Clark’s The haunting of tram car 015 on this basis. As it happens, this is actually a novella of 123 pages building on the same universe as a previous work of the author, A dead djinn in Cairo, which however is even shorter and only available as a Kindle book… I really enjoyed the short read and its description of an alternate Cairo that is competing with Paris and London, thanks to the advantage brought by the supernatural powers of djinns. (And apparently also gaining the independence Egypt could not secure under the British protectorate.) The English suffragettes have also their counterparts in Egypt and the country is about to decide about women right to vote. The story itself is nice if not stratospheric, with mostly well-drawn characters and good dialogues. (The core of the plot relies on smuggling sweets from Armenia, though, a rather weak link.) As in an earlier order, the book itself was not properly printed, with a vertical white band of erased characters on most odd pages, presumably another illustration of the shortcomings of the  print-on-demand principle. (Which means that I sent the book back to Amazon rather than leaving it in the common room.)

revisiting the Gelman-Rubin diagnostic

Posted in Books, pictures, Statistics, Travel, University life with tags , , , , , , , , , , , , , on January 23, 2019 by xi'an

Just before Xmas, Dootika Vats (Warwick) and Christina Knudson arXived a paper on a re-evaluation of the ultra-popular 1992 Gelman and Rubin MCMC convergence diagnostic. Which compares within-variance and between-variance on parallel chains started from hopefully dispersed initial values. Or equivalently an under-estimating and an over-estimating estimate of the MCMC average. In this paper, the authors take advantage of the variance estimators developed by Galin Jones, James Flegal, Dootika Vats and co-authors, which are batch mean estimators consistently estimating the asymptotic variance. They also discuss the choice of a cut-off on the ratio R of variance estimates, i.e., how close to one need it be? By relating R to the effective sample size (for which we also have reservations), which gives another way of calibrating the cut-off. The main conclusion of the study is that the recommended 1.1 bound is too large for a reasonable proximity to the true value of the Bayes estimator (Disclaimer: The above ABCruise header is unrelated with the paper, apart from its use of the Titanic dataset!)

In fact, I have other difficulties than setting the cut-off point with the original scheme as a way to assess MCMC convergence or lack thereof, among which

  1. its dependence on the parameterisation of the chain and on the estimation of a specific target function
  2. its dependence on the starting distribution which makes the time to convergence not absolutely meaningful
  3. the confusion between getting to stationarity and exploring the whole target
  4. its missing the option to resort to subsampling schemes to attain pseudo-independence or scale time to convergence (albeit see 3. above)
  5. a potential bias brought by the stopping rule.

maximal spacing around order statistics [#2]

Posted in Books, R, Statistics, University life with tags , , , , , , , , on June 8, 2018 by xi'an

The proposed solution of the riddle from the Riddler discussed here a few weeks ago is rather approximative, in that the distribution of

\Delta_n=\max_i\,\min_j\,|X_{i}-X_{j}|

when the n-sample is made of iid Normal variates is (a) replaced with the distribution of one arbitrary minimum and (b) the distribution of the minimum is based on an assumption of independence between the absolute differences. Which does not hold, as shown by the above correlation matrix (plotted via corrplot) for N=11 and 10⁴ simulations. One could think that this correlation decreases with N, but it remains essentially 0.2 for larger values of N. (On the other hand, the minima are essentially independent.)

Imperial postdoc in Bayesian nonparametrics

Posted in pictures, R with tags , , , , , , , , on April 27, 2018 by xi'an

Here is another announcement for a post-doctoral position in London (UK) to work with Sarah Filippi. In the Department of Mathematics at Imperial College London. (More details on the site or in this document. Hopefully, the salary is sufficient for staying in London, if not in South Kensington!)

The post holder will work on developing a novel Bayesian Non-Parametric Test for Conditional Independence. This is at the core of modern causal discovery, itself of paramount importance throughout the sciences and in Machine Learning. As part of this project, the post holder will derive a Bayesian non-parametric testing procedure for conditional independence, scalable to high-dimensional conditioning variable. To ensure maximum impact and allow experimenters in different fields to easily apply this new methodology, the post holder will then create an open-source software package available on the R statistical programming platform. Doing so, the post holder will investigate applying this approach to real-world data from our established partners who have a track record of informing national and international bodies such as Public Health England and the World Health Organisation.

Scottish polls…

Posted in pictures, Statistics, Travel with tags , , , , , , , , on September 11, 2014 by xi'an

Hillhead Street from the Great Western Road, Glasgow westside, Apr. 20, 2012As much as I love Scotland, or because of it, I would not dream of suggesting to Scots that one side of the referendum sounds better than the other. However, I am rather annoyed at the yoyo-like reactions to the successive polls about the result, because, just like during the US elections, each poll is analysed separately rather than being pooled with the earlier ones in a reasonable meta-analysis… Where is Nate Silver when we need him?!

Sufficiency [BC]

Posted in Books, Statistics with tags , , , , on May 10, 2011 by xi'an

Here is an email I received about The Bayesian Choice a few days ago:

I am an undergraduate student in Japan. I am self-studying your classical book The Bayesian Choice. The book is wonderful with many instructive examples. Although it is a little bit hard for me right now, I think it will be very useful for my future research.

There is one point that I do not understand in Example 1.3.2 (p.14-15). I know a standard result that the sample mean and sample variance are independent, with the sample mean follows

\mathcal{N}(\mu,(1/n)\sigma^2)

while s^2/\sigma^2 follows a chi-square of n-1 degree of freedom. In this example is it correct that one must factorize the likelihood function to g(T(x)|\theta) which must be the product of these two normal and chi-square densities, and h(x|T(x)) which is free of \theta ?

In the book I do not see why $g(T(x) | \theta)$ is the product of normal and chi-square densities. The first part correctly corresponds to the density of \mathcal{N}(\mu,(1/n)*\sigma^2) . But the second part is not the density of n-1 degree of freedom chi-square of s^2/\sigma^2.

The example, as often, skips a lot of details, meaning that when one starts from the likelihood

\sigma^{-n} e^{-(\bar x-\theta)^2 n/2\sigma^2} \, e^{-s^2/2\sigma^2} / (2\pi)^n,

this expression only depends on T(x). Furthermore, it involves the normal density on \bar x and part of the chi-square density on . One can then plug in the missing power of to make g(T(x)|\theta) appear. The extra terms are then canceled by a function we can call h(x|T(x))However, there is a typo in this example in that \sigma^n in the chi-square density should be \sigma^{n-1}!