A few days ago, I got an anonymous comment complaining about my tendency to post pictures “no one is interested in” on the ‘Og and suggesting I moved them to another electronic media like Twitter or Instagram as to avoid readers having to sort through the blog entries for statistics related ones, to separate the wheat from the chaff… While my first reaction was (unsurprisingly) one of irritation, a more constructive one is to point out to all (un)interested readers that they can always subscribe by RSS to the Statistics category (and skip the chaff), just like R bloggers only post my R related entries. Now, if more ‘Og’s readers find the presumably increasing flow of pictures a nuisance, just let me know and I will try to curb this avalanche of pixels… Not certain that I succeed, though!
Archive for blogging
কিন্তু আমরা অপরাজিত
The WordPress.com stats helper monkeys prepared a 2014 annual report for the ‘Og…
.. and among the collected statistics for 2014, what I found most amazing are the three accesses from Greenland and the one access from Afghanistan!
Click here to see the complete report. (Assuming you have nothing better to do on Boxing day…)
When reading an entry on The Chemical Statistician that a sample median could often be a choice for a sufficient statistic, it attracted my attention as I had never thought a median could be sufficient. After thinking a wee bit more about it, and even posting a question on cross validated, but getting no immediate answer, I came to the conclusion that medians (and other quantiles) cannot be sufficient statistics for arbitrary (large enough) sample sizes (a condition that excludes the obvious cases of one & two observations where the sample median equals the sample mean).
In the case when the support of the distribution does not depend on the unknown parameter θ, we can invoke the Darmois-Pitman-Koopman theorem, namely that the density of the observations is necessarily of the exponential family form,
to conclude that, if the natural sufficient statistic
is minimal sufficient, then the median is a function of S, which is impossible since modifying an extreme in the n>2 observations modifies S but not the median.
In the other case when the support does depend on the unknown parameter θ, we can consider the case when
where the set indexed by θ is the support of f. In that case, the factorisation theorem implies that
is a 0-1 function of the sample median. Adding a further observation y⁰ which does not modify the median then leads to a contradiction since it may be in or outside the support set.
Incidentally, if an aside, when looking for examples, I played with the distribution
which has θ as its theoretical median if not mean. In this example, not only the sample median is not sufficient (the only sufficient statistic is the order statistic and rightly so since the support is fixed and the distributions not in an exponential family), but the MLE is also different from the sample median. Here is an example with n=30 observations, the sienna bar being the sample median:
The editors of a new blog entitled Marauders of the Lost Sciences (Learn from the giants) sent me an email to signal the start of this blog with a short excerpt from a giant in maths or stats posted every day:
There is a new blog I wanted to tell you about which excerpts one interesting or classic paper or book a day from the mathematical sciences. We plan on daily posting across the range of mathematical fields and at any level, but about 20-30% of the posts in queue are from statistics. The goal is to entice people to read the great works of old. The first post today was from an old paper by Fisher applying Group Theory to the design of experiments.
Interesting concept, which will hopefully generate comments to put the quoted passage into context. Somewhat connected to my Reading Statistical Classics posts. Which
incidentally if sadly will not take place this year since only two students registered. should take place in the end since more students registered! (I am unsure about the references behind the title of that blog, besides Spielberg’s Raiders of the Lost Ark and Norman’s Marauders of Gor… I just hope Statistics does not qualify as a lost science!)
As I was hurriedly trying to cram several ‘Og posts into a conference paper (!), I looked around for a way of including Unicode characters straight away. And found this solution on StackExchange:
which just suited me fine!