reading classics (#1)
This year, a lot of my Master students (plus all of my PhD students) registered for the Reading Classics Seminar course, so we should spend half of the year going through those “classics“. And have lively discussions thanks to the size of the group. The first student to present a paper, Céline Beji, chose Hartigan and Wong’s 1979 K-Means Clustering Algorithm paper in JRSS C. She did quite well, esp. when considering she had two weeks to learn and Beamer in addition to getting thru the paper! She also managed to find an online demo of the algorithm. Here are her slides
This was not the easiest paper in the list, by far: it is short, mostly algorithmic and somehow requires some background on the reasons why clustering was of interest and on how it impacted the field. Tellingly, the discussion with the class then focussed on the criterion rather than on the algorithm itself. In a sense, this is the most striking feature of the paper, namely that it is completely a-statistical in picking a criterion to minimise. there is neither randomness nor error involved at this stage, it is simply an extended least-square approach. This is why the number of clusters—and again the discussion from the class spent some time on this—cannot be inferred via this method. A well-auguring start to the course!