This is the next chapter of my Statistics course, definitely more standard, with some notions on statistical models, limit theorems, and exponential families. In the first class, I recalled the convergence notions with no proof but counterexamples and spend some time on a slide not included here, borrowed from Chris Holmes’ talk last Friday on the linear relation between blood pressure and the log odds ratio of an heart condition. This was a great example, both to illustrate the power of increasing the number of observations and of using a logistic regression model. Students kept asking questions about it.
Archive for undergraduates
This summer, for the first time, I took three Dauphine undergraduate students into research projects thinking they had had enough R training (with me!) and several stats classes to undertake such projects. In all cases, the concept was pre-defined and “all they had to do” was running a massive flow of simulations in R (or whatever language suited them best!) to check whether or not the idea was sound. Unfortunately, for two projects, by the end of the summer, we had not made any progress in any of the directions I wanted to explore… Despite a fairly regular round of meetings and emails with those students. In one case the student had not even managed to reproduce the (fairly innocuous) method I wanted to improve upon. In the other case, despite programming inputs from me, the outcome was impossible to trust. A mostly failed experiment which makes me wonder why it went that way. Granted that those students had no earlier training in research, either in exploiting the literature or in pushing experiments towards logical extensions. But I gave them entries, discussed with them those possible new pathways, and kept updating schedules and work-charts. And the students were volunteers with no other incentive than discovering research (I even had two more candidates in the queue). So it may be (based on this sample of 3!) that our local training system is missing in this respect. Somewhat failing to promote critical thinking and innovation by imposing too long presence hours and by evaluating the students only through standard formalised tests. I do wonder, as I regularly see [abroad] undergraduate internships and seminars advertised in the stats journals. Or even conferences.
Here are my R midterm exams, version A and version B in English (as students are sitting next to one another in the computer rooms), on simulation methods for my undergrad exploratory statistics course. Nothing particularly exciting or innovative! Dedicated ‘Og‘s readers may spot a few Le Monde puzzles in the lot…
Two rather entertaining if mundane occurences related to this R exam: one hour prior to the exam, a student came to my office to beg for being allowed to take the solution manual with her (as those midterm exercises are actually picked from an exercise booklet, some students cooperated towards producing a complete solution manual and this within a week!), kind of missing the main point of having an exam. (I have not seen yet this manual but I’d be quite interested in checking the code they produced on that occasion…) During the exam, another student asked me what was the R command to turn any density into a random generator: he had written a density function called mydens and could not fathom why rmydens(n) was not working. The same student later called me as his computer was “stuck”: he was not aware that a “+” prompt on the command line meant R was waiting for him to complete the command… A less comical event that ended well is that a student failed to save her R code (periodically and) at the end of the exam and we had to dig very deep into the machine to salvage her R commands from \tmp as rkward safeguards, as only the .RData file was available at first. I am glad we found this before turning the machine off, otherwise it would have been lost.
Arthur Charpentier (from the awesome Freakonometrics) pointed out to me those two blogs about teaching statistics. One by Meg Dillon about the joy of teaching statistics in France, of all places!, and entitled Statistics à la Mode. And another one by Douglas Andrews commenting on the first one, entitled the Big Mistake: teaching stat as though it was math… (It appeared on an ASA community blog/forum I did not know about.)
“…there is almost invariably a peculiar pair of caveats presented as from on high: Never accept the alternative hypothesis, and ever say the probability is 0.95 that the mean lies in a 95% confidence interval for the mean.” Meg Dillon, After Math
Both blogs managed to bemuse me (this is not going to be a very coherent post, I am afraid!): the first one because it has this condescending tone of pure mathematicians about statistics or at least statistics course (i.e. “anyone can teach statistics!” mixed with “I hate teaching statistics!”) that I meet too often, esp. this side of the pond. Plus it seemed to miss the fundamental distinction between probability and statistics (check the above quote). And it did not say why the contents of the French course was much nicer than the equivalent designed by Meg Dillon at her university (except for the fact that she could use measure theory from the start). Maybe the French idiosyncrasy the author basks in is the fact that statistics is not recognised as a field in French universities (there is no stat department for instance) but is instead a subfield of mathematics…
“…stat is a different intellectual discipline. She longs for a so-called stat course based on sigma-algebras and probability spaces. Well, that has been tried many times over many years, and it fails miserably at helping students understand the important stat concepts.” Douglas Andrews, ASA Blog Viewer
The second post is making sense in stressing that stat is not math. (Or rather, as it should have been stated, it is not only math.) And that (non-statistician) mathematicians should get some preliminary training or exposure to real data when teaching statistics courses. I can certainly remember a few of my (French) stat teachers who had never approached data in their whole life! However, the comment that “foundation of stat is in empirical science and in learning from observed data, not in math” seems to go overboard. As it echoes in negative the complaint from the math teacher that intro statistics courses were “a hodgepodge of recipes” with no mathematical backbone. My feeling there is that, while we certainly do not need measure theory for the earliest statistics courses (Riemann integration is good enough for my second and third year students), we have to anchor statistical techniques into a mathematical bed to avoid them looking as a bag of tricks. I remember after my first (mathematical) statistics course on being puzzled by the lack of direction and/or the multiplicity, when compared with a standard math course. I was missing the decision-theoretic part that was to come later! Had I been exposed to a non-mathematical intro stat course, I do not think I would have persevered in this field! (And I would have moved to differential geometry instead…)