[This review is under construction]
This was the kind of problem that motivated me to look for a full course of study that would give me the background that's missing.
In September 2011 I started a graduate certificate in statistics, at Sheffield (my wife's alma mater, coincidentally). preparatory course, which gets you ready to start an MSc in statistics. It's a nine-month affair with three courses (Math, Probability, Statistics) which are taught more or less in parallel (they cleverly stagger the assignment submissions so that one is only working on one of the three at any one time). The course guide says it's a time commitment of 15 hours a week, which seems amazingly little (imagine that you devote the last three hours of your day to working on something like this---you could easily exceed 15 hours a week). But it's a realistic number; I think I spent about that much time every week on it. Occasionally I have spent more (when I ran into trouble).
This is an incomplete review; I am working on it as I do the course.
Short version: the course is *excellent*. I wish I had done this years ago. But I would recommend it only to people who are really willing to sweat it out. You have to be able to do your own research when you need more detail on a particular topic; I am not sure all my classmates in this course are in full control of the information-search process, esp. as regards the use of R (the course is R based). I don't blame them though; R can be scarily obscure.
Long version:
Preparation for the course: I wish I had reviewed permutations and combinations, and trigonometry before starting the course. I found it the hardest to recall these things while working through the course. I would recommend the following sites for trig:
http://en.wikipedia.org/wiki/Trig_identities
http://www.mecmath.net/trig/
For permutations and combinations, this is badly formatted but it could be useful: http://www.bmlc.ca/PureMath30/Pure%20Math%2030%20-%20Permutations%20and%20Combinations.html
The assignments:
Each assignment I did (a total of 18 to be assigned by the time the course ends; only the last 15 are graded) took a lot of time, and if you are typesetting the submission in LaTeX, you should expect a full day to be devoted to that (at least). I gave up typesetting---to my great regret---in the math and probability courses after assignment 2 because it takes too long.
Most of the questions directly use knowledge that was recently covered in the course, but some of the questions require some thought. There was one situation where one had to provide a closed form solution to a finite geometric series that was something like \sum n+1 r^n. There's a really nifty trick in the book Concrete Math by Knuth and colleagues, that shows you how to solve that (the key observation is that n+1 r^n is the differential of r^{n+1}), but if I hadn't read that book long ago I would probably not hit on the solution... The solution provided by the instructor for that problem was simpler and seemed magical, I have yet to study it carefully.
I wish the assignments were prepared with greater attention to precision in writing. A couple of times I had a hard time trying to figure out what precisely was meant, and judging from the others' questions, I was not the only one.
The grading is tough, but I kind of like that the graders are so anal retentive about giving every point. You have to respect them for that. When I saw how they grade, I felt a bit ashamed that I am so soft on my students in Potsdam. These guys are hard-asses. I guess I've gone soft after eight years in a soft-skills oriented linguistics department. Computer science and math at Ohio State was about as strict.
My grades so far are below (percentages). This excludes the first assignments; they do not count for the final grade. In general, these assignments count for 20% of the final grade. You have to get minimum 70% overall in the final grade (after the exams in June 2012) to be allowed to continue on to the MSc.
Mathematics:
Assignment-1: 100%
## Notes: I was pleasantly surprised to get this grade, but I do admit the problems were not too hard. The hardest part of this course is not the math but the probability theory.
Assignment-2: 97%
## Made a stupid mistake. Don't ask.
Assignment-3: 81%
## Here, I lost a lot of points for not generalizing my solution for a particular proof, and I made a stupid mistake in a partial derivative computation (forgot to treat y as a constant when differentiating with respect to y) that caused a snowball-effect error in the final answer, and I thought that level curves are three-dimensional and plotted them as such. All errors could have been avoided if I had carefully re-read what I'd done, but there was no time (this was also the only submission so far which was not typeset, perhaps another reason why I didn't notice the errors---when typesetting I usually discover a lot of mistakes in my solutions, esp. since I check my solutions at that stage with R or a computer algebra system like yacas).
Assignment-4:
Assignment-5:
Probability:
Assignment-1: 76%
## Notes: I did relatively badly in the first assignment because I reversed a couple of signs accidentally, and because I didn't leave enough time for working on the assignment. I lost a lot of points on a stupid and mindless exercise involving reading numbers off of a binomial probability table --- shame on me.
Assignment-2: 96%
## Notes: The instructor cut 4 percentage points when I said that a particular distribution (I discovered later that this was the Cauchy distribution) had expectation infinity. He said I should have said "expectation does not exist." He's right, of course, but it was a painful loss, given that I didn't know at that time that infinity cannot be considered to exist in this particular context. That's a painful lesson.
Assignment-3:
Assignment-4:
Assignment-5:
Statistics:
Assignment-1: 84%
Assignment-2: 82%
### It is amazing that the one thing I thought I could do---data analysis---is
## giving me the most trouble in this course. The grading is pretty harsh; for example, if you report a p=0.30 and don't say explicitly that you reject the null hypothesis, you lose points. I did make some horrible mistakes in this assignment though, so I do deserve losing some marks.
Assignment-3:
Assignment-4:
Assignment-5:
The textbooks:
One minor gripe I have in the course is some of the textbooks assigned. I think the course designers should probably invest some time into building complete course-specific lecture notes. Sometimes they do provide lectures notes, and these are great (my only complaint is that the authors don't believe in page numbers, which can be a real hassle if you print out the material and accidentally drop them on the floor). But that said, I realize that producing customized lecture notes is a major undertaking and I don't blame them for relying on existing books.
The math textbook is by Gilbert and Jordan, Guide$^2$ Mathematical Methods. The title is a bit strange, I mean the 2 for to (although the authors cannot be blamed for this naming decision---apparently Macmillan has a Guide 2... series). The book has one positive aspect: it covers the relevant material in the sense that it goes through the topics. So, for someone like me, who doesn't know exactly what I need to know as background to read more advanced textbooks on statistics, this is a good extended listing of the things I should know (or recall from high school). This is all good. What makes the reader's life hard are the super-terse proofs/solutions to exercises, and the large number of typos (especially in the solutions). The course organizers released a list of typos for the book at the start of the course, but there are even more typos than in the errata. The notation can also be sloppy and the reader has to be careful (e.g., at one point they write F-p, where F are the *names* of a function; what they meant was F(x)-p(x)).
An example of the terseness is the proof that the limit of sin(x)/x when x approaches 0 is 1. I had no idea where that proof was going until I watched Strang's online lecture (MIT open courseware). After watching Strang's lecture, I was able to unpack the proof myself, but I doubt that it could be done easily by just working through it (are proofs meant to be hard work to unpack? Read the Salle et al book on Calculus to see that the answer may be no). Gilbert and Jordan should read Knuth's book on writing mathematics. I would recommend Strang's lectures, and Calculus by Salas, Hille and Etgen (I have the 9th Edition); this last book really nailed it for me. This book has the smoothness and feel of Cormen, Rivest et al on Algorithms.
The probability theory book (Ross, A First Course in Probability) is OK in that it covers all the points. But it has the irritating property that each definition is followed by half a dozen totally unrelated examples. This in itself is not bad, but the examples are SCARY. I'm not sure I need to see the toughest applications of the latest idea learnt right away. After a bit of reading this sort of teaching-by-example at least this reader just gets depressed (there is no way I would have worked out those example answers myself after just reading the definitions provided immediately earlier). I found the online book by Jay Kerns much, much more useful for the present course---the course organizers should consider switching to that or at least assigning it alongside Ross, with some warnings to not get intimidated by Ross' style. Here is an excellent review of the Ross book on amazon that pretty much summarizes the main problems in the book.
There is an accompanying book on probability (by Freund, but authors are Miller and Miller) which is a straightforward and formal introduction to mathematical statistics; I like that more. It's part of the statistics course, not the probability course.
The statistics textbook is by Moore, McCabe and someone else, is absolutely terrible (apparently I'm not the only one complaining; see here). The book seems to be written for first year undergrads in statistics (nothing wrong with that of course, but this graduate certificate has a different audience). The large number of disconnected and silly examples (for example, for planned experiments vs observational studies) following every new concept lead to a feeling of total disorientation. There is also a painful attempt to make the book relevant to the modern user: examples about cell phones and iPhone Apps abound, presumably to draw in the young reader's eyes away from the cell phone as they read.
There are so many really good books on introductory statistics using R (e.g., in the Use R series); I wish they had used one of them. As it is, you have to be either real good at R, or be able to quickly get on board with R, if you want to do this course. It's completely based on R. This makes the course absolutely wonderful for me, but several of the students went into a state of blind panic (for example, a beginner often cannot easily figure out how to find out how to change a directory within R---in my own courses at Potsdam, I think we spend about 90 minutes just getting them used to the interface). Using an R-based book for introductory statistics would have been much better than Moore et al's. I have to admit though that I cannot name an alternative to Moore et al's right away that covers exactly the same material.
Some other minor gripes about the course
1. In one of the pages of Ross' book (our assigned text), he writes "...if \sigma=\infty....". Now, in one of the probability theory assignments (Assignment 3) I lost four percentage points for saying that E[X]=\infty. I lost marks because I should have written, "the mean does not exist." This is correct. But when I pointed out that Ross makes the same mistake, I was told that that was a slight abuse of notation but it's fine. Seems like if a statistician writes something incorrectly, it's OK, but not if a student writes the same thing. If I had seen Ross' statement before writing my assignment solution, would I still deserve to lose 4 percentage points? I found this double standard irritating.
2. The statement of some of the probability theory problems is very unclear. It's hard to write unambiguously, that's easy to understand; but I would have expected clearly worded problems. Even worse, the clarifications one got after asking for more detail led to so much confusion among us students (at one point the lecturer was contradicting an earlier statement) that in one case we just gave up and went with our interpretation, which seemed according to the messages from lecturer to be wrong (I will report later if our interpretation got full marks or not). I was not the only one facing this problem; there was a flurry of unhappiness about the question.
3. Too often, the response to clarification questions can take up to a week or more; this is simply too long a period for a course moving as fast as this one. In some cases, I just had to go with what I understood. This isn't too serious; one often leaves a course with many questions unanswered, and after all this course is just a prep course for the real thing, the MSc. But it could be optimized by asking the lecturers to check the message board at least once a day or every other day.
Conclusion
Overall, a thumbs-up. This is a course every non-statistician who needs to work with data should take. Even in these few months I learnt a lot of interesting and even downright cool things (mostly in the math segment, but also in probability theory).
The big advantages of doing this kind of structured course are that:
- you have to solve problems on a daily basis in order to the get the assignments done on time, and someone carefully checks your work. If you try to read books on topics that are specifically relevant to you, like Gelman recommends (of course he recommends his own book), you are not going to get that quality of feedback (no, not even with a solutions manual).
- you can ask a statistician questions that come up as you read or work on real problems that affect your own life, and they will take the time to answer them fully. This is virtually impossible if you just try to talk to a random statistician (generally, they either heap scorn on you, or give a rambling answer that doesn't really answer the question, because they just don't want to pay attention long enough to try to understand the problem---not that I blame them for that; why should they care what your problem is?).
Material I consulted while doing this course (incomplete):
1. On writing math
2. Kerns' book on probability
3. Grinstead and Snell on probability
4. Salle et al Calculus
5. Spivak Calculus
6. MIT Open courseware (Strang, Auroux on calculus and linear algebra)
Software I used in this course:
1. R
2. Yacas (with Ryacas and without), to check my answers, arrived at analytically.
Notes I kept on the course:
I've been keeping notes on what I learnt, these are located here (please send corrections to me). Ask me for the source .Rnw file if you want to expand on them or modify them for your own use.
[to be continued]
0 comments:
Post a Comment