Search

Saturday, January 22, 2022

Review of Writing Science by Joshua Shiemel: Good advice on writing, but ignore his advice on statistical inference because it's just plain wrong

These days I am quite obsessed with figuring out how to improve the writing that is coming out of my lab. My postdocs generally produce solid writing, but my students struggle, just as I struggled when I was a student. So I bought a bunch of books written by experts to try to figure out what the best advice is out there on writing scientific articles. One of the very best books I have read is by Schimel:

 Schimel, Joshua. Writing Science. Oxford University Press. Kindle Edition. 

Schimel seems to be a heavy-weight in his field: 


What I like most about his book is that he takes the attitude that the goal of writing is that people actually read your paper. He puts it differently: people should cite your paper. But I think he means that people should want to read your paper (it's unfortunately pretty common to cite someone's work without reading it, just because someone else cited it; one just copies over the citations). 

His book treats writing as storytelling. A clear storyline has to be planned out before one puts pen to paper (or fingers to keyboard). Several types of openings are suggested, but the most sensible one for standard technical writing that is addressed to an expert audience is what he calls the OCAR style (the text below is quoted directly from his excellent blog: https://schimelwritingscience.wordpress.com/):

1. Opening: This should identify the larger problem you are contributing to, give readers a sense of the direction your paper is going, and make it clear why it is important. It should engage the widest audience practical. The problem may be applied or purely conceptual and intellectual—this is the reason you’re doing the work. 

2. Challenge: What is your specific question or hypothesis? You might have a few, but there is often one overarching question, which others flesh out.

3. Action: What are the key results of your work? Identify no more than 2-3 points. 

4. Resolution: What is your central conclusion and take home message? What have you learned about nature? If readers remember only one thing from your work, this should be it. The resolution should show how the results (Action) answer the question in the Challenge, and how doing so helps solve the problem you identified in the Opening

The book spends a lot of time unpacking these ideas, I won't repeat the details here. 

One problem I had with his examples was that they all lie outside my area of expertise, so I couldn't really appreciate what a good vs bad style was when looking at his specific examples. I think such books really have to be written for people working in particular fields; the title should reflect that. There is an urgent need for such a book specifically for psycholinguistics, with examples from our own field. I don't think that a student of psycholinguistics can pick up this book and learn anything much from the examples. The high-level advice is great, but it's hard to translate into actionable things in one's own field.

I have one major complaint about this book: Schimel gives absurdly incorrect advice to the reader about how to present and draw inferences from statistical results. To me it is quite surprising that you can become just a senior and well-cited scientist in an empirically driven field, and have absolutely zero understanding of basic statistical concepts. Schimel would fail my intro stats 1 class.

Here is what he has to say (p 78 in my Kindle edition) about how to present statistical results. I bold-face the most egregious statements.

"As an example, consider figure 8.3. In panel A there is a large difference (the  treatment is 2.3 x the control) that is unquestionably statistically significant. Panel  B shows data with the same statistical significance ( p = 0.02), but the difference  between the treatments is smaller. You could describe both of these graphs by  saying, “The treatment significantly increased the response ( p = 0.02).” That would  be true, but the stories in panels A and B are different — in panel A, there is a  strong effect and in panel B, a weak one. I would describe panel A by saying, “The  treatment increased the response by a factor of 2.3 ( p = 0.02)”; for panel B, I might  write, “The treatment increased the response by only 30 percent, but this increase  was statistically significant ( p = 0.02).” 


Well, panel A is probably Type M error (just look at the uncertainty of the estimates compared to panel B), and what he calls a weak effect in panel B is more likely to be the accurate estimate (again, just look at those uncertainty intervals). So that's a very misleading statement to call A a strong effect and B a weak effect. If given data like in panels A and B, I would take panel B more seriously. I have ranted extensively about this point in a 2018 paper. And of course, others have long complained about this kind of misunderstanding (Gelman and Carlin, 2014).

But it gets worse. Here is what Schimel has to say about panel C. Again, I highlight the absurd part of his comments/advice:

"The tricky question is what to write about panel C. The difference between  treatment and control is the same as in panel A (a factor of 2.3), but the data are  more variable and so the statistics are weaker, in this case above the threshold that  many use to distinguish whether there is a “significant” diff erence at all. Many  would describe this panel by writing, “There was no significant effect of the treatment (p > 0.05).” Such a description, however, has several problems.  The first problem is that many readers would infer that there was no difference  between treatment and control. In fact though, they differed by a factor of 2.3.  That is never the “same.” Also, with a p value of 0.07, the probability that the effect  was due to the experimental treatment is still greater than 90 percent. Thus, a  statement like this is probably making a Type II error — rejecting a real effect.  The second problem is that just saying there was no significant effect mixes  results and interpretation. When you do a statistical test, the F and p values are  results . Deciding whether the test is significant is interpretation. When you  describe the data solely in terms of whether the difference was significant, you  present an interpretation of the data as the data, which violates an important principle of science. Any specific threshold for significance is an arbitrary choice with  no fundamental basis in either science or statistics."

It is kind of fascinating, in a horrifying kind of way, to think that even today there are people out there who think that a p-value of 0.07 implies that the probability of the null being true is 0.07; he thinks that a p-value of 0.07 means that there is a 93% chance that the null is false, i.e., that the effect is real. To support the last sentence in the quote above, Schimel cites an introductory textbook written by statisticians: An introduction to the practice of statistics, by Moore and McCabe (who seem to be professional statisticians). I wanted to read this book to see what they say about p-values there, but it's not available as a Kindle edition and I can't be bothered to spend 80 Euros to get a hard copy.

Could it be that Schimel got his statistical education, such as it is, through misleading textbooks written by professional statisticians? Or did he just misunderstand what he read? I have no idea, but I find it depressing that such misleading and outright wrong recommendations can appear in a section on how to report one's results, and that this was written not by some obscure guy who knows nothing about nothing, but a leading scientist in his field.

Anyway, despite my complaints, overall the book is great and worth reading. One can get a lot out of his other advice on writing. Just ignore everything he says about statistics and consult someone else who actually know what they are talking about; maybe someone like Andrew Gelman. Gelman has written plenty on the topic of presenting one's data analyses and on statistical inference.

As mentioned above, Schimel also has a very cool blog (seems not to be currently in use) that has a lot of interesting and very readable posts: https://schimelwritingscience.wordpress.com/