Search

Friday, January 02, 2015

A weird and unintended consequence of Barr et al's Keep It Maximal paper

Barr et al's well-intentioned paper is starting to lead to some seriously weird behavior in psycholinguistics! As a reviewer, I'm seeing submissions where people take the following approach:

1. Try to fit a "maximal" linear mixed model.  If you get a convergence failure (this happens a lot since we routinely run low power studies!), move to step 2.

[Aside:
By the way, the word maximal is ambiguous here, because you can have a "maximal" model with no correlation parameters estimated, or have one with correlations estimated. For a 2x2 design, the difference would look like:

correlations estimated: (1+factor1+factor2+interaction|subject) etc.

no correlations estimated: (factor1+factor2+interaction || subject) etc.

Both options can be considered maximal.]

2. Fit a repeated measures ANOVA. This means that you average over items to get F1 scores in the by-subject ANOVA. But this is cheating and amounts to p-value hacking. This effectively changes the between items variance to 0 because we aggregated over items for each subject in each condition. That is the whole reason why linear mixed models are so important; we can take both between item and between subject variance into account simultaneously. People mistakenly think that the linear mixed model and rmANOVA are exactly identical. If your experiment design calls for crossed varying intercepts and varying slopes (and it always does in psycholinguistics), an rmANOVA is not identical to the LMM, for the reason I give above. In the old days we used to compute minF.  In 2014, I mean, 2015, it makes no sense to do that if you have a tool like lmer.

As always, I'm happy to get comments on this.

5 comments:

Anonymous said...

On the plus side, in the recent past many of those authors probably would have just gone straight for the RM-ANOVA without trying a mixed model at all. So it's probably still progress.

Dani said...

If one is unable to fit a maximal model due to low power, what should be done instead. For me, this is especially important in cases where neither IV1 or IV2 have a stronger case for being included over the other.

Shravan Vasishth said...

You should try to find the simplest model that is possible. Examples coming soon.

Dani said...

Looking forward to it,

Titus von der Malsburg said...

@Dani, Barr et al. say that you need random slopes for all fixed effects about which you want to make inferences. The last part is often forgotten. Specifically, this means that you don't need random slopes for covariates and this can be used to solve your problem: Assume a 2x2 design and a non-converging model with the following structure: y ~ a + b + (a+b|subj) + (a+b|item) Now, you can decompose this into two models:

y ~ a + b + (a|subj) + (a|item)
y ~ a + b + (b|subj) + (b|item)

Use the first model to make inferences about a (b is just a covariate there) and the second to make inferences about b (now a is the covariate). Since both models have a simpler random effects structure, there is a higher chance that they converge. Not sure if this approach is generally valid but it should at least work when a and b are orthogonal. Check the Barr et al. paper, I think they describe this approach somewhere in the discussion.