A little-known fact: The paired t-test is equivalent to a linear mixed model with varying intercepts
Shravan Vasishth
4/11/2018
Introduction
I have noticed that when people fail to get a maximal model (in the sense of Barr et al 2013) to converge, they back off to a paired t-test (or the equivalent ANOVA), which involves aggregating their data by items or by subjects. I have even heard that a prominent psycholinguist only fits repeated measures ANOVAs now (because at least he understands them). But note that the paired t-test and repeated measures ANOVA are the same thing (see Maxwell and Delaney book), so they are really doing a paired t-test.
What is the relationship between the paired t-test and the linear mixed model? This has never been discussed in any textbook I know (please let me know if this appears in some book).
First, note that the variance of the difference of two random variables, which are correlated with correlaiton \(\rho\), is:
\[\begin{equation} Var(X_1-X_2)=Var(X_1) + Var(X_2) - 2\times Cov(X_1, X2) \end{equation}\]Also note that \(Cov(X_1, X_2) = \rho \sqrt{Var(X_1)}\sqrt{Var(X_2)}\).
You can find the proofs of these assertions in books like Rice.
A paired t-test applies when you have paired data from subject \(i=1,...,n\) in two conditions 1 and 2. Let’s write the data as two vectors \(X_{1}, X_{2}\). Because the pairs of data-points are coming from the same subject, they are correlated with correlation \(\rho\). Assume that condition 1 has standard deviation \(\sigma\) and condition 2 has the same standard deviation \(\sigma\).
Generate some data as an example. Here, I assume that \(\sigma = 1\), \(\rho=0.5\), and that the data are balanced. (When there is unbalanced data, the situation changes, more on that some day.)
library(MASS)
samplesize<-12
mu <- c(.3, .2)
rho<-0.5
stddev<-1
Sigma <- matrix(stddev, nrow=2, ncol=2) + diag(2)
Sigma<-Sigma/2
Sigma
## [,1] [,2]
## [1,] 1.0 0.5
## [2,] 0.5 1.0
## fake data:
randvars <- mvrnorm(n=samplesize, mu=mu, Sigma=Sigma, empirical=TRUE)
head(randvars)
## [,1] [,2]
## [1,] -0.1923371 -0.3500540
## [2,] 0.5728776 -0.8589497
## [3,] 1.8310886 0.9958840
## [4,] 1.5166467 0.9893283
## [5,] -1.4027736 -2.0602480
## [6,] -0.1208778 -0.4637518
n<-samplesize
x1<-randvars[,1]
x2<-randvars[,2]
x1
## [1] -0.19233709 0.57287757 1.83108857 1.51664667 -1.40277357
## [6] -0.12087784 -0.80506229 0.04637409 0.70615898 -0.72291347
## [11] 0.92484196 1.24597641
x2
## [1] -0.35005400 -0.85894973 0.99588404 0.98932826 -2.06024800
## [6] -0.46375184 1.27557341 0.17289232 1.26881165 0.63975069
## [11] 0.07865768 0.71210553
To carry out the paired t-test, we need to know the variance of \(X_1-X_2\) because the t-statistic will be:
\(t_{n-1} = \frac{X_1 - X_2}{\sqrt{Var(X_1 - X_2)/n}}\)
Now,\(Var(X_1 - X_2) = \sigma^2 + \sigma^2 - 2 \rho \sigma\sigma = 2\sigma^2 (1-\rho)\)
So, now let’s compute the t-statistic using the above formula. Let the actual data be \(x_1, x_2\).
\(t_{n-1} = \frac{mean(x_1) - mean(x_2)}{\sqrt{Var(X_1 - X_2)/n}}\)
This simplifies to:
\(t_{n-1} = \frac{mean(x_1) - mean(x_2)}{\sqrt{2\sigma^2(1-\rho)/n}}\)
Now compare the paired t-test output and the by-hand calculation:
t.test(x1,x2,paired=TRUE)$statistic
## t
## 0.3464102
(mean(x1)-mean(x2))/sqrt((2*stddev^2*(1-rho))/n)
## [1] 0.3464102
Now, the linear mixed model is the following situation.
Suppose we have i subjects and j=1,2 conditions. For simplicity, assume that each subject sees each condition once, so we have two data points from each subject.
Then, for condition 1, the dependent variable is
\(y_{i1} = \beta + b_i + \epsilon_{i1}\)
Similarly, for condition 2, the dependent variable is
\(y_{i2} = \beta + \delta + b_i + \epsilon_{i2}\)
The difference between the two is:
\(d_i=y_{i1} - y_{i2}= \delta + (\epsilon_{i1}-\epsilon_{i2})\)
Now, assuming that the error terms are correlated with correlation \(\rho\), the result presented at the beginning of this note applies.
\(Var(y_1-y_2) = \sigma^2 -2\rho\sigma^2=2\sigma^2(1-\rho)\)
The generative distribution for \(d\), the vector of pairwise differences in the two conditions, is
\(d \sim Normal(\delta, 2\sigma^2(1-\rho))\)
So, the paired t-test will deliver exactly the same t-score as the varying intercepts linear mixed model.
We check this next using our simulated data:
library(lme4)
## Loading required package: Matrix
dat<-data.frame(y=c(x1,x2),cond=rep(letters[1:2],each=n),subj=rep(1:n,2))
contrasts(dat$cond)<-contr.sum(2)/2
contrasts(dat$cond)
## [,1]
## a 0.5
## b -0.5
summary(m<-lmer(y~cond+(1|subj),dat))
## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ cond + (1 | subj)
## Data: dat
##
## REML criterion at convergence: 64.2
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.54889 -0.41356 -0.02765 0.67697 1.53499
##
## Random effects:
## Groups Name Variance Std.Dev.
## subj (Intercept) 0.5 0.7071
## Residual 0.5 0.7071
## Number of obs: 24, groups: subj, 12
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.2500 0.2500 1.000
## cond1 0.1000 0.2887 0.346
##
## Correlation of Fixed Effects:
## (Intr)
## cond1 0.000
The t-statistic is exactly the same as the paired t-test.
Conclusion
Given the assumptions above, the paired t-test is equivalent to fitting a varying intercept linear mixed model. So if you think you are following the advice of Barr et al 2013) to back off to using using repeated measures ANOVA or the equivalent paired t-test, then you are doing exactly what they said you should not do. Oops.
In our desperation to get the p-value below 0.05, we tend to fool ourselves using whatever tools we have at hand, here the paired t-test. If you are serious about following Barr et al 2013’s recommendations, you’re better off using brms and going all the way to a maximal model.