A little-known fact: The paired t-test is equivalent to a linear mixed model with varying intercepts
Shravan Vasishth
4/11/2018
Introduction
I have noticed that when people fail to get a maximal model (in the sense of Barr et al 2013) to converge, they back off to a paired t-test (or the equivalent ANOVA), which involves aggregating their data by items or by subjects. I have even heard that a prominent psycholinguist only fits repeated measures ANOVAs now (because at least he understands them). But note that the paired t-test and repeated measures ANOVA are the same thing (see Maxwell and Delaney book), so they are really doing a paired t-test.
What is the relationship between the paired t-test and the linear mixed model? This has never been discussed in any textbook I know (please let me know if this appears in some book).
First, note that the variance of the difference of two random variables, which are correlated with correlaiton \(\rho\), is:
\[\begin{equation} Var(X_1-X_2)=Var(X_1) + Var(X_2) - 2\times Cov(X_1, X2) \end{equation}\]Also note that \(Cov(X_1, X_2) = \rho \sqrt{Var(X_1)}\sqrt{Var(X_2)}\).
You can find the proofs of these assertions in books like Rice.
A paired t-test applies when you have paired data from subject \(i=1,...,n\) in two conditions 1 and 2. Let’s write the data as two vectors \(X_{1}, X_{2}\). Because the pairs of data-points are coming from the same subject, they are correlated with correlation \(\rho\). Assume that condition 1 has standard deviation \(\sigma\) and condition 2 has the same standard deviation \(\sigma\).
Generate some data as an example. Here, I assume that \(\sigma = 1\), \(\rho=0.5\), and that the data are balanced. (When there is unbalanced data, the situation changes, more on that some day.)
library(MASS)
samplesize<-12
mu <- c(.3, .2)
rho<-0.5
stddev<-1
Sigma <- matrix(stddev, nrow=2, ncol=2) + diag(2)
Sigma<-Sigma/2
Sigma##      [,1] [,2]
## [1,]  1.0  0.5
## [2,]  0.5  1.0## fake data:
randvars <- mvrnorm(n=samplesize, mu=mu, Sigma=Sigma, empirical=TRUE)
head(randvars)##            [,1]       [,2]
## [1,] -0.1923371 -0.3500540
## [2,]  0.5728776 -0.8589497
## [3,]  1.8310886  0.9958840
## [4,]  1.5166467  0.9893283
## [5,] -1.4027736 -2.0602480
## [6,] -0.1208778 -0.4637518n<-samplesize
x1<-randvars[,1]
x2<-randvars[,2]
x1##  [1] -0.19233709  0.57287757  1.83108857  1.51664667 -1.40277357
##  [6] -0.12087784 -0.80506229  0.04637409  0.70615898 -0.72291347
## [11]  0.92484196  1.24597641x2##  [1] -0.35005400 -0.85894973  0.99588404  0.98932826 -2.06024800
##  [6] -0.46375184  1.27557341  0.17289232  1.26881165  0.63975069
## [11]  0.07865768  0.71210553To carry out the paired t-test, we need to know the variance of \(X_1-X_2\) because the t-statistic will be:
\(t_{n-1} = \frac{X_1 - X_2}{\sqrt{Var(X_1 - X_2)/n}}\)
Now,\(Var(X_1 - X_2) = \sigma^2 + \sigma^2 - 2 \rho \sigma\sigma = 2\sigma^2 (1-\rho)\)
So, now let’s compute the t-statistic using the above formula. Let the actual data be \(x_1, x_2\).
\(t_{n-1} = \frac{mean(x_1) - mean(x_2)}{\sqrt{Var(X_1 - X_2)/n}}\)
This simplifies to:
\(t_{n-1} = \frac{mean(x_1) - mean(x_2)}{\sqrt{2\sigma^2(1-\rho)/n}}\)
Now compare the paired t-test output and the by-hand calculation:
t.test(x1,x2,paired=TRUE)$statistic##         t 
## 0.3464102(mean(x1)-mean(x2))/sqrt((2*stddev^2*(1-rho))/n)## [1] 0.3464102Now, the linear mixed model is the following situation.
Suppose we have i subjects and j=1,2 conditions. For simplicity, assume that each subject sees each condition once, so we have two data points from each subject.
Then, for condition 1, the dependent variable is
\(y_{i1} = \beta + b_i + \epsilon_{i1}\)
Similarly, for condition 2, the dependent variable is
\(y_{i2} = \beta + \delta + b_i + \epsilon_{i2}\)
The difference between the two is:
\(d_i=y_{i1} - y_{i2}= \delta + (\epsilon_{i1}-\epsilon_{i2})\)
Now, assuming that the error terms are correlated with correlation \(\rho\), the result presented at the beginning of this note applies.
\(Var(y_1-y_2) = \sigma^2 -2\rho\sigma^2=2\sigma^2(1-\rho)\)
The generative distribution for \(d\), the vector of pairwise differences in the two conditions, is
\(d \sim Normal(\delta, 2\sigma^2(1-\rho))\)
So, the paired t-test will deliver exactly the same t-score as the varying intercepts linear mixed model.
We check this next using our simulated data:
library(lme4)## Loading required package: Matrixdat<-data.frame(y=c(x1,x2),cond=rep(letters[1:2],each=n),subj=rep(1:n,2))
contrasts(dat$cond)<-contr.sum(2)/2
contrasts(dat$cond)##   [,1]
## a  0.5
## b -0.5summary(m<-lmer(y~cond+(1|subj),dat))## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ cond + (1 | subj)
##    Data: dat
## 
## REML criterion at convergence: 64.2
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.54889 -0.41356 -0.02765  0.67697  1.53499 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subj     (Intercept) 0.5      0.7071  
##  Residual             0.5      0.7071  
## Number of obs: 24, groups:  subj, 12
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   0.2500     0.2500   1.000
## cond1         0.1000     0.2887   0.346
## 
## Correlation of Fixed Effects:
##       (Intr)
## cond1 0.000The t-statistic is exactly the same as the paired t-test.
Conclusion
Given the assumptions above, the paired t-test is equivalent to fitting a varying intercept linear mixed model. So if you think you are following the advice of Barr et al 2013) to back off to using using repeated measures ANOVA or the equivalent paired t-test, then you are doing exactly what they said you should not do. Oops.
In our desperation to get the p-value below 0.05, we tend to fool ourselves using whatever tools we have at hand, here the paired t-test. If you are serious about following Barr et al 2013’s recommendations, you’re better off using brms and going all the way to a maximal model.
 
 
2 comments:
This was a great read! I'm trying to apply it to what I'm doing, though, and having a little trouble. It's a little tricky to explain in a comment but let me try:
I have 3 treatments across 6 sites. Within each site, I have near and far points(location) at each site where I counted birds. I'm interested to see if the difference in the number of birds counted near and far is significantly different across treatments.
At first, I did a # birds ~ treatment*location + (1|site), but now I'm thinking it is more paired, similar to your example. But, because it is paired within a site within a treatment, I am thinking of doing #near-#far ~ treatment + (1|site). My situation seems to be rare, though, and not sure if this is the right approach. If you have an opinion/advice I would greatly appreciate it! Still figuring this out.
Anyway, that's my relation to your article. Thanks for posting it!
A reader informs me that this question appears in Agresti (2015), Foundations of Linear and Generalized Linear Models. Chapter 9, question 6.
Post a Comment