Search

Wednesday, April 11, 2018

A little-known fact: The paired t-test is equivalent to a linear mixed model with varying intercepts

A little-known fact: The paired t-test is equivalent to a linear mixed model with varying intercepts

Introduction

I have noticed that when people fail to get a maximal model (in the sense of Barr et al 2013) to converge, they back off to a paired t-test (or the equivalent ANOVA), which involves aggregating their data by items or by subjects. I have even heard that a prominent psycholinguist only fits repeated measures ANOVAs now (because at least he understands them). But note that the paired t-test and repeated measures ANOVA are the same thing (see Maxwell and Delaney book), so they are really doing a paired t-test.

What is the relationship between the paired t-test and the linear mixed model? This has never been discussed in any textbook I know (please let me know if this appears in some book).

First, note that the variance of the difference of two random variables, which are correlated with correlaiton \(\rho\), is:

\[\begin{equation} Var(X_1-X_2)=Var(X_1) + Var(X_2) - 2\times Cov(X_1, X2) \end{equation}\]

Also note that \(Cov(X_1, X_2) = \rho \sqrt{Var(X_1)}\sqrt{Var(X_2)}\).

You can find the proofs of these assertions in books like Rice.

A paired t-test applies when you have paired data from subject \(i=1,...,n\) in two conditions 1 and 2. Let’s write the data as two vectors \(X_{1}, X_{2}\). Because the pairs of data-points are coming from the same subject, they are correlated with correlation \(\rho\). Assume that condition 1 has standard deviation \(\sigma\) and condition 2 has the same standard deviation \(\sigma\).

Generate some data as an example. Here, I assume that \(\sigma = 1\), \(\rho=0.5\), and that the data are balanced. (When there is unbalanced data, the situation changes, more on that some day.)

library(MASS)
samplesize<-12
mu <- c(.3, .2)
rho<-0.5
stddev<-1
Sigma <- matrix(stddev, nrow=2, ncol=2) + diag(2)
Sigma<-Sigma/2
Sigma
##      [,1] [,2]
## [1,]  1.0  0.5
## [2,]  0.5  1.0
## fake data:
randvars <- mvrnorm(n=samplesize, mu=mu, Sigma=Sigma, empirical=TRUE)
head(randvars)
##            [,1]       [,2]
## [1,] -0.1923371 -0.3500540
## [2,]  0.5728776 -0.8589497
## [3,]  1.8310886  0.9958840
## [4,]  1.5166467  0.9893283
## [5,] -1.4027736 -2.0602480
## [6,] -0.1208778 -0.4637518
n<-samplesize
x1<-randvars[,1]
x2<-randvars[,2]
x1
##  [1] -0.19233709  0.57287757  1.83108857  1.51664667 -1.40277357
##  [6] -0.12087784 -0.80506229  0.04637409  0.70615898 -0.72291347
## [11]  0.92484196  1.24597641
x2
##  [1] -0.35005400 -0.85894973  0.99588404  0.98932826 -2.06024800
##  [6] -0.46375184  1.27557341  0.17289232  1.26881165  0.63975069
## [11]  0.07865768  0.71210553

To carry out the paired t-test, we need to know the variance of \(X_1-X_2\) because the t-statistic will be:

\(t_{n-1} = \frac{X_1 - X_2}{\sqrt{Var(X_1 - X_2)/n}}\)

Now,\(Var(X_1 - X_2) = \sigma^2 + \sigma^2 - 2 \rho \sigma\sigma = 2\sigma^2 (1-\rho)\)

So, now let’s compute the t-statistic using the above formula. Let the actual data be \(x_1, x_2\).

\(t_{n-1} = \frac{mean(x_1) - mean(x_2)}{\sqrt{Var(X_1 - X_2)/n}}\)

This simplifies to:

\(t_{n-1} = \frac{mean(x_1) - mean(x_2)}{\sqrt{2\sigma^2(1-\rho)/n}}\)

Now compare the paired t-test output and the by-hand calculation:

t.test(x1,x2,paired=TRUE)$statistic
##         t 
## 0.3464102
(mean(x1)-mean(x2))/sqrt((2*stddev^2*(1-rho))/n)
## [1] 0.3464102

Now, the linear mixed model is the following situation.

Suppose we have i subjects and j=1,2 conditions. For simplicity, assume that each subject sees each condition once, so we have two data points from each subject.

Then, for condition 1, the dependent variable is

\(y_{i1} = \beta + b_i + \epsilon_{i1}\)

Similarly, for condition 2, the dependent variable is

\(y_{i2} = \beta + \delta + b_i + \epsilon_{i2}\)

The difference between the two is:

\(d_i=y_{i1} - y_{i2}= \delta + (\epsilon_{i1}-\epsilon_{i2})\)

Now, assuming that the error terms are correlated with correlation \(\rho\), the result presented at the beginning of this note applies.

\(Var(y_1-y_2) = \sigma^2 -2\rho\sigma^2=2\sigma^2(1-\rho)\)

The generative distribution for \(d\), the vector of pairwise differences in the two conditions, is

\(d \sim Normal(\delta, 2\sigma^2(1-\rho))\)

So, the paired t-test will deliver exactly the same t-score as the varying intercepts linear mixed model.

We check this next using our simulated data:

library(lme4)
## Loading required package: Matrix
dat<-data.frame(y=c(x1,x2),cond=rep(letters[1:2],each=n),subj=rep(1:n,2))
contrasts(dat$cond)<-contr.sum(2)/2
contrasts(dat$cond)
##   [,1]
## a  0.5
## b -0.5
summary(m<-lmer(y~cond+(1|subj),dat))
## Linear mixed model fit by REML ['lmerMod']
## Formula: y ~ cond + (1 | subj)
##    Data: dat
## 
## REML criterion at convergence: 64.2
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -1.54889 -0.41356 -0.02765  0.67697  1.53499 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  subj     (Intercept) 0.5      0.7071  
##  Residual             0.5      0.7071  
## Number of obs: 24, groups:  subj, 12
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   0.2500     0.2500   1.000
## cond1         0.1000     0.2887   0.346
## 
## Correlation of Fixed Effects:
##       (Intr)
## cond1 0.000

The t-statistic is exactly the same as the paired t-test.

Conclusion

Given the assumptions above, the paired t-test is equivalent to fitting a varying intercept linear mixed model. So if you think you are following the advice of Barr et al 2013) to back off to using using repeated measures ANOVA or the equivalent paired t-test, then you are doing exactly what they said you should not do. Oops.

In our desperation to get the p-value below 0.05, we tend to fool ourselves using whatever tools we have at hand, here the paired t-test. If you are serious about following Barr et al 2013’s recommendations, you’re better off using brms and going all the way to a maximal model.

Friday, April 06, 2018

The smaller the p-value, the higher the likelihood ratio: true or false?

[Thanks to Scott Glover for correcting me.] Someone recently said to me that the lower the p-value, the higher the likelihood ratio under the alternative vs the null. The arXiv paper by Michael Lew makes analogous points (thanks to Titus von der Malsburg for pointing me to this paper).
I show below why this question is irrelevant. The problem lies again with Type M errors under low power. The bottom line is: should I care about a significant result if it is based on an overestimate of the true mean, i.e., if it is based on a biased estimate of the true mean? PS I corrected a typo, thanks to Keith Goldfeld.

Here is the blog post.

Thursday, April 05, 2018

Posterior probability of the null hypothesis being true, given a significant effect

For some reason, I am unable to load this post to google blogger. I have linked the post to an html file on my home page. Please comment here on this blog.
Here is the post:
Posterior probability of the null hypothesis being true, given a significant effect