Shravan Vasishth's Slog (Statistics blog)

Friday, March 15, 2013

How are the random effects (BLUPs) `predicted' in linear mixed models?

In linear mixed models, we fit models like these (the Ware-Laird formulation--see Pinheiro and Bates 2000, for example):

\begin{equation}
Y = X\beta + Zu + \epsilon
\end{equation}

Let $u\sim N(0,\sigma_u^2)$, and this is independent from $\epsilon\sim N(0,\sigma^2)$.

Given $Y$, the ``minimum mean square error predictor'' of $u$ is the conditional expectation:

\begin{equation}
\hat{u} = E(u\mid Y)
\end{equation}

We can find $E(u\mid Y)$ as follows. We write the joint distribution of $Y$ and $u$ as:

\begin{equation}
\begin{pmatrix}
Y \\
u
\end{pmatrix}
=
N\left(
\begin{pmatrix}
X\beta\\
0
\end{pmatrix},
\begin{pmatrix}
V_Y & C_{Y,u}\\
C_{u,Y} & V_u \\
\end{pmatrix}
\right)
\end{equation}

$V_Y, C_{Y,u}, C_{u,Y}, V_u$ are the various variance-covariance matrices.
It is a fact (need to track this down) that

\begin{equation}
u\mid Y \sim N(C_{u,Y}V_Y^{-1}(Y-X\beta)),
Y_u - C_{u,Y} V_Y^{-1} C_{Y,u})
\end{equation}

This apparently allows you to derive the BLUPs:

\begin{equation}
\hat{u}= C_{u,Y}V_Y^{-1}(Y-X\beta))
\end{equation}

Substituting $\hat{\beta}$ for $\beta$, we get:

\begin{equation}
BLUP(u)= \hat{u}(\hat{\beta})C_{u,Y}V_Y^{-1}(Y-X\hat{\beta}))
\end{equation}

Here is a working example:

Correlations of fixed effects in linear mixed models

Ever wondered what those correlations are in a linear mixed model? For example:

The estimated correlation between $\hat{\beta}_1$ and $\hat{\beta}_2$ is $0.988$. Note that

$\hat{\beta}_1 = (Y_{1,1} + Y_{2,1} + \dots + Y_{10,1})/10=10.360$

and

$\hat{\beta}_2 = (Y_{1,2} + Y_{2,2} + \dots + Y_{10,2})/10 = 11.040$

From this we can recover the correlation $0.988$ as follows:

By comparison, in the linear model version of the above:

because $Var(\hat{\beta}) = \hat{\sigma}^2 (X^T X)^{-1}$.

Wednesday, January 23, 2013

Linear models summary sheet

As part of my long slog towards statistical understanding, I started making notes on the very specific topic of linear models. The details are tricky and hard to keep in mind, and it is difficult to go back and forth between books and notes to try to review them. So I tried to summarize the basic ideas into a few pages (the summary sheet is not yet complete).

It's not quite a cheat sheet, so I call it a summary sheet.

Here is the current version:

https://github.com/vasishth/StatisticsNotes

Needless to say (although I feel compelled to so it), the document is highly derivative of lecture notes I've been reading. Corrections and comments and/or suggestions for improvement are most welcome.