学习笔记158—INTRODUCTION TO LINEAR MIXED MODELS

This page briefly introduces linear mixed models LMMs as a method for analyzing data that are non independent, multilevel/hierarchical, longitudinal, or correlated. We focus on the general concepts and interpretation of LMMS, with less time spent on the theory and technical details.

Background

Linear mixed models are an extension of simple linear models to allow both fixed and random effects, and are particularly used when there is non independence in the data, such as arises from a hierarchical structure. For example, students could be sampled from within classrooms, or patients from within doctors.

When there are multiple levels, such as patients seen by the same doctor, the variability in the outcome can be thought of as being either within group or between group. Patient level observations are not independent, as within a given doctor patients are more similar. Units sampled at the highest level (in our example, doctors) are independent. The figure below shows a sample where the dots are patients within doctors, the larger circles.

There are multiple ways to deal with hierarchical data. One simple approach is to aggregate. For example, suppose 10 patients are sampled from each doctor. Rather than using the individual patients’ data, which is not independent, we could take the average of all patients within a doctor. This aggregated data would then be independent.

Although aggregate data analysis yields consistent and effect estimates and standard errors, it does not really take advantage of all the data, because patient data are simply averaged. Looking at the figure above, at the aggregate level, there would only be six data points.

Another approach to hierarchical data is analyzing data from one unit at a time. Again in our example, we could run six separate linear regressions—one for each doctor in the sample. Again although this does work, there are many models, and each one does not take advantage of the information in data from other doctors. This can also make the results “noisy” in that the estimates from each model are not based on very much data

Linear mixed models (also called multilevel models) can be thought of as a trade off between these two alternatives. The individual regressions has many estimates and lots of data, but is noisy. The aggregate is less noisy, but may lose important differences by averaging all samples within each doctor. LMMs are somewhere inbetween.

Beyond just caring about getting standard errors corrected for non independence in the data, there can be important reasons to explore the difference between effects within and between groups. An example of this is shown in the figure below. Here we have patients from the six doctors again, and are looking at a scatter plot of the relation between a predictor and outcome. Within each doctor, the relation between predictor and outcome is negative. However, between doctors, the relation is positive. LMMs allow us to explore and understand these important effects.

Random Effects

The core of mixed models is that they incorporate fixed and random effects. A fixed effect is a parameter that does not vary. For example, we may assume there is some true regression line in the population, $β$

$β\simN(μ,σ)β\simN(μ,σ)$

This is really the same as in linear regression, where we assume the data are random variables, but the parameters are fixed effects. Now the data are random variables, and the parameters are random variables (at one level), but fixed at the highest level (for example, we still assume some overall population mean, $μ$

Theory of Linear Mixed Models

$y=Xβ+Zu+εy=Xβ+Zu+ε$

Where $y$

To make this more concrete, let’s consider an example from a simulated dataset. Doctors ( $J = 407$

In our example, $N = 8525$

and by stacking observations from all groups together we would have:

Because $Z$

Pictorial representation of the sparse matrix Z

In order to see the structure in more detail, we could also zoom in on just the first 10 doctors. The filled space indicates rows of observations belonging to the doctor in that column, whereas the white space indicates not belonging to the doctor in that column.

Pictorial representation of the first 10 columns of the sparse matrix Z

If we estimated it, $u$

$u\simN(0,G)$

Which is read: “u is distributed as normal with mean zero and variance G”. Where $G$

Because $G$

$G=σ(θ)$

In other words, $G$

The final element in our model is the variance-covariance matrix of the residuals, $ε$

where $I$

So the final fixed elements are $y$

$(y|β;u=u)\simN(Xβ+Zu,R)$

We could also frame our model in a two level-style equation for the $i$

$\begin{matrix} \end{matrix}$

Substituting in the level 2 equations into level 1, yields the mixed model specification. Here we grouped the fixed and random intercept parameters together to show that combined they give the estimated intercept for a particular doctor.

References

Fox, J. (2008). Applied Regression Analysis and Generalized Linear Models, 2nd ed. Sage Publications.
McCullagh, P, & Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. Chapman & Hall/CRC Press.
Snijders, T. A. B. & Bosker, R. J. (2012). Multilevel Analysis, 2nd ed. Sage Publications.
Singer, J. D. & Willett, J. B. (2003). Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence. Oxford University Press.
Pinheiro, J. & Bates, D. (2009). Mixed-Effects Models in S and S-Plus. 2nd printing. Springer.
Galecki, A. & Burzykowski, T. (2013). Linear Mixed-Effects Models Using R. Springer.
Skrondal, A. & Rabe-Hesketh, S. (2004). Generalized Latent Variable Modeling: Multilevel, longitudinal, and structural equation models. Chapman & Hall/CRC Press.
Gelman, A. & Hill, J. (2006). Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press.
Gelman, A., Carlin, J. B., Stern, H. S. & Rubin, D. B. (2003). Bayesian Data Analysis, 2nd ed. Chapman & Hall/CRC.
参考链接：https://stats.idre.ucla.edu/other/mult-pkg/introduction-to-linear-mixed-models/