This chapter introduces core concepts of xxM as a modeling framework and software from a practical standpoint. Key elements of model specification in xxM are introduced in the context of fitting a bivariate random-intercepts model.

We use the bivariate random intercepts model example from Mehta, Neale, and Flay (2005).

Bivariate random-intercepts model

Equations

level 1

\[ y_{pij} = 1 \times \eta_{pj} + e_{pij}, \] \[ e ~ N(0, R) \]

level 2

\[ \eta_{pj} = \alpha_{p0} + u_{pj}, \] \[ u~N(0,G) \]

Mixed-effects model matrices \( R \) and \( G \) correspond to \( \Theta \) and \( \Psi \) matrices in SEM. Henceforth, we will use to \( \Theta \) and \( \Psi \) for observed and latent residual covariance matrix, respectively. For two dependent variables, we can write the equation as:

\[ y_{1ij} = 1 \times \eta_{1j} + e_{1ij} \] \[ y_{2ij} = 1 \times \eta_{2j} + e_{2ij} \]

\[ \Theta = \begin{bmatrix} \theta_{1,1} & \\ \theta_{2,1} & \theta_{2,2} \end{bmatrix} \] \[ \Psi= \begin{bmatrix} \psi_{1,1} & \\ \psi_{2,1} & \psi_{2,2} \end{bmatrix} \]

The above bivariate random-intercepts model has 8 parameters:

Path-diagram

The following two-level path diagram accurately represents all parameters except the grand means.

alt brim

Fitting bivariate random intercepts model in xxM

A complete model involves just three objects:

  1. Main Model
  2. One or more submodels
  3. Parameter matrices.

These objects are very simple and are constructed and added in a sequential fashion. An advantage of this approach is that complex models just involve more of these objects. The process of constructing and adding these objects remains the same.

  1. Construct main model.
  2. Constuct submodels for each level and add to the main model.
  3. Construct parameter matrices and add to the model.

A complete xxM model script has just five succinct commands to accomplish these goals:

  1. xxmModel()
  2. xxmSubmodel()
  3. xxmWithinMatrix()
  4. xxmBetweenMatrix()
  5. xxmRun()

The following sections describes how these five xxM commands are used to construct a complete model.

Model

An n-level xxM model is composed of n submodels. The very first step in specifying an xxM model is to create a model object by invoking xxmModel(). The idea is to declare names of all levels.

brim <- xxmModel(levels = c("student", "teacher"))

Internally, xxM assigns level numbers \( (lnum = {1,2,\dotsc,P}) \) to each level declared in xxmModel(). The order of levels is also important. In this case, students are influenced by teachers. Hence, the student level must be declared before the teacher level. The intent behind this function is obvious.The function creates an object called brim (i.e., bivariate random intercepts model) from a list of level names. The left hand side is the name of the xxM model. The choice of the name is arbitrary and other names such as ladyGaga or cow would work. However, it is better to use a short, but descriptive name, as this name will be used in all subsequent commands. Internally, the above invocation literally creates a model with placeholders for the student and teacher submodels, as depicted below.

alt brim.xxmModel

More formally, the function takes a single parameter aptly called levels and expects to receive a list of level names. The function creates an xxM object and returns a handle or a pointer to the object in memory.
The following command presents a generic invocation of xxmModel().

xxmObject <- xxmModel(levels = list)

But what is a level?

At this stage it appears that we have data for students and teachers. Presumably students are nested within teachers. In multilevel modeling jargon, we have two levels with students hierarchically nested within teachers. While the term level is obvious in this simple case, xxM can be used with fairly complex data-structures in which the notion of levels may not be so easy to identify. Levels represent any concrete or abstract set of entities across which some attribute is expected to vary. Very simply, a level involves multiple entities of some kind (e.g., students, situations, responses, occasions etc.) for whom there is an attribute or a variable (e.g., IQ or achievement) of interest. Each level may have its own set of observed and/or latent variables. The definition of levels also implies a flow of influence. Levels have a one-directional relationship. A higher order level is said to influence a lower order level. In this case, we know that teachers impact student outcomes. The directional relationships are implicitly conveyed to xxM in how the levels are ordered in the list.

Our goal is to be able to specify a latent variable model within and across levels to capture the hypothesized flow of influence. The actual model will be presented later, for now we focus on the mechanics of specifying a model in xxM. We begin with the submodels for students and teachers.

Submodels

Each level may have its very own complete SEM model with observed dependent and exogenous independent variables, latent variables, measurement model, and structural model involving all possible regressions (observed on observed, observed on latent, latent on observed, and latent on latent). Before we can begin to specify the actual model, we need to provide our model object- brim, with basic information about each level. This is accomplished by the xxmSubmodel() function.

brim <- xxmSubmodel( model = “brim”,
 level = “student”, 
 parents = c(“teacher”),
 ys = c(“y1”, “y2”),
 xs = ,
 etas = ,
 data = student )

These assignment statements build the submodel with the constituent parts. The xxmSubmodel() function adds basic information to our xxM model object, brim.

parents = c( “family”, “teacher”, “school”, “neighborhood” )

The corresponding submodel for the teacher level is:

brim <- xxmSubmodel( model = “brim”,
 level = “teacher”, 
 parents = ,
 ys = ,
 xs = ,
 etas = c( “eta1”, “eta2” ),
 data = teacher )

The teacher level does not have a parent, nor does it have observed dependent or independent variables. The teacher level does have two latent variables. The latent variables represent random-intercepts of student level dependent variables (y1 and y2). If teachers were nested within a higher level such as school, the parents argument would be:

parents = c( “school” )

How are datasets structured?

For two level data structures, a single dataset is adequate. However with complex dependent data-structures it is most convenient to provide data for each level separately. Each dataset must include information about how each observation at a lower level is linked to a higher level. In general, datasets must have three types of variables:

  1. One or more columns of IDs or variables with linking information.
  2. Zero or more columns of dependent variables corresponding to the list of ys.
  3. Sero or more columns of independent variables corresponding to the list of xs.

The student data has four columns student, teacher, y1 and y2. The first column is for the ID variable for the current level, in this case student. Student has a single parent, teacher. The ID columns must have the same name as the name of the corresponding level. Practically, it means that if in your dataset the ID variables are named SID and TID, these must be renamed to student and teacher.

The teacher dataset (teacher) has no observed variables, nor does it have parents. Yet, it is necessary to have a dataset listing teacher IDs. The teacher data has a single column teacher.

A complete checklist to ensure that the data requirements are met:

  1. Each level must have a corresponding R dataset. To avoid confusion, the name of the dataset should match the level name.
  2. If a level does not have observed dependent or independent variables, the dataset would inclde a single column of level IDs.
  3. The first (1 + p) columns of a dataset include ID variables.
  4. The names of the ID columns must match the corresponding level names.
  5. The order of parent levels in the dataset must be ascending. Lower level ID columns must come before higher level ID columns.
  6. ID columns must be of type integer. R routinely converts categorical variables to a factor type.
  7. Dependent and independent variables must be of type numeric
  8. Use R command for examining structure of a dataset to ensure that the above requirements are met, e.g. str(myLevel1Data) and str(myLevel2Data).

So far we created a model object called brim by invoking xxmModel() and declared submodels for student and teacher by invoking xxmSubModel(). At this point, our model object called brim, looks as follows and is just a shell of the final desired model. The next logical step would be to specify the actual model, i.e., how observed and latent variables relate to each other. This is accomplished by defining parameter matrices. alt brim.xxmSubmodel

Matrices

From an xxM perspective, the above model is specified in terms of parameters and matrices associated with each level and links among variables across levels. This sounds complicated, but in reality we will simply repeat what we have already stated in previous sections:

What parameters are we estimating?

Parameters of interest in the above model specification are best specified as matrices. This allows complex models to be expressed succinctly. We begin by translating our scalar model formulation into xxM parameter matrices.

Within-student model matrices (Level 1)

At level-1, we only have variances and a covariance for the residuals. The residual covariance matrix is called the theta matrix (\( \Theta \)). The matrix is symmetric with three free parameters: two variances and a covariance \( (\theta_{12}= \theta_{21}) \).

\[ \Theta = \begin{bmatrix} \theta_{1,1} & \\ \theta_{2,1} & \theta_{2,2} \end{bmatrix} \]

Within-teacher model matrices (Level-2)

At level-2, we have a covariance and variances among the latent variables, along with their means. Latent covariance and mean matrices are called \( Psi \) (psi) and \( \alpha \) (alpha), respectively:

\[ \Psi = \begin{bmatrix} \psi_{1,1} & \\ \psi_{2,1} & \psi_{2,2} \end{bmatrix} \] \[ \alpha = \begin{bmatrix} \alpha_{1,1} \\ \alpha_{2,1} \end{bmatrix} \]

Across-level model matrices: From Teachers To Students ( Level 2 -> Level 1)

The above three matrices (\( \Theta \), \( \Psi \), & \( \alpha \)) capture information about variables within each level. The missing piece is the link among variables across levels. In this case, the teacher level latent variables influence student level observed variables. Very concretely, the effects are:

\[ y_{1ij} = 1 \times \eta_{1j} + 0 \times \eta_{2j} + e_{1ij} \] \[ y_{2ij} = 0 \times \eta_{1j} + 1 \times \eta_{2j} + e_{2ij} \]

We can represent the effects in a table, with columns representing the independent variables and rows representing the dependent variables. In xxM matrices, columns independent variables and rows represent dependent variables. With this convention we can now represent the coefficients in a single matrix \( \Lambda \) (lambda):

\[ \Lambda = \begin{bmatrix} 1 & 0\\ 0 & 1 \end{bmatrix} \]

In LISREL and xxM, \( \Lambda \) matrix is used to capture measurement relationship. In this case, level-2 latent variables are said to be measured by level-1 observed variables.

How are parameter matrices added to the model?

We have now defined four matrices that completely specify the underlying bivariate random-intercepts model. Once the model itself is clearly defined, the actual specification is really very trivial.There are just two commands for specifying parameter matrices:

brim <- xxmWithinMatrix(…)
brim <- xxmBetweenMatrix(…)

Essentially, we want to add the above four matrices to complete the model. Of the four matrices, the first matrix is a within-student matrix \( \theta \), the next two matrices are within-teacher matrices \( \alpha \) & \( \Psi \) and the last matrix (\( \Lambda \)) connects the two levels and is therefore an across-level or between matrix. At this point several things must be obvious:

  1. We will need to call xxmWithinMatrix() three times, first for the student level and then twice for the teacher level.
  2. xxmBetweenMatrix() will be called once connecting the teacher level to the student level.
brim <- xxmWithinMatrix( model = “brim”, 
 level = “student”, 
 type = “theta”… )
brim <- xxmWithinMatrix( model = “brim”, 
 level = “teacher”, 
 type = “psi”, … )
brim <- xxmWithinMatrix( model = “brim”, 
 level = “teacher”, 
 type = “alpha”, … )
brim <- xxMBetweenMatrix( model = “brim”, 
 parent = “teacher”, 
 child = “student”, 
 type = “lambda”, …)

Now we know the general procedure for adding a matrix to the model. Let us now examine how a matrix to be added is actually specified.

What is a fixed parameter?

Note that the first three parameter matrices (\( \Theta \), \( \Psi \), and \( \alpha \)) are somewhat different from the last matrix (\( \Lambda \)). The first three matrices include model parameters that are to be freely estimated. In contrast, all four elements of the last matrix are fixed. We already know their values. This idea of free vs. fixed parameters is central in SEM. In essence, for each parameter we need to tell xxM if the parameter is to be estimated or if the parameter is to be fixed to some known value. For each parameter matrix, we need to define two separate matrices:

  1. pattern matrix indicating the pattern of free (=1) or fixed (= 0) parameters and
  2. value matrix providing numeric values for fixed-parameters or start-values for free parameters.

It is easier done than said. We use a two part name including:

lambda_pattern <- matrix(c(0, 0, 0, 0), 2, 2)
lambda_value <- matrix(c(1, 0, 0, 1), 2, 2)
\[ \Lambda_{pattern} = \begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix} \] \[ \Lambda_{value} = \begin{bmatrix} 1.0 & 0.0 \\ 1.0 & 1.0 \end{bmatrix} \]

All elements of pattern matrix for \( \Lambda \) are zero indicating that none of the parameters are free to be estimated. Instead, all parameters are to be fixed to some known values. The value matrix provides the corresponding values.
The diagonal elements are to be fixed to 1.0, whereas the off-diagonal elements are to be fixed to 0.0. Compare the specification of \( \Lambda \) with that of the \( \Theta \) matrix:

theta_pattern <- matrix(c(1, 1, 1, 1), 2, 2)
theta_value <- matrix(c(1.1, 0.2, 0.2, 2.3), 2, 2)
\[ \Theta_{pattern} = \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} \] \[ \Theta_{value} = \begin{bmatrix} 1.1 & 0.2 \\ 0.2 & 2.3 \end{bmatrix} \]

All four elements of the \( \Theta \) matrix are to be freely estimated. Hence all four elements in the pattern matrix are 1s. The value matrix provides start values. At this point, you may complain that you do not actually have any idea as to what these values may be. In general, almost any reasonable set of values will work. More specifically, for a residual covariance matrix, the following rules work very well in practice:

So far, we have described

Now we will see how these matrices are constructed in R and added to our xxM model.

How do I construct and add matrices to the model?

We create pattern and value matrices for each of the four model matrices, and add these matrices to our model as described earlier:

Within-Student Model Matrices

 # Theta create
theta_pattern <- matrix( c(1,1,1,1), 2,2 )
theta_value   <- matrix( c(1.1,.2,.2,2.3), 2,2 )
 # Theta add
brim <- xxmWithinMatrix( model = “brim”, 
 level = “student”, 
 type = “theta”, 
 pattern = theta_pattern,
 value = theta_value)

Within-Teacher Model Matrices

  # Psi create
psi_pattern <- matrix( c(1,1,1,1), 2,2 )
psi_value   <- matrix( c(.1,.05,.05,.2), 2,2 )
  # psi add
brim <- xxmWithinMatrix( model = “brim”, 
 level = “teacher”, 
 type = “psi”, 
 pattern = psi_pattern,
 value = psi_value) 
 # Alpha create
alpha_pattern <- matrix( c(1,1) , 2,1) 
alpha_value <- matrix( c(1.1,2.1) , 2,1) 
 # Alpha add
brim <- xxmWithinMatrix( model = “brim”, 
 level = “teacher”, 
 type = “alpha”, 
 pattern = alpha_pattern,
 value = alpha_value)

Teacher->Student Across Matrices

 # Lambda create
lambda_pattern <- matrix( c(0,0,0,0), 2,2 )
lambda_value   <- matrix( c(1,0,0,1), 2,2 ) 
 # Lambda add
brim <- xxmBetweenMatrix( model = “brim”, 
 parent = “teacher”, 
 child = “student”, 
 type = “lambda”,  
 pattern = lambda_pattern,
 value = lambda_value)

The following diagram illustrates a one-to-one correspondence between the path-diagram, corresponding model matrices, and xxM specification.

Note:

Run

If all went well above and xxM did not produce any error messages, then our model object brim has all the information it needs to estimate the model parameters. We can begin estimation by issuing a simple command:

brim <- xxmRun(brim)

Code listing

[su_tabs] [su_tab title=“xxM”]

Load xxM and data

library(xxm)
data(brim.xxm, package = "xxm")

Construct R-matrices

For each parameter matrix, construct three related matrices:

  1. pattern matrix: A matrix indicating free or fixed parameters.
  2. value matrix: with start or fixed values for corresponding parameters.
  3. label matrix: with user friendly label for each parameter. label matrix is optional.
theta_pattern <- matrix(c(1, 1, 1, 1), nrow = 2, ncol = 2)
theta_value <- matrix(c(1, 0, 0, 1), nrow = 2, ncol = 2)
alpha_pattern <- matrix(1, 2, 1)
alpha_value <- matrix(c(0, 0), 2, 1)
psi_pattern <- matrix(c(1, 1, 1, 1), , 2, 2)
psi_value <- matrix(c(0.1, 0, 0, 0.1), 2, 2)
lambda_pattern <- matrix(c(0, 0, 0, 0), 2, 2)
lambda_value <- matrix(c(1, 0, 0, 1), 2, 2)

Construct main model object

xxmModel() is used to declare level names. The function returns a model object that is passed as a parameter to subsequent stattements.

xm <- xxmModel(levels = c("student", "teacher"))
## A new model was created.

Add submodels to the model objects

For each declared level xxmSubmodel() is invoked to add corresponding submodel to the model object. The function adds three pieces of information: 1. parents declares a list of parents of the current level. 2. variables declares names of observed dependent (ys), observed independent (xs) and latent variables (etas) for the level. 3. data R data object for the current level.

### Submodel: Student
xm <- xxmSubmodel(model = xm, level = "student", parents = c("teacher"), ys = c("y1", 
    "y2"), xs = , etas = , data = brim.student)
## Submodel for level `student` was created.
### Submodel: Teacher
xm <- xxmSubmodel(model = xm, leve = "teacher", parents = , ys = , xs = , etas = c("eta1", 
    "eta2"), data = brim.teacher)
## Submodel for level `teacher` was created.

Add Within-level parameter matrices for each submodel

For each declared level xxmWithinMatrix() is used to add within-level parameter matrices. For each parameter matrix, the function adds the three matrices constructed earlier:

### Within-Matrix: Level :: Student, Type :: observed residual-covariance
xm <- xxmWithinMatrix(model = xm, level = "student", type = "theta", pattern = theta_pattern, 
    value = theta_value)
## 
## 'theta' matrix does not exist and will be added.
##  Added `theta` matrix.

### Within-Matrix: Level :: Teacher, Type :: latent means
xm <- xxmWithinMatrix(model = xm, level = "teacher", type = "alpha", pattern = alpha_pattern, 
    value = alpha_value)
## 
## 'alpha' matrix does not exist and will be added.
##  Added `alpha` matrix.

### Within-Matrix: Level :: Teacher, Type :: latent residual covariance
xm <- xxmWithinMatrix(model = xm, level = "teacher", type = "psi", pattern = psi_pattern, 
    value = psi_value)
## 
## 'psi' matrix does not exist and will be added.
##  Added `psi` matrix.

Add Across-level parameter matrices to the model

Pairs of levels that share parent-child relationship have regression relationships. xxmBetweenMatrix() is used to add corresponding parameter matrices connecting the two levels.

For each parameter matrix, the function adds the three matrices constructed earlier:

### Between-Matrix: Parent :: Teacher, Child :: Student, Type ::
### factor-loadings
xm <- xxmBetweenMatrix(model = xm, parent = "teacher", child = "student", type = "lambda", 
    pattern = lambda_pattern, value = lambda_value)
## 
## 'lambda' matrix does not exist and will be added.
##  Added `lambda` matrix.

Estimate model parameters

Estimation process is initiated by xxmRun(). If all goes well, a quick printed summary of results is produced.

xm <- xxmRun(xm)
## ------------------------------------------------------------------------------
## Estimating model parameters
## ------------------------------------------------------------------------------
##                 456.5781921615 
##                 454.3359125436 
##                 452.6051730894 
##                 451.0652787053 
##                 450.0531684945 
##                 448.6173091428 
##                 447.5320743183 
##                 447.1954369085 
##                 445.8740258784 
##                 445.7721481833 
##                 445.4488373694 
##                 445.4357270893 
##                 445.4338550574 
##                 445.4331653124 
##                 445.4331434919 
##                 445.4331407646 
##                 445.4331407251 
## Model converged normally
## nParms: 8
## ------------------------------------------------------------------------------
## *
##  1:                             student_theta_1_1 ::      0.969 [     0.000,      0.000]
## 
##  2:                             student_theta_1_2 ::      0.338 [     0.000,      0.000]
## 
##  3:                             student_theta_2_2 ::      0.893 [     0.000,      0.000]
## 
##  4:                               teacher_psi_1_1 ::      0.752 [     0.000,      0.000]
## 
##  5:                               teacher_psi_1_2 ::      0.128 [     0.000,      0.000]
## 
##  6:                               teacher_psi_2_2 ::      0.159 [     0.000,      0.000]
## 
##  7:                             teacher_alpha_1_1 ::     -0.107 [     0.000,      0.000]
## 
##  8:                             teacher_alpha_2_1 ::      0.278 [     0.000,      0.000]
## 
## ------------------------------------------------------------------------------

Estimate profile-likelihood confidence intervals

Once parameters are estimated, confidence inetrvals are estimated by invoking xxmCI() . Depending on the the number of observations and the complexity of the dependence structure xxmCI() may take very long. xxMCI() displays a table of parameter estimates and CIS.

View results

A summary of results may be retrived as an R list by a call to xxmSummary()

s <- xxmSummary(xm)
s
## $fit
## $fit$deviance
## [1] 445.4
## 
## $fit$nParameters
## [1] 8
## 
## $fit$nObservations
## [1] 150
## 
## $fit$aic
## [1] 461.4
## 
## $fit$bic
## [1] 485.5
## 
## 
## $estimates
##      child  parent   to from             label estimate       low   high
## 1  student student   y1   y1 student_theta_1_1   0.9691  0.670784 1.4741
## 2  student student   y1   y2 student_theta_1_2   0.3381  0.090586 0.6823
## 3  student student   y2   y2 student_theta_2_2   0.8935  0.618407 1.3589
## 8  teacher teacher eta1 eta1   teacher_psi_1_1   0.7524  0.293264 1.6633
## 9  teacher teacher eta1 eta2   teacher_psi_1_2   0.1283 -0.156114 0.5437
## 10 teacher teacher eta2 eta2   teacher_psi_2_2   0.1592  0.001000 0.5548
## 11 teacher teacher eta1  One teacher_alpha_1_1  -0.1073 -0.529914 0.3154
## 12 teacher teacher eta2  One teacher_alpha_2_1   0.2781  0.002623 0.5537

Free moodel object

xxM model object may hog a large amount of RAM outside of R's memory. This memory will automatically be released, when R's workspace is cleared by a call to rm(list=ls()) or at the end of the R session. Alternatively, xxmFree() may be called to release memory.

xm <- xxmFree(xm)

[/su_tab] [su_tab title=“SAS: Proc Mixed”]

Proc Mixed data = brim covtest;
 CLASS teacher vars;
 MODEL y = vars / solution noint;
 RANDOM vars / subject = teacher type = un;
 REPEATED vars/ subject = student(teacher) type = un;
RUN;

[/su_tab] [/su_tabs]

For the current dataset, the parameter estimates are:

alt brim.results

[latexpage]