Table of Contents
Dyestuff: Random-Intercepts model
Dyestuff example included with lme4 involves a random-intercepts model of yield in different batches of production of chemicals. Level-1 is response and level-2 is batch
\[ y_{i}^{1} = \Lambda_{i,j}^{1,2} \times \eta_j^{2}+ e_{i}, \]
\[ e_i^{1} \sim N(0, \Theta) \] \[ \eta_{j}^2 \sim N(\alpha^{2}, \Psi^{2,2}) \]
xxM Model Matrices
Level-1 (Response): Within matrices
Residual Covariance Matrix
We have a single dependent variable and hence a single parameter level-1 residual variance \( (\Theta_{1,1}) \) at level-1. Hence, the residual covariance or theta matrix is a (1×1) matrix:
\Theta^{1,1} =
\begin{bmatrix}
\theta^{1,1}_{1,1}
\end{bmatrix}
\]
Level-2 (Batch): Within Matrices
At level-2, we have a single latent variable: Random-intercept of yield.
Latent Means
With a single latent variable, the latent variable mean matrix is a (1×1) matrix:
\alpha^2 =
\begin{bmatrix}
\alpha^2_1
\end{bmatrix}
\]
\( (\alpha^2_1) \) is the mean of the intercept for yield.
Latent Factor Covariance Matrix
The latent covariance matrix is a (1×1) matrix with single variance of the intercept:
\Psi^{2,2}=
\begin{bmatrix}
\psi_{1,1}^{2,2}
\end{bmatrix}
\]
Across level matrices: Batch to Response
Across-level factor-loading matrix
The factor-loading matrix \( (\Lambda^{1,2}) \) has a single element fixed to 1.0.
\Lambda^{1,2} =
\begin{bmatrix}
1.0_{i,j}^{1,2}
\end{bmatrix}
\]
Path Diagram
(insert diagram)
Code Listing
xxM
Load xxM
xxM library needs to be loaded first.
library(xxm)
Prepare R datasets
For this analysis, we use data packaged with lme4.
For analysis with xxM separate response and batch datasets are created as follows:
library(lme4)
## Loading required package: lattice
## Loading required package: Matrix
### Creating datasets for xxM
convert.magic1 <- function(obj, types) {
out <- lapply(1:length(obj), FUN = function(i) {
FUN1 <- switch(types[i], character = as.character, numeric = as.numeric,
factor = as.factor, integer = as.integer)
FUN1(obj[, i])
})
names(out) <- colnames(obj)
as.data.frame(out)
}
ds <- Dyestuff
# assign proper names
names(ds) <- c("batch", "yield")
# create a two-level dataset with response as level-1 and batch as level-2.
# create an ID column for the 'response' level with the same column name
ds$response <- 1:nrow(ds)
# Ensure that the first column is the ID column (for)response)
response <- ds[c("response", "batch", "yield")]
# ID columns (response and batch) must be integer, Outcome must be numeric.
response <- convert.magic1(response, c("integer", "integer", "numeric"))
# create level-2 dataset.
batch <- data.frame(batch = unique(response$batch))
str(response)
## 'data.frame': 30 obs. of 3 variables:
## $ response: int 1 2 3 4 5 6 7 8 9 10 ...
## $ batch : int 1 1 1 1 1 2 2 2 2 2 ...
## $ yield : num 1545 1440 1440 1520 1580 ...
str(batch)
## 'data.frame': 6 obs. of 1 variable:
## $ batch: int 1 2 3 4 5 6
response dataset includes the following columns:
- response Unique response IDs (integer)
- batch IDs for the parent level (integer)
- yield the dependent varaible (numeric)
batch dataset includes the following columns:
- batch Unique batch IDs (integer)
Construct R-matrices
For each parameter matrix, construct three related matrices:
- pattern matrix: A matrix indicating free or fixed parameters.
- value matrix: with start or fixed values for corresponding parameters.
- label matrix: with user friendly label for each parameter. label matrix is optional.
# level-1 (response) residual variance
theta_pattern <- matrix(1, nrow = 1, ncol = 1)
theta_value <- matrix(4000, 1, 1)
# level-2 (batch) latent variance
psi_pattern <- matrix(1, nrow = 1, ncol = 1)
psi_value <- matrix(1, nrow = 1, ncol = 1)
# level-2 (batch) latent mean
alpha_pattern <- matrix(1, nrow = 1, ncol = 1)
alpha_value <- matrix(1500, nrow = 1, ncol = 1)
# batch -> response: Factor-loading
lambda_pattern <- matrix(0, nrow = 1, ncol = 1)
lambda_value <- matrix(1, nrow = 1, ncol = 1)
Construct model
xxmModel()
is used to declare level names. The function returns a model object that is passed as a parameter to subsequent statements. Variable name for the return value can be anything.
dyes <- xxmModel(levels = c("response", "batch"))
## A new model was created.
Add submodels
For each declared level xxmSubmodel()
is invoked to add corresponding submodel to the model object. The function adds three types of information to the model object:
- parents declares a list of all parents of the current level.
- Level with the independent variable is the parent level.
- Level with the dependent variable is the child level.
- variables declares names of observed dependent (ys), observed independent (xs) and latent variables (etas) for the level.
- data R data object for the current level.
### Submodel: Response
dyes <- xxmSubmodel(model = dyes, level = "response", parents = "batch", ys = "yield",
xs = , etas = , data = response)
## Submodel for level `response` was created.
### Submodel: Batch
dyes <- xxmSubmodel(model = dyes, level = "batch", parents = , ys = , xs = ,
etas = "intercept", data = batch)
## Submodel for level `batch` was created.
Add within-level matrices
For each declared level xxmWithinMatrix()
is used to add within-level parameter matrices. For each parameter matrix, the function adds the three matrices constructed earlier:
- pattern
- value
- label (optional)
### Within-Matrix: Level :: Response, Type :: observed residual-covariance
dyes <- xxmWithinMatrix(model = dyes, level = "response", type = "theta", pattern = theta_pattern,
value = theta_value, label = )
##
## 'theta' matrix does not exist and will be added.
## Added `theta` matrix.
### Within-Matrix: Level :: Batch, Type ::latent-covariance
dyes <- xxmWithinMatrix(model = dyes, level = "batch", type = "psi", pattern = psi_pattern,
value = psi_value, label = )
##
## 'psi' matrix does not exist and will be added.
## Added `psi` matrix.
### Within-Matrix: Level :: Batch, Type ::latent-mean
dyes <- xxmWithinMatrix(model = dyes, level = "batch", type = "alpha", pattern = alpha_pattern,
value = alpha_value, label = )
##
## 'alpha' matrix does not exist and will be added.
## Added `alpha` matrix.
Add across-level matrices
Pairs of levels that share parent-child relationship have regression relationships. xxmBetweenMatrix()
is used to add corresponding rergession matrices connecting the two levels.
- Level with the independent variable is the parent level.
- Level with the dependent variable is the child level.
For each parameter matrix, the function adds the three matrices constructed earlier:
- pattern
- value
- label (optional)
### Between-Matrix: Child :: Response, Parent :: Batch, Type :: factor-loading
dyes <- xxmBetweenMatrix(model = dyes, parent = "batch", child = "response",
type = "lambda", pattern = lambda_pattern, value = lambda_value, label = )
##
## 'lambda' matrix does not exist and will be added.
## Added `lambda` matrix.
Estimate model parameters
Estimation process is initiated by xxmRun()
. If all goes well, a q&d summary of the results is printed.
dyes <- xxmRun(dyes)
## ------------------------------------------------------------------------------
## Estimating model parameters
## ------------------------------------------------------------------------------
## 338.3147186475
## 337.4404249706
## 334.7574504898
## 333.6774516610
## 333.5732949180
## 333.5095319120
## 333.3473806261
## 333.0668046496
## 332.6954956766
## 332.4430076297
## 332.3671913446
## 332.3362465828
## 332.1836952341
## 331.4390119131
## 330.4236242031
## 329.9603694067
## 328.3308444258
## 327.4561124889
## 327.3876794899
## 327.3329362482
## 327.3271607404
## 327.3270605553
## 327.3270599048
## 327.3270598811
## Model converged normally
## nParms: 3
## ------------------------------------------------------------------------------
## *
## 1: response_theta_1_1 :: 2451.250 [ 0.000, 0.000]
##
## 2: batch_psi_1_1 :: 1388.333 [ 0.000, 0.000]
##
## 3: batch_alpha_1_1 :: 1527.500 [ 0.000, 0.000]
##
## ------------------------------------------------------------------------------
Estimate profile-likelihood confidence intervals
Once parameters are estimated, confidence intervals are estimated by invoking xxmCI()
. Depending on the the number of observations and the complexity of the model, xxmCI()
may take a long time to compute. xxmCI()
also prints a summary of parameter estimates and CIS.
dyes <- xxmCI(dyes)
View results
A summary of results may be retrived as an R list by a call to xxmSummary()
. The returned list has two elements:
- fit is a list with five elements:
- deviance is \( -2 Log Likelihood \) for the maximum likelihood fit function.
- nParameters is the total number of unique parameters.
- nObservations is the total number of observations across all levels.
- aic is Akaike’s Information Criterion or AIC computed as \( -2ll + 2*p \).
- bic is Bayesian Information Criterion or BIC computed as \( -2ll + p*\log(n) \).
- estimates is a single table of free parameter estimates
All xxM parameters have superscripts {child, parent} and subscripts {to, from}. xxM adds a descriptive parameter label if one is not already provided by the user.
xxmSummary(dyes)
## $fit
## $fit$deviance
## [1] 327.3
##
## $fit$nParameters
## [1] 3
##
## $fit$nObservations
## [1] 30
##
## $fit$aic
## [1] 333.3
##
## $fit$bic
## [1] 337.5
##
##
## $estimates
## child parent to from label estimate low
## 1 response response yield yield response_theta_1_1 2451 1461.5
## 3 batch batch intercept intercept batch_psi_1_1 1388 148.8
## 4 batch batch intercept One batch_alpha_1_1 1527 1486.5
## high
## 1 4578
## 3 7067
## 4 1569
Free model object
xxM
model object may hog a significant amount of RAM outside of R’s workspace. This memory will automatically be released, when the workspace is cleared by a call to rm(list=ls())
or at the end of the R session. Alternatively, it is recommended that xxmFree()
may be called to release the memory.
dyes <- xxmFree(dyes)
rm(list = ls())
Proc Mixed
PROC MIXED data=rdata method=ML covtest;
CLASS batch;
MODEL yield = /s;
RANDOM intercept /subject=batch type=UN G GCORR;
RUN;
R: lmer
library(lme4)
(fm01 <- lmer(Yield ~ 1 + (1 | Batch), REML = FALSE, Dyestuff))
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: Yield ~ 1 + (1 | Batch)
## Data: Dyestuff
## AIC BIC logLik deviance
## 333.3 337.5 -163.7 327.3
## Random effects:
## Groups Name Std.Dev.
## Batch (Intercept) 37.3
## Residual 49.5
## Number of obs: 30, groups: Batch, 6
## Fixed Effects:
## (Intercept)
## 1527