# Dyestuff: Random-Intercepts model

Dyestuff example included with lme4 involves a random-intercepts model of yield in different batches of production of chemicals. Level-1 is response and level-2 is batch

$y_{i}^{1} = \Lambda_{i,j}^{1,2} \times \eta_j^{2}+ e_{i},$

$e_i^{1} \sim N(0, \Theta)$ $\eta_{j}^2 \sim N(\alpha^{2}, \Psi^{2,2})$

# xxM Model Matrices

## Level-1 (Response): Within matrices

### Residual Covariance Matrix

We have a single dependent variable and hence a single parameter level-1 residual variance $$(\Theta_{1,1})$$ at level-1. Hence, the residual covariance or theta matrix is a (1×1) matrix:

$\Theta^{1,1} = \begin{bmatrix} \theta^{1,1}_{1,1} \end{bmatrix}$

## Level-2 (Batch): Within Matrices

At level-2, we have a single latent variable: Random-intercept of yield.

### Latent Means

With a single latent variable, the latent variable mean matrix is a (1×1) matrix:

$\alpha^2 = \begin{bmatrix} \alpha^2_1 \end{bmatrix}$

$$(\alpha^2_1)$$ is the mean of the intercept for yield.

### Latent Factor Covariance Matrix

The latent covariance matrix is a (1×1) matrix with single variance of the intercept:

$\Psi^{2,2}= \begin{bmatrix} \psi_{1,1}^{2,2} \end{bmatrix}$

## Across level matrices: Batch to Response

The factor-loading matrix $$(\Lambda^{1,2})$$ has a single element fixed to 1.0.

$\Lambda^{1,2} = \begin{bmatrix} 1.0_{i,j}^{1,2} \end{bmatrix}$

(insert diagram)

# Code Listing

xxMSAS: Proc MixedR: lmer

## xxM

xxM library needs to be loaded first.

library(xxm)


### Prepare R datasets

For this analysis, we use data packaged with lme4.
For analysis with xxM separate response and batch datasets are created as follows:

library(lme4)

## Loading required package: lattice

### Creating datasets for xxM
convert.magic1 <- function(obj, types) {
out <- lapply(1:length(obj), FUN = function(i) {
FUN1 <- switch(types[i], character = as.character, numeric = as.numeric,
factor = as.factor, integer = as.integer)
FUN1(obj[, i])
})
names(out) <- colnames(obj)
as.data.frame(out)
}
ds <- Dyestuff
# assign proper names
names(ds) <- c("batch", "yield")
# create a two-level dataset with response as level-1 and batch as level-2.
# create an ID column for the 'response' level with the same column name
ds$response <- 1:nrow(ds) # Ensure that the first column is the ID column (for)response) response <- ds[c("response", "batch", "yield")] # ID columns (response and batch) must be integer, Outcome must be numeric. response <- convert.magic1(response, c("integer", "integer", "numeric")) # create level-2 dataset. batch <- data.frame(batch = unique(response$batch))
str(response)

## 'data.frame':    30 obs. of  3 variables:
##  $response: int 1 2 3 4 5 6 7 8 9 10 ... ##$ batch   : int  1 1 1 1 1 2 2 2 2 2 ...
##  $yield : num 1545 1440 1440 1520 1580 ...  str(batch)  ## 'data.frame': 6 obs. of 1 variable: ##$ batch: int  1 2 3 4 5 6


response dataset includes the following columns:

• response Unique response IDs (integer)
• batch IDs for the parent level (integer)
• yield the dependent varaible (numeric)

batch dataset includes the following columns:

• batch Unique batch IDs (integer)

### Construct R-matrices

For each parameter matrix, construct three related matrices:

1. pattern matrix: A matrix indicating free or fixed parameters.
2. value matrix: with start or fixed values for corresponding parameters.
3. label matrix: with user friendly label for each parameter. label matrix is optional.
# level-1 (response) residual variance
theta_pattern <- matrix(1, nrow = 1, ncol = 1)
theta_value <- matrix(4000, 1, 1)
# level-2 (batch) latent variance
psi_pattern <- matrix(1, nrow = 1, ncol = 1)
psi_value <- matrix(1, nrow = 1, ncol = 1)
# level-2 (batch) latent mean
alpha_pattern <- matrix(1, nrow = 1, ncol = 1)
alpha_value <- matrix(1500, nrow = 1, ncol = 1)
lambda_pattern <- matrix(0, nrow = 1, ncol = 1)
lambda_value <- matrix(1, nrow = 1, ncol = 1)


### Construct model

xxmModel() is used to declare level names. The function returns a model object that is passed as a parameter to subsequent statements. Variable name for the return value can be anything.

dyes <- xxmModel(levels = c("response", "batch"))

## A new model was created.


For each declared level xxmSubmodel() is invoked to add corresponding submodel to the model object. The function adds three types of information to the model object:

• parents declares a list of all parents of the current level.
• Level with the independent variable is the parent level.
• Level with the dependent variable is the child level.
• variables declares names of observed dependent (ys), observed independent (xs) and latent variables (etas) for the level.
• data R data object for the current level.
### Submodel: Response
dyes <- xxmSubmodel(model = dyes, level = "response", parents = "batch", ys = "yield",
xs = , etas = , data = response)

## Submodel for level response was created.

### Submodel: Batch
dyes <- xxmSubmodel(model = dyes, level = "batch", parents = , ys = , xs = ,
etas = "intercept", data = batch)

## Submodel for level batch was created.


For each declared level xxmWithinMatrix() is used to add within-level parameter matrices. For each parameter matrix, the function adds the three matrices constructed earlier:

• pattern
• value
• label (optional)
### Within-Matrix: Level :: Response, Type :: observed residual-covariance
dyes <- xxmWithinMatrix(model = dyes, level = "response", type = "theta", pattern = theta_pattern,
value = theta_value, label = )

##
## 'theta' matrix does not exist and will be added.
##  Added theta matrix.

### Within-Matrix: Level :: Batch, Type ::latent-covariance
dyes <- xxmWithinMatrix(model = dyes, level = "batch", type = "psi", pattern = psi_pattern,
value = psi_value, label = )

##
## 'psi' matrix does not exist and will be added.
##  Added psi matrix.

### Within-Matrix: Level :: Batch, Type ::latent-mean
dyes <- xxmWithinMatrix(model = dyes, level = "batch", type = "alpha", pattern = alpha_pattern,
value = alpha_value, label = )

##
## 'alpha' matrix does not exist and will be added.
##  Added alpha matrix.


Pairs of levels that share parent-child relationship have regression relationships. xxmBetweenMatrix() is used to add corresponding rergession matrices connecting the two levels.

• Level with the independent variable is the parent level.
• Level with the dependent variable is the child level.

For each parameter matrix, the function adds the three matrices constructed earlier:

• pattern
• value
• label (optional)
### Between-Matrix: Child :: Response, Parent :: Batch, Type :: factor-loading
dyes <- xxmBetweenMatrix(model = dyes, parent = "batch", child = "response",
type = "lambda", pattern = lambda_pattern, value = lambda_value, label = )

##
## 'lambda' matrix does not exist and will be added.
##  Added lambda matrix.


### Estimate model parameters

Estimation process is initiated by xxmRun(). If all goes well, a q&d summary of the results is printed.

dyes <- xxmRun(dyes)

## ------------------------------------------------------------------------------
## Estimating model parameters
## ------------------------------------------------------------------------------
##                 338.3147186475
##                 337.4404249706
##                 334.7574504898
##                 333.6774516610
##                 333.5732949180
##                 333.5095319120
##                 333.3473806261
##                 333.0668046496
##                 332.6954956766
##                 332.4430076297
##                 332.3671913446
##                 332.3362465828
##                 332.1836952341
##                 331.4390119131
##                 330.4236242031
##                 329.9603694067
##                 328.3308444258
##                 327.4561124889
##                 327.3876794899
##                 327.3329362482
##                 327.3271607404
##                 327.3270605553
##                 327.3270599048
##                 327.3270598811
## Model converged normally
## nParms: 3
## ------------------------------------------------------------------------------
## *
##  1:                            response_theta_1_1 ::   2451.250 [     0.000,      0.000]
##
##  2:                                 batch_psi_1_1 ::   1388.333 [     0.000,      0.000]
##
##  3:                               batch_alpha_1_1 ::   1527.500 [     0.000,      0.000]
##
## ------------------------------------------------------------------------------


### Estimate profile-likelihood confidence intervals

Once parameters are estimated, confidence intervals are estimated by invoking xxmCI(). Depending on the the number of observations and the complexity of the model, xxmCI() may take a long time to compute. xxmCI() also prints a summary of parameter estimates and CIS.

dyes <- xxmCI(dyes)


### View results

A summary of results may be retrived as an R list by a call to xxmSummary(). The returned list has two elements:

1. fit is a list with five elements:
• deviance is $$-2 Log Likelihood$$ for the maximum likelihood fit function.
• nParameters is the total number of unique parameters.
• nObservations is the total number of observations across all levels.
• aic is Akaike’s Information Criterion or AIC computed as $$-2ll + 2*p$$.
• bic is Bayesian Information Criterion or BIC computed as $$-2ll + p*\log(n)$$.
2. estimates is a single table of free parameter estimates
All xxM parameters have superscripts {child, parent} and subscripts {to, from}. xxM adds a descriptive parameter label if one is not already provided by the user.
xxmSummary(dyes)

## $fit ##$fit$deviance ## [1] 327.3 ## ##$fit$nParameters ## [1] 3 ## ##$fit$nObservations ## [1] 30 ## ##$fit$aic ## [1] 333.3 ## ##$fit$bic ## [1] 337.5 ## ## ##$estimates
##      child   parent        to      from              label estimate    low
## 1 response response     yield     yield response_theta_1_1     2451 1461.5
## 3    batch    batch intercept intercept      batch_psi_1_1     1388  148.8
## 4    batch    batch intercept       One    batch_alpha_1_1     1527 1486.5
##   high
## 1 4578
## 3 7067
## 4 1569


### Free model object

xxM model object may hog a significant amount of RAM outside of R’s workspace. This memory will automatically be released, when the workspace is cleared by a call to rm(list=ls()) or at the end of the R session. Alternatively, it is recommended that xxmFree() may be called to release the memory.

dyes <- xxmFree(dyes)
rm(list = ls())


## Proc Mixed

PROC MIXED data=rdata method=ML covtest;
CLASS batch;
MODEL yield = /s;
RANDOM intercept /subject=batch type=UN G GCORR;
RUN;


## R: lmer

library(lme4)
(fm01 <- lmer(Yield ~ 1 + (1 | Batch), REML = FALSE, Dyestuff))

## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: Yield ~ 1 + (1 | Batch)
##    Data: Dyestuff
##      AIC      BIC   logLik deviance
##    333.3    337.5   -163.7    327.3
## Random effects:
##  Groups   Name        Std.Dev.
##  Batch    (Intercept) 37.3
##  Residual             49.5
## Number of obs: 30, groups: Batch, 6
## Fixed Effects:
## (Intercept)
##        1527