Multivariate Covariance Generalized Linear Models for the Analysis of Experimental Data
|
Wagner H. Bonat | Walmes M. Zeviani |
---|---|
wbonat@ufpr.br |
walmes@ufpr.br |
LEG/DEST/UFPR | LEG/DEST/UFPR |
62a RBras & 17o SEAGRO July 24–28, 2017 UFLA, Lavras/MG
The dataset contains the values of chemical variables of 3 layers of the soil in 50 sites cultivated with teak (Tectona grandis). The study was carried out in 2015, in Teca plantations belonging to two farms located in the western region of the state of Mato Grosso. The selection of study areas within the farms was done by means of free walking, covering the entire cultivated area of 1869 ha, making field observations and delimiting plots within the stands from the characteristics of the soils, position in the landscape and development of culture. Fifty plots with 600 m2 (20 x 30 m) each were allocated. As a selection criterion, only fields with an area of more than seven hectares (7 ha) were selected, using only the areas with the same planting density, management practices aged 13-14 years.
The main goal of the study is to conduct a profile analysis to verify if there is a general pattern of cation concentration (K+, Ca2+ and Mg2+) in the soil layers.
This dataset is part of data set with more tables (teca_arv
, teca_crapar
, teca_cra
, teca_gran
and teca_qui
). These data are available on the package EACS
(http://leg.ufpr.br/~walmes/pacotes/EACS/index.html). Please do not use datasets without the researchers’ permission because the results of these researches are being submitted to journals at the moment. If you want to use the data to propose statistical methodologies, contact the researchers as well.
#-----------------------------------------------------------------------
# Packages.
library(latticeExtra)
library(car)
library(reshape2)
library(corrplot)
library(grid)
library(gridExtra)
library(mcglm)
library(Matrix)
#-----------------------------------------------------------------------
# Getting the dataset.
# Online documentation of the EACS::teca_qui dataset.
u <- "http://leg.ufpr.br/~walmes/pacotes/EACS/reference/teca_qui.html"
browseURL(u)
csv <- "https://raw.githubusercontent.com/walmes/EACS/master/data-raw/teca_qui.csv"
teca <- read.csv2(file = csv, dec = ".")
str(teca)
## 'data.frame': 150 obs. of 15 variables:
## $ loc: int 1 1 1 2 2 2 3 3 3 4 ...
## $ cam: Factor w/ 3 levels "[0, 5)","[40, 80)",..: 1 3 2 1 3 2 1 3 2 1 ...
## $ ph : num 6.8 6.7 6.7 4.7 4.7 4.9 7.6 6.8 6.9 6.6 ...
## $ p : num 22.51 0.83 0.01 3.89 0.69 ...
## $ k : num 72.24 13.42 7.23 48.13 12.34 ...
## $ ca : num 8.27 2.91 2.33 0.97 0.76 0.21 7.63 3.22 2.22 5.54 ...
## $ mg : num 1.7 1.77 0.51 0.16 0.14 0 1.53 1.31 1.09 1.77 ...
## $ al : num 0 0 0 0.3 0.6 0.6 0 0 0 0 ...
## $ ctc: num 12.47 6.57 4.52 5.3 4.17 ...
## $ sat: num 81.4 71.7 63.2 23.7 22.4 ...
## $ mo : num 72.2 25.6 9.7 34.4 8.7 9.7 50.4 15.6 9.5 50.2 ...
## $ arg: num 184 215 286 232 213 ...
## $ are: num 770 750 674 741 775 ...
## $ cas: num 1.8 2.2 3.7 0.4 1.1 1.7 8.4 19 14.3 5.5 ...
## $ acc: num 770 750 676 741 775 ...
teca$cam <- factor(teca$cam, labels = 1:3)
#-----------------------------------------------------------------------
# Graphical exploratory analysis.
# The cations.
v <- c("k", "ca", "mg")
summary(teca[v])
## k ca mg
## Min. : 4.13 Min. : 0.210 Min. :0.0000
## 1st Qu.: 15.64 1st Qu.: 1.305 1st Qu.:0.3825
## Median : 31.47 Median : 2.445 Median :1.1300
## Mean : 44.46 Mean : 3.567 Mean :1.2361
## 3rd Qu.: 59.25 3rd Qu.: 4.835 3rd Qu.:1.8025
## Max. :188.96 Max. :14.410 Max. :4.9200
# Only the cations.
densityplot(~k + ca + I(mg + 0.1),
outer = TRUE,
groups = cam,
scales = list(relation = "free",
x = list(log = 10)),
as.table = TRUE,
data = teca)
scatterplotMatrix(~log(k) + log(ca) + log(mg + 0.1) | cam,
data = teca,
gap = 0,
smooth = FALSE,
reg.line = FALSE,
ellipse = TRUE,
by.groups = TRUE,
diagonal = "qqplot")
# Transformed copies of the variables.
teca <- transform(teca,
lk = log(k),
lca = log(ca),
lmg = log(mg + 0.1))
# Names of the transformed variables.
v <- tail(names(teca), n = 3)
v
## [1] "lk" "lca" "lmg"
#-----------------------------------------------------------------------
# Repeated measures design.
# Outer factor: none.
# Inner factor: soil layer (`cam`)
# Responses: 3 cations x 3 layers = 9 conditions.
# Long format.
tecal <- melt(data = teca[c("loc", "cam", v)],
measure.vars = v,
variable.name = "res")
str(tecal)
## 'data.frame': 450 obs. of 4 variables:
## $ loc : int 1 1 1 2 2 2 3 3 3 4 ...
## $ cam : Factor w/ 3 levels "1","2","3": 1 3 2 1 3 2 1 3 2 1 ...
## $ res : Factor w/ 3 levels "lk","lca","lmg": 1 1 1 1 1 1 1 1 1 1 ...
## $ value: num 4.28 2.6 1.98 3.87 2.51 ...
bwplot(value ~ cam | res,
pch = "|",
# scales = list(y = list(relation = "free")),
layout = c(NA, 1),
xlab = "Soil layer",
ylab = "Observed values in the transformed scale",
data = tecal) +
layer(panel.xyplot(x = x,
y = y,
jitter.x = TRUE,
type = c("p", "a")))
# Combine 3 responses x 3 layers = 9 response variables.
tecal$res.cam <- with(tecal, paste(res, cam, sep = "."))
# Wide format.
tecaw <- dcast(data = tecal,
formula = loc ~ res.cam,
value = "value",
stringsAsFactors = FALSE)
# Presented order of the responses.
ord <- c(t(outer(Y = levels(tecal$cam),
X = v,
FUN = paste,
sep = ".")))
tecaw <- tecaw[, c("loc", ord)]
str(tecaw)
## 'data.frame': 50 obs. of 10 variables:
## $ loc : int 1 2 3 4 5 6 7 8 9 10 ...
## $ lk.1 : num 4.28 3.87 4.74 3.99 4.04 ...
## $ lk.2 : num 1.98 2.16 3.29 2.61 2.61 ...
## $ lk.3 : num 2.6 2.51 3.25 2.41 3.1 ...
## $ lca.1: num 2.1126 -0.0305 2.0321 1.712 1.0953 ...
## $ lca.2: num 0.8459 -1.5606 0.7975 -0.2877 0.0198 ...
## $ lca.3: num 1.0682 -0.2744 1.1694 0.8109 -0.0726 ...
## $ lmg.1: num 0.588 -1.347 0.489 0.626 0.798 ...
## $ lmg.2: num -0.494 -2.303 0.174 -1.347 -1.966 ...
## $ lmg.3: num 0.626 -1.427 0.344 -1.171 -0.478 ...
#-----------------------------------------------------------------------
# Repeated measures analysis.
dput(names(tecaw)[-1])
## c("lk.1", "lk.2", "lk.3", "lca.1", "lca.2", "lca.3", "lmg.1",
## "lmg.2", "lmg.3")
# Multivariate linear model for 9 responses.
m0 <- lm(as.matrix(tecaw[, 2:10]) ~ 1)
m0
##
## Call:
## lm(formula = as.matrix(tecaw[, 2:10]) ~ 1)
##
## Coefficients:
## lk.1 lk.2 lk.3 lca.1 lca.2 lca.3 lmg.1
## (Intercept) 4.2521 2.9425 3.1651 1.5684 0.3443 0.7683 0.4740
## lmg.2 lmg.3
## (Intercept) -0.4681 -0.1056
# summary(m0)
# summary.aov(m0)
anova(m0)
## Analysis of Variance Table
##
## Df Pillai approx F num Df den Df Pr(>F)
## (Intercept) 1 0.99413 771.53 9 41 < 2.2e-16 ***
## Residuals 49
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Extract the raw residuals.
r <- residuals(m0)
# Checking the models assumptions on the residuals.
scatterplotMatrix(r,
gap = 0,
smooth = FALSE,
reg.line = FALSE,
ellipse = TRUE,
diagonal = "qqplot")
corrplot(cor(r),
type = "upper",
tl.pos = "d",
outline = TRUE,
method = "ellipse")
# Inner factors data design.
# DANGER this order changes de SS.
idata <- expand.grid(cam = levels(tecal$cam),
res = v,
KEEP.OUT.ATTRS = FALSE)
str(idata)
## 'data.frame': 9 obs. of 2 variables:
## $ cam: Factor w/ 3 levels "1","2","3": 1 2 3 1 2 3 1 2 3
## $ res: Factor w/ 3 levels "lk","lca","lmg": 1 1 1 2 2 2 3 3 3
Anova(m0,
idata = idata,
idesign = ~res * cam)
## Note: model has only an intercept; equivalent type-III tests substituted.
##
## Type III Repeated Measures MANOVA Tests: Pillai test statistic
## Df test stat approx F num Df den Df Pr(>F)
## (Intercept) 1 0.86275 308.00 1 49 < 2.2e-16 ***
## res 1 0.97606 978.44 2 48 < 2.2e-16 ***
## cam 1 0.86876 158.87 2 48 < 2.2e-16 ***
## res:cam 1 0.32750 5.60 4 46 0.000929 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#-----------------------------------------------------------------------
# Regression parameters.
B <- coef(m0)
colnames(B)
## [1] "lk.1" "lk.2" "lk.3" "lca.1" "lca.2" "lca.3" "lmg.1" "lmg.2" "lmg.3"
# The L matrix is only a scalar.
L <- matrix(1, nrow = 1, ncol = 1)
# Check the orders.
cbind(idata, res.cam = cbind(colnames(B)))
## cam res res.cam
## 1 1 lk lk.1
## 2 2 lk lk.2
## 3 3 lk lk.3
## 4 1 lca lca.1
## 5 2 lca lca.2
## 6 3 lca lca.3
## 7 1 lmg lmg.1
## 8 2 lmg lmg.2
## 9 3 lmg lmg.3
# To test effect of responses.
M_res <- cbind(c(-1, -1, -1, -1, -1, -1, 2, 2, 2),
c(-1, -1, -1, 1, 1, 1, 0, 0, 0))
linearHypothesis(m0,
test = "Pillai",
hypothesis.matrix = L,
P = M_res)
##
## Response transformation matrix:
## [,1] [,2]
## lk.1 -1 -1
## lk.2 -1 -1
## lk.3 -1 -1
## lca.1 -1 1
## lca.2 -1 1
## lca.3 -1 1
## lmg.1 2 0
## lmg.2 2 0
## lmg.3 2 0
##
## Sum of squares and products for the hypothesis:
## [,1] [,2]
## [1,] 8764.849 5083.320
## [2,] 5083.320 2948.156
##
## Sum of squares and products for error:
## [,1] [,2]
## [1,] 254.64335 60.57668
## [2,] 60.57668 175.97410
##
## Multivariate Test:
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.9760584 978.4396 2 48 < 2.22e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# To test effect of soil layer.
M_cam <- cbind(c(-1, -1, 2, -1, -1, 2, -1, -1, 2),
c(-1, 1, 0, -1, 1, 0, -1, 1, 0))
linearHypothesis(m0,
test = "Pillai",
hypothesis.matrix = L,
P = M_cam)
##
## Response transformation matrix:
## [,1] [,2]
## lk.1 -1 -1
## lk.2 -1 1
## lk.3 2 0
## lca.1 -1 -1
## lca.2 -1 1
## lca.3 2 0
## lmg.1 -1 -1
## lmg.2 -1 1
## lmg.3 2 0
##
## Sum of squares and products for the hypothesis:
## [,1] [,2]
## [1,] 106.2188 253.3122
## [2,] 253.3122 604.1031
##
## Sum of squares and products for error:
## [,1] [,2]
## [1,] 108.22479 19.95788
## [2,] 19.95788 94.89703
##
## Multivariate Test:
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.8687589 158.8695 2 48 < 2.22e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# GOOD: To easy get the M matrices, creates a model matrix.
M <- model.matrix(~res * cam,
data = idata,
contrasts = list(res = contr.sum,
cam = contr.sum))
a <- attr(M, "assign")
# This design matrix has all the slices needed.
M
## (Intercept) res1 res2 cam1 cam2 res1:cam1 res2:cam1 res1:cam2 res2:cam2
## 1 1 1 0 1 0 1 0 0 0
## 2 1 1 0 0 1 0 0 1 0
## 3 1 1 0 -1 -1 -1 0 -1 0
## 4 1 0 1 1 0 0 1 0 0
## 5 1 0 1 0 1 0 0 0 1
## 6 1 0 1 -1 -1 0 -1 0 -1
## 7 1 -1 -1 1 0 -1 -1 0 0
## 8 1 -1 -1 0 1 0 0 -1 -1
## 9 1 -1 -1 -1 -1 1 1 1 1
## attr(,"assign")
## [1] 0 1 1 2 2 3 3 3 3
## attr(,"contrasts")
## attr(,"contrasts")$res
## [,1] [,2]
## lk 1 0
## lca 0 1
## lmg -1 -1
##
## attr(,"contrasts")$cam
## [,1] [,2]
## 1 1 0
## 2 0 1
## 3 -1 -1
# ATTENTION: The order or lines in the `idesign` and in the `B`
# parameter matrix must match.
linearHypothesis(m0,
test = "Pillai",
hypothesis.matrix = L,
P = M[, a == 3])
##
## Response transformation matrix:
## res1:cam1 res2:cam1 res1:cam2 res2:cam2
## lk.1 1 0 0 0
## lk.2 0 0 1 0
## lk.3 -1 0 -1 0
## lca.1 0 1 0 0
## lca.2 0 0 0 1
## lca.3 0 -1 0 -1
## lmg.1 -1 -1 0 0
## lmg.2 0 0 -1 -1
## lmg.3 1 1 1 1
##
## Sum of squares and products for the hypothesis:
## res1:cam1 res2:cam1 res1:cam2 res2:cam2
## res1:cam1 12.875668 5.5941310 3.5528360 -1.5596193
## res2:cam1 5.594131 2.4304993 1.5436116 -0.6776126
## res1:cam2 3.552836 1.5436116 0.9803486 -0.4303522
## res2:cam2 -1.559619 -0.6776126 -0.4303522 0.1889154
##
## Sum of squares and products for error:
## res1:cam1 res2:cam1 res1:cam2 res2:cam2
## res1:cam1 36.581045 28.060288 9.006497 9.776567
## res2:cam1 28.060288 40.915091 7.408952 15.453836
## res1:cam2 9.006497 7.408952 16.163552 9.554192
## res2:cam2 9.776567 15.453836 9.554192 25.045473
##
## Multivariate Test:
## Df test stat approx F num Df den Df Pr(>F)
## Pillai 1 0.3275004 5.600383 4 46 0.00092895 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#-----------------------------------------------------------------------
# The univariate regression model assumes independent observations
# (between responses and between soil layers) inside each location (a
# strong assumption in this case).
an0 <- lm(value ~ res * cam, data = tecal)
anova(an0)
## Analysis of Variance Table
##
## Response: value
## Df Sum Sq Mean Sq F value Pr(>F)
## res 2 978.30 489.15 890.2577 <2e-16 ***
## cam 2 106.58 53.29 96.9932 <2e-16 ***
## res:cam 4 3.75 0.94 1.7044 0.1479
## Residuals 441 242.31 0.55
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# The point estimates for means are the same.
doBy::LSmeans(an0, effect = c("cam", "res"))
## estimate se df t.stat p.value cam res
## 1 4.2521180 0.104828 441 40.562826 6.381831e-151 1 lk
## 2 2.9425024 0.104828 441 28.069825 3.424403e-100 2 lk
## 3 3.1650635 0.104828 441 30.192934 2.185772e-109 3 lk
## 4 1.5683802 0.104828 441 14.961469 3.182203e-41 1 lca
## 5 0.3442527 0.104828 441 3.283978 1.105057e-03 2 lca
## 6 0.7683067 0.104828 441 7.329216 1.113877e-12 3 lca
## 7 0.4740344 0.104828 441 4.522023 7.882329e-06 1 lmg
## 8 -0.4681484 0.104828 441 -4.465874 1.014363e-05 2 lmg
## 9 -0.1055624 0.104828 441 -1.007006 3.144841e-01 3 lmg
cbind(c(coef(m0)))
## [,1]
## [1,] 4.2521180
## [2,] 2.9425024
## [3,] 3.1650635
## [4,] 1.5683802
## [5,] 0.3442527
## [6,] 0.7683067
## [7,] 0.4740344
## [8,] -0.4681484
## [9,] -0.1055624
mcglm
package# Installing the package from the github repository.
devtools::install_github("wbonat/mcglm", ref = "devel")
# Load the package.
library(mcglm)
# packageVersion("mcglm")
# Sort the lines per subject (`loc`) followed by `cam`.
teca <- plyr::arrange(teca, loc, cam)
# Keep only the variables that will be used.
teca <- teca[, c(1, 2, 16, 17, 18)]
str(teca)
## 'data.frame': 150 obs. of 5 variables:
## $ loc: int 1 1 1 2 2 2 3 3 3 4 ...
## $ cam: Factor w/ 3 levels "1","2","3": 1 2 3 1 2 3 1 2 3 1 ...
## $ lk : num 4.28 1.98 2.6 3.87 2.16 ...
## $ lca: num 2.1126 0.8459 1.0682 -0.0305 -1.5606 ...
## $ lmg: num 0.588 -0.494 0.626 -1.347 -2.303 ...
# Head and tail of the data.frame.
head(teca)
## loc cam lk lca lmg
## 1 1 1 4.279994 2.11263451 0.5877867
## 2 1 2 1.978239 0.84586827 -0.4942963
## 3 1 3 2.596746 1.06815308 0.6259384
## 4 2 1 3.873906 -0.03045921 -1.3470736
## 5 2 2 2.156403 -1.56064775 -2.3025851
## 6 2 3 2.512846 -0.27443685 -1.4271164
tail(teca)
## loc cam lk lca lmg
## 145 49 1 4.750914 2.534490 0.55961579
## 146 49 2 2.749832 1.211941 0.25464222
## 147 49 3 3.155297 1.691939 0.66782937
## 148 50 1 4.541165 2.431857 0.73716407
## 149 50 2 2.931727 1.501853 -0.01005034
## 150 50 3 3.491343 2.129421 0.30748470
#-----------------------------------------------------------------------
# Covariance structures for the inner subject factor `cam`.
# Independent covariance structure.
Z0 <- mc_id(teca)
Z0[[1]][1:6,1:6]
## 6 x 6 diagonal matrix of class "ddiMatrix"
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1 . . . . .
## [2,] . 1 . . . .
## [3,] . . 1 . . .
## [4,] . . . 1 . .
## [5,] . . . . 1 .
## [6,] . . . . . 1
# Unstructured model for covariance among cam.
Z_ns <- mc_ns(teca, id = "loc")
Z_ns[[1]][1:6,1:6]
## 6 x 6 sparse Matrix of class "dgCMatrix"
##
## [1,] . 1 . . . .
## [2,] 1 . . . . .
## [3,] . . . . . .
## [4,] . . . . 1 .
## [5,] . . . 1 . .
## [6,] . . . . . .
Z_ns[[2]][1:6,1:6]
## 6 x 6 sparse Matrix of class "dgCMatrix"
##
## [1,] . . 1 . . .
## [2,] . . . . . .
## [3,] 1 . . . . .
## [4,] . . . . . 1
## [5,] . . . . . .
## [6,] . . . 1 . .
Z_ns[[3]][1:6,1:6]
## 6 x 6 sparse Matrix of class "dgCMatrix"
##
## [1,] . . . . . .
## [2,] . . 1 . . .
## [3,] . 1 . . . .
## [4,] . . . . . .
## [5,] . . . . . 1
## [6,] . . . . 1 .
# Moving average first order structure.
Z_ma1 <- mc_ma(id = "loc",
time = "cam",
data = teca,
order = 1)
Z_ma1[[1]][1:6,1:6]
## 6 x 6 sparse Matrix of class "nsCMatrix"
##
## [1,] . | . . . .
## [2,] | . | . . .
## [3,] . | . . . .
## [4,] . . . . | .
## [5,] . . . | . |
## [6,] . . . . | .
# Distance based covariance structure.
Z_dist <- mc_dist(id = "loc",
time = "cam",
data = teca,
method = "euclidean")
Z_dist[[1]][1:6,1:6]
## 6 x 6 sparse Matrix of class "dsCMatrix"
##
## [1,] . 1 0.5 . . .
## [2,] 1.0 . 1.0 . . .
## [3,] 0.5 1 . . . .
## [4,] . . . . 1 0.5
## [5,] . . . 1.0 . 1.0
## [6,] . . . 0.5 1 .
# Random walk covariance structure.
Z_rw <- mc_rw(id = "loc",
time = "cam",
data = teca,
order = 1,
proper = TRUE)
Z_rw[[1]][1:6,1:6]
## 6 x 6 sparse Matrix of class "dtCMatrix"
##
## [1,] 1 . . . . .
## [2,] . 2 . . . .
## [3,] . . 1 . . .
## [4,] . . . 1 . .
## [5,] . . . . 2 .
## [6,] . . . . . 1
Z_rw[[2]][1:6,1:6]
## 6 x 6 sparse Matrix of class "dsCMatrix"
##
## [1,] 0 -1 . . . .
## [2,] -1 0 -1 . . .
## [3,] . -1 0 . . .
## [4,] . . . 0 -1 .
## [5,] . . . -1 0 -1
## [6,] . . . . -1 0
#-----------------------------------------------------------------------
#-----------------------------------------------------------------------
# Fitting several competing models.
# Linear predictors (the same RHS for all responses).
form <- list(lk ~ cam,
lca ~ cam,
lmg ~ cam)
# Standard MANOVA (ignores the correlation among soil layers).
fit1 <- mcglm(linear_pred = form,
matrix_pred = list(Z0, Z0, Z0),
data = teca)
## Automatic initial values selected.
# MANOVA + repeated measures using unstructured matrix.
fit2 <- mcglm(linear_pred = form,
matrix_pred = list(c(Z0, Z_ns),
c(Z0, Z_ns),
c(Z0, Z_ns)),
control_algorithm = list(tunning = 0.9,
max_iter = 100),
data = teca)
## Automatic initial values selected.
# MANOVA + repeated measures using moving average first order.
fit3 <- mcglm(linear_pred = form,
matrix_pred = list(c(Z0, Z_ma1),
c(Z0, Z_ma1),
c(Z0, Z_ma1)),
control_algorithm = list(tunning = 0.8),
data = teca)
## Automatic initial values selected.
# MANOVA + repeated measures using distance based.
fit4 <- mcglm(linear_pred = form,
matrix_pred = list(c(Z0, Z_dist),
c(Z0, Z_dist),
c(Z0, Z_dist)),
control_algorithm = list(tunning = 0.8),
data = teca)
## Automatic initial values selected.
# MANOVA + repeated measures using distance based + expm covariance link
# function.
fit5 <- mcglm(linear_pred = form,
matrix_pred = list(c(Z0, Z_dist),
c(Z0, Z_dist),
c(Z0, Z_dist)),
covariance = c("expm",
"expm",
"expm"),
control_algorithm = list(tunning = 0.8),
data = teca)
## Automatic initial values selected.
# MANOVA + repeated measures using random walk + inverse covariance link
# function.
fit6 <- mcglm(linear_pred = form,
matrix_pred = list(c(Z_rw),
c(Z_rw),
c(Z_rw)),
covariance = c("inverse",
"inverse",
"inverse"),
control_algorithm = list(tunning = 0.5, max_iter = 100),
data = teca)
## Automatic initial values selected.
#-----------------------------------------------------------------------
# Comparing fitted models.
m <- c("Indep",
"Unstr",
"MA(1)",
"Eucl",
"Eucl expm",
"RW expm")
meas <- rbind(gof(fit1),
gof(fit2),
gof(fit3),
gof(fit4),
gof(fit5),
gof(fit6))
meas <- cbind(model = m,
as.data.frame(meas),
stringsAsFactors = FALSE)
meas$model <- with(meas,
factor(model,
levels = model[order(Df)]))
meas <- plyr::arrange(meas, Df)
meas
## model plogLik Df pAIC pKLIC pBIC
## 1 Indep -422.22 15 874.44 885.3233 936.0787
## 2 MA(1) -373.19 18 782.38 860.1365 856.3465
## 3 Eucl -362.21 18 760.42 857.5303 834.3865
## 4 Eucl expm -355.59 18 747.18 803.5611 821.1465
## 5 RW expm -380.48 18 796.96 886.0974 870.9265
## 6 Unstr -343.01 24 734.02 710.6130 832.6419
barchart(plogLik + pAIC + pBIC + pKLIC~ model,
outer = TRUE,
layout = c(1, NA),
as.table = TRUE,
ylab = "Measures of model fitting",
xlab = "Fitted models",
data = meas,
scales = "free") +
layer(grid.text(x = unit(1:nlevels(x), "native"),
y = unit(0.03, "npc"),
label = meas$Df))
McGLM
#-----------------------------------------------------------------------
# Multivariate hypotheses tests for McGLM.
source("../review/functions.R")
library(Matrix)
manova.mcglm(fit1)
## Note: method with signature 'Matrix#numLike' chosen for function '%*%',
## target signature 'dgTMatrix#numeric'.
## "TsparseMatrix#ANY" would also be valid
## Note: method with signature 'sparseMatrix#matrix' chosen for function '%*%',
## target signature 'dgTMatrix#matrix'.
## "TsparseMatrix#ANY" would also be valid
## Effects Df Hotelling.Lawley Qui.square p_value
## 1 Intercept 3 20.449 3067.386 0.000000e+00
## 2 cam 6 0.992 148.753 1.419538e-29
manova.mcglm(fit2)
## Effects Df Hotelling.Lawley Qui.square p_value
## 1 Intercept 3 17.218 2582.655 0.000000e+00
## 2 cam 6 3.074 461.140 1.963509e-96
#--------------------------------------------
# Correlations.
# Correlation between responses.
summary(fit2, print = "Correlation")
## Correlation matrix:
## Parameters Estimates Std.error Z value
## 1 rho12 0.2474244 0.06883107 3.594661
## 2 rho13 0.3700137 0.08349973 4.431316
## 3 rho23 0.3229868 0.08022769 4.025877
##
## Algorithm: chaser
## Correction: TRUE
## Number iterations: 29
Sigma_b <- matrix(NA,ncol = 3, nrow = 3)
Sigma_b[lower.tri(Sigma_b)] <- fit2$Covariance[1:3]
diag(Sigma_b) <- 1
Sigma_b <- forceSymmetric(Sigma_b, uplo = FALSE)
# Correlation among `cam` for the `lk` response.
COR_lk <- matrix(NA, 3, 3)
COR_lk[lower.tri(COR_lk)] <- fit2$Covariance[5:7]
COR_lk[upper.tri(COR_lk)] <- fit2$Covariance[5:7]
diag(COR_lk) <- fit2$Covariance[4]
cov2cor(COR_lk)
## [,1] [,2] [,3]
## [1,] 1.0000000 0.4560899 0.6310219
## [2,] 0.4560899 1.0000000 0.7648163
## [3,] 0.6310219 0.7648163 1.0000000
Sigma_r1 <- COR_lk
# Correlation among `cam` for the `lca` response.
COR_lca <- matrix(NA, 3, 3)
COR_lca[lower.tri(COR_lca)] <- fit2$Covariance[9:11]
COR_lca[upper.tri(COR_lca)] <- fit2$Covariance[9:11]
diag(COR_lca) <- fit2$Covariance[8]
cov2cor(COR_lca)
## [,1] [,2] [,3]
## [1,] 1.0000000 0.8082929 0.8286790
## [2,] 0.8082929 1.0000000 0.7943085
## [3,] 0.8286790 0.7943085 1.0000000
Sigma_r2 <- COR_lca
# Correlation among `cam` for the `lmg` response.
COR_lmg <- matrix(NA, 3, 3)
COR_lmg[lower.tri(COR_lmg)] <- fit2$Covariance[13:15]
COR_lmg[upper.tri(COR_lmg)] <- fit2$Covariance[13:15]
diag(COR_lmg) <- fit2$Covariance[12]
cov2cor(COR_lmg)
## [,1] [,2] [,3]
## [1,] 1.0000000 0.4039425 0.4442434
## [2,] 0.4039425 1.0000000 0.7645949
## [3,] 0.4442434 0.7645949 1.0000000
Sigma_r3 <- COR_lmg
# Joint variance covariance matrix
part1 <- bdiag(t(chol(Sigma_r1)),t(chol(Sigma_r2)),t(chol(Sigma_r3)))
image(part1)
I <- Diagonal(3,1)
part2 <- kronecker(Sigma_b,I)
image(part2)
C <- as.matrix(part1%*%part2%*%t(part1))
corrplot(cov2cor(C),
tl.pos = "d",
outline = TRUE,
method = "ellipse")
# devtools::session_info()
Sys.time()
## [1] "2017-07-26 19:31:42 BRT"
cbind(Sys.info())
## [,1]
## sysname "Linux"
## release "4.4.0-87-generic"
## version "#110-Ubuntu SMP Tue Jul 18 12:55:35 UTC 2017"
## nodename "class"
## machine "x86_64"
## login "unknown"
## user "walmes"
## effective_user "walmes"
sessionInfo()
## R version 3.4.1 (2017-06-30)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.2 LTS
##
## Matrix products: default
## BLAS: /usr/lib/libblas/libblas.so.3.6.0
## LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
##
## locale:
## [1] LC_CTYPE=pt_BR.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=pt_BR.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=pt_BR.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=pt_BR.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=pt_BR.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] grid stats graphics grDevices utils datasets base
##
## other attached packages:
## [1] Matrix_1.2-10 mcglm_0.4.0 gridExtra_2.2.1
## [4] corrplot_0.77 reshape2_1.4.2 car_2.1-4
## [7] latticeExtra_0.6-28 RColorBrewer_1.1-2 lattice_0.20-35
## [10] knitr_1.15.1 rmarkdown_1.3
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.9 compiler_3.4.1 nloptr_1.0.4
## [4] plyr_1.8.4 methods_3.4.1 tools_3.4.1
## [7] digest_0.6.12 lme4_1.1-12 evaluate_0.10
## [10] nlme_3.1-131 gtable_0.2.0 mgcv_1.8-17
## [13] yaml_2.1.14 parallel_3.4.1 SparseM_1.74
## [16] stringr_1.2.0 MatrixModels_0.4-1 rprojroot_1.2
## [19] nnet_7.3-12 minqa_1.2.4 magrittr_1.5
## [22] backports_1.0.5 htmltools_0.3.5 MASS_7.3-47
## [25] splines_3.4.1 assertthat_0.2.0 pbkrtest_0.4-6
## [28] quantreg_5.29 stringi_1.1.2 doBy_4.5-15
mcglm4aed - Multivariate Covariance Generalized
Linear Models for the Analysis of Experimental Data
|
Bonat & Zeviani (2017) |