Medical studies often collect physiological and/or psychological measurements over time from multiple subjects in order to study dynamics such as circadian rhythms and interactions between variables. Experiments are typically conducted in such a way that variables of interest are measured several times during a time period, e.g. 24 hours, from a group of normal (or ill) human subjects. The problems of interest are: (1) do circadian rhythms exist? (2) do demographic, environmental and psychological variables affect circadian rhythms, and if so, how? and (3) how are variables associated with each other?
In an experiment to study immunological responses in humans, blood samples were collected every two hours for 24 hours from 9 healthy normal volunteers, 11 patients with major depression and 16 patients with Cushing's disease. These blood samples were analyzed for parameters that measure immune functions and hormones of the HPA axis (Kronfol et al., 1997). We will concentrate on one hormone: cortisol (cort).The data frame horm.cort contains the following variables: ID of subjects, time when measurements are taken, hormone measurements conc, and subject type (normal, depression, cushing). For simplicity, the time variable of 24-hour period is transformed into [0,1].
Observations are shown in Figures , and . The data are noisy and it is difficult to identify patterns among subjects. The investigator hypothesizes that there is a common curve for all individuals. However, the time axis may be shifted and the magnitude of the values may differ between subjects; i.e., there may be phase and amplitude differences between subjects. Since the 24-hour periodicity is entrained, the cycle length is fixed. The common practice is to fit a sinusoidal function to each subject. Problems with this approach are: (1) the pattern over time may not be symmetric. That is, the peak and nadir may not be separated by 12 hours and/or the amplitude and width of the peak may differ from those of the nadir; (2) sometimes there are local minimum and maximum points (Wang and Brown, 1996). Although adding harmonics can improve the fit, it is difficult to decide how many harmonics to include in the model and the results are difficult to interpret.
We first show how to fit a SIM to this kind of data. Consider the
following model (Wang and Brown, 1996)
We now fit model () to cortisol measurements from normal subjects.
> options(contrasts=c("contr.treatment", "contr.poly")) > data(horm.cort) > cort.nor <- horm.cort[horm.cort$type=="normal",] > M <- model.matrix(~as.factor(ID), data=cort.nor)[,-1] > cort.nor.snr.fit1 <- snr(conc~b1+exp(b2)*f(time-alogit(b3)), func=f(u)~list(periodic(u)),params=list(b1~as.factor(ID), b2+b3~M-1), start=list(params=c(mean(cort.nor$conc),rep(0,24))), data=cort.nor, spar="m", control=list(prec.out=0.001,converg="PRSS")) > summary(cort.nor.snr.fit1) Semi-parametric Nonlinear Regression Model Fit Model: horm ~ b1 + exp(b2) * f(time - alogit(b3)) Data: cort.nor.dat AIC BIC logLik 113.9554 183.449 -30.9777 Coefficients: ... Standardized residuals: Min Q1 Med Q3 Max -2.11257 -0.5740086 0.04326369 0.5185883 2.521769 GML estimate(s) of smoothing spline parameter(s): 1.505551e-06 Equivalent Degrees of Freedom (DF) for spline function: 10.13926 Degrees of freedom: 107 total; 71.86074 residual Residual standard error: 0.4213113 Converged after 5 iterations
Notice that we used the option converg="PRSS" instead of the default converg="COEF" because this option usually requires fewer number of iterations (the same is true for the snm function). We also relaxed the tolerance for convergence criterion. The potential risk of these options is that the algorithm may stop too early. The fits shown in Figure seems adequate.
Random errors in model () may be correlated. In the following we fit with an AR(1) within-subject correlation structure.
> cort.nor.snr.fit2 <- update(cort.nor.snr.fit1, cor=corAR1(form=~1|ID)) > summary(cort.nor.snr.fit2) Semi-parametric Nonlinear Regression Model Fit Model: conc ~ b1 + exp(b2) * f(time - alogit(b3)) Data: cort.nor AIC BIC logLik 117.6520 189.8184 -31.82602 Correlation Structure: AR(1) Formula: ~1 | ID Parameter estimate(s): Phi -0.1641137 Coefficients: ... GML estimate(s) of smoothing spline parameter(s): 0.0001425255 Equivalent Degrees of Freedom (DF) for spline function: 10.33217 Residual standard error: 0.4311652 Converged after 5 iterationsThe lag 1 autocorrelation coefficient is small.
Parameters , and in model () are deterministic. Thus model () has following drawbacks: (1) they ignore correlations among observations; (2) they have the number of parameters proportional to the number of subjects; and (3) it is difficult to investigate covariate effects on parameters and/or the common curve. In the remaining of this section we show how to fit hormone data using SNM models.
We first show how to fit individual hormone for each group.
Consider the following model
We now fit model () to cortisol measurements from normal subjects.
> cort.nor.fit1 <- snm(conc~b1+exp(b2)*f(time-alogit(b3)), func=f(u)~list(periodic(u)), data=cort.nor, fixed=list(b1~1), random=list(b1+b2+b3~1), start=c(mean(cort.nor$conc)), groups=~ID, spar="m", control=list(prec.out=0.005,converg="PRSS")) > summary(cort.nor.fit1) Semi-parametric Nonlinear Mixed Effects Model fit Model: conc ~ b1 + exp(b2) * f(time - alogit(b3)) Data: cort.nor AIC BIC logLik 176.4104 224.4590 -70.2004 Random effects: Formula: list(b1 ~ 1, b2 ~ 1, b3 ~ 1) Level: ID Structure: General positive-definite, Log-Cholesky parametrization StdDev Corr b1 0.2475622 b1 b2 b2 0.1889174 -0.629 b3 0.2483884 0.035 -0.479 Residual 0.3948469 Fixed effects: list(b1 ~ 1) Value Std.Error DF t-value p-value b1 1.670316 0.07686368 98 21.73089 <.0001 GML estimate(s) of smoothing parameter(s): 0.0001218930 Equivalent Degrees of Freedom (DF): 10.00480 Converged after 4 iterations > cort.dep <- horm.cort[horm.cort$type=="depression",] > cort.dep.fit1 <- snm(conc~b1+exp(b2)*f(time-alogit(b3)), func=f(u)~list(periodic(u)), data=cort.dep, fixed=list(b1~1), random=list(b1+b2+b3~1), start=c(mean(cort.dep$conc)), groups=~ID, spar="m", control=list(prec.out=0.005,converg="PRSS")) > summary(cort.dep.fit1) Semi-parametric Nonlinear Mixed Effects Model fit Model: conc ~ b1 + exp(b2) * f(time - alogit(b3)) Data: cort.dep AIC BIC logLik 248.2489 293.5048 -108.4047 Random effects: Formula: list(b1 ~ 1, b2 ~ 1, b3 ~ 1) Level: ID Structure: General positive-definite, Log-Cholesky parametrization StdDev Corr b1 0.4139559 b1 b2 b2 0.3883549 -0.815 b3 0.3806670 0.195 -0.099 Residual 0.4390506 Fixed effects: list(b1 ~ 1) Value Std.Error DF t-value p-value b1 1.956112 0.09046153 121 21.62369 <.0001 GML estimate(s) of smoothing parameter(s): 0.0005380155 Equivalent Degrees of Freedom (DF): 7.719702 Converged after 5 iterations > cort.cush <- horm.cort[horm.cort$type=="cushing",] > cort.cush.fit1 <- snm(conc~b1+exp(b2)*f(time-alogit(b3)), func=f(u)~list(periodic(u)), data=cort.cush, fixed=list(b1~1), random=list(b1+b2+b3~1), start=c(mean(cort.cush$conc)), groups=~ID, spar="m", control=list(prec.out=0.005,converg="PRSS")) > summary(cort.cush.fit1) Model: conc ~ b1 + exp(b2) * f(time - alogit(b3)) Data: cort.cush AIC BIC logLik -38.94984 -13.18616 27.47518 Random effects: Formula: list(b1 ~ 1, b2 ~ 1, b3 ~ 1) Level: ID Structure: General positive-definite, Log-Cholesky parametrization StdDev Corr b1 3.429996e-01 b1 b2 b2 1.028451e+04 -0.690 b3 6.520918e+03 -0.107 0.593 Residual 1.685278e-01 Fixed effects: list(b1 ~ 1) Value Std.Error DF t-value p-value b1 3.034334 0.07610914 170 39.8682 <.0001 GML estimate(s) of smoothing parameter(s): 999.9996 Equivalent Degrees of Freedom (DF): 0.0002583048 Converged after 2 iterations
The fits are shown in Figures , and .
We calculate the posterior means and variances of the common function for all three groups:
u <- seq(0,1,len=50) cort.nor.ci <- intervals(cort.nor.fit.1, newdata=data.frame(u=u)) cort.dep.ci <- intervals(cort.dep.fit.1, newdata=data.frame(u=u)) cort.cush.ci <- intervals(cort.cush.fit.1, newdata=data.frame(u=u))The estimated common functions are shown in Figure together with their 95% Bayesian confidence intervals.
It is obvious that the common function for Cushing group is
almost zero, which suggests that circadian rhythms are lost
for Cushing patients except subjects 3044 and 3069.
It seems that the shape functions for normal and depression
groups are similar. We now test this hypothesis by fitting
data from these two groups simultaneous. Consider the following
model
(98) |
> cort.nordep.dat <- horm[horm$type=="cort"&horm$group!="cushing",] > cort.nordep.dat$group <- as.vector(cort.nordep.dat$group) > cort.nordep.fit.1 <- snm(horm~b1+exp(b2)*f(group,time-alogit(b3)), func=f(g,u)~list(list(periodic(u), rk.prod(shrink1(g),periodic(u)))), data=cort.nordep.dat, fixed=list(b1~group), random=pdStrat(b1+b2+b3~1,strata=~as.factor(group)), weights=varIdent(form=~1|group), control=list(prec.out=0.005,converg="PRSS"), spar="m", start=c(1.8,-.2), groups=~ID) > summary(cort.nordep.fit.1) Semi-parametric Nonlinear Mixed Effects Model fit Model: cort ~ b1 + exp(b2) * f(group, time - alogit(b3)) Data: cort.nordep.dat AIC BIC logLik 430.7103 516.9962 -190.4965 Random effects: Formula: list(b1 ~ 1, b2 ~ 1, b3 ~ 1) Level: ID Structure: General positive-definite stratified by as.factor(group) Stratum: depression StdDev Corr b1.(Intercept) 0.4025420 b1.(I) b2 b2 0.3441774 -0.779 b3 0.3352246 0.080 -0.013 Stratum: normal StdDev Corr b1.(Intercept) 0.2526908 b1.(I) b2 b2 0.2352516 -0.535 b3 0.2431394 -0.050 -0.384 Within-group standard deviation: 0.4516581 Variance function: Structure: Different standard deviations per stratum Formula: ~ 1 | group Parameter estimates: depression normal 1 0.9344799 Fixed effects: list(b1 ~ group) Value Std.Error DF t-value p-value b1.(Intercept) 1.860474 0.0962724 218 19.32511 <.0001 b1.group -0.160235 0.1274092 218 -1.25764 0.2099 Correlation: b1.(I) b1.group -0.756 GML estimate(s) of smoothing parameter(s): 5.037821e-04 1.487554e+02 Equivalent Degrees of Freedom (DF): 8.858698 Converged after 4 iterations > u <- seq(0,1,len=50) > cort.nordep.ci <- intervals(cort.nordep.fit.1, newdata=data.frame( g=rep(c("normal","depression"),c(50,50)),u=rep(u,2)), terms=c(0,1))
The smoothing parameter for the interaction term is large, which means that the interaction is small. In fact, is essentially zero: the posterior means are on the magnitude of while the posterior standard deviations are on the magnitude of . Therefore normal and depression groups have the same shape function.
Under the assumption of one shape function for both groups, we now can investigate differences of 24-hour mean, amplitude, and phase between two groups. For this purpose, consider the following model
> cort.nordep.fit.2 <- snm(horm~b1+exp(b2+d1*I(group=="normal"))* f(time-alogit(b3+d2*I(group=="normal"))), func=f(u)~list(periodic(u)), data=cort.nordep.dat, fixed=list(b1~group,d1+d2~1), random=pdStrat(b1+b2+b3~1,strata=~as.factor(group)), weights=varIdent(form=~1|group), control=list(prec.out=0.005,converg="PRSS"), spar="m", start=c(1.9,-0.3,0,0), groups=~ID) > summary(cort.nordep.fit.2) Semi-parametric Nonlinear Mixed Effects Model fit Model: cort ~ b1 + exp(b2 + d1 * I(group == "normal")) * f(time - alogit(b3 + d2 * I(group == "normal"))) Data: cort.nordep.dat AIC BIC logLik 438.1875 531.3631 -192.2045 Random effects: Formula: list(b1 ~ 1, b2 ~ 1, b3 ~ 1) Level: ID Structure: General positive-definite stratified by as.factor(group) Stratum: depression StdDev Corr b1.(Intercept) 0.4072866 b1.(I) b2 b2 0.3789050 -0.784 b3 0.3324878 0.086 -0.046 Stratum: normal StdDev Corr b1.(Intercept) 0.2448359 b1.(I) b2 b2 0.1626245 -0.522 b3 0.2661856 -0.032 -0.640 Within-group standard deviation: 0.4515871 Variance function: Structure: Different standard deviations per stratum Formula: ~ 1 | group Parameter estimates: depression normal 1 0.9358295 Fixed effects: list(b1 ~ group, d1 + d2 ~ 1) Value Std.Error DF t-value p-value b1.(Intercept) 1.907384 0.0956961 216 19.93168 <.0001 b1.group -0.272381 0.1310609 216 -2.07828 0.0389 d1 0.234996 0.0766247 216 3.06684 0.0024 d2 0.029918 0.0915166 216 0.32691 0.7441 Correlation: b1.(I) b1.typ d1 b1.group -0.730 d1 0.000 -0.225 d2 0.000 -0.017 -0.428 GML estimate(s) of smoothing parameter(s): 0.0005858115 Equivalent Degrees of Freedom (DF): 8.889222 Converged after 4 iterationsThe differences of 24-hour mean and amplitude are significant, while the difference of phase is not. We refit without the term.
> acth.nordep.fit.3 <- snm(horm~b1+exp(b2+d1*I(group=="normal"))* f(time-alogit(b3)), func=f(u)~list(periodic(u)), data=acth.nordep.dat, fixed=list(b1~group,d1~1), random=pdStrat(b1+b2+b3~1,strata=~as.factor(group)), weights=varIdent(form=~1|group), control=list(prec.out=0.005,converg="PRSS"), spar="m", start=c(2.8,-0.4,0), groups=~ID) > summary(acth.nordep.fit.3) Semi-parametric Nonlinear Mixed Effects Model fit Model: acth ~ b1 + exp(b2 + d1 * I(group == "normal")) * f(time - alogit(b3)) Data: acth.nordep.dat AIC BIC logLik 417.8258 518.0793 -180.2195 Random effects: Formula: list(b1 ~ 1, b2 ~ 1, b3 ~ 1) Level: ID Structure: General positive-definite stratified by as.factor(group) Stratum: depression StdDev Corr b1.(Intercept) 0.4414536 b1.(I) b2 b2 0.3572042 -0.606 b3 0.3117688 0.212 0.238 Stratum: normal StdDev Corr b1.(Intercept) 0.3941497 b1.(I) b2 b2 0.3102877 -0.532 b3 0.1961919 0.067 0.113 Within-group standard deviation: 0.4178081 Variance function: Structure: Different standard deviations per stratum Formula: ~ 1 | group Parameter estimates: depression normal 1 0.9706348 Fixed effects: list(b1 ~ group, d1 ~ 1) Value Std.Error DF t-value p-value b1.(Intercept) 2.913696 0.1101288 222 26.45717 <.0001 b1.group -0.509698 0.1732330 222 -2.94227 0.0036 d1 0.340090 0.1283891 222 2.64890 0.0087 Correlation: b1.(I) b1.group b1.group -0.636 d1 0.000 -0.314 GML estimate(s) of smoothing parameter(s): 0.0002518025 Equivalent Degrees of Freedom (DF): 11.69337 Converged after 6 iterations
The fits are shown in Figures and . The right panel of Figure shows the estimated common function and its 95% Bayesian confidence intervals. Data from two groups are pooled to estimate the common function which has narrower confidence intervals. We conclude that the depressed subjects have their mean cortisol level elevated and less profound circadian rhythm than normal subjects.