Canadian Temperature Data

Next: Human Circadian Rhythms Up: Examples Previous: Core Body Temperature Data

Canadian Temperature Data

The dataset, downloaded from the website http://www.psych.mcgill.ca/faculty/ramsay/
fda.html, includes mean monthly temperatures at 35 Canadian weather stations. The left panel in Figure shows the plots of mean temperature from 4 selected weather stations.

**Figure:** (a) Plot of `temperature` for four selected `stations`. With dots representing observations and lines representing SNMM fits. (b) Fitted curves after alignment for all 35 stations. Left: plot of temperatures for 4 selected stations.
$\begin{figure}\centerline{\psfig{figure=canada1.ps,height=2.75in,width=5.2in}} \end{figure}$

Ramsay and Silverman (1997) and Ramsay and Li (1998) used this dataset to illustrate the usefulness of Functional Data Analysis (FDA). Among others, they addressed the following two questions: how to measure the departure from a sinusoidal model and how to register the curves for further analyses. In the following, we approach these questions with SNM models.

We assume that mean temperatures over month at all weather stations share a common shape function. Variation between stations are modeled by vertical shift, vertical scale and horizontal shift parameters. That is, we assume that

$\displaystyle \mbox{\tt temp}_{ij}=\mu + b_{1i} + e^{b_{2i}} f(\mbox{\tt month}... ...^{b_{3i}}}) + \epsilon_{ij}, \,\,\,\, i=1,\cdots, 35, \,\,\,\, j=1, \cdots, 12,$

(86)

where $\mbox{\tt temp}_{ij}$ is the temperature of the

th month at station

; $\mbox{\tt month}_{ij}$ is the time point scaled to

; $\mbox{\boldmath$b$}_{i}=(b_{1i}, b_{2i}, b_{3i})^{T}\sim \mbox{N}({\bf0}, \sigma^{2}\mbox{\boldmath$D$})$ , where $\mbox{\boldmath$D$}$ is an unstructured covariance matrix; $\epsilon_{ij}$ 's are errors modeled by an AR(1) within-subject correlation structure with lag 1 autocorrelation coefficient $\rho$ and variance $\sigma^2$ ; $\mu$ , fixed, is the annual mean temperature for all stations; $b_{1i}$ is the vertical shift; $e^{b_{2i}}$ is the amplitude; and $e^{b_{3i}}/(1+e^{b_{3i}})$ is the horizontal shift of station

. Since the mean temperature is a periodic function with a period equal to one year,

is periodic with a period equal to 1. Thus we assume $f\in W_{2}(per)$ . To make the constant $\mu$ and

identifiable, we need the side condition $\int_0^1 f=0$ . This is equivalent to assuming that $f\in W_{2}^{0}(per)=W_2(per)\ominus \mbox{span}\{1\}$ , where $\mbox{span}\{1\}$ represents all constant functions. Since the sinusoidal form provides rough approximation to the temperature profile, we use the

-spline with $L=D^{2} +(2\pi)^{2}$ . The GML estimate of the smoothing parameter approaches to zero slowly as iteration increases, which leads to wiggly estimates. To get smoother estimates, we set a lower bound for the smoothing parameter. Specifically, $\log _{10} (n \lambda) \ge -4$ . To save time, we also change the convergence criterion from the default

> canada.fit <- snm(temp~b1+exp(b2)*f(month-alogit(b3)),
   func=f(u)~list(~sin(2*pi*u)+cos(2*pi*u)-1,lspline(u,type=''sine0'')),
   fixed=list(b1~1), random=list(b1+b2+b3~1), cor=corAR1(),
   groups=~station, data=canadaTemp.dat,
   control=list(rkpk.control=list(limnla=c(-4,0)),prec.out=0.005))
> summary(canada.fit)
...
       AIC    BIC    logLik
  1529.772 1607.5 -745.6426

Random effects:
 Formula: list(b1 ~ 1, b2 ~ 1, b3 ~ 1)
 Level: station
 Structure: General positive-definite, Log-Cholesky parametrization
         StdDev     Corr
b1       5.94427125 b1     b2
b2       0.30031948 -0.745
b3       0.06308751 -0.013 -0.462
Residual 1.48044625

Correlation Structure: AR(1)
 Formula: ~1 | station
 Parameter estimate(s):
      Phi
0.7258356
Fixed effects: list(b1 ~ 1)
      Value Std.Error  DF  t-value p-value
b1 1.690498 0.6188728 385 2.731576  0.0066

GCV estimate(s) of smoothing parameter(s): 0.0001000000
Equivalent Degrees of Freedom (DF):  10.24364

Converged after 5 iterations

The resulting fits are plotted in the left panel of Figure

for the four selected stations. Fits for other stations are similar. The estimated correlation coefficient is $\hat{\rho}=0.726$ with an approximate $95\%$ confidence interval

. Thus there is evidence that serial correlation does exist in the data. Figure

shows the overall fit of the common curve and its projections on subspaces ${\cal H}_1$ (sinusoidal part) and ${\cal H}_2$ (remaining part) together with their $95\%$ Bayesian confidence intervals. The small departure from sinusoidal form is statistically significant, especially in Spring and Fall. Our conclusion here is comparable to that of Ramsay and Silverman (1997), but their approach is on individual stations and does not provide a formal test.

**Figure:** The overall fit of the common shape function (a), and its projection onto ${\cal H}_1$ (b) and ${\cal H}_2$ (c). Solid lines are fitted values. Dotted lines are 95% Bayesian confidence intervals.
$\begin{figure}\centerline{\psfig{figure=canada2.ps,height=2in,width=5.2in}}\end{figure}$

By removing the horizontal shift in each curve, we can easily align all curves. Therefore our method can be used for curve registration.

grid <- NULL
for (i in 1:35) {
    grid <- c(grid,seq(0,1,len=40)+
              alogit(canada.fit$coef$random$station[70+i])+.5)
}
p.canada.2 <- predict(canada.fit, newdata=data.frame(month=grid, 
                      station=rep(1:35,rep(40,35))))

The right panel in Figure

displays the fitted curves for all 35 stations after aligning them together by removing shifts.

Below is new

Ramsay and Silverman (1997) also fitted various functional linear models (FLM) to this data set. The functional data can appear in the FLM as (i) the dependent variable, (ii) the independent variable, or (iii) both. We now show that those FLM are special cases of general smoothing spline models. Therefore they can be fitted by the ssr function.

To investigate geographical differences in the pattern of the annual tempreture function, Ramsay and Silverman (1997) divided Canada into four meteorological zones: Atlantic, Pacific, Continental and Arctic. Then they considered the following FLM (Model (9.1) in Ramsay and Silverman (1997))

$\displaystyle \mbox{\tt temp}_{kg}(t)=\mu(t) + \alpha_g(t) + \epsilon_{kg}(t),$

(87)

where

and

correspond to climate zones Atlantic, Pacific, Continental and Arctic respectively, $\mbox{\tt temp}_{kg}(t)$ is the

th temperature function in the

th zone, $\mu(t)$ is the grand mean function, $\alpha_g(t)$ represent zone effect, and $\epsilon_{kg}(t)$ are random errors. Model (

) is an example of case (i).

Now consider expected temperature as a function of both zone and time : $f(g,t)=\mbox{E}(\mbox{\tt temp}_{kg}(t))$ . Suppose that we want to model zone effect using an one-way ANOVA model and time effect using a periodic spline. Then we have the following SS ANOVA decomposition

$\displaystyle f(g,t) = f_0 + f_1(g) + f_2(t) + f_{12}(g,t),$

(88)

where

is a constant,

and

are the main effects of zone and time, and $f_{12}$ is the interaction between zone and time. Comparing (

) with (

), it is obvious that $\mu(t)=f_0 + f_2(t)$ and $\alpha_g(t) = f_1(g) + f_{12}(g,t)$ . Therefore, model (

) can be regarded as an SS ANOVA model. Discretize the functional response by 12 months, we can fit the data as follows. Dense discretization such as by days could also be used.

> temps <- matrix(scan(``monthtemp.dat'',0), 35, 12, byrow=T)
> atlindex <- c(1,2,3,4,5,6,7,8,9,10,11,13,14,16)
> pacindex <- c(25,26,27,28,29)
> conindex <- c(12,15,17,18,19,20,21,22,23,24,30,31,35)
> artindex <- c(32,33,34)
> t <- seq(0.5, 11.5, 1)/12
> tgrid <- seq(0,1,len=50)
> x <- apply(temps,1,sum)
> n <- 35*12
> temp <- as.vector(t(temps[c(atlindex,pacindex,conindex,artindex),]))
> grp <- as.factor(rep(1:4,12*c(14,5,13,3)))
> tempdata <- data.frame(temp=temp,t=rep(t,35),grp=grp)

> canada4 <- ssr(temp ~ grp, spar="m",data=tempdata,
                 rk=list(periodic(t),rk.prod(periodic(t),shrink1(grp))))
> summary(canada4)
Smoothing spline regression fit by GML method
Call: ssr(formula = temp ~ grp, rk = list(periodic(t), rk.prod(periodic(t), 
    shrink1(grp))), data = tempdata, spar = "m")

Coefficients (d):
(Intercept)        grp2        grp3        grp4 
   4.787500    2.809167   -5.204167  -16.651389 

GML estimate(s) of smoothing parameter(s) : 0.2622870 0.9971152 
Equivalent Degrees of Freedom (DF):  23.74366 
Estimate of sigma:  3.762337 

Number of Observations:  420 

> grid <- list(grp=as.factor(rep(1:4,rep(50,4))),t=rep(tgrid,4))
> p.canada4 <- predict(canada4,newdata=expand.grid(t=tgrid,grp=as.factor(1:4)),
                     terms=c(1,1,1,1,0,1))

Figures and display the estimated zone effects and the fits for four regions. They are very similar to Figures 9.1 and 9.2 in Ramsay and Silverman (1997). In addition to the estimates, we can provide confidence intervals.

**Figure:** Solid lines are zone effects $\alpha_g$ for the temperature functions in model (). Dashed lines are 95% Bayesian confidence intervals. Dotted lines are constant zero. Points are regional monthly averages minus monthly averages of the whole contry plus the average of all temperature observations.
$\begin{figure}\centerline{\psfig{figure=canada3.ps,height=3in,width=4.5in}}\end{figure}$

**Figure:** Solid lines are estimated region tempreture profiles. The dashed curve is the Canada mean function $\mu(t)$ . Points are regional monthly averages.
$\begin{figure}\centerline{\psfig{figure=canada4.ps,height=3in,width=4.5in}}\end{figure}$

To investigate the possible replationship between total annual precipitation and the temperature function, Ramsay and Silverman (1997) considered the following FLM (Model (10.3))

$\displaystyle y_i=\alpha + \int_0^T x_i(s) \beta(s)ds + \epsilon_i,$

(89)

where

is the logarithm of total annual precipitation at station

, $\alpha$ is a constant parameter,

is the temperature function at station

, $\beta(t)$ is a unknown weight function, and $\epsilon_i$ are random errors. The goal is to estimate the weight function $\beta(t)$ . Model (

) is an example of case (ii).

Without loss of the generality, suppose that the time interval has been transformed into . That it, . It is reasonable to assume that $\beta(t)$ is a smooth periodic function. Specifically, we model $\beta(t)$ using cubic periodic spline space . Define linear functionals $L_i \beta = \int_0^T x_i(s) \beta(s)ds$ . are bounded when $x_i \in {\cal L}_2$ which we assume to be true. Then model () is a partial spline model.

Denote $\beta(t)=\beta_0 + \beta_1(t)$ where $\beta_1 \in W_2^0(per)$ . Let be the reproducing kernel (RK) matrix for $\beta_1$ . From the definition in Section , the th elements of the matrix

$\displaystyle R[i,j]$	$\textstyle =$	$\displaystyle < L_{i(\cdot)} R_1 (t,\cdot), L_{j(\cdot)} R_1 (t,\cdot) > = L_{i... ...t)} L_{j(\cdot)} R_1(\cdot,\cdot) = \int_0^T \int_0^T x_i(s)x_j(t) R_1(s,t)dsdt$
	$\textstyle \approx$	$\displaystyle \sum_{k=1}^{12} \sum_{k=1}^{12} x_i(s_k)x_j(t_l) R_1(s_k,t_l) = \mbox{\boldmath$x$}_i^T \Sigma \mbox{\boldmath$x$}_j ,$	(90)

where

is the RK of $W_{2}^{0}(per)$ ,

and

represnet middle point of month

is the temperature of month

at station

, $\mbox{\boldmath$x$}_i=(x_i(s_1),\cdots,x_i(s_{12}))^T$ , and $\Sigma=\{ R_1(s_k,t_l)\}_{k, l=1}^{12}$ . We used simple summations to approximate the integrals. More accurate approximations can be used. We also ignored a scaling constant (

) since it can be absorbed by the smoothing parameter.

> prec <- matrix(scan(``monthprec.dat'',0), 35, 12, byrow=T)
> y <- log(apply(prec,1,sum))
> R <- temps %*% periodic(t) %*% t(temps)
> canada5 <- ssr(y ~ x, rk=R)
> summary(canada5)
Smoothing spline regression fit by GML method
Call: ssr(formula = y ~ x, rk = R, spar = "m")

Coefficients (d):
(Intercept)           x 
6.433069274 0.006122233 

GML estimate(s) of smoothing parameter(s) : 0.006772994 
Equivalent Degrees of Freedom (DF):  4.062615 
Estimate of sigma:  0.3588121 

Number of Observations:  35

The estimated weight function $\beta$ is represented by

$\begin{displaymath}\hat{\beta}(t) = d_2 + \sum_{i=1}^n c_i L_{i(\cdot)} R_1 (t,\cdot),\end{displaymath}$

where

is an estimate of $\alpha$ . To compute $\hat{\beta}(t)$ at grid points $\mbox{\boldmath$t$}_0$ of size

$\begin{eqnarray*} \hat{\beta}(\mbox{\boldmath$t$}_0) &=& d_2 \mbox{\boldmath$1$... ...h$x$}_i = d_2 \mbox{\boldmath$1$}_{n_0} + S \mbox{\boldmath$c$} \end{eqnarray*}$

where $\mbox{\boldmath$1$}_a$ is a vector of 1's of size

, $\mbox{\boldmath$s$}=(s_1,\cdots,s_{12})^T$ , $S=R_1(\mbox{\boldmath$t$}_0,\mbox{\boldmath$s$}) X^T$ ,

is the RK of $W_{2}^{0}(per)$ , $R_1(\mbox{\boldmath$t$}_0,\mbox{\boldmath$s$})$ is a $n_0 \times 12$ matrix of

evaluated at all combinations of $\mbox{\boldmath$t$}_0$ and $\mbox{\boldmath$s$}$ ,

is a $n \times 12$ matrix with each row consists monthly tempretures at each station, and $\mbox{\boldmath$c$}=(c_1, \cdots, c_n)^T$ .

Bayesian confidence intervals are not available for non-evaluational functionals. Therefore we use the following bootstrap procedure to compute confidence intervals. We used simple percentile intervals. Other approaches may be used (Wang and Wahba, 1995).

> S <- periodic(tgrid,t)%*%t(temps)
> f <- canada5$coef$d[2] + S%*%canada5$coef$c

> nboot <- 999
> fb <- NULL
> for (i in 1:nboot) {
  yb <- canada5$fit+sample(canada5$resi,35,replace=T)
  bfit <- ssr(yb ~ x, rk=R)
  fb <- cbind(fb,bfit$coef$d[2] + S%*%bfit$coef$c)
}
> lb <- apply(fb,1,quantile,prob=.025)
> ub <- apply(fb,1,quantile,prob=.975)

We used the default GCV method to select the smoothing parameter in the above computation. We repeated the computation with the GML method to select the smoothing parameter. Figure displays the estimated regression weight functions. It is interesting to note that the estimates with GCV and GML choices of the smoothing parameter are similar to the upper left plot and the lower right plot in Figure 10.3 of Ramsay and Silverman (1997). As indicated in Figure 10.6 of Ramsay and Silverman (1997) that it is difficult to get a precise estimate of the smoothing parameter due to the small sample size. Interestingly that the GCV method picks an estimate of the smoothing parameter that is close to the lower bound and the GML method picks an estimate that is close to the upper bound. Wide confidence intervals are the consequence of the small sample size ().

**Figure:** Estimated regression weight function $\beta$ (solid lines) and its 95% bootstrap confidence intervals (dashed lines) with GCV (left) and GML (right) choices of the smoothing parameter.
$\begin{figure}\centerline{\psfig{figure=canada5.ps,height=2.5in,width=5.8in}}\end{figure}$

To investigate the dependence of the complete log precipitation profile on the complete tempreture profile, Ramsay and Silverman (1997) considered the following FLM

$\displaystyle y_i(t)=\alpha(t) + \int_0^T x_i(s) \beta(s,t)ds + \epsilon_i(t),$

(91)

where

is the logarithm of annual precipitation profile at station

, $\alpha(t)$ plays the part of an intercept in the standard regression,

is the temperature function at station

, $\beta(s,t)$ is a unknown weight function at time

, and $\epsilon_i(t)$ are random errors processes. The goal is to estimate $\alpha(t)$ and $\beta(s,t)$ . Model (

) is an example of case (iii).

The expected annual precipitation profile depends on two non-parametric functions: $\alpha$ and $\beta$ . Again, we assume that , and $\alpha(t)$ and $\beta(s,t)$ are smooth periodic functions. Specifically, we assume that $\alpha(t) \in W_2(per)=\mbox{span} \{ 1 \} \oplus W_{2}^{0}(per)$ , and $\beta(s,t) \in W_2(per) \otimes W_2(per)= (\mbox{span} \{ 1 \} \oplus W_{2}^{0}... ... \{ 1 \} \otimes W_{2}^{0}(per)) \oplus (W_{2}^{0}(per) \otimes W_{2}^{0}(per))$ . Equivalently, we use the following SS ANOVA decomposition for $\alpha$ and $\beta$ :

$\displaystyle \alpha(t)$	$\textstyle =$	$\displaystyle \alpha_0 + \alpha_1(t),$
$\displaystyle \beta(s,t)$	$\textstyle =$	$\displaystyle \beta_0 + \beta_1(s) + \beta_2(t) + \beta_{12}(s,t).$	(92)

Again, we discretize the annual precipitation profile by 12 months. Model () becomes

$\displaystyle y_i(t_j)$	$\textstyle =$	$\displaystyle \alpha(t_j) + \int_0^T x_i(s) \beta(s,t_j)ds + \epsilon_i(t_j)$
	$\textstyle =$	$\displaystyle \alpha_0 + \alpha_1(t_j) + \beta_0 z_i + \int_0^T x_i(s) \beta_1(s) ds + z_i \beta_2(t_j) + \int_0^T x_i(s) \beta_{12}(s,t_j) ds + \epsilon_i(t_j),$	(93)
		$\displaystyle i=1,\cdots,n,~~j=1,\cdots,m,$

where

is the logarithm of precipitation in month

at station

, and $z_i=\int_0^T x_i(s)ds$ . For simplicity, we assume random errors are independent. Models with correlated random errors can be fitted similarly.

Let $\mbox{\boldmath$y$}=(y_1(t_1),\cdots,y_1(t_m),\cdots, y_n(t_1), \cdots,y_n(t_m))^T$ , $\mbox{\boldmath$z$}=(z_1,\cdots,z_n)^T$ , $\mbox{\boldmath$\alpha$}_1=(\alpha(t_1),\cdots,\alpha(t_m))^T$ , $\mbox{\boldmath$\beta$}_1=(\int_0^T x_1(s) \beta_1(s) ds,\cdots, \int_0^T x_n(s) \beta_1(s) ds)^T$ , $\mbox{\boldmath$\beta$}_2=(\beta_2(t_1),\cdots,\beta_2(t_m))^T$ ,
$\mbox{\boldmath$\beta$}_{12}=(\int_0^T x_1(s) \beta_{12}(s,t_1)ds,\cdots, \int_... ...int_0^T x_n(s) \beta_{12}(s,t_1)ds,\cdots, \int_0^T x_n(s) \beta_{12}(s,t_m)ds)$ , and $\mbox{\boldmath$\epsilon$}=(\epsilon_1(t_1),\cdots,\epsilon_1(t_m), \cdots, \epsilon_n(t_1),\cdots,\epsilon_n(t_m))^T$ .

Then model () can be written in a matrix form

$\displaystyle \mbox{\boldmath$y$}= \alpha_0 \mbox{\boldmath$1$}_{mn} + \mbox{\b... ...boldmath$\beta$}_2 + \mbox{\boldmath$\beta$}_{12} + \mbox{\boldmath$\epsilon$}.$

(94)

We have two parametric terms, $\alpha_0$ and $\beta_0$ , and four non-parametric terms, $\alpha_1$ , $\beta_1$ , $\beta_2$ and $\beta_{12}$ , in model (). To fit this model, we need to find RK matrices for four non-parametric terms. It is not difficult to see that the RK matrices for $\alpha_1$ , $\beta_1$ and $\beta_2$ are $\Sigma_1 = \mbox{\boldmath$1$}_n \otimes \mbox{\boldmath$1$}_n^T \otimes R_1(\mbox{\boldmath$t$}, \mbox{\boldmath$t$})$ , $\Sigma_2 = R \otimes \mbox{\boldmath$1$}_m \otimes \mbox{\boldmath$1$}_m^T$ , $\Sigma_3 = \mbox{\boldmath$z$}\otimes \mbox{\boldmath$z$}^T \otimes R_1(\mbox{\boldmath$t$}, \mbox{\boldmath$t$})$ , where is the RK of $W_{2}^{0}(per)$ , $\mbox{\boldmath$t$}=(t_1,\cdots,t_m)^T$ , $R_1(\mbox{\boldmath$t$}, \mbox{\boldmath$t$})$ represents a $m\times m$ matrix of evaluated at all combinations of , and is a $n \times n$ matrix defined in (). To compute the RK matrix for $\beta_{12}$ , we define the linear functional $L_{ij} \beta_{12} = \int_0^T x_i(s) \beta_{12}(s, t_j) ds$ and assume that it is bounded. Since the RK of a tensor product space is the product of the two RK's, the RK of $W_{2}^{0}(per) \otimes W_{2}^{0}(per)$ is . Then elements of the repreducing kernel matrix for $\beta_{12}$ is

$\begin{eqnarray*} L_{ij} L_{kl} R_1(s,u) R_1(t,v) = R_1(t_j,v_l) \int_0^T \int_0^T x_i(s) x_k(u) R_1(s,u)ds du . \end{eqnarray*}$

Thus the RK matrix for $\beta_{12}$ is $\Sigma_4=\mbox{rk.prod}(\Sigma_2,\Sigma_3)$ , where rk.prod prepresents elementwise product of two matrices.

> prec <- matrix(scan(``monthprec.dat'',0), 35, 12, byrow=T)
> y <- as.vector(t(log(prec)))
> xx <- rep(x,rep(12,35))
> R1 <- kronecker(matrix(1,35,35),periodic(t))
> R2 <- kronecker(R,matrix(1,12,12))
> R3 <- kronecker(x%*%t(x),periodic(t))
> R4 <- rk.prod(R1,R2)
> canada6 <- ssr(y ~ xx, rk=list(R1,R2,R3,R4))
> summary(canada6)
Smoothing spline regression fit by GCV method
Call: ssr(formula = y ~ xx, rk = list(R1, R2, R3, R4))

Coefficients (d):
(Intercept)          xx
4.745601623 0.003035296

GCV estimate(s) of smoothing parameter(s) : 
  1.727184e+03 3.630885e-02 1.416582e+07 4.597862e-04
Equivalent Degrees of Freedom (DF):  60.52553
Estimate of sigma:  0.3549322

Number of Observations:  420

We now show how to compute estimated functions evaluated at grid points. We stacked all observations as $\mbox{\boldmath$y$}$ . Denote the corresponding months for $\mbox{\boldmath$y$}$ as $\tilde{\mbox{\boldmath$t$}}=(\mbox{\boldmath$t$}^T,\cdots,\mbox{\boldmath$t$}^T)^T$ . Denote as the total number of observations. Then the estimated functions are represented by

$\begin{eqnarray*} \hat{\alpha}(t) &=& d_1 + N \lambda_1 \sum_{i=1}^N c_i R_1 (t... ...N c_i \int_0^T x_{[i/m]+1}(u) R_1 (s,u) du R_1 (t,\tilde{t}_i), \end{eqnarray*}$

where

is the integer part of

. To compute $\hat{\alpha}$ , $\hat{\beta}_1(s)$ , $\hat{\beta}_2(t)$ at grid points $\mbox{\boldmath$s$}_0$ and $\mbox{\boldmath$t$}_0$ for

and

respectively,

$\begin{eqnarray*} \hat{\alpha}(\mbox{\boldmath$t$}_0) &=& d_1 \mbox{\boldmath$1... ...oldmath$t$}_0,\tilde{t}_i) = N \lambda_3 S_3 \mbox{\boldmath$c$} \end{eqnarray*}$

where $S_1=\mbox{\boldmath$1$}_n^T \otimes R_1(\mbox{\boldmath$t$}_0,\mbox{\boldmath$t$})$ , $S_2=S \otimes \mbox{\boldmath$1$}_{m}^T$ , and $S_3=\mbox{\boldmath$z$}^T \otimes R_1(\mbox{\boldmath$t$}_0,\mbox{\boldmath$t$})$ . The interaction $\beta_{12}$ is a bivariate function. Thus we evaluate it at a bivariate grid $\mbox{\boldmath$s$}_0 \otimes \mbox{\boldmath$t$}_0=\{ (s_{0k},t_{0l}):~k,l=1,\cdots,n_0\}$ :

$\begin{displaymath} \hat{\beta}_{12}(s_{0k},t_{0l}) = N \lambda_4 \sum_{i=1}^N ... ...i S_2[k,i] S_1[l,i] \\ = N \lambda_4 S_4 \mbox{\boldmath$c$}, \end{displaymath}$

where

is a $n_0^2 \times N$ matrix with

. We also compute bootstrap confidence intervals.

> S1 <- kronecker(t(rep(1,35)),periodic(tgrid,t))
> S2 <- kronecker(S,t(rep(1,12)))
> S3 <- kronecker(t(x),periodic(tgrid,t))
> S4 <- NULL
> for (i in 1:50) {for (j in 1:50) S4 <- rbind(S4,S1[i,]*S2[j,])}

> alpha <- canada6$coef$d[1] + n*canada6$lambda[1]*S1%*%canada6$coef$c
> beta0 <- canada6$coef$d[2]
> beta1 <- n*canada6$lambda[2]*S2%*%canada6$coef$c
> beta2 <- n*canada6$lambda[3]*S3%*%canada6$coef$c
> beta12 <- n*canada6$lambda[4]*S4%*%canada6$coef$c
> beta <- beta0 + rep(beta1,rep(50,50)) + rep(beta2,50) + beta12
> nboot <- 99
> alphab <- betab0 <- betab1 <- betab2 <- betab12 <- betab <- NULL
> for (i in 1:nboot) {
    yb <- canada6$fit+rnorm(n,sd=canada6$sig)
    bfit <- ssr(yb ~ xx, rk=list(R1,R2,R3,R4))
    alphab <- cbind(alphab,bfit$coef$d[1] + n*bfit$lambda[1]*S1%*%bfit$coef$c)
    bb0 <- bfit$coef$d[2]
    bb1 <- n*bfit$lambda[2]*S2%*%bfit$coef$c
    bb2 <- n*bfit$lambda[3]*S3%*%bfit$coef$c
    bb12 <- n*bfit$lambda[4]*S4%*%bfit$coef$c
    betab0 <- c(betab0,bb0)
    betab1 <- cbind(betab1,bb1)
    betab2 <- cbind(betab2,bb2)
    betab12 <- cbind(betab12,bb12)
    betab <- cbind(betab, bb0 + rep(bb1,rep(50,50)) + rep(bb2,50) + bb12)
}
> lba <- apply(alphab,1,quantile,prob=.025)
> uba <- apply(alphab,1,quantile,prob=.975)
> lbb1 <- apply(betab1,1,quantile,prob=.025)
> ubb1 <- apply(betab1,1,quantile,prob=.975)
> lbb2 <- apply(betab2,1,quantile,prob=.025)
> ubb2 <- apply(betab2,1,quantile,prob=.975)
> lbb12 <- apply(betab12,1,quantile,prob=.025)
> ubb12 <- apply(betab12,1,quantile,prob=.975)
> lbb <- apply(betab,1,quantile,prob=.025)
> ubb <- apply(betab,1,quantile,prob=.975)

Figure displays the estimates of $\beta_{12}(s,t)$ and $\beta(s,t)$ . The scale, as well as the GCV estimates of the smoothing parameters ( $\lambda_4=4.597862e-04$ ), indicate that the interaction $\beta_{12}$ is not small. Figures and display splices of the estimated interaction function $\beta_{12}$ and their 95% bootstrap confidence intervals with one variable fixed approximately at the middle points of 12 months. Figures and display splices of the estimated function $\beta$ and their 95% bootstrap confidence intervals with one variable fixed approximately at the middle points of 12 months. Figure displays estimates of $\alpha$ , $\beta_1$ and $\beta_2$ functions.

**Figure:** Estimated $\beta_{12}$ (left) and $\beta$ (right) functions with GCV choice of the smoothing parameter.
$\begin{figure}\centerline{\psfig{figure=canada6.ps,height=3in,width=7in}}\end{figure}$

**Figure:** Estimated $\beta_{12}(s,t)$ as a function of with fixed approximately at the middle points of 12 months.
$\begin{figure}\centerline{\psfig{figure=canada7.ps,height=3in,width=5.25in}}\end{figure}$

**Figure:** Estimated $\beta_{12}(s,t)$ as a function of with fixed approximately at the middle points of 12 months.
$\begin{figure}\centerline{\psfig{figure=canada8.ps,height=3in,width=5.25in}}\end{figure}$

**Figure:** Estimated $\beta(s,t)$ as a function of with fixed approximately at the middle points of 12 months.
$\begin{figure}\centerline{\psfig{figure=canada9.ps,height=3in,width=5.25in}}\end{figure}$

**Figure:** Estimated $\beta(s,t)$ as a function of with fixed approximately at the middle points of 12 months.
$\begin{figure}\centerline{\psfig{figure=canada10.ps,height=3in,width=5.25in}}\end{figure}$

**Figure:** Estimates of $\alpha(t)$ (left), $\beta_1(s)$ (middle) and $\beta_2$ (right) functions.
$\begin{figure}\centerline{\psfig{figure=canada11.ps,height=2in,width=7in}}\end{figure}$

We also fit the data using the GML choice of the smoothing parameter.

> canada6.1 <- ssr(y ~ xx, rk=list(R1,R2,R3,R4),spar="m")
> summary(canada6.1)
Smoothing spline regression fit by GML method
Call: ssr(formula = y ~ xx, rk = list(R1, R2, R3, R4), spar = "m")

Coefficients (d):
(Intercept)          xx 
  4.7327087   0.0030902 

GML estimate(s) of smoothing parameter(s) : 
7.195469e-01 4.852421e-02 3.633548e+04 3.613019e+00 
Equivalent Degrees of Freedom (DF):  27.24519 
Estimate of sigma:  0.3733665 

Number of Observations:  420

The GML estimates of smoothing parameters indicates that both the interaction $\beta_{12}$ and the main effect $\beta_2$ are small. Figure displays the estimates of $\beta_{12}(s,t)$ and $\beta(s,t)$ . Figure displays estimates of $\alpha$ , $\beta_1$ and $\beta_2$ functions.

**Figure:** Estimated $\beta_{12}$ (left) and $\beta$ (right) functions with GML choice of the smoothing parameter.
$\begin{figure}\centerline{\psfig{figure=canada12.ps,height=3in,width=7in}}\end{figure}$

**Figure:** Estimates of $\alpha(t)$ (left), $\beta_1(s)$ (middle) and $\beta_2$ (right) functions with GML choice of the smoothing parameter.
$\begin{figure}\centerline{\psfig{figure=canada17.ps,height=2in,width=7in}}\end{figure}$

Next: Human Circadian Rhythms Up: Examples Previous: Core Body Temperature Data

Yuedong Wang 2004-05-19