### Reaction Norms: An Example

Wolf [20] described a reaction norm as a surface landscape determined by genetic and environmental factors. The surface is characterized by a phenotypic trait as a function of different environmental factors such as temperature, light intensity, humidity, etc., and corresponds to a specific genetic effect such as additive, dominant or epistatic [21]. At least in three dimensions, the features of the surface such as "slope", "curvature", "peak valley", and "ridge", can be described graphically to help visualize and elucidate how the underlying factors affect the phenotype.

An example of reaction norms that illustrate a surface landscape is *photosynthesis* [13], the process by which light energy is converted to chemical energy by plants and other living organisms. It is an important yet complex process because it involves several factors such as the age of a leaf (where photosynthesis takes place in most plants), the concentration of carbon dioxide in the environment, temperature, light irradiance, available nutrients and water in the soil. A mathematical expression for the rate of single-leaf photosynthesis, *P*, without photorespiration [22] is

\begin{array}{c}P=\frac{\alpha I\phantom{\rule{0.5em}{0ex}}+{P}_{m}}{2\theta}\\ -\frac{\sqrt{{b}^{2}-4\theta \alpha I{P}_{m}}}{2\theta}\end{array}

(1)

where *b* = (*αI* + *P*_{
m
}, *θ* ∈ (0,1) is a dimensionless parameter, *α* is the photochemical efficiency, *I* is the irradiance, and *P*_{
m
}is the asymptotic photosynthetic rate at a saturating irradiance. *P*_{
m
}is a linear function of the temperature, *T*

{P}_{m}=\{\begin{array}{cc}{P}_{m}(20)P(T)& T\ge T*\\ 0& T<{T}^{*},\end{array}

(2)

where P(T)=\frac{T-{T}^{*}}{20-{T}^{*}}, *P*_{
m
}(20) is the value of *P*_{
m
}at the reference temperature of 20°C and *T** is the temperature at which photosynthesis stops. *T** is chosen over a range of temperatures, such as 5°C-25°C, to provide a good fit to observed data.

Wu et al. [13] studied the reaction norm of photosynthetic rate, defined by Eqs. (1) and (2), as a function of irradiance (*I*) and temperature (*T*). That is, the authors considered *P* = *P*(*I*, *T*). We assume that *T** = 5 so that the reaction norm model parameters are (*α*, *P*_{
m
}(20), *θ*). The surface landscape that describes the reaction norm of *P* (*I,T*), with parameters (*α, P*_{
m
} (20), *θ*) = (0.02, 1, 0.9), is shown in Figure 1. As stated earlier, each reaction norm surface corresponds to a specific genetic effect. Thus, if a QTL is at work, the genetic effects produce different surfaces defined by distinct sets of model parameters corresponding to different genotypes.

### Likelihood

We consider a backcross design with one QTL. Extensions to more complicated designs and the two-QTL case, as in [13], are straightforward. Assume a backcross plant population of size *n* with a single QTL affecting the phenotypic trait of photosynthetic rate. The photosynthetic rate for each progeny *i* (*i* = 1, ..., *n*) is measured at different irradiance (*s* = 1, ..., *S*) and temperature (*t* = 1, ..., *T* ) levels. This choice of variables is adopted for consistency in later discussions as we will be working with spatio-*t*emporal covariance models. The set of phenotype measurements or observations can be written in vector form as

\begin{array}{c}{\text{y}}_{i}\phantom{\rule{0.5em}{0ex}}=\underset{\text{irradiance1}}{\underset{\u23df}{[{y}_{i}(1,1),\mathrm{...},{y}_{i}(1,T),}}\\ \mathrm{...}\phantom{\rule{0.5em}{0ex}},\underset{\text{irradianceS}}{\underset{\u23df}{[{y}_{i}(S,1),\mathrm{...},{y}_{i}(S,T)\text{'},}}\end{array}

(3)

The progeny are genotyped for molecular markers to construct a genetic linkage map for the segregating QTL in the population. This means that the genotypes of the markers are observed and will be used, along with the phenotype measurements, to predict the QTL. With a backcross design, the QTL has two possible genotypes (as do the markers) which shall be indexed by *k* = 1, 2. The likelihood function based on the phenotype and marker data can be formulated as

L(\Omega )={\displaystyle \prod _{i=1}^{n}\left[{\displaystyle \sum _{k=1}^{2}{p}_{k|i}}{f}_{k}({\text{y}}_{i}|\Omega )\right]}

(4)

where *p*_{k|i}is the conditional probability of a QTL genotype given the genotype of a marker interval for progeny *i*. We assume a multivariate normal density for the phenotype vector y _{
i
} with genotype-specific means

\begin{array}{c}{\mu}_{k}\phantom{\rule{0.5em}{0ex}}=\underset{\text{irradiance1}}{\underset{\u23df}{[{\mu}_{k}(1,1),\mathrm{...},{\mu}_{k}(1,T),}}\\ \mathrm{...}\phantom{\rule{0.5em}{0ex}},\underset{\text{irradianceS}}{\underset{\u23df}{[{\mu}_{k}(S,1),\mathrm{...},{\mu}_{k}(S,T)\text{'},}}\end{array}

(5)

and covariance matrix Σ = cov(y_{
i
}).

### Mean and Covariance Models

The mean vector for photosynthetic rate in (5) can be modeled using equations (1) and (2) as

\begin{array}{c}{\mu}_{k}(s,\phantom{\rule{0.5em}{0ex}}t)\phantom{\rule{0.5em}{0ex}}=\frac{{\alpha}_{k}s\phantom{\rule{0.5em}{0ex}}+{P}_{mk}}{2{\theta}_{k}}\\ -\frac{\sqrt{{b}_{k}^{2}-4{\theta}_{k}{\alpha}_{k}s{P}_{mk}}}{2{\theta}_{k}}\end{array}

(6)

Where *b*_{
k
} *= α*_{
k
}*s* + *P*_{
mk
},

{P}_{mk}(t)=\{\begin{array}{cc}{P}_{mk}(20)P(t)& t\ge T*\\ 0& t<{T}^{*}\end{array}

(7)

P(t)=\frac{t-{T}^{*}}{20-{T}^{*}} and *k* = 1, 2.

Wu et al. [13] used a separable structure (Mitchell et al., 2005) for the *ST* × *ST* covariance matrix Σ as

{\Sigma}_{AR(1)}={\Sigma}_{1}\otimes {\Sigma}_{2}

(8)

where Σ_{1} and Σ_{2} are the (*S×S*) and (*T×T*) covariance matrices among different irradiance and temperature levels, respectively, and ⊗ is the Kronecker product operator. Note that Σ_{1} and Σ_{2} are unique only up to multiples of a constant because for some |c| > 0, cΣ_{1} ⊗ (1/c)Σ_{2} = Σ_{1} ⊗ Σ_{2}. Each of Σ_{1} and Σ_{2} is modeled using an AR(1) structure with a common error variance, σ^{2}, and correlation parameters *ρ*_{
k
} (*k* = 1, 2):

{\Sigma}_{k}={\sigma}^{2}\left[\begin{array}{cccc}1& {\rho}_{k}& \cdots & {\rho}_{k}^{S-1}\\ {\rho}_{k}& 1& \cdots & {\rho}_{k}^{S-2}\\ \vdots & \vdots & \ddots & \vdots \\ {\rho}_{k}^{S-1}& {\rho}_{k}^{S-2}& \cdots & 1\end{array}\right]

(9)

Separable covariance structures, however, cannot model interaction effects of each reaction norm to temperature and irradiance. Thus, there is a need for a more general model for this purpose.

Yap et al. [14] proposed to use a data-driven nonparametric covariance estimator in functional mapping. The authors showed that using such estimator provides better estimates for QTL location and mean model parameters when compared to AR(1). Huang et al. [16] showed that the nonparametric estimator works well for large matrices. Functional mapping of reaction norms when there are two environmental signals necessitates the use of large covariance matrices that result from Kronecker products of smaller matrices. Here, we are interested in determining whether the nonparametric covariance estimator of Yap et al. [14] will still work well in this reaction norm setting.

It should be noted that unlike parametric models, e.g. AR(1), there are no parameters being estimated in the nonparametric covariance estimator. The entries of the matrix are determined based on the data. This is different from a model-dependent covariance matrix model with one parameter for each of its elements. Due to over-parametrization, such a model may not lead to convergence to yield reliable results.

Note that with (6)-(9), **Ω** = **Ω**_{
1
} ∪ **Ω**_{
2
} in (4), where **Ω**_{
1
} = {*α*_{1}, *P*_{m1}(20), *θ*_{1}, *σ*^{2}, *ρ*_{
1
} } and **Ω**_{
1
} = {*α*_{2}, *P*_{m2}(20), *θ*_{2}, *σ*^{2}, *ρ*_{2}**}**. These model parameters may be estimated using the ECM algorithm [17], but closed form solutions at the CM-step are be very complicated. A more efficient method is to use the Nelder-Mead simplex algorithm [23] which can be easily implemented using softwares such as Matlab.

### Hypothesis Tests

The features of the surface landscape are important because they can be used as a basis in formulating hypothesis tests. Let *H*_{0} and *H*_{1} denote the null and alternative hypotheses, respectively. Then the existence of a QTL that determines the reaction norm curves can be formulated as

{H}_{0}:{\alpha}_{1}={\alpha}_{2},{P}_{m1}(20)={P}_{m}(20),{\theta}_{1}={\theta}_{2},

versus

*H*_{1} : at least one of the equalities

above does not hold

This means that if the reaction norm curves are distinct (in terms of their respective estimated parameters), then a QTL possibly exists. The estimated location of the QTL is at the point at which the log-likelihood ratio obtained using the null and alternative hypotheses is maximal. Of course a slight difference in parameter estimates does not automatically mean a QTL exists. The significance of the results can be determined by permutation tests [24] which involves a repeated application of the functional mapping model on the data where the phenotype and marker associations are broken to simulate the null hypothesis of no QTL. A significance level is then obtained based on the maximal log-likelihood ratio at each application to infer the presence or absence of a QTL (see ref. [25] for more details). A procedure described in ref. [26] can be used to test the additive effects of a QTL. Other hypotheses can be formulated and tested such as the genetic control of the reaction norm to each environmental factor, interaction effects between environmental factors on the phenotype, and the marginal slope of the reaction norm with respect to each environmental factor or the gradient of the reaction norm itself. The reader is referred to Wu et al. [13] for more details.