- Methodology article
- Open Access
Functional mapping of reaction norms to multiple environmental signals through nonparametric covariance estimation
- John S Yap^{1},
- Yao Li^{2},
- Kiranmoy Das^{3},
- Jiahan Li^{3} and
- Rongling Wu^{4, 3}Email author
https://doi.org/10.1186/1471-2229-11-23
© Yap et al; licensee BioMed Central Ltd. 2011
- Received: 21 November 2009
- Accepted: 26 January 2011
- Published: 26 January 2011
Abstract
Background
The identification of genes or quantitative trait loci that are expressed in response to different environmental factors such as temperature and light, through functional mapping, critically relies on precise modeling of the covariance structure. Previous work used separable parametric covariance structures, such as a Kronecker product of autoregressive one [AR(1)] matrices, that do not account for interaction effects of different environmental factors.
Results
We implement a more robust nonparametric covariance estimator to model these interactions within the framework of functional mapping of reaction norms to two signals. Our results from Monte Carlo simulations show that this estimator can be useful in modeling interactions that exist between two environmental signals. The interactions are simulated using nonseparable covariance models with spatio-temporal structural forms that mimic interaction effects.
Conclusions
The nonparametric covariance estimator has an advantage over separable parametric covariance estimators in the detection of QTL location, thus extending the breadth of use of functional mapping in practical settings.
Keywords
- Photosynthetic Rate
- Covariance Model
- Reaction Norm
- Kronecker Product
- Functional Mapping
Background
The phenotype of a quantitative trait exhibits plasticity if the trait differs in phenotypes with changing environment [1–7]. Such environment-dependent changes, also called reaction norms, are ubiquitous in biology. For example, thermal reaction norms show how performance, such as caterpillar growth rate [8] or growth rate and body size in ectotherms [9], varies continuously with temperature [10]. Another example is the flowering time of Arabidopsis thaliana with respect to changing light intensity [11]. However, QTL mapping of reaction norms is difficult to model because of the inherent complexity in the interplay of a multitude of factors involved. An added difficulty is in their being "infinite-dimensional" as they require an infinite number of measurements to be completely described [12]. Wu et al. [13] proposed a functional mapping-based model which addresses the latter difficulty by using a biologically relevant mathematical function to model reaction norms. The authors considered a parametric model of photosynthetic rate as a function of light irradiance and temperature and studied the genetic mechanism of such process. They showed through simulations that in a backcross population with one or two-QTLs, their method accurately and precisely estimated the QTL location(s) and the parameters of the mean model for photosynthesis rate. For a backcross population with one QTL, the mean model consists of two surfaces that describe the photosynthetic rate of two genotypes. However, in their model, they assumed the covariance matrix to be a Kronecker product of two AR(1) structures, each modeling a reaction norm due to one environmental factor. This type of covariance model is said to be separable. Although computationally efficient because of the minimal number of parameters to be estimated, this model only captures separate reaction norm effects but fails to incorporate interactions. A more general approach is therefore needed.
In the context of longitudinal data, Yap et al. [14] proposed a nonparametric covariance estimator in functional mapping. It was nonparametric in the sense that the covariance matrix has an unconstrained set of parameters to be estimated and not the usual distribution-free sense in nonparametric statistics. This estimator can be obtained by employing a modified Cholesky decomposition of the covariance matrix which yields component matrices whose elements can be interpreted and modeled as terms in a regression [15]. A penalized likelihood procedure is used to solve the regression with either an L_{1} or L_{2} penalty [16]. Penalized likelihood in regression is a technique used to obtain minimum mean squared error (MSE) of estimated regression coefficients by balancing bias and variance. L_{1} or L_{2} penalties, which are functions of the regression covariates, are included in a regression model in order to shrink coefficients towards estimates with minimum MSE. In the case of the L_{1} penalty, some of the coefficients are actually shrunk to zero. Thus, with the L_{1} penalty, a more parsimonious regression model is obtained. The use of penalized likelihood with L_{1} or L_{2} penalties is particularly useful when there is multi-collinearity among the covariates in the regression i.e. when there are near linear dependencies or high correlations among the regressors or predictor variables. An iterative procedure is implemented by using the ECM algorithm [17] to obtain the final estimator. Through Monte Carlo simulations, this nonparametric estimator is found to provide more accurate and precise mean parameters and QTL location estimates than the parametric AR(1) form for the covariance model, especially when the underlying covariance structure of the data is significantly different from the assumed model.
The question of how to incorporate interaction effects in a model with multiple factors has not, to our knowledge, been thoroughly explored in the biology literature, especially in the context of genetic mapping that incorporates interactions of function-valued traits. The spatio-temporal literature, however, has a wealth of publications that developed more general models such as nonseparable covariance structures which are used to model the underlying interactions of random processes in the space and time domains (see [18, 19]). A nonseparable covariance cannot be expressed as a Kronecker product of two matrices like separable structures can. The random processes being modeled may be the concentration of pollutants in the atmosphere, groundwater contaminants, wind speed, or even disposable household incomes. The main significance of the covariance in this context is in providing a better characterization of the random process to obtain optimal kriging or prediction of unobserved portions of it. It therefore seems natural to consider the utilization of nonseparable structures in the simulation and modeling of reaction norms that react to two environmental factors. More concretely, we consider the photosynthetic rate as a random process, and the irradiance and temperature as the spatial (one dimension) and temporal domains, respectively.
The remaining part of this paper is organized as follows: We first describe the functional mapping model proposed by Wu et al. [13] for reaction norms. Then, we formulate separable and nonseparable models used in spatio-temporal analyses and present a simulation study using some nonseparable structures. Lastly, the new model and its implications for genetic mapping are discussed. From hereon, the terms covariance matrix, covariance structure or covariance function are used interchangeably.
Functional Mapping of Reaction Norms
Reaction Norms: An Example
Wolf [20] described a reaction norm as a surface landscape determined by genetic and environmental factors. The surface is characterized by a phenotypic trait as a function of different environmental factors such as temperature, light intensity, humidity, etc., and corresponds to a specific genetic effect such as additive, dominant or epistatic [21]. At least in three dimensions, the features of the surface such as "slope", "curvature", "peak valley", and "ridge", can be described graphically to help visualize and elucidate how the underlying factors affect the phenotype.
where $P(T)=\frac{T-{T}^{*}}{20-{T}^{*}}$, P_{ m }(20) is the value of P_{ m }at the reference temperature of 20°C and T* is the temperature at which photosynthesis stops. T* is chosen over a range of temperatures, such as 5°C-25°C, to provide a good fit to observed data.
Likelihood
and covariance matrix Σ = cov(y_{ i }).
Mean and Covariance Models
$P(t)=\frac{t-{T}^{*}}{20-{T}^{*}}$ and k = 1, 2.
Separable covariance structures, however, cannot model interaction effects of each reaction norm to temperature and irradiance. Thus, there is a need for a more general model for this purpose.
Yap et al. [14] proposed to use a data-driven nonparametric covariance estimator in functional mapping. The authors showed that using such estimator provides better estimates for QTL location and mean model parameters when compared to AR(1). Huang et al. [16] showed that the nonparametric estimator works well for large matrices. Functional mapping of reaction norms when there are two environmental signals necessitates the use of large covariance matrices that result from Kronecker products of smaller matrices. Here, we are interested in determining whether the nonparametric covariance estimator of Yap et al. [14] will still work well in this reaction norm setting.
It should be noted that unlike parametric models, e.g. AR(1), there are no parameters being estimated in the nonparametric covariance estimator. The entries of the matrix are determined based on the data. This is different from a model-dependent covariance matrix model with one parameter for each of its elements. Due to over-parametrization, such a model may not lead to convergence to yield reliable results.
Note that with (6)-(9), Ω = Ω_{ 1 } ∪ Ω_{ 2 } in (4), where Ω_{ 1 } = {α_{1}, P_{m1}(20), θ_{1}, σ^{2}, ρ_{ 1 } } and Ω_{ 1 } = {α_{2}, P_{m2}(20), θ_{2}, σ^{2}, ρ_{2}}. These model parameters may be estimated using the ECM algorithm [17], but closed form solutions at the CM-step are be very complicated. A more efficient method is to use the Nelder-Mead simplex algorithm [23] which can be easily implemented using softwares such as Matlab.
Hypothesis Tests
versus
H_{1} : at least one of the equalities
above does not hold
This means that if the reaction norm curves are distinct (in terms of their respective estimated parameters), then a QTL possibly exists. The estimated location of the QTL is at the point at which the log-likelihood ratio obtained using the null and alternative hypotheses is maximal. Of course a slight difference in parameter estimates does not automatically mean a QTL exists. The significance of the results can be determined by permutation tests [24] which involves a repeated application of the functional mapping model on the data where the phenotype and marker associations are broken to simulate the null hypothesis of no QTL. A significance level is then obtained based on the maximal log-likelihood ratio at each application to infer the presence or absence of a QTL (see ref. [25] for more details). A procedure described in ref. [26] can be used to test the additive effects of a QTL. Other hypotheses can be formulated and tested such as the genetic control of the reaction norm to each environmental factor, interaction effects between environmental factors on the phenotype, and the marginal slope of the reaction norm with respect to each environmental factor or the gradient of the reaction norm itself. The reader is referred to Wu et al. [13] for more details.
Spatio-Temporal Covariances
We investigate the use of parametric and nonseparable spatio-temporal covariance structures in functional mapping of photosynthetic rate as a reaction norm to the environmental factors irradiance and temperature. As stated earlier, the main idea is to model irradiance as a one-dimensional spatial variable and temperature as a temporal variable. The choice of which environmental signal is modeled as temporal or spatial is arbitrary. For more about spatio-temporal modeling, we refer the reader to [27, 19].
Basic Ideas, Notation, and Assumptions
to characterize unobserved portions of the process. This collection of coordinates are not necessarily ordered fixed levels of each trait. We will only be concerned with the case d = 1. Aside from those mentioned earlier, Y may also represent ozone levels, disease incidence, ocean current patterns or water temperatures. In our setting, Y represents photosynthetic rate.
Note that C(u, 0) and C(0, v) correspond to purely spatial and purely temporal covariance functions, respectively.
In spatio-temporal analysis, the ultimate goal is optimal prediction (or kriging) of an un-observed part of the random process Y(s, t) using an appropriate covariance function model. We utilize a covariance model to calculate the mixture likelihood associated with functional mapping.
Separable and Nonseparable Covariance Structures
Separable Covariance Structures
where C_{1}(u | θ_{1}) and C_{2}(v | θ_{2}) are purely spatial and purely temporal covariance functions, respectively, and θ = (θ_{1}, θ_{2})'. This representation implies that the observed joint process can be seen as a product of two independent spatial and temporal processes.
where a and b are scale parameters. In this model, the scale parameters correct for the uneven distances between coordinates.
Nonseparable Covariance Structures
Here, we present some nonseparable covariance models that were derived in two different ways. The details of the derivation are omitted as they are rather complicated and lengthy.
where a, b ≥ 0 are scaling parameters of time and space, respectively; c ≥ 0 is an interaction parameter of time and space, and σ^{ 2 } = C(0, 0) ≥ 0. Note that when c = 0, (18) reduces to a separable model.
with (u, v) ∈ ℛ × ℛ and where a, b > 0 are scaling parameters of space and time, respectively; α, β ∈ (0, 1] are smoothness parameters of space and time, respectively; γ 0[1]; τ ≥ 1/2; and σ^{ 2 } ≥ 0. γ is a space-time interaction parameter which implies a separable structure when 0 and a nonseparable structure otherwise. Increasing values of γ indicates strengthening spatio-temporal interaction.
Computer Simulation
where a, b ≥ 0; γ ∈ 0[1] and σ^{2} > 0. C_{1} and C_{2} correspond to (16) and (17), respectively, and C_{3} is a special case of (19) with α = 1/2, β = 1/2 and τ = 1.
We generated photosynthetic rate data using these nonseparable covariances to simulate interaction effects between the two environmental signals in functional mapping of a reaction norm. The generated data was analyzed using the nonparametric estimator Σ _{ NP } proposed by Yap et al. [14] using an L_{ 2 } penalty, and Σ_{AR (1)}(equation (8)). Note that the underlying covariance structures were very different from the assumed model, Σ_{AR (1) }, and we therefore expected to get biased estimates. The issue we wanted to address was the extent to which the bias cannot be ignored and an alternative estimator such as Σ _{ NP } may be more appropriate.
where $\widehat{\Sigma}$ is the estimate of the true underlying covariance Σ [14, 16, 29–31]. Each loss function is 0 when $\widehat{\Sigma}=\Sigma $ and large values suggest significant bias.
Using a backcross design for the QTL mapping population, we randomly generated 6 markers equally spaced on a chromosome 100 cM long. One QTL was simulated between the fourth and fifth markers, 12 cM from the fourth marker (or 72 cM from the leftmost marker of the chromosome). The QTL had two possible genotypes which determined two distinct mean photosynthetic rate reaction norm surfaces defined by equations (1) and (2) (see also Figure 1). The surface parameters for each genotype were (α_{1}, P_{m1}(20), θ_{1}) = (0.02, 2, 0.9) and (α_{2}, P_{m2}(20), θ_{2}) = (0.01, 1.5, 0.9). Phenotype observations were obtained by sampling from a multivariate normal distribution with mean surface based on irradiance and temperature levels of {0, 50, 100, 200, 300} and {15, 20, 25, 30}, respectively, and covariance matrix C_{ l }(u, v), l = 1, 2, 3 with a = 0.50, b = 0.01 for C_{1}, a = 1.00, b = 0.01 for C_{2}, a = 1.00, b = 0.01, c = 0.60 for C_{3} and σ^{2} = 1.00 for all three covariances.
Averaged QTL position, mean curve parameters, entropy and quadratic losses and their standard errors (given in parentheses) for two QTL genotypes in a backcross population under different sample sizes (n) based on 100 simulation replicates (Σ_{ NP }).
QTL | QTL genotype 1 | QTL genotype 2 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Covariance | n | Location | ${\widehat{\alpha}}_{1}$ | ${\widehat{P}}_{m1}(20)$ | ${\widehat{\theta}}_{1}$ | ${\widehat{\alpha}}_{2}$ | ${\widehat{P}}_{m2}(20)$ | ${\widehat{\theta}}_{2}$ | L_{E} | L_{Q} |
C _{1} | 200 | 71.68 | 0.02 | 2.02 | 0.90 | 0.01 | 1.52 | 0.88 | 1.04 | 2.03 |
(0.28) | (0.00) | (0.01) | (0.00) | (0.00) | (0.02) | (0.01) | (0.01) | (0.02) | ||
400 | 72.16 | 0.02 | 2.00 | 0.90 | 0.01 | 1.52 | 0.88 | 0.53 | 1.06 | |
(0.23) | (0.00) | (0.01) | (0.00) | (0.00) | (0.01) | (0.01) | (0.00) | (0.01) | ||
C _{2} | 200 | 71.88 | 0.02 | 2.00 | 0.90 | 0.01 | 1.53 | 0.88 | 1.00 | 1.96 |
(0.29) | (0.00) | (0.01) | (0.00) | (0.00) | (0.01) | (0.01) | (0.01) | (0.02) | ||
400 | 71.92 | 0.02 | 2.00 | 0.90 | 0.01 | 1.52 | 0.89 | 0.52 | 1.02 | |
(0.17) | (0.00) | (0.01) | (0.00) | (0.00) | (0.01) | (0.01) | (0.00) | (0.01) | ||
C _{3} | 200 | 72.12 | 0.02 | 2.01 | 0.89 | 0.01 | 1.54 | 0.87 | 0.88 | 1.70 |
(0.37) | (0.00) | (0.01) | (0.01) | (0.00) | (0.02) | (0.01) | (0.01) | (0.02) | ||
400 | 72.08 | 0.02 | 2.01 | 0.90 | 0.01 | 1.52 | 0.89 | 0.48 | 0.94 | |
(0.20) | (0.00) | (0.01) | (0.00) | (0.00) | (0.01) | (0.01) | (0.00) | (0.01) | ||
True: | 72.00 | 0.02 | 2.00 | 0.90 | 0.01 | 1.50 | 0.90 |
Averaged QTL position, mean curve parameters, entropy and quadratic losses and their standard errors (given in parentheses) for two QTL genotypes in a backcross population under different sample sizes (n) based on 100 simulation replicates (Σ_{A R(1}_{)}).
QTL | QTL genotype 1 | QTL genotype 2 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Covariance | n | Location | ${\widehat{\alpha}}_{1}$ | ${\widehat{P}}_{m1}(20)$ | ${\widehat{\theta}}_{1}$ | ${\widehat{\alpha}}_{2}$ | ${\widehat{P}}_{m2}(20)$ | ${\widehat{\theta}}_{2}$ | L _{ E } | L _{ Q } |
C _{1} | 200 | 72.32 | 0.02 | 2.03 | 0.90 | 0.01 | 1.53 | 0.87 | 19.43 | 681.78 |
(0.45) | (0.00) | (0.01) | (0.01) | (0.00) | (0.02) | (0.01) | (0.07) | (6.16) | ||
400 | 71.72 | 0.02 | 2.03 | 0.90 | 0.01 | 1.51 | 0.89 | 19.45 | 684.11 | |
(0.27) | (0.00) | (0.01) | (0.00) | (0.00) | (0.01) | (0.01) | (0.05) | (4.40) | ||
C _{2} | 200 | 71.96 | 0.02 | 2.01 | 0.90 | 0.01 | 1.55 | 0.87 | 4.83 | 58.60 |
(0.34) | (0.00) | (0.01) | (0.00) | (0.00) | (0.02) | (0.01) | (0.02) | (1.01) | ||
400 | 71.84 | 0.02 | 2.01 | 0.90 | 0.01 | 1.52 | 0.89 | 4.83 | 58.61 | |
(0.20) | (0.00) | (0.01) | (0.00) | (0.00) | (0.01) | (0.01) | (0.02) | (0.77) | ||
C _{3} | 200 | 72.00 | 0.02 | 2.01 | 0.89 | 0.01 | 1.54 | 0.87 | 0.60 | 1.51 |
(0.35) | (0.00) | (0.01) | (0.01) | (0.00) | (0.02) | (0.01) | (0.00) | (0.10) | ||
400 | 71.96 | 0.02 | 2.01 | 0.89 | 0.01 | 1.52 | 0.89 | 0.60 | 1.43 | |
(0.22) | (0.00) | (0.01) | (0.00) | (0.00) | (0.01) | (0.01) | (0.00) | (0.08) | ||
True: | 72.00 | 0.02 | 2.00 | 0.90 | 0.01 | 1.50 | 0.90 |
- 1.
σ ^{2} = 2, 4 with irradiance and temperature levels of {0, 50, 100, 200, 300} and {15, 20, 25, 30}, respectively.
- 2.
σ ^{2} = 1, 2 with irradiance and temperature levels of {0, 50, 100, 150, 200, 250, 300} and {15, 18, 21, 24, 27, 30}, respectively.
Averaged QTL position, mean curve parameters, log-likelihood values, maximum log-likelihood ratios (maxLR), entropy and quadratic losses and their standard errors (given in parentheses) for two QTL genotypes in a backcross population based on 100 simulation replicates (C_{1} with n = 400 and σ^{2} = 2, 4).
QTL | QTL genotype 1 | QTL genotype 2 | log-likelihood | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Covariance | σ ^{2} | Location | ${\widehat{\alpha}}_{1}$ | ${\widehat{P}}_{m1}(20)$ | ${\widehat{\theta}}_{1}$ | ${\widehat{\alpha}}_{2}$ | ${\widehat{P}}_{m2}(20)$ | ${\widehat{\theta}}_{2}$ | H _{ 0 } | H _{ 1 } | max LR | L _{ E } | L _{ Q } |
Σ_{AR(1)} | 2 | 72.40 | 0.02 | 2.05 | 0.89 | 0.01 | 1.52 | 0.87 | -5437 | -5373 | 128.51 | 19.45 | 684.37 |
(0.44) | (0.00) | (0.01) | (0.01) | (0.00) | (0.02) | (0.01) | (7.36) | (7.31) | (2.45) | (0.05) | (4.44) | ||
4 | 74.20 | 0.02 | 2.11 | 0.88 | 0.01 | 1.52 | 0.84 | -8175 | -8141 | 65.55 | 19.44 | 683.82 | |
(0.69) | (0.00) | (0.02) | (0.01) | (0.00) | (0.03) | (0.02) | (7.32) | (7.31) | (1.80) | (0.05) | (4.46) | ||
C _{1} | 2 | 71.96 | 0.02 | 2.01 | 0.90 | 0.01 | 1.54 | 0.88 | -4088 | -4021 | 133.41 | 0.01 | 0.13 |
(0.29) | (0.00) | (0.01) | (0.00) | (0.00) | (0.02) | (0.01) | (7.17) | (7.16) | (2.15) | (0.00) | (0.02) | ||
4 | 71.96 | 0.02 | 2.03 | 0.89 | 0.01 | 1.57 | 0.86 | -6822 | -6788 | 69.07 | 0.01 | 0.13 | |
(0.44) | (0.00) | (0.01) | (0.01) | (0.00) | (0.03) | (0.02) | (7.16) | (7.16) | (1.57) | (0.00) | (0.02) | ||
N P | 2 | 72.16 | 0.02 | 2.01 | 0.89 | 0.01 | 1.54 | 0.87 | -3967 | -3912 | 109.79 | 0.53 | 1.05 |
(0.29) | (0.00) | (0.01) | (0.00) | (0.00) | (0.02) | (0.01) | (6.87) | (6.89) | (1.66) | (0.00) | (0.01) | ||
4 | 71.64 | 0.02 | 2.01 | 0.89 | 0.01 | 1.57 | 0.84 | -6713 | -6684 | 59.92 | 0.53 | 1.04 | |
(0.49) | (0.00) | (0.01) | (0.01) | (0.00) | (0.03) | (0.02) | (6.89) | (6.93) | (1.27) | (0.00) | (0.01) | ||
True: | 72.00 | 0.02 | 2.00 | 0.90 | 0.01 | 1.50 | 0.90 |
Averaged QTL position, mean curve parameters, log-likelihood values, maximum log-likelihood ratios (maxLR), entropy and quadratic losses and their standard errors (given in parentheses) for two QTL genotypes in a backcross population based on 100 simulation replicates (C_{1} with n = 400, increased irradiance and temperature levels, and σ^{2} = 1, 2).
QTL | QTL genotype 1 | QTL genotype 2 | log-likelihood | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Covariance | σ ^{2} | Location | ${\widehat{\alpha}}_{1}$ | ${\widehat{P}}_{m1}(20)$ | ${\widehat{\theta}}_{1}$ | ${\widehat{\alpha}}_{2}$ | ${\widehat{P}}_{m2}(20)$ | ${\widehat{\theta}}_{2}$ | H _{0} | H _{1} | max LR | L _{ E } | L _{ Q } |
Σ_{AR(1)} | 1 | 72.16 | 0.02 | 2.04 | 0.90 | 0.01 | 1.48 | 0.88 | -1278 | -1063 | 430.01 | 223 | 64090 |
(0.36) | (0.00) | (0.01) | (0.00) | (0.00) | (0.01) | (0.01) | (14.01) | (14.15) | (4.78) | (0.45) | (261.88) | ||
2 | 78.44 | 0.02 | 2.15 | 0.91 | 0.01 | 1.48 | 0.86 | -6992 | -6876 | 231.86 | 222 | 63923 | |
(0.84) | (0.00) | (0.02) | (0.00) | (0.00) | (0.02) | (0.01) | (14.08) | (14.16) | (3.62) | (0.44) | (257.89) | ||
C _{1} | 1 | 71.76 | 0.02 | 2.01 | 0.90 | 0.01 | 1.51 | 0.89 | 4913 | 5068 | 309.86 | 0.01 | 0.31 |
(0.18) | (0.00) | (0.00) | (0.00) | (0.00) | (0.01) | (0.00) | (11.04) | (11.10) | (3.17) | (0.00) | (0.04) | ||
2 | 71.76 | 0.02 | 2.01 | 0.90 | 0.01 | 1.52 | 0.88 | -821.08 | -743.76 | 154.64 | 0.01 | 0.31 | |
(0.24) | (0.00) | (0.01) | (0.00) | (0.00) | (0.01) | (0.01) | (11.10) | (11.12) | (2.22) | (0.00) | (0.04) | ||
N P | 1 | 71.73 | 0.02 | 2.01 | 0.90 | 0.01 | 1.51 | 0.89 | 5431 | 5537 | 212.64 | 2.34 | 4.55 |
(0.18) | (0.00) | (0.01) | (0.00) | (0.00) | (0.01) | (0.00) | (11.22) | (11.11) | (2.20) | (0.01) | (0.03) | ||
2 | 72.13 | 0.02 | 2.01 | 0.90 | 0.01 | 1.49 | 0.89 | -336 | -273 | 127.37 | 2.37 | 4.53 | |
(0.34) | (0.00) | (0.01) | (0.00) | (0.00) | (0.01) | (0.01) | (10.44) | (10.42) | (1.72) | (0.01) | (0.03) | ||
True: | 72.00 | 0.02 | 2.00 | 0.90 | 0.01 | 1.50 | 0.90 |
Discussion
In this paper, we studied the covariance model in functional mapping of photosynthetic rate as a reaction norm to irradiance and temperature as environmental signals. In the presence of interaction between the two signals simulated by nonseparable covariance structures, our analysis showed that Σ _{ NP } is a more reliable estimator than Σ_{AR(1) }particularly in QTL location estimation. The advantage of Σ _{ NP } over Σ_{AR(1) }is greater when the variance of the reaction norm process and the number of signal levels increase.
This vector has no natural ordering like in longitudinal data. However, our simulation results still suggest that Σ _{ NP } can be directly applied to observations that have no variable ordering such as (23). The process by which Σ _{ NP } was obtained in Yap et al. [14] was based on non-mixture type of longitudinal covariance estimators. This process is flexible and can potentially accommodate other estimators that can handle unordered data or are invariant to variable permutations. See for example the sparse permutation invariant covariance estimator (SPICE) proposed by Rothman et al. [32].
In the presence of interactions, nonseparable covariances can possibly be used in place of Σ_{ NP }, but they should closely reflect the structure of the data. Unfortunately, as with any parametric model, this is not often the case. In fact, it is not even known whether the data exhibits interactions or not. Before deciding on what model to use, one might utilize tests for separability [33, 34]. If separable models are appropriate, then there are many options. Otherwise, it is difficult to choose from a number of complex nonseparable covariances because there are no available general guidelines as yet that can help one decide which model to use. The covariance C_{ 3 } that was used in the simulations had an easy to interpret interaction parameter γ ∈ 0[1]. However, despite an interaction "strength" of γ = 0.6, the separable model, Σ_{AR(1)}, estimated the data generated by C_{3} quite well. Thus, the trade-o between using a nonseparable model instead of a separable one may not be worth it. Another option is to use separable approximations to nonseparable covariances [35]. The nonseparable covariances that we considered were assumed to be stationary and isotropic. These two assumptions may not always hold for real data. Although not specifically addressed here, using Σ _{ NP } may work for data that do not satisfy these assumptions.
Finally, we only considered two environmental signals with interactions: irradiance and temperature. However, the reaction norm of photosynthetic rate is a very complex process because there are really more environmental signals at play other than these two. Theoretically, the spatial domain of spatio-temporal nonseparable covariance models can be extended to more than one dimensions i.e., d > 1 in (10). For example, a two dimensional spatial domain models an area on a flat surface while a three dimensional domain models space. There are spatio-temporal models for these. However, this extension cannot be used to increase the number of signals in a reaction norm unless the signals have the same unit of measurement or one assumes separability or no interaction among the signals. For example, carbon dioxide concentration cannot be added as a signal, in addition to irradiance and temperature, when modeling photosynthetic rate as a reaction norm in the functional mapping setting because it does not have the same unit as irradiance or temperature. Thus, it is difficult to simulate data from more than two signals with interactions. However, Σ _{ NP } can theoretically handle covariances associated with more than two signals that may involve interactions. The computer code for the model will be available from http://statgen.psu.edu.
Declarations
Acknowledgements
This work is partially supported by NSF grant IOS-0923975, the Changjiang Scholarship Award and "One-thousand Person Plan" Award at Beijing Forestry University.
Authors’ Affiliations
References
- Via S, Gomulkievicz R, de Jong G, Scheiner SM, et al: Adaptive phenotypic plasticity: Consensus and controversy. Trends in Ecology and Evolution. 1995, 10: 212-217. 10.1016/S0169-5347(00)89061-8.PubMedView ArticleGoogle Scholar
- Scheiner SM: Genetics and evolution of phenotypic plasticity. Annual Reviews of Ecology and Systematics. 1993, 24: 35-68. 10.1146/annurev.es.24.110193.000343.View ArticleGoogle Scholar
- Schlichting CD, Smith H: Phenotypic plasticity: Linking molecular mechanisms with evolutionary outcomes. Evolutionary Ecology. 2002, 16: 189-201. 10.1023/A:1019624425971.View ArticleGoogle Scholar
- West-Eberhard MJ: Developmental Plasticity: An Evolution. Oxford University Press, New York; 2003.Google Scholar
- Wu RL: The detection of plasticity genes in heterogeneous environments. Evolution. 1998, 52: 967-977. 10.2307/2411229.View ArticleGoogle Scholar
- Wu RL, Grissom JE, McKeand SE, O'Malley DM: Phenotypic plasticity of fine root growth increases plant productivity in pine seedlings. BMC Ecology. 2004, 4: 14-10.1186/1472-6785-4-14.PubMedPubMed CentralView ArticleGoogle Scholar
- de Jong G: Evolution of phenotypic plasticity: Patterns of plasticity and the emergence of ecotypes. New Phytologist. 2005, 166: 101-117. 10.1111/j.1469-8137.2005.01322.x.PubMedView ArticleGoogle Scholar
- Kingsolver JG, Izem R, Ragland GJ: Plasticity of size and growth in fluctuating thermal environments: comparing reaction norms and performance curves. Integrative and Comparative Biology. 2004, 44: 450-460. 10.1093/icb/44.6.450.PubMedView ArticleGoogle Scholar
- Angilletta MJ, Sears MW: Evolution of thermal reaction norms for growth rate and body size in ectotherms: an introduction to the symposium. Integrative and Comparative Biology. 2004, 44: 401-402. 10.1093/icb/44.6.401.PubMedView ArticleGoogle Scholar
- Yap JS, Wang CG, Wu RL: A simulation approach for functional mapping of quantitative trait loci that regulate thermal performance curves. PLoS ONE. 2007, 2 (6): e554-10.1371/journal.pone.0000554.PubMedPubMed CentralView ArticleGoogle Scholar
- Stratton D: Reaction norm functions and QTL-environment interactions for flowering time in Arabidopsis thaliana. Heredity. 1998, 81: 144-155. 10.1046/j.1365-2540.1998.00369.x.PubMedView ArticleGoogle Scholar
- Kirkpatrick M, Heckman N: A quantitative genetic model for growth, shape, reaction norms, and other infinite-dimensional characters. Journal of Mathematical Biology. 1989, 27: 429-450. 10.1007/BF00290638.PubMedView ArticleGoogle Scholar
- Wu J, Zeng Y, Huang J, Hou W, Zhu J, Wu RL: Functional mapping of reaction norms to multiple environmental signals. Genetical Research. 2007, 89: 27-38. 10.1017/S0016672307008622.PubMedView ArticleGoogle Scholar
- Yap JS, Fan J, Wu RL: Nonparametric covariance estimation in functional map-ping of quantitative trait loci. Biometrics. 2009, 65: 1068-1077. 10.1111/j.1541-0420.2009.01222.x.PubMedPubMed CentralView ArticleGoogle Scholar
- Pourahmadi M: Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterisation. Biometrika. 1999, 86 (3): 677-690. 10.1093/biomet/86.3.677.View ArticleGoogle Scholar
- Huang J, Liu N, Pourahmadi M, Liu L: Covariance selection and estimation via penalised normal likelihood. Biometrika. 2006, 93: 85-98. 10.1093/biomet/93.1.85.View ArticleGoogle Scholar
- Meng X-L, Rubin D: Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika. 1993, 80: 267-278. 10.1093/biomet/80.2.267.View ArticleGoogle Scholar
- Cressie N, Huang H-C: Classes of nonseparable, spatio-temporal stationary covariance functions. Journal of the American Statistical Association. 1999, 94: 1330-1340. 10.2307/2669946.View ArticleGoogle Scholar
- Gneiting T, Genton M, Guttorp P: Geostatistical space-time models, stationary, separability and full symmetry. Statistical Methods for Spatio-temporal Systems (Monographs on Statistics and Applied Probability). Edited by: Finkenstadt B, Held L, Isham V. Chapman & Hall/CRC; 2006,Google Scholar
- Wolf JB: The geometry of phenotypic evolution in developmental hyperspace. Proceedings of the National Academy of Sciences of the USA. 2002, 99: 15849-15851. 10.1073/pnas.012686699.PubMedPubMed CentralView ArticleGoogle Scholar
- Wu RL, Ma C-X, Casella G: Statistical Genetics of Quantitative Traits: Linkage, Maps, and QTL. Springer-Verlag, New York; 2007.Google Scholar
- Thornley JHM, Johnson IR: Plant and Crop Modelling: A Mathematical Approach to Plant and Crop Physiology. Clarendon Press, Oxford; 1990.Google Scholar
- Nelder J, Mead R: A simplex method for function minimization. Computer Journal. 1965, 7: 308-313.View ArticleGoogle Scholar
- Doerge RW, Churchill GA: Permutation tests for multiple loci affecting a quantitative character. Genetics. 1996, 142: 285-294.PubMedPubMed CentralGoogle Scholar
- Ma C, Casella G, Wu RL: Functional mapping of quantitative trait loci underlying the character process: A theoretical framework. Genetics. 2002, 161: 1751-1762.PubMedPubMed CentralGoogle Scholar
- Wu RL, Ma C-X, Lin M, Casella G: A general framework for analyzing the genetic architecture of developmental characteristics. Genetics. 2004, 166: 1541-1551. 10.1534/genetics.166.3.1541.PubMedPubMed CentralView ArticleGoogle Scholar
- Gneiting T: Nonseparable, stationary covarience functions for space-time data. Journal of the American Statistical Association. 2002, 97: 590-600. 10.1198/016214502760047113.View ArticleGoogle Scholar
- Bochner S: Harmonic Analysis and the Theory of Probability. University of California Press, Berkley and Los Angeles; 1955.Google Scholar
- Wu WB, Pourahmadi M: Nonparametric estimation of large covariance matrices of longitudinal data. Biometrika. 2003, 90: 831-844. 10.1093/biomet/90.4.831.View ArticleGoogle Scholar
- Huang J, Liu L, Liu N: Estimation of large covariance matrices of longitudinal data with basis function approximations. Journal of Computational and Graphical Statistics. 2007, 16: 189-209. 10.1198/106186007X181452.View ArticleGoogle Scholar
- Levina E, Rothman A, Zhu J: Sparse estimation of large covariance matrices via a nested lasso penalty. Annals of Applied Statistics. 2008, 2: 245-263. 10.1214/07-AOAS139.View ArticleGoogle Scholar
- Rothman A, Bickel P, Levina E, Zhu J: Sparse permutation invariant covariance estimation. Electronic Journal of Statistics. 2008, 2: 494-515. 10.1214/08-EJS176.View ArticleGoogle Scholar
- Mitchell MW, Genton MG, Gumpertz ML: Testing for separability of space-time covariences. Envirometrics. 2005, 16: 819-831. 10.1002/env.737.View ArticleGoogle Scholar
- Fuentes M: Testing separability of spatial-temporal covariance functions. Journal of Statistical Planning and Inference. 2005, 136: 447-466. 10.1016/j.jspi.2004.07.004.View ArticleGoogle Scholar
- Genton M: Separable approximations of space-time covariance matrices. Envirometrics. 2007, 18: 681-695. 10.1002/env.854.View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.