Multiallelic epistatic model for an out-bred cross and mapping algorithm of interactive quantitative trait loci

Tong, Chunfa; Zhang, Bo; Wang, Zhong; Xu, Meng; Pang, Xiaoming; Si, Jingna; Huang, Minren; Wu, Rongling

doi:10.1186/1471-2229-11-148

Methodology article
Open access
Published: 31 October 2011

Multiallelic epistatic model for an out-bred cross and mapping algorithm of interactive quantitative trait loci

Chunfa Tong^1,2,
Bo Zhang¹,
Zhong Wang^2,3,
Meng Xu¹,
Xiaoming Pang³,
Jingna Si³,
Minren Huang¹ &
…
Rongling Wu^3,2

BMC Plant Biology volume 11, Article number: 148 (2011) Cite this article

6642 Accesses
10 Citations
Metrics details

Abstract

Background

Genetic mapping has proven to be powerful for studying the genetic architecture of complex traits by characterizing a network of the underlying interacting quantitative trait loci (QTLs). Current statistical models for genetic mapping were mostly founded on the biallelic epistasis of QTLs, incapable of analyzing multiallelic QTLs and their interactions that are widespread in an outcrossing population.

Results

Here we have formulated a general framework to model and define the epistasis between multiallelic QTLs. Based on this framework, we have derived a statistical algorithm for the estimation and test of multiallelic epistasis between different QTLs in a full-sib family of outcrossing species. We used this algorithm to genomewide scan for the distribution of mul-tiallelic epistasis for a rooting ability trait in an outbred cross derived from two heterozygous poplar trees. The results from simulation studies indicate that the positions and effects of multiallelic QTLs can well be estimated with a modest sample and heritability.

Conclusions

The model and algorithm developed provide a useful tool for better characterizing the genetic control of complex traits in a heterozygous family derived from outcrossing species, such as forest trees, and thus fill a gap that occurs in genetic mapping of this group of important but underrepresented species.

Background

Approaches for quantitative trait locus (QTL) mapping were developed originally for experimental crosses, such as the backcross, double haploid, RILs or F₂, derived from inbred lines [1–3]. Because of the homozygosity of inbred lines, the Mendelian (co)segregation of all markers each with two alternative alleles in such crosses can be observed directly. In practice, there is also a group of species of great economical and environmental importance - out-crossing species, such as forest trees, in which traditional QTL mapping approaches cannot be appropriately used. For these species, it is difficult or impossible to generate inbred lines due to long generation intervals and high heterozygosity [4], although experimental hybrids have been commercially used in practical breeding programs.

For a given outbred line, some markers may be heterozygous, whereas others may be homozygous over the genome. All markers may, or may not, have the same allele system between any two outbred lines used for a cross. Also, for a pair of heterozygous loci, their allelic configuration along two homologous chromosomes (i.e., linkage phase) cannot be observed from the segregation pattern of genotypes in the cross [5, 6]. Unfortunately, a consistent number of alleles across different markers and their known linkage phases are the prerequisites for statistical mapping approaches described for the backcross or F₂. Grattapaglia and Sederoff [7] proposed a so-called pseudo-test backcross strategy for linkage mapping in a controlled cross between two outbred parents. This strategy is powerful for the linkage analysis of those testcross markers that are heterozygous in one parent and null in the other, although it fails to consider many other marker cross types, such as intercross markers and dominant markers, that occur for an outbred cross. Maliepaard et al. [8] derived numerous formulas for estimating the linkage between different types of markers by correctly determining the linkage phase of markers. A general model has been developed for simultaneous estimation of the linkage and linkage phase for any marker cross type in outcrossing populations [9, 10]. Stam [11] wrote powerful software for integrating genetic linkage maps using different types of markers.

Statistical methods for QTL mapping in a full-sib family of outcrossing species have not received adequate attention. Lin et al. [12] developed a model that takes into account uncertainties about the number of alleles across the genome. Wu et al. [13] used this model to reanalyze a full-sib family data for poplar trees [14], leading to the detection of new QTLs for biomass traits which were not discovered by traditional approaches. With increasing recognition of the role of epistasis in controlling and maintaining quantitative variation [15], it is crucial to extend Lin et al.'s model to map the epistatic of QTLs by which to elucidate a detailed and comprehensive perspective on the genetic architecture of a quantitative trait. However, the well-established theory and model for epistasis are mostly based on biallelic genes [16] and their estimation and test are made for a pedigree derived from inbred lines [17]. Until now, no models and algorithms have been available for characterizing the epistasis of multiallelic QTLs in an outcrossing population.

In this article, we will extend the theory for biallelic epistasis to model the epistasis between different QTLs each with multiple alleles. The multiallelic epistatic theory is then implemented into a statistical model for QTL mapping based on a mixture model. We have derived a closed form for the estimation of the main and interactive effects of multiallelic QTLs within the EM framework. Our model allows geneticists to test the effects of individual genetic components on trait variation. The estimating model has been investigated through simulation studies and validated by an example of QTL mapping for poplar trees [18]. The algorithm has been packed to a newly developed package of software, 3FunMap, derived to map QTLs in a full-sib family [19].

Quantitative Genetic Model

Additive-dominance Model

Randomly select two heterozygous lines as parents P₁ and P₂ to produce a full-sib family, in which a QTL will form four genotypes if the two lines have completely different allele systems. Let μ_uvbe the value of a QTL genotype inheriting allele u (u = 1,2) from parent P₁ and allele v (v = 3, 4) from parent P₂. Based on quantitative genetic theory, this genotypic value can be partitioned into the additive and dominant effects as follows:

μ_{u v} = μ + α_{u} + β_{v} + γ_{u v},

(1)

where μ is the overall mean, α_uand β_vare the allelic (additive) effects of allele u and v, respectively, and γ_uvis the interaction (dominant) effect at the QTL. Considering all possible alleles and allele combinations between the two parent, there are a total of four additive effects (α₁ and α₂ from parent P₁ and β₃ and β₄ from parent P₂ and four dominant effects (γ₁₃, γ₁₄, γ₂₃ and γ₃₄). But these additive and dominant effects are not independent and, therefore, are not estimable. After parameterization, there are two independent additive effects, α = α₁ = -α₂ and β₃ = β₃ = -β₄, and one dominant effect, γ = γ₁₃ = -γ₁₄ = -γ₂₃ = γ₂₄, to be estimated.

Let u = (μ_uv)_{4 × 1} and a = (μ, α, β, γ)^T, which can be connected by a design matrix D. We have

u = D a,

where

D = [\begin{matrix} 1 & 1 & 1 & 1 \\ 1 & 1 & - 1 & - 1 \\ 1 & - 1 & 1 & - 1 \\ 1 & - 1 & - 1 & 1 \end{matrix}] .

The expression of a can be obtained from the expression of u by

a = D^{- 1} u .

(2)

Additive-dominance-epistatic Model

If there are two segregating QTL in the full-sib family, the epistatic effects due to their nonallelic interactions should be considered. The theory for epistasis in an inbred family [16] can be readily extended to specify different epistatic components for outbred crosses. Consider two epistatic multiallelic QTL, each of which has four different genotypes, 13, 14, 23, and 24, in the outbred progeny. Let $μ_{u_{1} v_{1} ∕ u_{2} v_{2}}$ be the genotypic value for QTL genotype u₁v₁/u₂v₂ for u₁,u₂ = 1,2 and v₁,v₂ = 3,4 and $u = (μ_{u_{1} v_{1} ∕ u_{2} v_{2}})$ be the corresponding mean vector. The two-QTL genotypic value is partitioned into different components as follows:

\begin{align} μ_{u_{1} v_{1} ∕ u_{2} v_{2}} & = μ + α_{1} + β_{1} + γ_{1} + α_{2} + β_{2} + γ_{2} \\ + I_{α α} + I_{α β} + I_{β α} + I_{β β} + J_{α γ} + J_{β γ} + K_{γ α} + K_{γ β} + L_{γ γ} \end{align}

(3)

where

(1)
μ is the overall mean;
(2)
α ₁ is the additive effect due to the substitution from allele 1 to 2 at the first QTL;
(3)
β ₁ is the additive effect due to the substitution from allele 3 to 4 at the first QTL;
(4)
γ ₁ is the dominant effect due to the interaction between alleles from different parents;
(5)
α ₂ is the additive effect due to the substitution from allele 1 to 2 at the second QTL;
(6)
β ₂ is the additive effect due to the substitution from allele 3 to 4 at the second QTL;
(7)
γ ₂ is the dominant effect due to the interaction between alleles from different parents;
(8)
I _ααis the additive × additive epistatic effect due to the interaction between the substitutions from allele 1 to 2 at the first and second QTLs;
(9)
I _αβis the additive × additive epistatic effect due to the interaction between the substitutions from allele 1 to 2 at the first QTL and from allele 3 to 4 at the second QTL;
(10)
I _βαis the additive × additive epistatic effect due to the interaction between the sub-stitutions from allele 3 to 4 at the first QTL and from allele 1 to 2 at the second QTL;
(11)
I _αβis the additive × additive epistatic effect due to the interaction between the sub-stitutions from allele 3 to 4 at the first and second QTLs;
(12)
J _αγis the additive × dominant epistatic effect due to the interaction between the substitutions from allele 1 to 2 at the first QTL and the dominant effect at the second QTL;
(13)
J _βγis the additive × dominant epistatic effect due to the interaction between the substitutions from allele 3 to 4 at the first QTL and the dominant effect at the second QTL;
(14)
K _γαis the dominant × additive epistatic effect due to the interaction between the dominant effect at the first QTL and the substitutions from allele 1 to 2 at the second QTL;
(15)
K _γβis the dominant × additive epistatic effect due to the interaction between the dominant effect at the first QTL and the substitutions from allele 3 to 4 at the second QTL;
(16)
L _γγis the dominant × dominant epistatic effect due to the interaction between the dominant effects at the first and second QTLs.

Genetic effect parameters for two interacting QTL are arrayed in a = (μ, α₁, β₁, γ₁, α₂, β₂, γ₂, I_αα, I_αβ, I_βα, I_ββ, J_αγ, J_βγ, K_γα, K_γβ, L_γγ)^T. We relate the genotypic value vector and genetic effect vector by

u = D a,

where design matrix

D = [\begin{matrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & - 1 & - 1 & 1 & - 1 & 1 & - 1 & - 1 & - 1 & 1 & - 1 & - 1 \\ 1 & 1 & 1 & 1 & - 1 & 1 & - 1 & - 1 & 1 & - 1 & 1 & - 1 & - 1 & - 1 & 1 & - 1 \\ 1 & 1 & 1 & 1 & - 1 & - 1 & 1 & - 1 & - 1 & - 1 & - 1 & 1 & 1 & - 1 & - 1 & 1 \\ 1 & 1 & - 1 & - 1 & 1 & 1 & 1 & 1 & 1 & - 1 & - 1 & 1 & - 1 & - 1 & - 1 & - 1 \\ 1 & 1 & - 1 & - 1 & 1 & - 1 & - 1 & 1 & - 1 & - 1 & 1 & - 1 & 1 & - 1 & 1 & 1 \\ 1 & 1 & - 1 & - 1 & - 1 & 1 & - 1 & - 1 & 1 & 1 & - 1 & - 1 & 1 & 1 & - 1 & 1 \\ 1 & 1 & - 1 & - 1 & - 1 & - 1 & 1 & - 1 & - 1 & 1 & 1 & 1 & - 1 & 1 & 1 & - 1 \\ 1 & - 1 & 1 & - 1 & 1 & 1 & 1 & - 1 & - 1 & 1 & 1 & - 1 & 1 & - 1 & - 1 & - 1 \\ 1 & - 1 & 1 & - 1 & 1 & - 1 & - 1 & - 1 & 1 & 1 & - 1 & 1 & - 1 & - 1 & 1 & 1 \\ 1 & - 1 & 1 & - 1 & - 1 & 1 & - 1 & 1 & - 1 & - 1 & 1 & 1 & - 1 & 1 & - 1 & 1 \\ 1 & - 1 & 1 & - 1 & - 1 & - 1 & 1 & 1 & 1 & - 1 & - 1 & - 1 & 1 & 1 & 1 & - 1 \\ 1 & - 1 & - 1 & 1 & 1 & 1 & 1 & - 1 & - 1 & - 1 & - 1 & - 1 & - 1 & 1 & 1 & 1 \\ 1 & - 1 & - 1 & 1 & 1 & - 1 & - 1 & - 1 & 1 & - 1 & 1 & 1 & 1 & 1 & - 1 & - 1 \\ 1 & - 1 & - 1 & 1 & - 1 & 1 & - 1 & 1 & - 1 & 1 & - 1 & 1 & 1 & - 1 & 1 & - 1 \\ 1 & - 1 & - 1 & 1 & - 1 & - 1 & 1 & 1 & 1 & 1 & 1 & - 1 & - 1 & - 1 & - 1 & 1 \end{matrix}] .

Thus, the genetic effect vector can be expressed, in terms of the genotypic value vector, as

a = D^{- 1} 1 u .

(4)

If we have alleles 1 = 3 and 2 = 4 for an outbred family, Equations 1 and 3 will be reduced to traditional biallelic additive-dominant and biallelic additive-dominant-epistatic genetic models, respectively [20].

Statistical Model

Likelihood

Suppose there is a full-sib family of size n derived from two outbred lines. Consider two interacting QTLs for a quantitative trait. Let u₁v₁ and u₂v₂ denote a general genotype at QTL 1 and 2, respectively, where u₁ and u₂ (u₁,u₂ = 1,2) are the alleles inherited from parent P₁ and v₁ and v₂ (v₁,v₂ = 3,4) are the alleles inherited from parent P₂. The linear model of the trait value (y_i) for individual i affected by the two QTLs is written as

y_{i} = \sum_{u_{1} = 1}^{2} \sum_{v_{1} = 3}^{4} \sum_{u_{2} = 1}^{2} \sum_{v_{2} = 3}^{4} ξ_{i u_{1} v_{1} ∕ u_{2} v_{2}} μ_{u_{1} v_{1} ∕ u_{2} v_{2}} + e i,

(5)

where $ξ_{i u_{1} v_{1} ∕ u_{2} v_{2}}$ is the indicator variable for QTL genotypes defined as 1 if a particular genotype u₁v₁/u₂v₂ is considered for individual i and 0 otherwise, and e_iis the residual error normally distributed with mean 0 and variance σ². The probability with which individual i carries QTL genotype u₁v₁/u₂v₂ can be inferred from its marker genotype, with this conditional probability expressed as $ω_{u_{1} v_{1} ∕ u_{2} v_{2} | i}$ [20].

The log-likelihood of the putative QTLs given the trait value (y) and marker information (M) is given by

L (Θ | y, M) = \prod_{i = 1}^{n} \sum_{u_{1} = 1}^{2} \sum_{v_{1} = 3}^{4} \sum_{u_{2} = 1}^{2} \sum_{v_{2} = 3}^{4} ω_{u_{1} v_{1} ∕ u_{2} v_{2} | i} f_{u_{1} v_{1} ∕ u_{2} v_{2}} (y_{i}),

(6)

where Θ is the vector for unknown parameters that include the QTL position expressed by the conditional probabilities $(ω_{u_{1} v_{1} ∕ u_{2} v_{2} | i})$ , QTL genotypic values $(μ_{u_{1} v_{1} ∕ u_{2} v_{2}})$ and the residual variance (σ²). The first parameters, denoted by Θ_p, are contained in the mixture proportions of the above model, whereas the second two, denoted by Θ_q, are quantitative genetic parameters. Normal distribution density $f u_{1} v_{1} ∕ u_{2} v_{2} (y_{i})$ has mean $μ_{u_{1} v_{1} ∕ u_{2} v_{2}}$ and variance σ².

EM Algorithm

The standard EM algorithm is developed to obtain the estimates of the unknown vector. By differentiating the log-likelihood of equation (6) with respect to two groups of unknown parameters (Θ_p, Θ_q), we have

\begin{gathered} \frac{\partial}{\partial Θ} log L (Θ | y, M) \\ = \sum_{i = 1}^{n} \sum_{u_{1} = 1}^{2} \sum_{v_{1} = 3}^{4} \sum_{u_{2} = 1}^{2} \sum_{v_{2} = 3}^{4} \frac{f_{u_{1} v_{1} ∕ u_{2} v_{2}} (y i) \frac{\partial}{\partial Θ_{p}} ω_{u_{1} v_{1} ∕ u_{2} v_{2} | i} + ω_{u_{1} v_{1} ∕ u_{2} v_{2} | i} \frac{\partial}{\partial Θ_{q}} f_{u_{1} v_{1} u_{2} v_{2}} (y i)}{\sum_{{u^{'}}_{1} = 1}^{2} \sum_{{v^{'}}_{1} = 3}^{4} \sum_{{u^{'}}_{2} = 1}^{2} \sum_{{v^{'}}_{2} = 3}^{4} ω {u^{'}}_{1} {v^{'}}_{1} ∕ {u^{'}}_{2} {v^{'}}_{2} | i f {u^{'}}_{1} {v^{'}}_{1} ∕ {u^{'}}_{2} {v^{'}}_{2} (y_{i})} \\ = \sum_{i = 1}^{n} \sum_{u_{1} = 1}^{2} \sum_{v_{1} = 3}^{4} \sum_{u_{2} = 1}^{2} \sum_{v 2 = 3}^{4} [\frac{ω u_{1} v_{1} ∕ u_{2} v_{2} | i f u_{1} v_{1} ∕ u_{2} v_{2} (y i) \frac{1}{ω u_{1} v_{1} ∕ u_{2} v_{2} | i} \frac{\partial}{\partial Θ_{p}} ω_{u v} | i}{\sum_{{u^{'}}_{1} = 1}^{2} \sum_{{v^{'}}_{1} = 3}^{4} \sum_{{u^{'}}_{2} = 1}^{2} \sum_{{v^{'}}_{2} = 3}^{4} ω_{{u^{'}}_{1} {v^{'}}_{1} ∕ {u^{'}}_{2} {v^{'}}_{2} | i} f_{{u^{'}}_{1} {v^{'}}_{1} ∕ {u^{'}}_{2} {v^{'}}_{2}} (y i)} \\ + \frac{ω_{u_{1} v_{1} ∕ u_{2} v_{2} | i} f_{u_{1} v_{1} ∕ u_{2} v_{2}} (y i) \frac{\partial}{\partial Θ_{q}} log {f_{u_{1} v_{1}}}_{∕ u_{2} v_{2}} (y i)}{\sum_{{u^{'}}_{1} = 1}^{2} \sum_{{v^{'}}_{1} = 3}^{4} \sum_{{u^{'}}_{2} = 1}^{2} \sum_{{v^{'}}_{2} = 3}^{4} ω_{{u^{'}}_{1} {v^{'}}_{1} ∕ {u^{'}}_{2} {v^{'}}_{2} | i} {f_{{u^{'}}_{1} v^{'}}_{1 ∕}}_{{u^{'}}_{2} {v^{'}}_{2}} (y_{i})}] \\ = \sum_{i = 1}^{n} \sum_{u_{1} = 1}^{2} \sum_{v_{1} = 3}^{4} \sum_{u_{2} = 1}^{2} \sum_{v_{2} = 3}^{4} {\prod {_{u}}_{_{1} v_{1} ∕ u_{2} v_{2}}}_{| i} [\frac{1}{ω_{u_{1} v_{1} ∕ u_{2} v_{2} | i}} \frac{\partial}{\partial Θ_{p}} ω_{u_{1} v_{1} ∕ u_{2} v_{2} | i} + \frac{\partial}{\partial Θ_{q}} log f_{u_{1} v_{1} ∕ u_{2} v_{2}} (y_{i})], \end{gathered}

where we define

\prod_{u_{1} v_{1} ∕ u_{2} v_{2} | i} = \frac{ω_{u_{1} v_{1} ∕ u_{2} v_{2} | i} f_{u_{1} v_{1} ∕ u_{2} v_{2}} (y i)}{\sum_{{u^{'}}_{1} = 1}^{2} \sum_{{v^{'}}_{1} = 3}^{4} \sum_{{u^{'}}_{2} = 1}^{2} {\sum_{{v^{'}}_{2} = 3}^{4} ω_{{u^{'}}_{1} {v^{'}}_{1} ∕ {u^{'}}_{2} {v^{'}}_{2} | i} f}_{{u^{'}}_{1} {v^{'}}_{1} ∕ {u^{'}}_{2} {v^{'}}_{2}} (y_{i})}

(7)

which could be thought of as a posterior probability that individual i has a QTL genotype u₁v₁/u₂v₂.

In the E step, calculate the posterior probabilities of QTL genotypes given the marker genotype of individual i by equation (7). In the M step, estimate the maximum likelihood estimates (MLEs) of the unknown parameters by solving $\frac{\partial}{\partial Θ} log L (Θ | y, M) = 0 .$ . The closed forms for estimating the genotypic values and residual variance are derived as

\begin{align} {\hat{μ}}_{u_{1} v_{1} ∕ u_{2} v_{2}} & = \frac{\sum_{u_{1} = 1}^{2} \sum_{v_{1} = 3}^{4} \sum_{u_{2} = 1}^{2} \sum_{v_{2} = 3}^{4} \prod_{u_{1} v_{1} ∕ u_{2} v_{2} | i} y_{i}}{\sum_{u_{1} = 1}^{2} \sum_{v_{1} = 3}^{4} \sum_{u_{2} = 1}^{2} \sum_{v_{2} = 3}^{4} \prod_{u_{1} v_{1} ∕ u_{2} v_{2} | i}} \\ {\hat{σ}}^{2} & = \frac{1}{n} \sum_{i = 1}^{n} \sum_{u_{1} = 1}^{2} \sum_{v_{1} = 3}^{4} \sum_{u_{2} = 1}^{2} \sum_{v_{2} = 3}^{4} {(y_{i} {\hat{μ}}_{u_{1} v_{1} ∕ u_{2} v_{2}})}^{2} \prod_{u_{1} v_{1} ∕ u_{2} v_{2} | i \cdot} \end{align}

(8)

By giving initial values for the parameters, the E and M steps are iterated until the estimates are stable. The stable values are the MLEs of the unknown parameters. Note that the QTL position within a marker interval can be estimated by treating the position is fixed. Using a grid search, we can obtain the MLE of the QTL position from the peak of the profile of the log-likelihood ratio test statistics across a chromosome.

Hypothesis Tests

After the parameters are estimated, a number of hypothesis tests can be made. The existence of a QTL can be tested by formulating the null hypothesis expressed as

\begin{gathered} H_{0} : μ_{u_{1} v_{1} ∕ u_{2} v_{2}} \equiv μ, for u_{1}, v_{1} = 1, 2 and u_{2}, v_{2} = 3, 4 \\ H_{1} : at least one of the equalities above does not hold . \end{gathered}

(9)

The likelihoods under the reduced (H₀) and full model (H₁) are calculated and their log-likelihood ratio (LR) is then estimated by

L R = - 2 In [\frac{L_{0} ({\tilde{Θ}}_{p}, {\tilde{Θ}}_{q} | y)}{L_{0} ({\hat{Θ}}_{p}, {\hat{Θ}}_{q} | y, M)}],

(10)

where the tildes and hats are the MLEs under the H₀ and H₁, respectively. The critical threshold for declaring the existence of a QTL can be empirically determined from permutation tests [21].

Hypothesis tests for different genetic effects including the additive (α₁, β₁, α₂, β₂), dominant (γ₁, γ₂) and additive × additive (I_αα, I_αβ, I_βα, I_ββ), additive × dominant (J_αγ, J_βγ), dominant × additive (K_γα, K_γβ) and dominant × dominant (L_γγ) epistatic effects can be formulated, with the respective null hypotheses:

Under each null hypothesis, the genotypic values should be constrained by

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 14} + μ_{13 ∕ 23} + μ_{13 ∕ 24} + μ_{14 ∕ 13} + μ_{14 ∕ 14} + μ_{14 ∕ 23} + μ_{14 ∕ 24} \\ = μ_{23 ∕ 13} + μ_{23 ∕ 14} + μ_{23 ∕ 23} + μ_{23 ∕ 24} + μ_{24 ∕ 13} + μ_{24 ∕ 14} + μ_{24 ∕ 23} + μ_{24 ∕ 24} \end{gathered}

(11)

for H₀ : α₁ = 0,

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 14} + μ_{13 ∕ 23} + μ_{13 ∕ 24} + μ_{23 ∕ 13} + μ_{23 ∕ 14} + μ_{23 ∕ 23} + μ_{23 ∕ 24} \\ = μ_{14 ∕ 13} + μ_{14 ∕ 14} + μ_{14 ∕ 23} + μ_{14 ∕ 24} + μ_{24 ∕ 13} + μ_{24 ∕ 14} + μ_{24 ∕ 23} + μ_{24 ∕ 24} \end{gathered}

(12)

for H₀ : β₁ = 0,

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 14} + μ_{14 ∕ 13} + μ_{14 ∕ 14} + μ_{23 ∕ 13} + μ_{23 ∕ 14} + μ_{24 ∕ 13} + μ_{24 ∕ 14} \\ = μ_{13 ∕ 23} + μ_{13 ∕ 24} + μ_{14 ∕ 23} + μ_{14 ∕ 24} + μ_{23 ∕ 23} + μ_{23 ∕ 24} + μ_{24 ∕ 23} + μ_{24 ∕ 24} \end{gathered}

(13)

for H₀ : α₂ = 0,

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 23} + μ_{14 ∕ 13} + μ_{14 ∕ 23} + μ_{23 ∕ 13} + μ_{23 ∕ 23} + μ_{24 ∕ 13} + μ_{24 ∕ 23} \\ = μ_{13 ∕ 14} + μ_{13 ∕ 24} + μ_{14 ∕ 14} + μ_{14 ∕ 24} + μ_{23 ∕ 14} + μ_{23 ∕ 24} + μ_{24 ∕ 14} + μ_{24 ∕ 24}, \end{gathered}

(14)

for H₀ : β₂ = 0,

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 14} + μ_{13 ∕ 23} + μ_{13 ∕ 24} + μ_{24 ∕ 13} + μ_{24 ∕ 14} + μ_{24 ∕ 23} + μ_{24 ∕ 24} \\ = μ_{14 ∕ 13} + μ_{14 ∕ 14} + μ_{14 ∕ 23} + μ_{14 ∕ 24} + μ_{23 ∕ 13} + μ_{23 ∕ 14} + μ_{23 ∕ 23} + μ_{23 ∕ 24}, \end{gathered}

(15)

for H₀ : γ₁ = 0,

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 24} + μ_{14 ∕ 13} + μ_{14 ∕ 24} + μ_{23 ∕ 13} + μ_{23 ∕ 24} + μ_{24 ∕ 13} + μ_{24 ∕ 24} \\ = μ_{13 ∕ 14} + μ_{13 ∕ 23} + μ_{14 ∕ 14} + μ_{14 ∕ 23} + μ_{23 ∕ 14} + μ_{23 ∕ 23} + μ_{24 ∕ 14} + μ_{24 ∕ 23}, \end{gathered}

(16)

for H₀ : γ₂ = 0,

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 14} + μ_{14 ∕ 13} + μ_{14 ∕ 14} + μ_{23 ∕ 23} + μ_{23 ∕ 24} + μ_{24 ∕ 23} + μ_{24 ∕ 24} \\ = μ_{13 ∕ 23} + μ_{13 ∕ 24} + μ_{14 ∕ 23} + μ_{14 ∕ 24} + μ_{23 ∕ 13} + μ_{23 ∕ 14} + μ_{24 ∕ 13} + μ_{24 ∕ 14}, \end{gathered}

(17)

for H₀ : I_αα= 0,

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 23} + μ_{14 ∕ 13} + μ_{14 ∕ 23} + μ_{23 ∕ 14} + μ_{23 ∕ 24} + μ_{24 ∕ 14} + μ_{24 ∕ 24} \\ = μ_{13 ∕ 14} + μ_{13 ∕ 24} + μ_{14 ∕ 14} + μ_{14 ∕ 24} + μ_{23 ∕ 13} + μ_{23 ∕ 23} + μ_{24 ∕ 13} + μ_{24 ∕ 23}, \end{gathered}

(18)

for H₀ : I_αβ= 0,

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 14} + μ_{14 ∕ 23} + μ_{14 ∕ 24} + μ_{23 ∕ 13} + μ_{23 ∕ 14} + μ_{24 ∕ 23} + μ_{24 ∕ 24} \\ = μ_{13 ∕ 23} + μ_{13 ∕ 24} + μ_{14 ∕ 13} + μ_{14 ∕ 14} + μ_{23 ∕ 23} + μ_{23 ∕ 24} + μ_{24 ∕ 13} + μ_{24 ∕ 14}, \end{gathered}

(19)

for H₀ : I_βα= 0,

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 23} + μ_{14 ∕ 14} + μ_{14 ∕ 24} + μ_{23 ∕ 13} + μ_{23 ∕ 23} + μ_{24 ∕ 14} + μ_{24 ∕ 24} \\ = μ_{13 ∕ 14} + μ_{13 ∕ 24} + μ_{14 ∕ 13} + μ_{14 ∕ 23} + μ_{23 ∕ 14} + μ_{23 ∕ 24} + μ_{24 ∕ 13} + μ_{24 ∕ 23}, \end{gathered}

(20)

for H₀ : I_ββ= 0,

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 24} + μ_{14 ∕ 13} + μ_{14 ∕ 24} + μ_{23 ∕ 14} + μ_{23 ∕ 23} + μ_{24 ∕ 14} + μ_{24 ∕ 23} \\ = μ_{13 ∕ 14} + μ_{13 ∕ 23} + μ_{14 ∕ 14} + μ_{14 ∕ 23} + μ_{23 ∕ 13} + μ_{23 ∕ 24} + μ_{24 ∕ 13} + μ_{24 ∕ 24}, \end{gathered}

(21)

for H₀ : J_αγ= 0,

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 24} + μ_{14 ∕ 14} + μ_{14 ∕ 23} + μ_{23 ∕ 13} + μ_{23 ∕ 24} + μ_{24 ∕ 14} + μ_{24 ∕ 23} \\ = μ_{13 ∕ 14} + μ_{13 ∕ 23} + μ_{14 ∕ 13} + μ_{14 ∕ 24} + μ_{23 ∕ 14} + μ_{23 ∕ 13} + μ_{24 ∕ 13} + μ_{24 ∕ 24}, \end{gathered}

(22)

for H₀ : J_βγ= 0,

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 14} + μ_{14 ∕ 23} + μ_{14 ∕ 24} + μ_{23 ∕ 23} + μ_{23 ∕ 24} + μ_{24 ∕ 13} + μ_{24 ∕ 14} \\ = μ_{13 ∕ 23} + μ_{13 ∕ 24} + μ_{14 ∕ 13} + μ_{14 ∕ 14} + μ_{23 ∕ 13} + μ_{23 ∕ 14} + μ_{24 ∕ 23} + μ_{24 ∕ 24}, \end{gathered}

(23)

for H₀ : K_γα= 0,

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 23} + μ_{14 ∕ 14} + μ_{14 ∕ 24} + μ_{23 ∕ 14} + μ_{13 ∕ 24} + μ_{24 ∕ 13} + μ_{24 ∕ 23} \\ = μ_{13 ∕ 14} + μ_{13 ∕ 24} + μ_{14 ∕ 13} + μ_{14 ∕ 23} + μ_{23 ∕ 13} + μ_{23 ∕ 23} + μ_{24 ∕ 14} + μ_{24 ∕ 24}, \end{gathered}

(24)

for H₀ : K_γβ= 0, and

\begin{gathered} μ_{13 ∕ 13} + μ_{13 ∕ 24} + μ_{14 ∕ 14} + μ_{14 ∕ 23} + μ_{23 ∕ 14} + μ_{23 ∕ 23} + μ_{24 ∕ 13} + μ_{24 ∕ 24} \\ = μ_{13 ∕ 14} + μ_{13 ∕ 23} + μ_{14 ∕ 13} + μ_{14 ∕ 24} + μ_{23 ∕ 13} + μ_{23 ∕ 24} + μ_{24 ∕ 14} + μ_{24 ∕ 23}, \end{gathered}

(25)

for H₀ : L_γγ= 0, respectively. Each of these constraints is implemented with the EM algorithm as described above, which will lead to the MLEs of the genotypic values that satisfies equations (11) - (25), respectively. The critical thresholds for each of the tests (11) - (25) can be determined from simulation studies.

Results

A Worked Example

We use a real example of a forest tree to illustrate our multiallelic epistiatic QTL mapping method in an outbred population. The material was an interspecific F₁ hybrid population between Populus deltoides (P₁) and P. euramericana (P₂). A total of 86 individuals were selected for QTL mapping. A genetic linkage map was constructed by using 74 SSR markers of segregating genotypes 12 × 34, which covers 822.35 cM of the whole genome and contains 14 linkage groups. The total number of roots per cutting (TNR) was measured and showed large variation in the hybrid population during the later development stage of adventitious rooting in water culture.

Through a systematic search over these linkage groups, the multiallelic espistatic model identifies six significant pairs of QTLs from different groups for TNR at the 5% significance level (Figure 1). The group × group-wide LR threshold for asserting that a pair of interacting QTLs exist was determined from 1000 permutation tests. Linkage group 2 has multiple regions that contain QTLs, which are located between markers L2_G_3592 and L2_O_10, markers L2_P_422 and L2_P_667, markers L2_P_667 and L2_G_876, and markers L2_O_286 and L2_O_222. These QTLs form five epistatic combinations by interacting with each other or with those on linkage groups 4, 7, 12 and 14 (Table 1). The sixth pair comes from linkage groups 6 and 12.

Table 1 Parameter estimates of interacting QTLs for root numbers in a full-sib family of poplars

Full size table

Table 1 gives the estimates of genetic effect parameters for the six pairs of interacting QTLs. At QTLs on linkage group 2, parent P. euramericana tends to contribute unfavorable alleles to root number, as seen by many negative β values, although this parent shows a better rooting capacity than parent P. deltoides. At these QTLs, parent P. deltoides generally contributes a small-effect allele to root number, as seen by small α values. At the QTL on linkage group 6, this parent triggers a large positive additive effect. It is interesting to find that there are pronounced interactions between alleles from these two parents, as seen by large γ values, suggesting the importance of dominance in rooting capacity. In many cases, additive × additive epistatic effects are important, as indicated by many large I values. Our model can further discern which kind of additive × additive epistasis contribute. For example, the additive × additive epistasis between QTLs from linkage group 2 is due to the interaction between alleles from parent P. euramericana, while for QTL pair from linkage groups 2 and 14 this is due to the interaction between alleles from parent P. deltoides. The pattern of how the QTLs interact with each other in terms of additive × dominant, dominant × additive, and dominant × dominant epistasis can also be identified (Table 1).

Monte Carlo Simulation

We performed simulation studies to investigate the statistical properties of the multiallelic epistatic model. We simulated a full-sib family of sample size 400, 800 and 2000 derived from two outcrossing parents. Two QTLs were assumed at different locations of a 100 cM-long linkage group with 6 even-spaced markers. Phenotypic values of a quantitative trait for each individual were simulated as the genotypic values at these QTLs plus normally distributed errors (scaled to have different heritabilities, 0.1 and 0.4). Genotypic values are expressed in terms of genetic actions and interactions with true values tabulated in Table 2.

Table 2 Parameter estimates and their standard errors of the multiallelic epistatic model for an outbred cross based on 1000 repeat simulations

Full size table

It was found that the QTL positions can well be estimated using our model (Table 2). The additive effects at individual QTLs and additive × additive epistatic effects can be reasonably estimated even when a modest sample size is used for a modest heritability. The other genetic effect parameters, especially dominant × dominant epistatic effects, need a large sample size to be reasonably estimated especially when the heritability is low. Because of a large number of parameters involved, the outcrossing design requires much larger sample sizes than backcross or F₂ designs.

Discussion

The past two decades have seen a tremendous interest in developing statistical models for QTL mapping of complex traits inspired by Lander and Botestin's (1989) pioneered interval mapping [2, 3, 17, 22–25]. However, model development for QTL mapping in outbred populations, a group of species of great environmental and economical importance [26], has not received adequate attention. Only a few publications are available to QTL mapping in outcrossing species [12, 13]. In this article, we present a quantitative genetic model for studying the epistasis of multiallelic QTLs and a computational algorithm for estimating and testing epistatic interactions.

The central issue of QTL mapping for outcrossing populations is how to model genetic actions and interactions between multiple alleles at different QTLs. Traditional quantitative genetic models have been developed for biallelic genetic effects [16] and their extension to multiallelic cases have not been clearly explored. This study gives a first attempt to characterize epistatic interactions between multiallelic QTLs that pervade outcrossing populations. We partition additive effects at each QTL into two subcomponents based on different parental origins of alleles. Similarly, we partition the additive × additive epistasis into four different subcomponents, the additive × dominant epistasis into two subcomponents, and the dominant × additive epistasis into two subcomponents based on the interactions of alleles of different parental origins. These subcomponents have unique biological meanings because they are derived from distinct parents. In practice, hybridization is made between two genetically distant parents, thus an understanding of each of these subcomponent helps to study the genetic basis of heterosis.

We tested the new multiallelic epistasis model through simulation studies. In general, because of a number of parameters involved, a larger sample size is required to obtain reasonably precise estimation for QTL mapping in outcrossing populations. According to our experience, the increased heritability of traits by precise phenotyping can improve parameter estimation and model power than augmented experiment scales. We recommend that more efforts are given to field management that can improve the quality of phenotype measurements than experimental size. By analyzing a real data set from a poplar genetic study, the new model has been well validated. It is interesting to find that interactions between alleles from different poplar species contribute substantially to rooting capacity from cuttings, larger than genetic effects of alleles that operate alone. This result may help to understand the role of dominance in mediating heterosis.

Conclusions

We have developed a statistical model for mapping interactive QTLs in a full-sib family of outcrossing species. By capitalizing on traditional quantitative genetic theory, we define epistatic components due to interactions between two outcrossing multiallelic QTLs. An algorithmic procedure was derived to estimate all types of outcrossing epistasis and test their significance in controlling a quantitative trait. Our model provides a useful tool for studying the genetic architecture of complex traits for outcrossing species, such as forest trees, and fill a gap that occurs in genetic mapping of this group of important but underrepresented species.

References

Lander ES, Botstein D: Mapping Mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics. 1989, 121: 185-199.
PubMed CAS PubMed Central Google Scholar
Zeng ZB: Precision mapping of quantitative trait loci. Genetics. 1994, 136: 1457-1468.
PubMed CAS PubMed Central Google Scholar
Lynch M, Walsh B: Genetics and Analysis of Quantitative Traits SinauerAssociates, Sunderland, MA; 1998.
Google Scholar
Wu RL, Zeng ZB, McKend SE, O'Malley DM: The case for molecular mapping in forest tree breeding. Plant Breed Rev. 2000, 19: 41-68.
CAS Google Scholar
Ritter E, Gebhardt C, Salamini F: Estimation of recombination frequencies and construction of RFLP linkage maps in plants from crosses between heterozy gous parents. Genetics. 1999, 125: 645-654.
Google Scholar
Ritter E, Salamini F: The calculation of recombination frequencies in crosses of allogamous plant species with applications to linkage mapping. Genet Res. 1996, 67: 55-65. 10.1017/S0016672300033474.
Article CAS Google Scholar
Grattapaglia D, Sederoff RR: Genetic linkage maps of Eucalyptus grandis and Eucalyptus urophylla using a pseudo-testcross: mapping strategy and RAPD markers. Genetics. 1994, 137: 1121-1137.
PubMed CAS PubMed Central Google Scholar
Maliepaard C, Jansen J, van Ooijen JW: Linkage analysis in a fullsib family of an outbreeding plant species: overview and consequences for applications. Genet Res. 1997, 70: 237-250. 10.1017/S0016672397003005.
Article Google Scholar
Wu RL, Ma CX, Painter I, Zeng ZB: Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. Theor Pop Biol. 2002, 61: 349-363. 10.1006/tpbi.2002.1577.
Article Google Scholar
Lu Q, Cui YH, Wu RL: A multilocus likelihood approach to joint modeling of linkage, parental diplotype and gene order in a full-sib family. BMC Genet. 2004, 5: 20.
Article PubMed PubMed Central Google Scholar
Stam P: Construction of integrated genetic linkage maps by means of a new computer package: JoinMap. Plant J. 1993, 3: 739-744. 10.1111/j.1365-313X.1993.00739.x.
Article CAS Google Scholar
Lin M, Lou XY, Chang M, Wu RL: A general statistical framework for mapping quantitative trait loci in non-model systems: Issue for characterizing linkage phases. Genetics. 2002, 165: 901-913.
Google Scholar
Wu S, Yang J, Huang YJ, Li Y, Yin T, Wullschleger SD, Tuskan GA, Wu RL: An improved approach for mapping quantitative trait loci in a pseudo-testcross design: Revisiting a poplar genome study. Bioinformat Biol Insights. 2010, 4: 1-8.
Article Google Scholar
Wullschleger SD, Yin TM, DiFazio SP, Tschaplinski TJ, Gunter LE, Davis MF, Tuskan GA: Phenotypic variation in growth and biomass distribution for two advanced-generation pedigrees of hybrid poplar (Populus spp.). Can J For Res. 2005, 5: 1779-1789.
Article Google Scholar
Whitlock MC, Phillips PC, Moore FBG, Tonsor SJ: Multiple fitness peaks and epistasis. Ann Rev Ecol Syst. 1995, 26: 601-629. 10.1146/annurev.es.26.110195.003125.
Article Google Scholar
Mather K, Jinks JL: Biometrical Genetics. Chapman & Hall London;, 3 1982.
Chapter Google Scholar
Kao CH, Zeng ZB: Modeling epistasis of quantitative trait loci using Cocker-ham's model. Genetics. 2002, 160: 1243-1261.
PubMed PubMed Central Google Scholar
Zhang B, Tong CF, Yin TM, Zhang XY, Zhuge Q, Huang MR, Wang MX, Wu RL: Detection of quantitative trait loci influencing growth trajectories of adventitious roots in Populus using functional mapping. Tree Genet Genom. 2009, 5: 539-552. 10.1007/s11295-009-0207-z.
Article Google Scholar
Mather K, Jinks JL: Biometrical Genetics. Chapman & Hall London;, 3 1982.
Google Scholar
Wu RL, Ma CX, Casella G: Statistical Genetics of Quantitative Traits: LinkageMaps and QTL Springer-Verlag, New York; 2007.
Google Scholar
Churchill GA, Doerge RW: Empirical threshold values for quantitative trait mapping. Genetics. 1994, 138: 963-971.
PubMed CAS PubMed Central Google Scholar
Yi NJ, Xu SZ, Allison DB: Bayesian model choice and search strategies for mapping interacting quantitative trait loci. Genetics. 2003, 165: 867-883.
PubMed CAS PubMed Central Google Scholar
Broman KW: Mapping quantitative trait loci in the case of a spike in the phenotype distribution. Genetics. 2003, 163: 1169-1175.
PubMed PubMed Central Google Scholar
Zou F, Nie L, Wright FA, Sen PK: A robust QTL mapping procedure. J Stat Plan Infer. 2009, 139: 978-989. 10.1016/j.jspi.2008.06.009.
Article Google Scholar
Cheng JY, Tzeng SJ: Parametric and semiparametric methods for mapping quantitative trait loci. Computat Stat Data Analy. 2009, 53: 1843-1849. 10.1016/j.csda.2008.08.026.
Article Google Scholar
Bradshaw HD, Stettler RF: Molecular genetics of growth and development in Populus. IV. Mapping QTLs with large effects on growth, form, and phenology traits in a forest tree. Genetics. 1995, 139: 963-973.
PubMed CAS Google Scholar

Download references

Acknowledgements

This work is partially supported by NSF/IOS-0923975, Changjiang Scholars Award, and "Thousand-person Plan" Award.

Author information

Authors and Affiliations

The Key Laboratory of Forest Genetics and Gene Engineering, Nanjing Forestry University, Nanjing, Jiangsu, 210037, China
Chunfa Tong, Bo Zhang, Meng Xu & Minren Huang
Center for Statistical Genetics, The Pennsylvania State University, Hershey, PA, 17033, USA
Chunfa Tong, Zhong Wang & Rongling Wu
Center for Computational Biology, National Engineering Laboratory for Tree Breeding, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants, Beijing Forestry University, Beijing, 100083, China
Zhong Wang, Xiaoming Pang, Jingna Si & Rongling Wu

Authors

Chunfa Tong
View author publications
You can also search for this author in PubMed Google Scholar
Bo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Meng Xu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoming Pang
View author publications
You can also search for this author in PubMed Google Scholar
Jingna Si
View author publications
You can also search for this author in PubMed Google Scholar
Minren Huang
View author publications
You can also search for this author in PubMed Google Scholar
Rongling Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rongling Wu.

Additional information

Authors' contributions

CT derived the model and performed computer simulation and data analysis. BZ and MX collected the data from poplar hybrids. ZW and JS participated in simulation studies. XP participated in model design and result interpretation. MH conceived of the experiment. RW developed the model and algorithm, coordinated simulation and data analysis, and wrote the paper. All authors have read and approved the final manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Tong, C., Zhang, B., Wang, Z. et al. Multiallelic epistatic model for an out-bred cross and mapping algorithm of interactive quantitative trait loci. BMC Plant Biol 11, 148 (2011). https://doi.org/10.1186/1471-2229-11-148

Download citation

Received: 18 May 2011
Accepted: 31 October 2011
Published: 31 October 2011
DOI: https://doi.org/10.1186/1471-2229-11-148

Multiallelic epistatic model for an out-bred cross and mapping algorithm of interactive quantitative trait loci

Abstract

Background

Results

Conclusions

Background

Quantitative Genetic Model

Additive-dominance Model

Additive-dominance-epistatic Model

Statistical Model

Likelihood

EM Algorithm

Hypothesis Tests

Results

A Worked Example

Monte Carlo Simulation

Discussion

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Authors' contributions

Authors’ original submitted files for images

Authors’ original file for figure 1

Rights and permissions

About this article

Cite this article

Keywords

BMC Plant Biology

Contact us

Multiallelic epistatic model for an out-bred cross and mapping algorithm of interactive quantitative trait loci

Abstract

Background

Results

Conclusions

Background

Quantitative Genetic Model

Additive-dominance Model

Additive-dominance-epistatic Model

Statistical Model

Likelihood

EM Algorithm

Hypothesis Tests

Results

A Worked Example

Monte Carlo Simulation

Discussion

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Authors' contributions

Authors’ original submitted files for images

Authors’ original file for figure 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Plant Biology

Contact us