High-density genetic map construction and QTLs analysis of grain yield-related traits in Sesame (Sesamum indicum L.) based on RAD-Seq techonology

Background Sesame (Sesamum indicum L., 2n = 26) is an important oilseed crop with an estimated genome size of 369 Mb. The genetic basis, including the number and locations of quantitative trait loci (QTLs) of sesame grain yield and quality remain poorly understood, due in part to the lack of reliable markers and genetic maps. Here we report on the construction of a hitherto most high-density genetic map of sesame using the restriction-site associated DNA sequencing (RAD-seq) combined with 89 PCR markers, and the identification of grain yield-related QTLs using a recombinant inbred line (RIL) population. Result In total, 3,769 single-nucleotide polymorphism (SNP) markers were identified from RAD-seq, and 89 polymorphic PCR markers were identified including 44 expressed sequence tag-simple sequence repeats (EST-SSRs), 10 genomic-SSRs and 35 Insertion-Deletion markers (InDels). The final map included 1,230 markers distributed on 14 linkage groups (LGs) and was 844.46 cM in length with an average of 0.69 cM between adjacent markers. Using this map and RIL population, we detected 13 QTLs on 7 LGs and 17 QTLs on 10 LGs for seven grain yield-related traits by the multiple interval mapping (MIM) and the mixed linear composite interval mapping (MCIM), respectively. Three major QTLs had been identified using MIM with R2 > 10.0% or MCIM with ha2 > 5.0%. Two co-localized QTL groups were identified that partially explained the correlations among five yield-related traits. Conclusion Three thousand eight hundred and four pairs of new DNA markers including SNPs and InDels were developed by RAD-seq, and a so far most high-density genetic map was constructed based on these markers in combination with SSR markers. Several grain yield-related QTLs had been identified using this population and genetic map. We report here the first QTL mapping of yield-related traits with a high-density genetic map using a RIL population in sesame. Results of this study solidified the basis for studying important agricultural traits and implementing marker-assisted selection (MAS) toward genetic improvement in sesame. Electronic supplementary material The online version of this article (doi:10.1186/s12870-014-0274-7) contains supplementary material, which is available to authorized users.


Background
Sesame (Sesamum indicum L.) is an important and ancient oilseed crop [1]. It is a diploid species (2n = 26) with an estimated genome size of 369 Mb [2]. Sesame seed has the highest oil contents compared with rapeseed, peanut, soybean and other oilcrops [3]. It is also rich in proteins, vitamins and specific antioxidants such as sesamin and sesamolin [4,5], making it one of the best choices for health foods. As the market demand of sesame seeds is rapidly growing, it becomes one of the most important goals to stably improve grain yield of sesame by genetic approaches. Grain yield of sesame per plant is considered to be composed of three components, i.e. the number of capsules per plant, the number of grains per capsule and the grain weight. Some other factors, including plant height, length of capsules (floral) and axis height of the first capsule were found to strongly associated with grain yield of sesame [6]. Since the grain yield-related traits are inherited quantitatively and governed by multiple genes sensitive to the environment, QTL-mapping is needed to dissect the genetics of these traits [7]. The high-density genetic map had been proved to be a very effective and important approach for QTLs detection in rice [8][9][10][11] and other crops [12][13][14]. Unfortunately, there are no yield-related QTLs or genes have been reported in sesame due in part to the lack of reliable DNA markers and genetic maps constructed based on permanent populations.
The first genetic linkage map of sesame was constructed using an F 2 population derived from the intervariety cross of 'COI1134' (white seed coat) and 'RXBS' (black seed coat) [15]. This map was 936.72 cM in genetic length with an average marker distance of 4.93 cM. It contained 220 markers, including 8 expressed sequence tag-simple sequence repeats (EST-SSRs), 25 amplified fragment length polymorphism (AFLPs) and 187 Random Selective Amplification of Microsatellite Polymorphic Loci (RSAMPLs), that are distributed on 30 linkage groups, which is more than 2 folds the number of chromosomes of the haploid sesame genome. Later, 14 more genic-SSRs developed from RNA-seq were integrated onto this map [16]. More recently, this map was improved substantially by placement of more markers using an enlarged F 2 population [17]. This reduced the number of LGs to 14, only one LG more than the haploid chromosome number of sesame. The genetic length of this new map was 1,216 cM, and the marker density was 1.86 cM per marker interval. Four QTLs controlling seed coat color with a heritability ranging from 59.33% to 69.89% were detected in F 3 populations.
The emergence of massively-parallel, next-generation sequencing (NGS) platforms with continually reducing costs offers unprecedented opportunities for genomewide marker development and genotyping by sequencing (GBS). Several NGS methods are combined with restriction enzyme digestion to reduce the complexity of the target genomes, making the sequencing load and cost significantly declined [18], while still capable of discovering thousands of single-nucleotide polymorphisms (SNPs) or insertion-deletions (InDels) markers [19][20][21]. The restriction-site associated DNA sequencing (RADseq) was one of the NGS methods that sequencing only the DNA flanking specific restriction enzyme sites to produce a reduced representation of genome, which ligated an adapter containing multiplex identifiers (MIDs) in the reduced-representation libraries (RRLs) [22][23][24][25][26][27]. In these ways, several high-density genetic maps have been constructed in eggplant [28], ryegrass [13], barley [14], grape [27] and even sesame [29]. Recently, a high-density genetic map of sesame was constructed based on an F 2 population using the specific length amplified fragment sequencing (SLAF-seq) technology, which is an enhanced RRL sequencing strategy for de novo SNP discovery from large populations [21,29]. This map comprises 1,233 SLAF markers that are distributed on 15 linkage groups (LGs), and is 1,474.87 cM in length with average marker spacing of 1.20 cM. Collectively, all the three published sesame genetic maps are not ideal for quantitative traits mapping as they are all on the basis of a temporary population (F 2 ) that renders repeated phenotyping unfeasible [30]. Moreover, these maps are not comparable as they lack common markers.
In this study, we identified three thousand seven hundred and sixty-nine pairs of SNP markers through RAD-seq of two sesame varieties 'Zhongzhi 14' and 'Miaoqianzhima'. These markers combined with 1,195 previously reported EST-SSR or genomic-SSR and 79 InDel markers [31], were used to construct a high-density genetic map of sesame using a recombinant inbred line (RIL) population. We further present the identification of grain yield-related QTLs based on these novel genomic resources.

RAD sequencing, SNPs and InDels discovery
A total of 62.57 Gb high-quality sequence data containing 312,829,823 pair-end reads was obtained. The read number for the 224 RILs ranged from 598,119 to 3,483,606 with an average of 1,644,718. For the two parents, 3,030,776 reads were from the female parent and 3,881,579 reads were from male parent. After, the number of RAD-tags identified from the male and female parents was 231,000 and 207,000, respectively. The average coverage for individual tag was 16.80-fold in the male parent and 14.64-fold in the female parent. The number of comparable RAD-tags between the two parents was 47,247. However, only 3,769 SNP had been identified for two parents of the RIL population. Most of these SNPs were transition type SNPs with Y(T/C) and R(G/A) types accounting for 30.43% and 30.78%, respectively (Additional file 1). Besides SNPs, 97 InDels (≥2 bp) were identified with 79 successfully designed for further PCR verification and population genotype analysis [31].
Combined with previously published sesame SSRs, a total of 1061 EST-SSRs, 134 genomic-SSRs and 79 InDels were surveyed on the genomic DNA of the two parents. Eighty-nine of these PCR markers detected polymorphism including 44 EST-SSRs, 10 genomic-SSRs and 35 InDels. The efficiencies of EST-SSRs, genomic-SSRs, InDels and SNPs markers in detecting polymorphism between parents varied from 5.0% with EST-SSRs to 46.7% with InDels. All of these polymorphic SSR and InDel markers detected codominant loci.

Genetic mapping
Before genetic mapping of these markers, 656 SNP markers and 1 InDel marker that had more than 40% missing data in the RIL population were excluded. Another 1,786 SNPs, 15 InDels, 24 EST-SSRs and 4 genomic-SSRs were also excluded for their excessively distorted pattern with segregation ratios of the minor allele frequency less than 0.29. Therefore, a final set of 1,327 SNPs, 19 InDels and 26 SSRs, which mostly inherited in a codominant manner, were used for genetic map construction ( Table 1) (Table 2). There were 16 gaps more than 10 cM distributed on 9 LGs, excluding LG2, LG8, LG9, LG10 and LG14, with the largest gap of 22.54 cM located on LG6. Most of these gaps were located near the end of the linkage groups (Figure 1), which was considered a reflection of high levels of recombination at distal regions of chromosomes [39,40]. Furthermore, the distributions of SSR, InDel and SNP markers toward different LGs are random, with less than 10% SSR or InDel markers each LGs.
One thousand one hundred and fifteen mapped markers segregated in the expected 1:1 ratio in the population. However, segregation of 115 mapped markers, including 4 SSRs, 2 InDels and 109 SNPs, were significantly deviated from this ratio (P <0.05) ( Table 2). Seventy-seven (61.1%) segregation distorted markers exhibited skewed genotypic frequencies toward 'Zhongzhi 14' , while 49 (38.9%) toward 'Miaoqianzhima'. Most of these markers have no effect on the calculation of map distance, except SBN1614, SBN3567 and GSSR074. Compared to mapped SNP markers and InDel markers, the mapped SSR markers had the highest percentage of skewed markers at 17.4%. These segregation distortion markers were distributed on 13 LGs, excepting LG14. The largest LG4 with 227 mapped markers had the most segregation distortion markers. The frequency of segregation distortion marker on LG12 was much higher than for other LGs at 39.4%. Four regions of segregation distortion (SDR) were detected on four LGs, including LG2, LG4, LG6 and LG12 ( Table 2). Most of these SDRs distributed near the end of their LGs, with 3 to 5 skewed markers each and accounting for 14.3% of the total skewed markers in the map. Most skewed markers in four SDRs were SNP type, with one EST-SSR marker (ZM1197) and one InDel marker (SBI035) in SDR-LG4. All the markers in SDR-LG2, SDR-LG6, and SDR-LG12 exhibited skewed

Phenotypic analysis
In all experiments, seven yield-related traits showed significant differences between the mapping parental lines. All traits showed a continuous distribution and transgressive segregation in the RIL population ( Figure 2), indicating governed by multiple genes. The near-normal curve distribution of PH, FCH, CAL, GN and TGW suggested a polygene mode of the genetic control; but CL and CN showed a bimodal distribution, suggesting the involvement of major effect genes. Analysis of variance (ANOVA) showed that the between-line variations of all traits in each trial were significant at P = 0.001. The broad-sense heritability of the seven traits ranged from 29.8% (FCH) to as high as 95.7% (CN) ( Table 3). The heritabilities of each trait are in line with their corresponding distributions.
Trial-wide correlation coefficients of all seven traits were significant at the level of P =0.01 (Additional file 3). Correlation of CL among different environments (years or locations) were strong with the coefficients above 0.80, while much weaker correlation for CAL were noted with the coefficients ranging from 0.27 to 0.35. Across the three environments where phenotypic data were available (2012WC, 2012FY and 2013YL), significant positive correlations were observed between PH and FCH (P ≤0.01), PH and CAL (P ≤0.01), PH and TGW (P ≤0.05), FCH and TGW (P ≤0.05), even CL and GN (P ≤0.01), while significant negative correlation were observed between CN and TGW (P ≤0.05) ( Table 4). More interestingly, GN and TGW were positively correlated in 2012FY (P ≤0.01), but negatively correlated in 2013YL (P ≤0.01).

QTL analysis
A total of 13 yield-related QTLs were found on 7 linkage groups using the multiple interval mapping (MIM) methods. A range of one to three QTLs were detected for individual traits (Table 5). Six QTLs were detectable in more than one trial, including Qph-12, Qtgw-11, Qgn-1, Qgn-6, Qgn-12 and Qcl-12, while others were repeatable by two softwares. Most of them showed positive additive effects by the alleles of Zhongzhi 14 except Qgn-12 and Qcl-12. Six major-effect QTLs were detected with the phenotypic effect (R 2 ) more than 10%, including one QTL, Qcl-12, showing R 2 ranged from 52.2% to 75.6%.
QTL mapping was also performed with QTLNetwork 2.0 under the mixed linear composite interval mapping  (MCIM) algorithm to dissect the main additive effects (a), the additive-additive epistatic effects (aa) and the additive-environmental interaction effects (ae) in multitrials. A total of 17 QTLs were detected on 10 linkage groups ( Table 3). All of them had significant a effects, and Qgn-6 also had significant ae effects at P ≤0.05 in 2013FY. All of them showed significant additive effect at P ≤0.001, and explained 1.70-45.39% of the phenotype variation with four major QTLs larger than 5.0%. Two QTLs for first capsule height, Qfch-4 and Qfch-12, were also detected with significant aa effect explained 1.59% of the phenotypic variation (Table 3).

Construction of a high-density genetic map in sesame
In this study, only 44 (5.0%) EST-SSRs and 10 (9.3%) genomic-SSRs were found polymorphic in the mapping population and thus were useful for genetic map construction. This rate of polymorphism is much lower than in many previous reports in sesame [16,32,34], indicating a narrower genetic dissimilarity between the parents. However, thanks to the high-throughput RAD-Seq technology, we were able to discover more than 3000 SNPs plus dozens of InDels from~40 k comparable RAD-tags. The rate of SNPs was 7.98% across the genome, which was higher than 5.12% reported by Zhang et al. [29]. The observation that most SNPs belong to the Y(T/C) (30.43%) and R(G/A) (30.78%) types are consistent with Positive and negative values indicated additive effect, additive × environment interaction effect (ae) or epistatic interaction additive effect (aa) by the alleles of Zhongzhi 14 and Miaoqianzhima, respectively; b Contibution ratio of QTL additive effect, additive × environment interaction effect (ae) or epistatic interaction additive effect (aa); *, **, *** Significant at 0.05, 0.01, 0.001 probability levels, respectively; c The broad-sense heritability (H 2 ) was calculated with the formula H 2 = σ g 2 /(σ g 2 + σ e the situations previously reported in sesame [29] and other species including even human [41]. Furthermore, the mapping population in this study was the first reported and the largest permanent mapping population in sesame. Compared to other published genetic maps in sesame, the map constructed in this paper had the highest marker density, the similar number of linkage groups compare to Sesamum indicum L. chromosomes (2n = 26), fewer distortion markers, fewer and smaller gaps [15,17,29]. Furthermore, 2,442 (64.8%) SNP markers and 44 (49.4%) polymorphic PCR markers that excessively missed or distorted were excluded for map construction in this study, while more than 65.4% markers were discarded for their unexpected segregation patterns that reported by Zhang et al. [29]. There were also 115 (9.35%) markers that showed significant segregation distortion (P <0.05) were mapped onto our map, while 205 (16.63%) [29] and 79 (10.91%) [17] on other two genetic maps in sesame. Four SDRs were detected on 4 LGs of our map, while 18 SDRs on 11 LGs of SLAF map [29]. Most of them distributed near the end of LGs, and may be involved in gametic, zygotic or other selections [42,43]. The map size reported here is 844.46 cM, which is significantly shorter than previously published maps of 1,216 and 1,474 cM. This might be due to the discarded linkage groups with less than 20 markers and the fewer segregation distortion markers and SDRs in our map. More importantly, several PCR markers on our map will be very useful information for the comparison of maps, genes or QTLs reported in sesame. Therefore, the high-density genetic map constructed in this study combined the advantages of two older maps in sesame, and will be an ideal map for QTL/gene mapping, comparative genomics analysis, map-based cloning and so on. However, it should be pointed out that the utility as a general tool for the research community has limitations for the genetic map presented is mainly based on SNP between only two sesame varieties and the SNP flanking sequence is only 85 bp.

Identification of grain yield-related QTLs using high-density genetic map in sesame
As grain yield is a complex quantitative trait controlled by multiple genes and sensitive to environments, it is imperative to phenotype yield-related traits repeatedly for reliable QTL mapping. Here the availability of a permanent segregating population (the RIL) makes it feasible for repeated phenotyping both over time and location. Since significantly (P = 0.01) correlations were found for each trait among different environments, the field LGs and 17 QTLs on 10 LGs had been detected using MIM and MCIM method, respectively. These were the first reported grain yield-related QTLs in sesame, and all of them were detectable in more than one trial or by two algorithms. The genetic control of seven yield-related traits was mostly comprised of few major QTLs plus several minor QTLs. Three major QTLs had been detected using MIM with R 2 > 10.0% or MCIM with h a 2 > 5.0%. Ten minor QTLs had been identified for seven yield-related traits using both MIM and MCIM. On the other hand, we found a QTL (Qgn-6) showed significant ae effect, and one pair of QTLs for FCH with significant aa effect. Several ae or aa effect of yield-related QTLs also had been reported in wheat [44], soybean [45], oilseed rape [46], and so on. These QTLs with a, ae or aa effect will be very important common and special information for yield improvement in sesame.
Furthermore, significantly correlations were found among some of the yield-related traits, which are indicative of closely linked or pleiotropic genetic factors controlling these traits. This was then verified by co-localization of several QTLs for these traits. The co-localization of Qph-12 and Qfch-12, all from the Zhongzhi 14 alleles, were in line with the significant positive correlation between PH and FCH. The positive correlation was found between FCH and TGW, but negative correlation between  CN and TGW or CN and FCH. Correspondingly, Qfch-11 and Qtgw-11 with positive additive effect from Zhongzhi 14 alleles, and Qcn-11 with negative additive effect from Miaoqianzhima alleles, were closely located on LG11. Nevertheless, not all correlations can be explained by QTL co-localization, such as CL and GN, PH and CN. These contradictions could be due to the effect of undetected QTLs or reasons other than pleiotropy or linkage.

Future perspectives and challenges in sesame breeding
Improvement of yield is one of the most important targets for sesame breeding; however, it is a timeconsuming and tedious project because multiple complex and environment-sensitive components are involved in this process. The identification of yield-related QTLs in this study has laid a preliminary foundation for marker assisted selection (MAS) toward the yield traits in sesame. Even though, for some minor QTLs with low LOD scores, further validation is necessary before utilizing them in breeding. On the other hand, the epistatic interaction and the co-location of yield-related QTLs may be beneficial or problematic for pyramiding of desired loci, depending on their patterns. The positive aa effects of Qfch-4 and Qfch-12 indicate that the integration of both QTLs will be beneficial to the improvement of FCH in this study. The closely located Qtgw-11 and Qcn-11 showed significant additive effect on TGW and CN, but the favorable alleles are carried by different parent lines. Thus, there are still a lot of efforts to make to precisely dissect the linked or epistatic QTLs, or screen for germplasm with independent favorable allelic variations, to facilitate breeding. In this study, we found that most QTLs showing positive additive effects are from the alleles of Zhongzhi 14, an excellent commercial cultivar with several high-yield characters. However, two identified QTLs for GN and CN contributed by Miaoqianzhima. It means that introduction of these two QTLs using the alleles of Miaoqianzhima will further improve the GN and CN of Zhongzhi 14. Furthermore, we have found 'the superior line' predicted using QTLNetwork 2.0 with significantly increased genotype effect for GN value than two parents [47] (data not showed). So there will be very great breeding potential for the improvement of grain number per capsule with this RIL population. This genotyped RIL population combined with high-density genetic map will also serve as an effective study system for characterizing serious of important agricultural traits, such as yield, oil or protein content in grain, stress tolerance, and so on.

Conclusions
This report presents by far the first QTL mapping work of yield-related traits in sesame using a RIL population, in addition to the construction of a high density genetic map. We developed 3,769 SNPs markers by RAD tag sequencing, and constructed a so far most high-density genetic map of 14 LGs in combination with SSR and InDel markers. Using this RIL population and genetic map, several grain yield-related QTLs had been detected in more than one trials or by both MIM and MCIM method, including three major effect QTLs with R 2 > 10.0% or h a 2 > 5.0%. Three QTLs with significant ae or aa effect had also been identified using MCIM algorithm. Several co-localized QTLs were identified that partially explained the correlations among seven yieldrelated traits. The high-density genetic map and yieldrelated QTLs in the current study solidified the basis for studying important agricultural traits, map-based cloning of grain yield-related genes and implementing MAS toward genetic improvement in sesame.

Plant materials and field trials
The mapping population used in this study consists of 224 F 8:9 recombinant inbred lines derived from singleseed descent from a cross between 'Miaoqianzhima' and 'Zhongzhi 14' , both are white seed-coated. The male parent 'Zhongzhi 14' is a commercial cultivar grown widely in China while the female parent 'Miaoqianzhima' is a landrace accession originating from Anhui province in China. The two varieties are distinct in many morphological traits, including plant height, growth habit, capsule shape, leaf shape and color, as well as resistances to multiple diseases.
Five field trials were set in five environments during the year 2012 to 2013 at normal planting season (from June to September), two in Wuchang (2012WC, 2013WC), two in Fuyang (2012FY, 2013FY), and one in Yangluo (2013YL). Wuchang (30°52'N, 114°32'E) and Yangluo (30°73'N, 114°62'E), which are~38.6 km apart, both are located in the summer-sown sesame zone of the middle Yangtze Valley, while Fuyang (32°93'N, 115°81'E) in the summer-sown sesame zone of the Huang Huai basin. The aforementioned two zones take up more than 50% of China's sesame-grown area. All trials were in a randomized complete blocks design, with three replicates each environment. Each plot had two 2.0-m rows spaced 0.4 m apart. At the two-euphylla stage, the plants were thinned and only thirteen evenly distributed plants in each row were retained for further analyses.

Traits evaluation
In each plot or genotype, only six uniform plants were used for trait evaluation. Plants at the two ends of each row were not selected to avoid edge effects. Traits evaluated include plant height (PH, cm), first capsule height (FCH, cm), capsule axis length (CAL, cm), capsule number per plant (CN), capsule length (CL, mm), grain number per capsule (GN) and thousand grain weight (TGW, g). CAL was measured as the length of axis from the lowest capsule to the top one. CL and GN were measured as the mean values of 18 uniform capsules from six plants. The half of TGW was measured as the mean weight of three independent samples of 500 grains. Other traits were measured as the mean values of 6 plants. All of them were measured just before the harvest stage.

Genomic DNA extraction and PCR
Genomic DNA was extracted from young leaves using the DNA extraction kit (TIANGEN Co. Ltd, Beijing). One thousand two hundred and seventy-four PCR markers, including 134 genomic-SSRs, 1,061 EST-SSRs and 79 InDels were used for genetic map construction (Table 1) [31]. Polymerase chain reactions (PCR) for SSRs and InDels were performed in 10 μl reactions, containing 10 ng DNA, 2 pmol of each primers, 2 nmol dNTPs, 15 nmol MgCl 2 , 0.2 U Taq DNA polymerase (Thermo Fisher Scientific, America) and 1 × PCR buffer supplied together with the enzyme. The PCR cycles were 94°C 3 min, 36 cycles of 94°C 20 s, 55°C~60°C (depending on the primers) 30 s, 72°C 40 s, and a 5 min at 72°C for final extension. PCR products were separated in 8% nondenaturing polyacrylamide gels (Acr:Bis =19:1 or 29:1) on a constant voltage of 180 V for 2~3 h, and were visualized by silver staining [48].

RAD sequencing, InDel and SNP markers development
Restriction-site Associated DNA (RAD) approach combined with Illumina DNA sequencing was used for rapid and effective discovery of InDel and SNP markers. RAD library construction, sample indexing and pooling followed Baird et al. [49]. The restriction enzyme EcoR I was used to cut the DNA of two parents and RIL population [50]. 22 multiplexed sequencing libraries were constructed, in which each DNA sample was assigned a unique nucleotide MID for barcoding. Single-end (101 bp) sequencing was performed using Illumina NGS platform HiSeq2000 in a total throughput of 22 lanes.
Raw sequence reads without MID barcode sequences were trimmed to 85 nucleotides from the 3' end to ensure more than 90% of the nucleotides have a quality value above Q30 (equals 0.1% sequencing error) and more than 99% above Q20 (equals 1% sequencing error). Reads of low quality, including reads with <85 bp after trimming or with ambiguous barcodes, were discarded. For InDels and SNPs calling, the trimmed reads were clustered into RAD-tags based on sequence similarity using Stacks under default parameters [51]. Clustered RAD-tags with very high read depth (>500) were excluded [51]. Sequences of RAD-tags were blasted between the two parental plants. InDels (≥2 bp) or SNPs were identified in alignment results, and regarded as true polymorphisms when each allele was observed at least three times. InDel markers were developed for PCR analysis by gaps in alignment results with another protocol [31]. The resultant sequence reads containing SNPs were compared among RIL plants. Only SNPs that were consistently discovered in parents and the progenies were retained [50]. The genotypes of SNP or PCR markers of 224 RILs were used for genetic map construction.

Linkage mapping
The marker segregation ratios were examined using the chi-square test. The poorly performing markers were removed before map construction, which excessively missed with more than 40% missing data in the RIL population or excessively distorted with segregation ratios more than of the minor allele frequency less than 0.29 [13]. A region with at least three adjacent loci showing significant segregation distortion (P <0.05) was defined as a segregation distorted region (SDR) [52]. The genetic linkage map was constructed using JoinMap 4 (Kyazma, Wageningen, Netherlands). Linkage groups were determined using a minimum LOD value of 5.0 and a maximum recombination of 45%. The regression mapping algorithm was used under the LOD threshold of 3.0 to determine the orders of markers in each linkage group. The linkage groups harboring less than 20 markers were discarded. A ripple was performed after addition of each locus, with the goodness-of-fit jump threshold for removal loci =5.0 and third round = Yes. The Kosambi mapping function was used to translate recombination frequencies into map distances. The final marker order of each linkage group was verified by the software program RECORD [53]. The linkage map was graphically visualized with MapChart 2.2 [54].

QTL analysis
The mean phenotypic data of three replicates (blocks) in different trials (environments) from all 224 lines (genotypes) were analyzed for frequency distributions, standard errors, pearsons correlation coefficients and ANOVA using SAS Statistics package [55]. The broad-sense heritability (H 2 ) was calculated with the formula H 2 = σ g 2 /(σ g 2 + σ e 2 /r), where σ g 2 represents the genetic variance, σ e 2 is the residual variance, and r is the number of replicates per genotype.
QTLs were detected for each of the seven traits using the MIM method implemented in Windows QTL Cartographer 2.5 [56] and MCIM in QTLNetwork 2.0 [57]. In Windows QTL Cartographer 2.5, a Composite interval mapping (CIM) analysis was run at first using Model 6 for one trait in one trial independently, with the forward and backward stepwise regression under a step size of 1 cM and a window size of 10 cM. The LOD significance thresholds (P <0.05) were determined by running 1,000 permutations tests [14]. The MIM was subsequently used to more precisely locate the QTLs. The QTL peaks identified in CIM were used as the initial model for the MIM and progressively refined the model using Bayesian Information Criteria (BIC-M0). QTL effects including their percentage of phenotypic variance (total R 2 ) were estimated with the final model fitted in MIM, and the R 2 for individual QTL was estimated using CIM. The boundaries of the confidence interval of the QTLs were estimated with the positions where the LOD value drop-off was equal to 1 [58]. QTLNetwork 2.0 was also used to identify QTL epistasis and QTL-environment (QE) interactions of one trait in several trials with three replicates together, which employed the genome scan parameters of a 10 cM testing window, 1 cM walk speed and 10 cM filtration window. Two-dimensional (2D) genome scans were carried out to search for multiple interacting QTLs. A genomewide threshold value of the F-statistic (α = 0.01) for declaring the presence of a QTL was estimated by 1,000 random permutations. A Monte Carlo Markov Chain method with Gibbs sample size of 20,000 was used to estimate QTL effects [59]. The sum of individual phenotypic variance explained by each QTL was calculated as the total phenotypic variance explained by all QTL for each trait.