Fine analysis of a genomic region involved in resistance to Mediterranean corn borer

Background Sesamia nonagrioides Lefebvere (Mediterranean corn borer, MCB) is the main pest of maize in the Mediterranean area. QTL for MCB stalk tunneling and grain yield under high MCB infestation had been located at bin 8.03–8.05 (4–21 cM and 10–30 cM respectively) in a previous analysis of the EP42 x EP39 RILs mapping population. The objective of the present work was to study with higher resolution those QTL, and validating and estimating with higher precision their locations and effects. To achieve this objective, we developed a set of 38 heterogeneous inbred families (HIFs) which were near-homozygous in the genome, except in the region under study. The HIFs were evaluated in multiple environments under artificial infestation with MCB and genotyped with SNPs. Results The QTL for grain yield under high infestation was confirmed with higher precision and improved reliability at 112.6–116.9 Mb. On the contrary, the location of the QTL for stalk tunneling was not validated probably due to the fixation of some genomic regions during the development of the HIFs. Our study confirmed that the co-localization of the QTL for stalk tunneling and grain yield in the previous study was due to linked genes, not to pleiotropic effects. So, the QTL for grain yield can be used for improving grain yield without undesirable effect on stalk tunneling. Conclusions The HIF analysis is useful for validating QTL and for conducting deeper studies in traits related to corn borer resistance.


Background
The area planted with maize worldwide exceeds 184.8 million hectares, with a total annual production of 1037.7 million of metric tons in 2014 [1]. Corn borer is the generic name for different species of Lepidoptera that feed on maize producing tunnels on stalks. Corn borers are found in all continents, for example Ostrinia nubilalis Hübner (European corn borer, ECB) in America and Europe, Ostrinia furnacalis Guenée in Asia, Sesamia calamistis Hampson in Africa, etc. Some studies have reported yield losses up to 30% caused by corn borers [2].
ECB is the main corn borer in central Europe while Sesamia nonagrioides Lefebvere (Mediterranean corn borer, MCB) is one of the most important pest of maize in Southern Europe, particularly in Spain [3,4]. ECB and MCB have usually two or more generations per year. The first generation feeds on leaves of young plants, while the larvae of the other generations feed on stem and ears of the plants that have completed (or are closed to complete) their vegetative growth. The second generation produces the main damage and we will focus on the resistance to this generation.
In studies of maize resistance to corn borers the damage and the level of resistance is commonly measured as the length of the tunnels produced by larvae in the stem. The genetic basis of ECB and MCB resistance measured as tunnel length is polygenic [5,6] and the values of heritability for this trait varied between experiments in a wide range from 0.5 to 0.8 [7][8][9][10][11][12][13][14].
At molecular level, several QTL experiments with RILs have been carried out to detect QTL related to resistance to ECB and MCB. About 10-15 QTL related to ECB resistance were detected per experiment that explained, approximately, 50 and 60% of phenotypic and genotypic variance, respectively [8,9]. In a QTL experiment with three connected populations and a relatively high number of RILs (521) and markers (2411), the number of QTL related to ECB resistance (10) and the proportion of phenotypic variance explained by the QTL (37%) was still low [11]. The number of QTL related to MCB resistance detected per experiment was low (1-3) and the genotypic variance explained by the QTL was also low (usually between 20 and 30%) [10,12,13,15]. In addition, in several studies QTL detected for tunnel length co-localized with QTL for other agronomic traits such as plant height [12,13], days to flowering [9,11] or grain yield [16]. The co-localization can be due to different genes for each trait that are linked or a single gene with pleiotropic effect on both traits. These previous studies did not allow the discrimination between linkage and pleiotropy, although that knowledge is relevant for the potential application of the QTL in breeding: A gen with a pleiotropic and contrary effect on two traits makes impossible the simultaneous improvement of those traits while two linked genes allows it.
The significant QTL detected with standard biparental populations should be verified in additional experiments before to continue with deeper studies of gene discovery and characterization. In biparental mapping populations the effect of multiple segregating QTL can be confounded and this can lead to reduced power of QTL detection or overestimation of the effects [17]. Near-isogenic lines [18] are effective genetic stocks for studying phenotypic effects attributable to a QTL since the genetic background that commonly influences phenotypic assessments of quantitative traits is standardized [19]. Tuinstra and collaborators proposes a quicker method to develop NILs by identifying inbred lines that are highly homozygous, except for a region that segregates for the trait of interest [20]. These types of NILs were called heterogeneous inbred families (HIFs) [20]. The method can be straightforwardly applied to RILs to validate a QTL previously detected in the RILs. HIF analysis has been used to validate QTL related to plant height and yield [21], leaf number [22], number of vascular bundles [23], and kernel traits [24] in maize. The HIF analysis could be particularly useful to validate QTL related to insect resistance because the precision of QTL mapping for traits related to pest resistance is low due to the intrinsic characteristics of the resistance traits which depend both on plant and insect variation. Thus, HIF analyses have been successfully used to validate QTL related to disease resistance, for example, resistance to Northern Leaf Blight [25] and dwarf disease [26] in maize. However, although numerous insect resistance QTL have been mapped in maize with standard biparental populations, no QTL for insect resistance have been verified with NILs or HIF and some authors have pointed out the need for more precise mapping for traits related to insect resistance in maize [27].
In the analysis of a RILs population derived from EP42xEP39 we detected a region spanning from bin 8.03 to 8.05 where a QTL for stalk tunnel length co-localized with a QTL for grain yield under high infestation and a QTL for flowering [15]. The QTL for stalk tunnel length was located between markers umc1984-umc1858 (79-111 Mb), while the QTL for grain yield and flowering were located between umc1858 and bnlg1812 (111-136 Mb) [15]. The objective of this research was to validate and estimate with higher precision the effects of the QTL for stalk tunnel length, the QTL for grain yield under high infestation and the QTL for flowering detected previously in a RIL population [15]. This is achieved by the development and genetic analysis of a set of HIFs, which provide higher mapping precision than RIL mapping populations.

Results
The genetic analysis of HIFs allows a fine mapping of a specific region previously detected in standard QTL because the genetic background outside the target region is expected to be highly homogenous in the HIFs. We indeed obtained a high level of homogeneity in the genetic background of our set of HIFs which is in contrast with the heterogeneity that the HIFs maintained in the target region where the QTL were located in the previous study (8.03-8.05) (Fig. 1). Thus, the percentage of polymorphic loci ranged from 1 to 4% in all chromosomes except in chromosome 8 which had 19% of polymorphic loci. In the target region where the QTL were located in the previous study the percentage of polymorphic loci was higher: about 50% in bins 8.03 and 8.05 and about 80% in bin 8.04.
As a summary, there was a region from 24 Mb to 139 Mb of chromosome 8 with 84% of polymorphisms, except two smaller sub-regions from 45 Mb to 69 Mb and from 122 Mb to 129 with reduced polymorphisms (10%).
After discarding the SNPs with missing data, there were 73,316 SNPs genotyped in the 38 HIFs. The percentage of polymorphic loci in the whole genome was 0.05%, while the percentage increased to 2% in chromosome 8.

Linkage mapping
In the linkage mapping analysis of the HIFs, we found QTL for grain yield, stalk tunneling, and silking in which the allele from EP42 provided more yield, longer galleries and early silking in congruence with the original EP42 x EP39 mapping experiment (Table 1, Fig. 2).

Haplotype analysis and identification of causative genes
The haplotype analysis showed that there were two haplotype groups in the region under study (Fig. 3). The QTL for grain yield and the QTL for silking were in block 2 overlapping with the QTL for plant height located also in block 2 and the QTL for stalk tunneling was in block 1.
Thus, the stalk tunnel and grain yield QTL were in different blocks being possible the recombination between blocks.
The comparison of haplotypes was exclusively made for the grain yield QTL because its location and effect were clearly validated and the homogenization of the genetic background was effective resulting in high proportion of the variance being explained. The yield of the lines with the haplotype of EP42 in the region where the QTL for yield was detected (from 112.6 to 117.7 Mb) did not overlap with the yield of the HIFs with the haplotype of EP39, with the exception of HIF_2 (Table 2). Thus, the mendelization of this QTL was almost achieved with the development of the HIFs families in spite of the moderate effect of the QTL. Two HIFs had recombinants in the region which gives us valuable information. HIF_40 had the haplotype of EP39 except for two SNP at 116.9 Mb where it had the alleles of EP42; also, this HIF had a high yield similar to the HIFs with the haplotype of EP42 in the entire region (from 112.6 to 117.7 Mb). On the contrary, HIF_37 had the haplotype of EP42 except for the two SNP at 116.9 where it was heterozygous; this HIF had a low yield similar to the HIFs with the haplotype of EP39 in the entire region (from 112.6 to 117.7 Mb). Thus, a change in the alleles at 116.9 Mb had a great impact on the yield of HIF_40 and HIF_37 which suggests that the QTL for grain yield under high infestation could be located around this location (113.9-117.7 Mb). In this region 72 genes are located, 33 of them with a function recognized by the PlantRegMap platform (Table 3). Grain yield is the result of multiple processes throughout the life of the plant and potentially any gene could have an effect on this complex a Bin locations were designed by an X.Y code, where X was the linkage group containing the bin and Y was the location of the bin within the linkage group [53] b DS was the estimation for the complete data set; ES was the average value for the 1000 estimation sets; TS was the average value of the 1000 validation sets in cross validation; the bias was calculated as the difference between ES and TS estimations divided by the ES estimation c Additive effect of the QTL estimated as half the difference between the genotypic values of the two homozygotes. A positive estimation means that EP42 carried the allele with higher value d Detection frequency of the QTL in the cross-validation test e Proportion of phenotypic variance explained by each QTL Fig. 2 Genetic map of a 38-HIF population derived from the cross EP39 × EP42 where the QTL found for different characters have been located. We used 17 SNP markers at bins 8.03-8.04. The black numbers below the chromosome indicate the position in bp of each SNP marker while the white numbers on the chromosome indicate the bin number. The 95% confidence intervals are indicated by the length of the QTL bar trait. Therefore, it is not possible to reduce the number of candidate genes in the region of the QTL based on their known functions. Anyway, the number of candidate genes for the yield QTL has been reduced from thousands in the previous analysis of the biparental population to less than one hundred in the analysis of the HIFs. This relatively reduced number of candidate genes is amenable to differential expression analysis to limit further the number of candidate genes.

Discussion
New genotyping techniques as GBS allow genotyping with higher density of markers compared to alternative techniques as SSRs. Thus, in the genotyping of the EP42 x EP39 RIL population only 6 SSRs markers were located on chromosome 8 [15], while 17 polymorphic SNPs were genotyped in the target region of chromosome 8 in the HIFs. The highly improved coverture increases the precision of QTL mapping of the present experiment compared to the first experiment.

Linkage mapping
The position of the QTL for grain yield in the present work was between the markers that flanked the QTL in the EP42 x EP39 RIL mapping population. However, the flanking markers in the HIF analysis delimited a shorter region between 113 and 117 Mb for the grain yield QTL. The additive value estimated in the analysis of the HIF was similar, although slightly higher, to the value estimated in the analysis of the EP42 x EP39 RIL population (0.3 vs 0.2 Mg ha − 1 ). CV was used to validate the estimation of the position and effect of the QTL. The Local linkage disequilibrium in Haploview, measured as r 2 between pair of SNP and haplotype blocks for a genomic region located at 8.03-8.04 and studied by HIFs analysis. Block in linkage disequilibrium at 50 and 60% of r 2 [52] average values of the additive effect estimated from the estimation and test set in the CV were similar to the values estimated by the whole data set (0.21-0.31) which indicates that the estimated values are consistent. Besides, the QTL was detected in 87% of the CV runs, which indicates also that the QTL is reliable. The proportion of CV runs in which the QTL was detected in the EP42 x EP39 RIL population was much lower (40%) indicating that the homogenization of the genetic background in the HIFs was effective for increasing the precision of the QTL detection. The fixation of most of the QTL outside the region target of the analysis in the HIFs also led to an increase in the proportion of phenotypic variance explained by the QTL (from 10.7 to 34.9%). Thus, the isogenization was effective isolating the effect of the QTL spite of its moderate effect and the moderate heritability of grain yield. Huo and collaborators found, after the homogenization of the genetic background, a similar increase in the proportion of phenotypic variance explained by QTL [28], but in a trait of high heritability as kernel number.
The location of a QTL for silking close to the QTL for yield was also confirmed in the analysis of the HIFs. Contrary to the QTL for yield, the reliability and percentage of variance explained by the silking QTL was reduced in the HIF compared to the EP42 x EP39 RIL population. In the EP42 x EP39 RIL population the flowering QTL had a large effect, explaining 30% of the phenotypic variance, in coincidence with other studies which detected a QTL of large effect for silking in the same region [29][30][31][32][33]. This large effect could be due to the combined effect of several flowering genes located near each other as ZCNC8 at 124 Mb [34] and Zm-Rap2.7 at 134 Mb [35]. ZCNC8 is located near of the QTL for flowering detected in the HIFs, but in a region that was unwillingly fixed during the development of the HIFs which could explain the reduced effect detected in the HIFs compared to the RILs.
At difference of the QTL for yield and the QTL for flowering, there were discrepancy in the location of the QTL for stalk tunneling in the analyses of RILs and HIFs. In the analysis of the RILs a QTL for stalk tunneling was located between 79 and 111 Mb, while in the analysis of the HIFs it was located between 28 and 36 Mb. The analysis of the RILs either was not able to detect any effect from 28 to 36 Mb or could locate their effects outside the region due to lack of markers coverage in the region. On the other hand, the analysis of the HIFs could have failed to detect any effect from 79 to 111 Mb due to fixation of genomic regions during the development of the HIFs. There may have been direct fixation of genes related to stalk tunneling or, alternatively, the reduction in the estimated effect of the QTL for flowering could have affected the detection of the QTL for stalk tunneling. Krakowsky and collaborators also failed to detect some QTL for stalk tunneling after adjusting for flowering [9]. These results are consistent with the relationship between time to flowering and stalk damage by corn borers observed at phenotypic [36] and molecular level [8,13].   We identified a QTL for plant height between 108 and 113 Mb which was not detected in the analysis of the EP42 x EP39 RIL population. This QTL does not seem a false positive because explained almost 30% of the phenotypic variance and the additive effects estimated using whole data, estimation and test sets were similar (5 cm) indicating that the magnitude of the bias in the estimation of the values was not large. Furthermore, the QTL was detected in 95% of the CV runs indicating that the location of the QTL was reliable. Differences between original and validation studies in QTL experiments for disease resistance can be attributed to QTL x environment interaction, high experimental error, overestimation of the effects, and lack of statistical power [37,38]. Those reasons do not seem to be applicable to our QTL for plant height because the QTL for grain yield was consistently found in the HIFs in spite of the low effect of the QTL in the EP42 x EP39 RIL population and the moderate heritability and large interaction with environment of the trait. Alternatively, the failure to detect the QTL in the EP42 x EP39 RIL population could be due to the presence of two QTL with counteracting effects linkage in repulsion so the combined effect is null [15]. One of them could be fixed in the development of HIFs, allowing the detection of the other one.

Haplotype analysis and identification of causative genes
Schulz and collaborators have reported significant and negative genetic correlations between tunnel length and grain yield [7] which implies that undesirable reduction in grain yield could accompany the improvement of the resistance. This undesirable, indirect response to selection for resistance, has indeed happened in several selection programs for corn borer resistance [39][40][41][42]. At molecular level, some QTL for stalk tunneling were localized in the same regions than QTL for grain yield due to linked genes or genes with pleiotropic and contrary effects in both traits [16] which hampers the use of those QTL in breeding. To know if the co-localization of QTL is due to linked genes or one gene with pleotropic effects is critical for the use of QTL in breeding. If the co-localization of QTL is due to linked genes then the simultaneous improvement of both traits is possible, but it is not if both QTL are due to the same gene with pleiotropic effects. In the analysis of the EP42 x EP39 RILs [15] we found a QTL for yield and a QTL for stalk tunneling in the same region, with the allele that increased yield having a negative effect on resistance. However, in the HIFs we only detected the QTL for yield, but not the QTL for stalk tunneling which is indicative that the gene responsible for the QTL for yield does not have a pleiotropic effect on resistance. Thus, the QTL could be used for improving grain yield without indirect undesirable effects on stalk tunneling.

Conclusions
The HIF analysis was effective for validating the QTL for grain yield under high infestation which was detected with higher precision and improved reliability. On the other hand, the location of the stalk tunneling QTL was not confirmed probably due to fixation of genes related to stalk tunneling or flowering during the development of HIFs. The HIF analysis allowed the detection of a new QTL for plant height not previously detected, probably due to the confounded effect of multiple segregating QTL. We conclude that the HIF analysis is useful for validating QTL and conducting deeper studies in traits that have associated high experimental error and moderate heritability as those related to corn borer resistance.

Plant materials
We used the HIF method for developing the NIL population under study [20]. A RIL heterozygous for three markers (umc1984, umc1858 and bnlg1812) located in the region 8.03-8.05 where the QTL for stalk tunneling, grain yield, and flowering were previously detected [15] and, with the highest level of homozygosity everywhere else compared to other families, was selected out of the 188 F 5 RILs derived from EP39 x EP42. That RIL was named LR-23. The selected family LR-23 was self-pollinated twice to increase the level of homozygosity outside the 8.03-8.05 region. A single F 7 plant from LR-23, which remained heterozygous in the target region (8.03-8.05), was self-pollinated. Seeds from this plant were sown and crosses among approximately 67 plants were made resulting in 38 HIFs (HIF_1, HIF_2, etc) with enough seed for posterior evaluations. A scheme of the development process of the HIFs is shown in Fig. 4.

Experimental design
The 38 HIFs were sown at Pontevedra, Spain (42°24'N, 8°3 8'W, and 20 m above of sea level) in three different years and cultivated under standard methods. The 38 HIFs were evaluated along with the parental inbreds EP42, EP39 and LR-23 using a 6 × 7 lattice design with three replications per year. The trials were hand planted and each experimental plot consisted of one row, spaced 0.8 m apart, with 15 two-kernel hills spaced 0.21 m apart. Plots were overplanted and thinned, obtaining a final density of approximately 60,000 plant ha − 1 . The evaluations were performed under artificial infestation with MCB eggs obtained at the Misión Biológica de Galicia by rearing the insect [43,44] with some modifications. Before flowering, Fig. 4 Scheme for developing 38 HIFs from a F 7 line (LR-23), which was obtained from the cross EP42 × EP39 five plants from each plot were infested with~40 MCB eggs placed between the stem and the sheath of a basal leaf. We collected the following data: days to silking, measured as the days from planting to the day when 50% of plants in the plot showed silks; plant height, measured in five representative plants in the plot as the average length in centimeters from the ground to the top; grain yield, estimated on a plot basis as Mg ha − 1 at 140 g H 2 O kg − 1 ; stalk tunnel length, measured as the average length in centimeters of the stem tunnels made by corn borers on the five infested plants.

Genotyping
The 38 HIFs derived from LR23 and the two parents were genotyped by GBS in Cornell University Institute of Biotechnology. Twenty-two polymorphic SNPs in the region 8.03-8.05 with percentages of missing data lower than 2.5% were used to validate the QTL.

Statistical analysis
The phenotypic data were analyzed using the mixed model procedure (PROC MIXED) of SAS [45] considering replications and blocks within replications as random effects and families as fixed effects. A best linear unbiased estimator (BLUE) was obtained to estimate each line mean phenotypic value for both individual and combined data.

Linkage mapping
As a first approach to validate the QTL we analyzed the HIFs using composite interval mapping with the software PlabMQTL [46] as we did in the analysis of the RILs in the previous study [15]. A LOD threshold of 1.2 was determined by permutation tests that ensures an experiment wise error rate of p < 0.30. A five-fold cross validation (CV) approach was employed for obtaining unbiased predictors of the QTL parameters such as additive effect (α) [47]. For each trait, CV was performed for the whole data set (DS) of entry BLUE across environments. A total of 30 entries were used as estimation set (ES) for calibration and 8 entries were used as the test set (TS) for validation. One thousand CV runs were performed in order to determine the QTL frequency and shrinkage of estimations for QTL effects of the QTL detected in the original data set [48]. The magnitude of the bias of the estimation of additive effectsα i explained by each individual QTL was calculated as the difference between the average estimates obtained in ES and in TS divided by the estimate in ES.

Haplotype analysis and identification of causative genes
Local linkage disequilibrium measured as r 2 between pair of SNP and common haplotype patterns in the region under study were assessed in Haploview 4.2 [49]. The uniformity of the genetic background of the HIFs allows the direct comparison of haplotypes to map the QTL [50]. Thus, for the QTL that were validated by the linkage mapping analysis we identified the parental haplotypes (EP42 and EP39) and the recombinant haplotypes in the region of the QTL. We compared the phenotypic value of the HIFs with the parental haplotypes and the HIFs with recombinant haplotypes to fine map the QTL. The filtered predicted gene set from the annotated B73 reference maize genome (v3) [51] was used to characterize candidate genes within the validated QTL.