Identification of reliable QTLs and designed QTL breeding for grain shape and milling quality in the reciprocal introgression lines in rice

Milling quality (MQ) and grain shape (GS) of rice ( Oryza sativa L.) are correlated traits, both determine farmers’ final profit. More than one population under multiple environments may provide valuable information for breeding selection on these MQ-GS correlations. However, suitable analytical methods for reciprocal introgression lines with linkage map for this kind of correlation remains unclear. In this study, our major tasks were (1) to provide a set of reciprocal introgression lines (composed of two BC 2 RIL populations) suitable for mapping by linkage mapping using markers/ bins with physical positions; (2) to test the mapping effects of different methods by using MQ-GS correlation dissection as sample case; (3) to perform genetic and breeding simulation on pyramiding favorite alleles of QTLs for representative MQ-GS traits. Finally, with four analysis methods and data collected under five environments, we identified about 28.4 loci on average for MQ-GS traits. Notably, 52.3% of these loci were commonly detected by different methods and eight loci were novel. There were also nine regions harboring loci for different MQ-GS traits which may be underlying the MQ-GS correlations. Background independent (BI) loci were also found for each MQ and GS trait. All these information may provide useful resources for rice molecular breeding.


Introduction
Rice (Oryza sativa L.) is a primary food crop, feeding nearly half of the world's population.Profit for rice farmers largely depends on not only the grain yield but also the head rice rate during grain milling [1,2].Grain shape and milling quality are key determinants of the head rice rate.Grain shape is often characterized by grain length (GL) and grain width (GW), which are all typical complex quantitative traits [3].Milling quality is often characterized by three component traits, namely brown rice rate (BR), milled rice rate (MR) and head rice rate (HR), which are also typical complex quantitative traits.
Compared to piles of works on grain shape, genetic studies focusing on milling quality in rice are still lagging behind.Initial results are often based on single-population under single environment [4,5].Later, populations derived from Oryza glaberrima [6] and rufipogon [7] were also adopted.Recently, works for GWAS by sequenced germplasm [8,9] as well as QTL validation work by fine mapping [10] also provided valuable results.For complex traits, multiple populations and/or multiple environments data may be largely beneficial to genetic dissection.Genetic control of milling quality and grain shape was identified with single population under multiple seasons [11,12] and with two RILs populations under two environments [13].Data of two different populations were evaluated on QTL for milling quality and other traits [14].
Trait correlations or genetic overlaps are important especially for breeding selection.Recently, correlations between milling quality and grain yield were promoted [15].However, grain yield and milling quality are both post-harvest traits.Comparatively, grain shape would be an index more suitable for field selection before harvest.Correlations between milling quality and grain shape traits were previously noted by phenotypic correlation with cultivars [16].Notably, QTL mapping with single set of introgression lines has provided some different views [17][18][19].Genetic studies focused on the relationship between these two traits especially with multiple populations under multiple environments, which may provide more information for utilization of QTLs for the genetic improvement of two groups of traits.
Meanwhile, consideration of a large number of QTLs is another challenge for breeders.Many factors limited the applications of detected QTLs, such as QTL by environment interactions and stability of QTL, different genetic backgrounds among populations, the linkage between detected QTL, false positives and false negatives, and lack of appropriate tools [20].Breeding simulation study for target genotype provides a potential way to find the most appropriate crossing and pyramid desired alleles at various loci based on dependable QTL mapping results [20,21].
In this study, we focused on QTL mapping of milling quality and grain shape using a set of reciprocal introgression lines developed by advanced backcrossing (BC 2 RIL) and a set of recombinant inbred lines (RIL) from a same cross.Our major tasks were (1) to construct a linkage map using a RIL population whose parents were the same as the two BC 2 RIL populations; (2) to conduct QTL mapping of the five traits in the two BC 2 RIL populations based on the re-constructed linkage map.Four different mapping methods were also used for comparison of detected QTLs; (3) to perform genetic and breeding simulation on pyramiding favorite alleles of QTLs for two representative traits, GL and HR.

Genetic materials and genotyping
In our previous study, the development of a set of reciprocal introgression lines (BC 2 RILs) derived from Minghui 63 (MH63) and 02428 were described [22][23][24].The BC 2 RILs consist of 424 lines, including 226 in MH63 background (MH63-ILs) and 198 in 02428 background (02428-ILs).All of them were genotyped by resequencing, and were used to construct a high-resolution map of 4568 bins.A set of RIL population consisting of 245 lines derived from the same parents was also adopted for linkage map construction.
Field experiments were conducted using a randomized complete design and followed the normal cultivating arrangement in each environment.Reciprocal ILs and their parents were planted in a three-row plot with ten individuals in each row with two replications.The planting density was 20 cm between rows and 20 cm between plants in each row.All field management followed local farmers' practices.At maturity, eight individuals in the middle of each plot were harvested and bulked.After harvest, seeds were naturally dried and stored at room temperature for at least three months.
Milling quality traits were evaluated according to the National Rice Gain Quality Assessment Standard of China (GB/T17891-1999).For BR, about 30 g well-filled grains from each IL were weighed and then dehulled with an experimental dehuller (Taizhou Food and Oil Machinery Factory, JLGJ-45, Taizhou, China).Then the brown rice was weighed.The BR was expressed as (total brown rice weight / total paddy rice weight) × 100%.For MR, about 20 g brown rice was weighed and milled for 60 s with an experimental miller (Taizhou Food and Oil Machinery Factory, JMNJ-3, Taizhou, China) until the whiteness of the sample reached the China national standard of GB1354-2018 grade-I milled rice.MR was defined as (total milled rice weight / total brown rice weight) × BR × 100%.Similarly, HR was defined as (total head rice weight / total milled rice weight) × MR × 100%.All lines were measured twice, and the average value was adopted as trait values for mapping.

Phenotypic data analysis
Analysis of variance (ANOVA) for phenotypic data and correlation analysis among traits were performed using the AOV function in QTL IciMapping V4.2 [25].Heritability in broad sense on the genotypic mean level in each BC 2 RIL population was calculated for each environment and across environments by the two equations, respectively.
Best linear unbiased prediction (BLUP) values across multiple environments were calculated for each trait using the following model.where y ijk was the observed phenotypic value of individual l of the genotype i in block k at environment j; μ was the mean value of the population; g i was the random effect of the i th genotype; e j was the random effect of the j th environment; r/e k(j) was the random effect of the k th block in the j th environment; ge ij was the random effect of the genotype by environment interaction; ε ijkl was the random residual effects.

Linkage map construction using the RIL population
It's notable that in the process of the BC 2 RILs construction, the recurrent parents (RPs), i.e.MH63 and 02428 were adopted as controls for phenotypic selection.According to previous experiences in reciprocal introgression line population construction [26], populations from interspecific cross in indica background tend to be more variated than that in japonica background and not for each environment, and y ijkl = µ + g i + e j + r/e k(j) + ge ij + ε ijkl easy to stabilize.Also, the phenotypic selection based on comparing to RP will largely affect the donor allelic frequency.Thus, we picked more lines with similar plant type and flowering time as MH63 during the selection of MH63-ILs, but selected more 02428-ILs largely varied from 02428 in agronomic traits except for the flowering time.Since this kind of artificial selection was conducted during the development of both BC 2 RIL populations, the RIL population was used for linkage map construction.Chromosome number and order of markers were anchored according to the physical map.MAP function in QTL IciMapping was adopted for estimating the genetic distance between markers.Recombination frequency was converted into map distance by the Kosambi mapping function.R package LinkageMapView was used for visualization of the linkage map [27].

QTL mapping for grain shape and milling quality traits in the two BC 2 RIL populations
The algorithm of inclusive composite interval mapping (ICIM) for the BC 2 RIL population implemented by the BIP function in QTL IciMapping [28] was used for QTL mapping of the five traits in different environments, i.e.GL, GW, BR, MR and HR.QTL mapping was conducted in each environment separately, as well as the BLUP values across environments.The REG method proposed by Alamin et al. [29] was also adopted for QTL mapping in each environment as a comparison, which only output position and LOD score of each scanning position.
As the two populations can also be viewed as CSSLs, we also used the RSTEP-LRT-ADD method in the CSL function of QTL IciMapping [30], which conducted QTL mapping in each environment separately as well as the means across environments.ICIM algorithm in MET function was adopted for QTL by [31].The LOD threshold was set at 2.5 for all the first three methods, and at 4 for MET as all environments were analyzed at the same time in MET.The scanning step was 0.5 cM.The two probabilities for entering and removing variables were set at 0.001 and 0.002, respectively.Comparison of QTL mapping results among different environments, among the four mapping methods, and among the five traits was then conducted.QTLs in different populations were considered to be common if the genetic positions were close enough.In other words, distance in linkage map was less than 20 cM in terms of QTL positions.In individual populations, QTLs were considered to be stable if they were identified in at least two the environments for at least one population.Twenty cM was also adopted as the minimum distance to identify pleiotropy QTLs among traits.Locus detected throughout two populations (genetic backgrounds) was important and regarded as background independent locus (BI locus).A tool named shinyCircos was used for the visualization of QTL positions on the linkage map [32].

Breeding simulation on pyramiding favorite alleles
Blib is a simulation platform for modelling, simulating, and predicting the genetic and breeding processes of different diploid species [33].Our target of breeding design was to improve HR and GL.QTL for the two traits detected by all the four methods were considered, and those QTLs exhibiting consistent effect directions across environments were kept.For pleiotropy QTLs, if their effects for the two traits were opposite, HR was adopted as the primary index trait for determining the target genotype.The individuals mostly close to the target genotype were selected as parents for developing simulated progenies.Three types of crosses were considered, i.e. single cross, top cross and double cross.For each cross, 1000 DH lines were generated by Blib, and proportion of target genotypes in these lines was counted.

Phenotypic evaluation
The descriptive statistics of grain shape and milling quality traits, i.e., GL, GW, BR, MR and HR of the two BC 2 RIL populations in different environments were shown in Table S1.The average values of the two parents, i.e., MH63 and 02428, in each environment were from 9.12 to 10.02 and 6.38 to 7.27 cm for GL, 2.59 to 2.98 and 3.30 to 3.56 cm for GW, 75.02 to 79.91 and 73.13 to 80.88% for BR, 64.06 to 69.01 and 57.68 to 68.48% for MR, 36.94 to 57.94 and 29.59 to 64.21% for HR.The average values of progenies in populations MH63-ILs and 02428-ILs in each environment were from 6.71 to 9.93 and 6.86 to 7.84 cm for GL, 2.41 to 3.02 and 3.20 to 3.48 cm for GW, 72.06 to 78.90% and 75.83 to 79.85% for BR, 62.01 to 72.62 and 61.27 to 67.67% for MR, 35.41 to 54.41 and 47.55 to 60.35% for HR.Obvious variations were observed in phenotypic data of parents and progenies in both populations, and so was the heritability among different environments (Table S1).
Table 1 shows the variance components and heritability in broad sense of the five traits across environments.Heritability of GL and GW was considerably high, i.e., over 0.90, and heritability of BR, MR and HR was a little lower, i.e., around 0.50 to 0.71.Pearson's correlation coefficient between the traits across environments is shown in Table 2.A significant test indicated that GL was negatively correlated with GW, and BR, MR and HR were positively correlated in both populations.HR was negatively correlated with GL, and BR was positively correlated with GW.Other correlations were insignificant or inconsistent in the two genetic backgrounds.

Linkage map constructed in the RIL population
The constructed linkage map is shown in Fig. 1, and general information on the linkage map is provided in Table 3.The whole genome spanned 1572.31cM in length, consisting of 12 chromosomes.The number of markers was 4833 and the number of unique map positions (denoted as bin) was 2448.Chromosome 5 was the longest one with the length of 200.12 cM, while chromosome 9 was the shortest with the length at 90.50 cM.The largest gap was also observed on chromosome 5 with the length at 18.44 cM.Chromosome 3 had the largest number of markers, i.e., 319 markers, and chromosome 8 had the smallest number of markers, i.e. 153 markers.The average distance between markers was 0.33 cM in the whole genome, and the average distance between bins was 0.65 cM.S4).
As we mentioned, the correlation between GL and HR was also significant, and so was the correlation between GW and BR.Common QTLs between GL and HR and those between GW and BR are shown in Fig. 5.There were 17, 26, 19 and 17 common QTLs between GL and HR from BIP, REG, CSL and MET, respectively.And there were 20, 29, 11 and 17 common QTLs between GW and BR from BIP, REG, CSL and MET, respectively.It can be concluded that higher correlation between traits leaded to more common QTLs for different traits, no matter which mapping method was used.Common QTLs were not only observed between traits belong to the same group, but also between traits from different groups.

Design of the target genotype based on the identified QTLs
Nineteen QTLs for GL and HR were detected by all the four methods and had consistent directions of additive effect across different environments.Eleven of them affected both GL and HR, five of which increased or decreased both traits, and the other five had different directions of effects for the two traits.For QTLs only affecting GL, the alleles increasing GL were set as the favorable one; for QTLs only affecting HR, the alleles increasing HR were set as the favorable one; for QTLs affecting both GL and HR, the alleles increasing HR were set as the favorable one.The target genotype at the 19 QTLs is given in Table 7. Four individuals from the two populations were picked up as parents in breeding design, i.e.MH63-ILs-166, 02428-ILs-185, MH63-ILs-65 and MH63-ILs-168, which had only 4, 5, 5 and 5 different QTL genotypes compared with the target genotype.Their genotypes were also given in Table 7.The predicted values of the selected lines with these QTLs under certain environments were also provided (Table S6).For the four selected individuals, we performed two single crosses, six top crosses and six double crosses to generate the target genotype, respectively, by using Blib.These crosses and the corresponding percentage of the designed genotype in the DH population were also provided in Table 8.The percentage of the designed genotype for SC1 was 0.0014, higher than that for SC2.Percentage for TC2 was 0.0022, which was the highest among the six top crosses.Percentage for DC2 was 0.0018, which was the highest among the six double crosses.TC2 and DC2 had the common three parents, but different cross design led to different genetic structures in the two populations, as well as different percentages of the designed genotype.

More reasonable mapping results based on linkage map
In previous mapping work with BC 2 RIL [23,24], we adopted physical map rather than linkage map for QTL identification.However, according to the underlying principles in methodology for popular QTL mapping packages, such as IciMapping [25], linkage map would be preferred.We compared our mapping results based on new map with same mapping method of BIP (Table S2) to our previous report [24].It's notable that some key loci, e.g.GL-3 is detected in more locations with larger LOD values.Despite this, for all the other grain shape loci, more stable detection with higher significance were found.For example, the mean LOD value was 12.3 (Table S2) now vs 9.1 in previous report [24] for GL.The same is true (11.7 vs 10.4) for GW.This set of BC 2 RIL lines with maps and seeds are now publicly available, which may provide useful materials and information for more effective genetic dissection of complex traits.This population would be open and accessible through following URL (https:// rfgb.rmbre eding.cn/ downl oad/ publi cData Downl oad) provided by the Functional Genomic Breeding (FGB) platform [34].This kind of population has previously already proven its value in genetic analyses for appearance quality [24] and may be useful for other quality traits, e.g.mineral concentration [35].

Comparison of QTLs detected in the two populations
As shown in Tables S2, S3, S4, and S5, there were some QTLs detected in both populations for the five traits by all the three QTL mapping methods.Stable QTLs for the two populations could be observed in Fig. 2.
For QTLs detected by BIP and CSL, reliability of these QTLs was higher than those stable in only one population, and followed by the unstable QTLs.As we presented before, directions of additive effects of these QTLs were mostly consistent for the two populations.However, values of additive effects varied significantly across populations and environments.Generally, if a QTL for the two populations was detected in the same environment, similar or larger effects was observed for MH63-ILs than that for 02428-ILs.For example, qGL.BIP-10.2 was detected in six environments for MH63-ILs and five environments for 02428-ILs (Table 4 and  S2).In environments SY, SZ, XZ, and BLUP, qGL.BIP-10.2 was detected for both populations.Its additive effect was larger in MH63-ILs than that in 02428-ILs except in SZ.The larger QTL effects may be related to the stricter selection on MH63-ILs.Further study is needed to prove this inference.

QTLs regions with multiple effects
As we mentioned above, some common QTLs were detected within and also between GS and MQ trait groups.As an example, a total of nine regions harboring these QTLs are located within an interval shorter than 20 cM (Table S7).Of these regions, three were located on chromosome 5, with an average size of 17.9 cM corresponding to about 1.6 Mb; two on chromosome 3 with an average size of 3.8 cM (2.4 Mb); other four with averaged size of 4.2 cM (0.5 Mb) on chromosomes 1, 2, 10, and 11 respectively.Especially, qGL.BIP-3.4which contains both the major effect QTL for grain length, GS3 [36] and qHR.BIP-3.2for head rice rate.It is evident that slender These QTLs may provide useful information in the future breeding of rice.

Application of detected QTLs in rice breeding
A large amount of QTL mapping studies have been conducted for various traits in rice in the last two decades, and how to utilize so many detected QTLs or genes becomes a challenge to the breeders.In this study, the simulation platform Blib has been adopted to design the target genotypes using identified QTLs.Two significantly related traits, GL and HR, were selected as examples to demonstrate the design process, which was quite important in plant breeding.According to the simulation results, the target genotype with larger HR and relatively longer GL could be obtained when 1000 DHs were generated from the cross between or among the selected individuals in the two mapping populations, no matter single cross, top cross, or double cross was adopted.
The two mapping populations in this study are generated by consecutive backcrossing, which are close to CSSL populations in two directions, the MH63-ILs using MH63 as the recurrent parent, and the 02428-ILs using 02428 as the recurrent parent.The individuals of MH63-ILs had higher genotypic similarity with the parent MH63, of which 91.30% were same as parent MH63, 7.08% were same as parent 02428, and 1.62% were missing.Genotypes of individuals in 02428-ILs were closer to parent 02428, among which 19.88% was same as parent MH63, 76.55% was the same as parent 02428, and 3.57% was missing.But more abundant genetic variations can be found in the two populations than the two parents.When the target genotype is defined in simulation study, it is easy to select suitable parents for crossing, which are in genotype at limited number of QTL.In addition, these parents are complementary, i.e. at least one of the parents was homogenous with target genotype.The development of mapping populations and simulation tools and procedures for the target genotype provides a potential way of utilizing of the detected QTLs.

Conclusions
A set of reciprocal introgression lines with a genetic map was provided for genetic dissection of complex traits.The procedure of selecting more suitable analytical methods for this kind of population were also presented by using MQ-GS trait correlations / genetic overlaps as example.Besides eight new loci, nine lociharboring regions for different traits and background independent loci were reported.Background independent (BI) loci were also found for each MQ and GS traits.All these information together with the simulation on breeding design may provide useful guidelines for rice molecular breeding.

Fig. 3
Fig. 3 Common QTLs among different QTL mapping methods

Fig. 4
Fig. 4 Common QTLs between grain shape traits and among milling quality traits

Table 2
Pearson's correlation coefficient between the five traits across environments in the two populations Since these QTLs were detected by QTL by environment interaction analysis instead of by single-environmental analysis, it is not suitable to determine if they were stably detected or not.To simplify the comparison of mapping results with the other two methods, we also regarded these QTLs as stable QTLs in this study.For GL, 14 QTLs were BI loci, i.e. qGL.qGL.

Table 3
Information on the linkage map constructed from the RIL population

Table 4
Stable QTLs detected by BIP GW grain width, BR brown rice rate, MR milled rice rate, HR head rice rate a Population b Environment c Position d Percentage of phenotypic variance explained e Additive effect B: Best linear unbiased prediction (BLUP)

Table 8
Percentage of the designed genotype in doubled haploid population derived from a single cross, top cross or double cross

Single cross (SC), Top cross (TC) or Double cross (DC)
4ableS2).For GL, a total of 60 reported QTLs in previous studies were detected in this study, of which 47, 42, 36, and 43 were identified by BIP, REG, CSL and MET, respectively.Twenty reported GL QTLs were detected by all the four mapping methods.For GW, 54 QTLs reported QTLs were detected, of which 32, 40, 27, and 34 were identified by BIP, REG, CSL and MET, respectively.Eighteen reported GW QTLs were detected by all the four mapping methods.For BR, 33 reported QTLs were detected, among which 16, 27, 11, and 9 were identified by BIP, REG, CSL and MET, respectively.Three reported BR QTLs were detected by all the four mapping methods.For MR, 31 reported QTLs were detected, of which 13, 29, 14, and 9 were identified by BIP, REG, CSL and MET, respectively.Five reported MR QTLs were detected by all four mapping methods.For HR, 36 reported QTLs were detected, of which 16, 29, 16, and 16 were identified by BIP, REG, CSL and MET, respectively.Six reported HR QTLs were detected by all four mapping methods.Comparison results showed the reliability of detected QTLs and the relative consistency of the four mapping methods in this study.There were eight novel QTLs detected by this study that were not reported in previous studies, and were stable in all the four mapping methods, i.e. qGL.BIP-5.4,qGL.BIP-8.2,qGL.BIP-10.1,qGW.BIP-1.4,qGW.BIP-3.1, qGW.BIP-12.3,qMR.BIP-2.1 and qHR.BIP-2.1.