- Research
- Open access
- Published:
QTL mapping and candidate gene mining of seed size and seed weight in castor plant (Ricinus communis L.)
BMC Plant Biology volume 24, Article number: 885 (2024)
Abstract
Background
Castor (Ricinus communis L., 2n = 2x = 20) is an important industrial crop, due to its oil is very important to the global special chemical industry. Seed size and seed weight are fundamentally important in determining castor yield, while little is known about it. In this study, QTL analysis and candidate gene mining of castor seed size and seed weight were conducted with composite interval mapping (CIM), inclusive composite interval mapping (ICIM) and marker enrichment strategy in 4 populations, i.e., populations F2, BC1, S1-1 and S1-2, derived from 2 accessions with significant phenotypic differences.
Results
In the QTL primary mapping, 2 novel QTL clusters were detected in marker intervals RCM520-RCM76 and RCM915-RCM950. In order to verify their accuracy and to narrow their intervals, QTL remapping was carried out in populations F2 and BC1. Among them, 44 and 30 QTLs underlying seed size and seed weight were detected in F2 population using methods CIM and ICIM-ADD respectively, including 4–9 and 3–5 ones conferring each trait were identified with a phenotypic variation explained ranged from 37.92 to 115.81% and 32.86-45.98% respectively. The remapping results in BC1 population were consistent with those in F2 population. Importantly, 3 QTL clusters (i.e. QTL-cluster1, QTL-cluster2 and QTL-cluster3) were found in marker intervals RCM74-RCM76 (37.1 kb), RCM930-RCM950 (259.8 kb) and RCM918-RCM920 (172.9 kb) respectively; in addition, all of them were detected again, the former one was found in the S1-2 population, and the latter two were found simultaneously in the populations S1-1 and S1-2. Finally, 6 candidate genes (i.e. LOC8266555, LOC8281168, LOC8281151, LOC8259066, LOC8258591 and LOC8270077) were screened in the above QTL clusters, they were differentially expressed in multiple seed tissues of both parents, signifying the potential role in regulating seed size and seed weight.
Conclusion
The above results not only provide new insights into the genetic structure of seed size and seed weight in castor, but also lay the foundation for the functional identification of these candidate genes.
Introduction
Castor plant (Ricinus communis L.) is an important industrial crop, with its seed oil content accounting for 46-55% [1]. Castor oil contains more than 85% ricinoleic acid, a special hydroxy fatty acid with unique chemical properties, which makes it widely used in specialty chemicals, biodiesel, medicine, etc [2, 3]. With the rapid development of economy, the global demand for castor oil is steadily increasing [2]. Growing castor plant is the only commercial source of castor oil, however, the castor planting area is decreasing for decades due to the low variety improvement and the lag in genetic research. Therefore, breeding high-yielding cultivars is in urgent need for the development of castor industry.
Seed size and seed weight are key components of seed yield in flowering plants [4], but little is known about its genetic mechanism in castor. Up to now, 29 QTLs underlying seed size and seed weight were identified in castor, from which 15 genes were predicted [5, 6]. In addition, a few candidate genes were screened out with multi-omics analysis and quantitative real-time polymerase chain reaction (qRT-PCR) [7, 8]. Nonetheless, few results available have been applied in castor breeding.
A mature seed consists of the seed coat, endosperm and embryo, and the formation of seed size and seed weight are the results of their harmonious growth. Various metabolic pathways are involved in this process, such as the ubiquitin-proteasome pathway, the G-protein signaling pathway, the mitogen-activated protein kinase (MAPK) pathway, the phytohormone signaling pathway, and the HAIKU (IKU, a pathway regulated by abscisic acid, brassinosteroids and cytokinin) pathway, so are a good deal of transcriptional regulatory factors [4, 9,10,11]. Three points are crucial. Firstly, the seed size and seed weight are mainly determined by cell expansion and/or division during organogenesis. In these processes, some genes negatively regulate the formation of seed size and seed weight by reducing the cell number or narrowing the cell size, such as the genes GRAIN LENGTH AND AWN 1 (GLA1) [12], Squamosa promoter-binding-like protein 4 (SPL4) [13], Mitogen-activated protein kinase 3 (MKK3) [14], SUPERNUMERARY BRACT (SNB) [15], Novel Seed Size (NSS) [16], while other genes such as OsAGO17 [17], miR156 and miR529 [18] positively control them by increasing the cell size. Secondly, the disorder of the endosperm cellularization stage affects seed size and seed weight, e.g., the gene TERMINAL FLOWER1 (TFL1) [19] negatively mediates them by performing early endosperm cellularization. Thirdly, the variants of individual parental alleles will decrease seed size and seed weight, such as the maternal defected Vacuolar Invertase2 (VIN2) [20] and the parental expressed dosage-effect defective1 (ded1) [21].
With the rapid development of sequencing technology and the cost reduction, high-throughput sequencing combined with bulked segregant analysis (BSA) has been widely used in gene mining, on which a series of mapping methods were developed, such as QTL-seq, MutMap, BSR-seq, GradedPool-seq, QTG-seq, etc [22]. These methods can adapt to the QTL mapping of different species and different traits in a better way by decreasing population signal/noise ratio, and improving data processing algorithms so as to locate QTL rapidly, but they generally fail to directly mine candidate genes. Fortunately, QTL mapping (e.g. QTL-seq) combining with transcriptome analysis has been used to mine genes quickly. Some genes were successfully identified using this strategy, for instance, the genes controlling seed iron and zinc content in soybean [23], capsaicinoid biosynthesis in pepper [24], high-temperature stress-responsive in tomato [25] and heading type in chinese cabbage [26]. Marker enrichment has also been proved to be a promising strategy for gene mining [22]. The confidence intervals of QTL can be narrowed by enriching markers within a reliable QTL through developing different types of novel markers [27,28,29]. Similarly, high-quality mapping within trustworthy QTL can be used to mine target gene [30].
In this study, QTL mapping and candidate gene mining of seed size and seed weight were performed in populations F2, BC1, S1-1 and S1-2 with composite interval mapping (CIM), inclusive composite interval mapping (ICIM) and marker enrichment strategy. It is expected to facilitate the castor high-yield breeding.
Materials and methods
Materials
9048 was the female parent of Zibi 5, a main cultivar in China, with tall and compact plant architecture, stout stalks, large spikes and a medium seed setting rate. 16–201 was collected from Southern China by the castor research group of Guangdong Ocean University in 2016, it was a monecious line with almost complementary agronomic traits to 9048 (i.e., dwarf plant type, thin and tough stalks, multiple scattered branches and high seed setting rate, but small spike and capsule). The use of 9048 (P1) and 16–201 (P2) as parents makes it easy to obtain mapping populations with large phenotypic differences, and is suitable for selecting superior materials from them. Populations F1, F2, BC1 (F1 backcross with P2) and S1, which derived from the above 2 accessions with significant differences in seed size and seed weight (Fig. 1, Fig. S1), were used in this study (Fig. 2). Among them, the S1 population was created by mixing the seeds harvested from F2 individuals which pollinated freely within the population under isolated conditions, 8 seeds were got from each individual (Fig. 2). In Sep. 2020, the populations P1, P2, F1, F2 and BC1 were planted at the experimental base of Guangdong Ocean University in Huguang, Zhanjiang, Guangdong, China, with a population size of 25, 25, 25, 282 and 250 respectively. In Sep. 2021, populations P1, P2, F1 and S1 were planted in Mazhang, Zhanjiang, Guangdong, China, with a population size of 22, 22, 22 and 1612 respectively. During planting, the weather in Zhanjiang was relatively mild, with average daily temperatures ranging between 17°C and 23°C, and average total precipitation of 39 mm (http://weather.zuzuche.com/c307_winter.html). The plant and row spacing was all 1 m. The cultivation management was the same as high-yield field.
S1-1 population was created by selecting individuals with the largest capsule (116 individuals) and the smallest capsule (124 individuals) respectively in primary spike from S1 population; S1-2 population was formed by 210 individuals randomly selected from S1 population (Fig. 2).
Phenotypic investigation
Seeds were separately harvested from the primary spike and primary branch spike of each individual (Fig. S2). Five seeds were randomly chosen from the primary spike, and the 3 dimensions of each were measured with the digital caliper, the average of 5 seeds was used as seed length of primary spike (PSSL), seed width of primary spike (PSSW) and seed thickness of primary spike (PSST), respectively. The same procedure was followed in the primary branch spike to obtain data of seed length of primary branch spike (PBSSL), seed width of primary branch spike (PBSSW) and seed thickness of primary branch spike (PBSST). The weight of 100 seeds randomly selected from the primary spike and the primary branch spike respectively was the hundred-seed weight of primary spike (PSHSW) and hundred-seed weight of primary branch spike (PBSHSW), replicated 3 times. Statistical description and Student’s t test were run by software SPSS 25 and Excel 2010.
DNA extraction, genotyping and genetic map construction
Genomic DNA was extracted using the modified CTAB method [31]. Five hundred sixty-six pairs of SSR primer uniformly distributed on the whole castor genome with clear and stable bands, which were selected from 1750 pairs of SSR primer developed by the castor research group of Guangdong Ocean University [32, 33], were used in this study. Genotyping of segregating populations was performed using the polymerase chain reaction (PCR) technique. In this study, a 10 µl PCR reaction system was used, including 0.5 µl each of SSR primers, 4 µl of 2×PCR Mix, 4 µl of ddH2O and 1 µl of 30 ng µl−1 DNA. PCR reactions and the display of PCR products were performed according to the procedure described by Yeboah et al. [34].
Polymorphic primer screening and genetic map construction were carried out according to the procedures described by Huang et al. [35].
QTL analysis
Single locus QTLs were mapped with the CIM method in software WinQTLCart v2.5 and the ICIM-ADD method in QTLIcimapping v4.2, with a LOD threshold of 2.0. The epistasis QTLs were identified with the ICIM-EPI method in software QTLIcimapping v4.2, with a LOD threshold of 5.0. The confidence intervals for all QTLs were determined with 95% confidence. The adjacent QTLs with overlapping confidence intervals or positions within 5 cM were regarded as the same QTL [23, 36]. The QTLs conferring the same trait detected simultaneously in different populations were defined as the stable QTL, those shared by more than 1 trait the co-located QTLs, and those with a phenotypic variation explained (PVE) over 10% the main-effect QTLs. The name of QTL starts with “q”, followed by the trait abbreviation, the chromosome serial number and the QTL serial number on the chromosome in turn [37]. In order to distinguish the QTLs identified in different populations, the capital “F” and “B” were prefixed the QTLs detected in populations F2 and BC1 respectively, no prefix before stable QTLs [36]. The co-located QTLs were named as “q + Co-locatedQTL + chromosome serial number + QTL serial number”.
Verification of QTL cluster
Based on the results of QTL remapping and QTL primary mapping, novel SSR markers were developed and enriched within the flanking marker intervals of QTL clusters. Furthermore, the QTLs conferring PSHSW, PSSL, PSSW and PSST were verified within the above intervals in populations S1-1 and S1-2, using the method described in the “QTL analysis” section [28, 29]. The genetic maps of populations S1-1 and S1-2 were constructed with default parameters and F2 population model since their gene frequency and genotypic frequency were identical with F2 population.
Candidate gene prediction and expression analysis
All open reading frames (ORFs), covered by the flanking marker interval of the QTL clusters, were retrieved by browsing castor genome ASM1957865v1 (BioProject: PRJNA589181) using software IGB v9.1.8 [38]. Functional annotation of the retrieved ORFs was performed using Kobas 3.0 online tool (http://bioinfo.org/kobas). In this study, candidate genes were directly selected by the available literature description, since the identified QTL clusters contained fewer ORFs. In brief, as long as the ORF-translated protein (based on gene annotation results) has been reported that it directly or indirectly regulates seed size and seed weight in other plants, this retrieved ORF will be selected as a candidate gene.
Further tested the expression patterns of the candidate genes using qRT-PCR. A total of 10 samples were collected, including the seed at 14 days after pollination (DAP), seed coat and endosperm at 30 DAP, and seed coat and endosperm at 45 DAP of 16–201(with small seed), corresponding to the seed at 14 DAP, seed coat and endosperm at 38 DAP, and seed coat and endosperm at 55 DAP of 9048 (with large seed), representing the early, middle and late stages of seed development respectively [7, 39]. Total RNA was extracted using RNAprep Pure Plant Plus Kit (Cat No. DP441, TIANGEN, Beijing, China) following the manufacturer’s protocol. cDNA was synthesized with 100 ng of total RNA using the PrimeScript™ RT reagent kit (Cat No. RR037A, TAKARA, Japan). Primers for each gene were designed with software Primer Primer 5.0 (Supplementary Table S1). qRT-PCR was performed on an Applied Biosystems StepOnePlus fluorescence quantitative PCR (ViiA7, USA) using HieffTM qPCR SYBR Green Master Mix (No Rox) (Cat No. 11201ES08, YEASEN, Shanghai, China). A two-step procedure was adopted to carry out the reaction, with the following thermal cycle conditions: 95°C for 5 min, followed by 40 cycles of 95°C for 10 s and 60°C for 30 s. The internal reference gene 16 S rRNA (> XM_048375175.1) was amplified in parallel with each candidate gene in 3 technical replicates. The relative expression levels of the genes were calculated using the 2−ΔΔCt method [40].
Results
Phenotype analysis
The seed size and seed weight of P1 (9048) were significantly higher than P2 (16–201) (p < 0.01) on primary spike and branch spike (Fig. 1, Table 1, Fig. S1). In F2 population, all 8 traits (i.e., PSHSW, PBSHSW, PSSL, PSSW, PSST, PBSSL, PBSSW and PBSST) showed slightly skewed and continuous distributions (skewness between 0.58 and 0.81) with a flattened shape (kurtosis between − 0.05 and 0.76), and their variations were 12.20–52.23 g, 4.19–42.50 g, 9.51–17.20 mm, 6.31–9.84 mm, 4.71–7.27 mm, 9.12–15.41 mm, 5.97–10.29 mm and 4.23–6.88 mm respectively; Compared to F2 population, the above traits showed a more clearly skewed and continuous distribution (skewness between 1.56 and 2.46) with a sharper shape (kurtosis between 3.67 and 8.97) in BC1 population, and their variations were 6.90–65.22 g, 10.03–42.85 g, 9.41–17.17 mm, 6.05–10.63 mm, 4.57–7.64 mm, 9.03–15.41 mm, 5.93–10.29 mm and 4.45–6.88 mm respectively (Fig. 3, Table 1). The above results implying the existence of major genes controlling seed size and seed weight [41]. With a correlation coefficient more than 0.69, a significant positive correlation (p < 0.01) existed among the 8 traits in populations F2 and BC1 (Fig. 3i).
Genetic map construction
In order to save costs and stop losses in time, 340 pairs of primers were randomly selected for QTL primary mapping from the 566 primer pairs that had been chosen. The genetic maps constructed with populations F2 and BC1 contained 47 and 19 polymorphic SSR markers, with an average marker interval of 11.63 and 14.24 cM and a LOD value of 6.0 and 6.0, including 9 and 4 linkage groups, covered 546.75 and 270.59 cM of the genome respectively (Fig. S3).
Since novel QTLs conferring seed size and seed weight were found in the QTL primary mapping, all 566 pairs of primers were used for QTL remapping. After that, the genetic maps constructed with populations F2 and BC1 contained 63 and 30 polymorphic markers, with an average marker interval of 10.21 and 10.31 cM and a LOD value of 8.0 and 9.0, including 10 and 5 linkage groups, covered 643.36 and 309.22 cM of the genome respectively.
QTL primary mapping
In the primary mapping, 1–5 QTLs underlying each trait were detected in populations F2 and BC1, with a PVE of 6.65-50.16%. The PVE of single QTL ranged from 0.01 to 50.16% (Fig. S4, Supplementary Tables S2, S3 and S4). Importantly, 2 QTL clusters, located in marker intervals RCM520-RCM76 (18,260 kb) and RCM915-RCM950 (1,004 kb) respectively, were identified, which all contained 8 QTLs (including 7 main-effect QTLs) (Fig. S5).
Remapping with methods CIM and ICIM-ADD
After marker enrichment, a total of 44 and 15 QTLs conferring seed size and seed weight were detected by the CIM method in populations F2 and BC1 respectively (Fig. 4, Tables 2 and 3). They were distributed on linkage group 3, 5, 6, 9 and 10, with a PVE ranged from 37.92 to 115.81%. Among them, FqPSHSW3.3, FqPBSHSW3.2 and FqPBSSL3.3 were regarded as the ghost QTLs of qPSHSW3.2, qPBSHSW3.3 and qPBSSL3.2 respectively, mainly because they were located in adjacent confidence intervals, with the same action mode and close effect value., which made the PVE over 100%. And because the latter 3 were stable QTLs, the former 3 were eliminated.
In F2 population, 5, 5, 4, 9, 5, 6, 4 and 6 QTLs conferring PSHSW, PBSHSW, PSSL, PSSW, PSST, PBSSL, PBSSW and PBSST respectively were identified, with a PVE ranged from 5.48 to 59.28%, 4.90-77.97%, 2.03-25.83%, 0.15-28.13%, 3.94-28.02%, 2.22-47.11%, 1.84-22.55% and 2.72-25.94% respectively. To our surprise, 3, 2, 3, 1, 3, 4, 2 and 3 main-effect QTLs underlying the above trait were found on linkage group 3, and they tend to cluster. In BC1 population, 2, 1, 2, 3, 2, 2, 1 and 2 QTLs conferring PSHSW, PBSHSW, PSSL, PSSW, PSST, PBSSL, PBSSW, PBSST respectively was detected, with a PVE ranged from 5.04 to 12.94%, 14.82%, 4.85-18.28%, 5.08-11.12%, 15.22-16.07%, 6.75-21.72%, 15.33% and 3.52-18.74% respectively. Similarly, at least 1 main-effect QTL conferring each trait were located on linkage group 3.
Thirteen stable QTLs were simultaneously found in populations F2 and BC1, including qPSHSW3.1, qPSHSW3.2, qPBSHSW3.3, qPSSL3.1, qPSSL3.2, qPSSW3.1, qPSSW3.2, qPSSW3.3, qPSST3.3, qPBSSL3.1, qPBSSL3.2, qPBSSW3.2 and qPBSST3.2. They each shared the same action mode, close effect value and common flanking markers between populations F2 and BC1.
After marker enrichment, a total of 30 and 12 QTLs were detected by the ICIM-ADD method in populations F2 and BC1 respectively (Fig. 4, Tables 3 and 4). They were distributed on linkage group 3, 5, 6 and 10, with a PVE ranged from 12.84 to 45.98%. Obviously, the vast majority of QTLs were detected by both CIM and ICIM-ADD methods.
In F2 population, 5, 4, 4, 3, 4, 3 and 3 QTLs conferring PSHSW, PBSHSW, PSSL, PSSW, PSST, PBSSL, PBSSW and PBSST respectively were detected, with a PVE ranged from 3.45 to 16.37%, 2.40-17.35%, 2.86-21.05%, 2.64-22.06%, 4.35-22.70%, 4.84-19.03%, 2.84-16.98% and 4.76-18.54% respectively. Four QTLs (FqPSSL6.1, FqPSSW5.1, FqPBSSW5.1 and FqPBSSW3.3) were separately detected with the ICIM-ADD method. In BC1 population, 1, 1, 2, 1, 1, 2, 2 and 2 QTLs conferring PSHSW, PBSHSW, PSSL, PSSW, PSST, PBSSL, PBSSW and PBSST respectively were identified, with a PVE ranged from 15.08%, 16.31%, 3.43-20.78%, 12.84%, 15.86%, 4.54-25.71%, 3.37-15.87% and 3.82-20.83% respectively. Only BqPBSSW6.1 was separately detected with the ICIM-ADD method.
Seven stable QTLs (qPSHSW3.1, qPSHSW3.2, qPSSL3.1, qPSSL3.2, qPSSW3.1, qPBSSL3.1 and qPBSSL3.2) were simultaneously found in populations F2 and BC1, with similar effects and common flanking markers, which was consistent with the results identified with the CIM method.
Co-located QTL
Ten co-located QTLs shared by 2–8 traits were found in populations F2 and BC1, namely qCo-locatedQTL3.1, qCo-locatedQTL3.2, qCo-locatedQTL3.3, qCo-locatedQTL3.4, qCo-locatedQTL3.5, qCo-locatedQTL5.1, qCo-locatedQTL6.1, qCo-locatedQTL6.2, qCo-locatedQTL6.3 and qCo-locatedQTL6.4 (Fig. 4, Supplementary Table S5). It indicated that the gene pleiotropy or close linkage between genes was common between 8 traits, which was the genetic foundation of significant correlationship between seed size and seed weight.
QTL clusters
After all the QTLs were mapped to the castor genome ASM1957865v1, 10 QTLs (including 9 main-effect QTLs), 16 QTLs (including 13 main-effect QTLs) and 11 QTLs (including 8 main-effect QTLs) were found to be located within 3 marker intervals on chromosome 3, i.e., RCM74-RCM76 (37.1 kb), RCM931-RCM950 (155.3 kb) and RCM917-RCM920 (189.3 kb) (Fig. S6), named as QTL-cluster1, QTL-cluster2 and QTL-cluster3 respectively. Significantly, the favorable alleles of all QTLs making up the 3 QTL clusters were provided by 9048 (Tables 2 and 4).
Epistatic QTL analysis
In F2 population, 3 pairs of epistasis QTLs were detected (Fig. S7, Supplementary Table S6). Among them, 2 and 1 pairs of epistasis QTLs conferring PSHSW and PBSHSW were detected, with a PVE ranged from 6.46 to 18.16% and 23.49% respectively. No epistatic QTL underlying seed size was detected.
In BC1 population, 10 pairs of epistasis QTLs were detected. Of which, 2, 2, 2, 2 and 2 pairs of QTLs underlying PSHSW, PBSHSW, PSST, PBSSL and PBSSW respectively, with a PVE ranged from 0.55 to 0.74%, 0.82-0.82%, 0.84-0.94%, 0.75-0.84% and 1.22-1.36% respectively. No epistatic QTL underlying PSSL, PSSW and PBSST was identified.
The epistasis effect was an important genetic component of seed weight, but has little effect on seed size (Fig. S7).
Verification of QTL clusters
QTL-cluster1, QTL-cluster2 and QTL-cluster3 were all detected again in S1 population (Fig. 5, Supplementary Tables S7 and S8). QTL-cluster1 was identified in S1-2 population, QTL-cluster2 and QTL-cluster3 were simultaneously identified in populations S1-1 and S1-2. The marker interval of QTL-cluster1 remained the same as before, still RCM74-RCM76 (37.1 kb). It’s just that the marker interval of QTL-cluster2 was changed from RCM931-RCM950(155.3 kb) to RCM930-RCM950 (259.8 kb), mainly to ensure the diseased genes were not missed; and that of QTL-cluster3 was changed from RCM917-RCM920 (189.3 kb) to RCM918-RCM920 (172.9 kb). In spite of this, the existence of the main-effect QTL clusters was proved.
Prediction and expression of candidate gene
Three, 37 and 18 ORFs were searched out within QTL-cluster1, QTL-cluster2 and QTL-cluster3 respectively (Supplementary Table S9). Finally, 6 ones were annotated as possible diseased genes. LOC8266555 in QTL-cluster1 and LOC8281151 in QTL-cluster2 were annotated as preprotein translocase subunit SCY1, chloroplastic (SCY1) and protein DEEPER ROOTING 1 (DRO1) respectively, which were predicted to regulate seed filling [42,43,44,45]; LOC8258591 and LOC8270077 in QTL-cluster3 were annotated as LRR receptor-like serine-/threonine-protein kinase ERECTA (ER) and SCF E3 ubiquitin ligase complex F-box protein grrA (grrA) respectively, which were predicted to control seed size [4, 46]; LOC8259066 and LOC8281168 in QTL-cluster2 were annotated as RING-box protein 1a (Rbx1) and auxin response factor 19 (ARF19) respectively, which were predicted to mediate normal development of the embryo [47, 48] and control seed size and seed weight respectively [49].
The 6 candidate genes were differentially expressed in multiple seed tissues between parents (Fig. 6). LOC8266555, LOC8281168, LOC8281151, LOC8259066, LOC8258591 and LOC8270077 showed the highest expression in seed coat (at middle stage), seed (at early stage), seed coat (at late stage), endosperm (at middle stage), seed (at early stage) and endosperm (at middle stage) respectively; the relative expression in the high expressing parent was 1.01–1.87, 1.15–19.90, 1.16–13.84, 1.08–2.10, 1.01–15.05 and 1.09–1.49 times than that in the low expressing parent respectively. In most cases, each candidate gene was differentially expressed in various seed tissues or different developmental stages.
Discussion
Genetic structure of seed size and seed weight
Cultivating target varieties mainly by traditional method because the tightly linked genetic markers and the information on the genetic mechanism are still scarce, although dozens of QTLs conferring seed size and seed weight have been reported [5, 6]. In this study, 5–10 QTLs underlying each trait were detected (Tables 2, 3 and 4). Among them, FqPSSW10.1, FqPSSW10.2, FqPSSW10.3 and FqPBSSL9.1 overlapped with QTLs detected by Yu et al. [5] and Yang et al. [6], and the others were novel QTLs conferring seed size and seed weight in castor. Importantly, 2–4 main-effect QTLs conferring each trait with stable and tightly linked flanking markers were identified, which play an important role in accelerating castor breeding efficiency by molecular marker-assisted selection. Additionally, PSHSW showed a multi-peaked continuous distribution and PBSHSW showed a skewed continuous distribution in both the F2 and BC1 populations (Fig. 3a and e); Seed size (PSSL, PSSW, PSST, PBSSL, PBSSW and PBSST) generally showed a multi-peaked continuous distribution in F2 population, whereas it was mostly skewed continuous distribution in BC1 population (Fig. 3b-d and f-h). These results implied that all 8 traits were jointly controlled by major genes and minor-effect polygenes [41], which was consistent with the QTL mapping results.
Epistatic effects of seed weight (PSHSW and PBSHSW) accounted for about 30% of the total contribution rate and were an important genetic component; however, epistatic effects of seed size accounted for a low or even 0% of total contribution rate and had little impact on their inheritance (Fig. S7). All traits showed a left-skewed distribution in the BC1 population (Fig. 3), with the population means biased toward 16–201 (Table 1), which was caused by the negative partial dominant inheritance of seed size and seed weight (Tables 2 and 4).
Main-effect QTL conferring seed size and seed weight
Main-effect QTLs that are mapped simultaneously with different methods are preferable as they are more reliable and easier to be genetically operated [23]. In this study, 26 (F2) and 11 (BC1) QTLs were detected simultaneously by both CIM and ICIM-ADD methods with similar single locus effects (Fig. 4, Tables 2, 3 and 4). Fortunately, most of them not only had a contribution rate over 10% but were stable QTLs identified simultaneously in populations F2 and BC1; They were clustered in 3 marker intervals (i.e., RCM74-RCM76, RCM930-RCM950 and RCM918-RCM920) arranged adjacent to each other on chromosome 3 with confidence interval lengths of 37.1-259.8 kb (Fig. 5). It was mainly caused by the significant correlationship between seed size and seed weight (Fig. 3i) [50]. In fact, these 3 QTL clusters contained all main-effect QTLs except BqPSST6.1, explained about half of the total phenotypic variation for each trait, and they were found simultaneously in 3–4 populations (Fig. 5). The above results suggest the potential role of these 3 QTL clusters in regulating seed size and seed weight in castor, and they are suitable for genetically engineered breeding.
Marker enrichment facilitates candidate gene mining
A low-cost, easy-to-operate and effective strategy is the key to mining genes. In this study, 2 QTL clusters were detected in marker intervals RCM520-RCM76 and RCM915-RCM950, they were selected based on the QTL primary mapping; then, SSR marker enrichment was performed within their flanking marker intervals, resulting in 3 main-effect QTL clusters with confidence interval lengths less than 259.8 kb were identified (Fig. 5). These 3 main-effect QTL clusters could be verified in populations S1-1 and S1-2. Similarly, enriching markers within main-effect QTLs successfully and narrowing its confidence interval lengths to below 220 kb in sesame [28] and peanut [29]. The most important step to narrow the confidence interval of main-effect QTLs is inserting markers closer to the target gene in the constructed genetic map. It can be achieved by increasing mapping population size [4] and/or developing new markers (e.g. SNP, SSR, InDel) [28, 29]. The former aims to break the tight-linkage between gene and marker, while the latter aims to increase the linked marker number. These results show that QTL primary mapping combined with marker enrichment is a viable and cost-effective candidate gene mining strategy for big-branched crops such as castor.
Predicted candidate genes and their expression pattern
A total of 6 candidate genes were predicted, namely LOC8266555 (SCY1), LOC8281168 (Rbx1), LOC8281151 (DRO1), LOC8259066 (ARF19), LOC8258591 (ER) and LOC8270077 (grrA). LOC8266555 controlled normal greening during embryogenesis [44], LOC8258591 involved in the ubiquitin-proteasome pathway [4], LOC8259066 and LOC8281168 involved in the auxin pathway [47,48,49], LOC8281151 mediated root growth angle [42, 45] and LOC8270077 enhanced seed coat cell proliferation [46]. The above ways have been reported to regulate seed development. The temporal and spatial expression of 6 candidate genes differed significantly between the parents, especially LOC8281168, LOC8281151 and LOC8258591, the expression of these 3 candidate genes in the high expressing parent was over 10 times than that in the low expressing parent (Fig. 6).
The development of seed coat and endosperm is dynamic, continuous and asynchronous in castor [8]. The genes controlling them were abundantly expressed in endosperm at the middle stage of seed development (seed-filling stage) [7, 39]. Surprisingly, most of the candidate genes (LOC8281168, LOC8259066, LOC8258591 and LOC8270077) were generally abundantly expressed in the endosperm of 9048, yet more candidate genes (LOC8266555, LOC8281151, LOC8258591 and LOC8270077) were abundantly expressed in the seed coat of 16–201 (Fig. 6). Coincidentally, 9048 has a higher seed weight while 16–201 has a harder seed coat. Therefore, the initial assumption was that more favorable alleles from 9048 were beneficial to increasing the seed weight (i.e., enhancing the seed-filling rate to make it plump) and more alleles from 16-201 caused excessive seed coat lignin content, which made the seed coat harder [8]. Aggregating enough favorable alleles (e.g., LOC8281168, LOC8259066, LOC8258591 and LOC8270077 from 9048) controlling seed size and seed weight may be beneficial for achieving a high seed setting rate when increasing the seed number and/or seed size, which in turn improves yield.
Conclusion
Dozens of QTLs and 6 candidate genes conferring castor seed size and seed weight were identified in 4 segregating populations (i.e., populations F2, BC1, S1-1 and S1-2). In summary, the above results suggest that LOC8266555, LOC8281168, LOC8281151, LOC8259066, LOC8258591 and LOC8270077 may be key genes controlling seed development in castor. Hence, it is necessary to analyze whether these 6 candidate genes affect seed development through transgenic experiments and explore their mechanisms of functions, which will be greatly beneficial to breed superior cultivars or accessions.
Availability of data and materials
The data that support the findings of this study are available from the corresponding author Xuegui Yin on reasonable request.
Abbreviations
- QTL:
-
Quantitative trait locus
- CIM:
-
Composite interval mapping
- ICIM:
-
Inclusive composite interval mapping
- PVE:
-
Phenotypic variation explained
- PCR:
-
Polymerase chain reaction
- qRT-PCR:
-
Quantitative real-time polymerase chain reaction
- ORFs:
-
Open reading frames
- PSHSW:
-
Hundred-seed weight of primary spike
- PBSHSW:
-
Hundred-seed weight of primary branch spike
- PSSL:
-
Seed length of primary spike
- PSSW:
-
Seed width of primary spike
- PSST:
-
Seed thickness of primary spike
- PBSSL:
-
Seed length of primary branch spike
- PBSSW:
-
Seed width of primary branch spike
- PBSST:
-
Seed thickness of primary branch spike
- LG:
-
Linkage group
- Add.:
-
Additive effect
- Dom.:
-
Dominance effect
- CI:
-
Confidence interval
- MI:
-
Marker interval
References
Ogunniyi DS. Castor oil: a vital industrial raw material. Bioresour Technol. 2006;97(9):1086–91.
Carrino L, Visconti D, Fiorentino N, Fagnano M. Biofuel production with castor bean: a win-win strategy for marginal land. Agronomy. 2020;10(11):1690.
Osorio-González CS, Gómez-Falcon N, Sandoval-Salas F, Saini R, Brar SK, Ramírez AA. Production of biodiesel from castor oil: a review. Energies. 2020;13(10):2467.
Li N, Xu R, Li Y. Molecular networks of seed size control in plants. Annu Rev Plant Biol. 2019;70(1):435–63.
Yu A, Li F, Xu W, Wang Z, Sun C, Han B, et al. Application of a high-resolution genetic map for chromosome-scale genome assembly and fine QTLs mapping of seed size and weight traits in castor bean. Sci Rep. 2019;9(1):11911–50.
Yang J, Wang Z, Qiao L, Wang Y, Zhao Y, Zhang H, et al. QTL mapping of seed size traits based on a high-density genetic map in castor. Acta Agron Sin. 2023;49:719–30. (in Chinese).
Yu A, Li F, Liu A. Comparative proteomic and transcriptomic analyses provide new insight into the formation of seed size in castor bean. BMC Plant Biol. 2020;20(1):48.
Yu A, Wang Z, Zhang Y, Li F, Liu A. Global gene expression of seed coat tissues reveals a potential mechanism of regulating seed size formation in castor bean. Int J Mol Sci. 2019;20(6):1282.
Liu Z, Mei E, Tian X, He M, Tang J, Xu M, et al. OsMKKK70 regulates grain size and leaf angle in rice through the OsMKK4-OsMAPK6‐OsWRKY53 signaling pathway. J Integr Plant Biol. 2021;63(12):2043–57.
Xiao W, Hu S, Zou X, Cai R, Liao R, Lin X, et al. Lectin receptor-like kinase LecRK-VIII.2 is a missing link in MAPK signaling-mediated yield control. Plant Physiol. 2021;187(1):303–20.
Zhan P, Wei X, Xiao Z, Wang X, Ma S, Lin S, et al. GW10, a member of P450 subfamily regulates grain size and grain number in rice. Theor Appl Genet. 2021;134(12):3941–50.
Wang T, Zou T, He Z, Yuan G, Luo T, Zhu J, et al. GRAIN LENGTH AND AWN 1 negatively regulates grain size in rice: GLA1 mediates grain size. J Integr Plant Biol. 2019;61(10):1036–42.
Hu J, Huang L, Chen G, Liu H, Zhang Y, Zhang R, et al. The elite alleles of OsSPL4 regulate grain size and increase grain yield in rice. Rice. 2021;14(1):90.
Pan YH, Gao LJ, Liang YT, Zhao Y, Liang HF, Chen WW, et al. OrMKK3 influences morphology and grain size in rice. J Plant Biol. 2021;66:1–14.
Jiang L, Ma X, Zhao S, Tang Y, Liu F, Gu P, et al. The APETALA2-Like transcription factor SUPERNUMERARY BRACT controls rice seed shattering and seed size. Plant Cell. 2019;31(1):17–36.
Zhang M, Dong R, Huang P, Lu M, Feng X, Fu Y, et al. Novel Seed Size: a novel seed-developing gene in Glycine max. Int J Mol Sci. 2023;24(4):4189.
Zhong J, He W, Peng Z, Zhang H, Li F, Yao J. A putative AGO protein, OsAGO17, positively regulates grain size and grain weight through OsmiR397b in rice. Plant Biotechnol J. 2020;18(4):916–28.
Li Y, He Y, Qin T, Guo X, Xu K, Xu C, et al. Functional conservation and divergence of miR156 and miR529 during rice development. Crop J. 2022;11(3):692–703.
Zhang B, Li C, Li Y, Yu H. Mobile TERMINAL FLOWER1 determines seed size in Arabidopsis. Nat plants. 2020;6(9):1146–57.
Lee D, Lee S, Rahman MM, Kim Y, Zhang D, Jeon J. The role of rice Vacuolar Invertase2 in seed size control. Mol Cells. 2019;42(10):711–20.
Dai D, Mudunkothge JS, Galli M, Char SN, Davenport R, Zhou X, et al. Paternal imprinting of dosage-effect defective1 contributes to seed weight xenia in maize. Nat Commun. 2022;13(1):5366.
Li Z, Xu Y. Bulk segregation analysis in the NGS era: a review of its teenage years. Plant J. 2022;109(6):1355–74.
Wang H, Jia J, Cai Z, Duan M, Jiang Z, Xia Q, et al. Identification of quantitative trait loci (QTLs) and candidate genes of seed Iron and zinc content in soybean [Glycine max (L.) Merr]. BMC Genomics. 2022;23(1):146.
Park M, Lee J, Han K, Jang S, Han J, Lim J, et al. A major QTL and candidate genes for capsaicinoid biosynthesis in the pericarp of Capsicum chinense revealed using QTL-seq and RNA-seq. Theor Appl Genet. 2019;132(2):515–29.
Wen J, Jiang F, Weng Y, Sun M, Shi X, Zhou Y, et al. Identification of heat-tolerance QTLs and high-temperature stress-responsive genes through conventional QTL mapping, QTL-seq and RNA-seq in tomato. BMC Plant Biol. 2019;19(1):398.
Gu A, Meng C, Chen Y, Wei L, Dong H, Lu Y, et al. Coupling Seq-BSA and RNA-Seq analyses reveal the molecular pathway and genes associated with heading type in chinese cabbage. Front Genet. 2017;8:176.
Li WL, Bai QH, Zhan WM, Ma CY, Wang SY, Feng YY, et al. Fine mapping and candidate gene analysis of qhkw5-3, a major QTL for kernel weight in maize. Theor Appl Genet. 2019;132(9):2579–89.
Liu H, Zhou F, Zhou T, Yang Y, Zhao Y. Fine mapping of a novel male-sterile mutant showing wrinkled-leaf in sesame by BSA-Seq technology. Ind Crop Prod. 2020;156:112862.
Wang Z, Yan L, Chen Y, Wang X, Huai D, Kang Y, et al. Detection of a major QTL and development of KASP markers for seed weight by combining QTL-seq, QTL-mapping and RNA-seq in peanut. Theor Appl Genet. 2022;135(5):1779–95.
Liu C, Yao X, Li G, Huang L, Wu X, Xie Z. Identification of major loci and candidate genes for anthocyanin biosynthesis in Broccoli using QTL-Seq. Hortic. 2021;7(8):246.
Agyenim-Boateng KG, Lu J, Shi Y, Zhang D, Yin X. SRAP analysis of the genetic diversity of wild castor (Ricinus communis L.) in south china. PLoS ONE. 2019;14(7):e219667.
Liu S, Yin X, Lu J, Liu C, Bi C, Zhu H, et al. The first genetic linkage map of Ricinus communis L. based on genome-SSR markers. Ind Crop Prod. 2016;89:103–8.
Rabinowicz PD, Chan AP, Crabtree J, Zhao Q, Lorenzi H, Orvis J, et al. Draft genome sequence of the oilseed species Ricinus communis. Nat Biotechnol. 2010;28(9):951–6.
Yeboah A, Lu J, Ting Y, Karikari B, Gu S, Xie Y, et al. Genome-wide association study identifies loci, beneficial alleles, and candidate genes for cadmium tolerance in castor (Ricinus communis L). Ind Crop Prod. 2021;171:113842.
Huang G, Yin X, Lu J, Zhang L, Lin D, Xie Y, et al. Dynamic QTL mapping revealed primarily the genetic structure of photosynthetic traits in castor (Ricinus communis L). Sci Rep. 2023;13(1):14071.
Wu J, Mao L, Tao J, Wang X, Zhang H, Xin M, et al. Dynamic quantitative trait loci mapping for plant height in recombinant inbred line population of upland cotton. Front Plant Sci. 2022;13:914140.
McCouch SR, Chen X, Panaud O, Temnykh S, Xu Y, Cho YG, et al. Microsatellite marker development, mapping and applications in rice genetics and breeding. Plant Mol Biol. 1997;35(1/2):89–99.
Lu J, Pan C, Fan W, Liu W, Zhao H, Li D, et al. A chromosome-level genome assembly of wild castor provides new insights into its adaptive evolution in tropical desert. Genomics Proteom Bioinf. 2022;20(1):42–59.
Zhang Y, Mulpuri S, Liu A. Photosynthetic capacity of the capsule wall and its contribution to carbon fixation and seed yield in castor (Ricinus communis L). Acta Physiol Plant. 2016;38(10):1–12.
Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2–∆∆CT method. Methods. 2001;25(4):402–8.
Gai J, Zhang Y, Wang J. Genetic system of quantitative traits in plants. Beijing: Science Press; 2003. p. 72–88. (in Chinese).
Arai-Sanoh Y, Takai T, Yoshinaga S, Nakano H, Kojima M, Sakakibara H, et al. Deep rooting conferred by DEEPER ROOTING 1 enhances rice yield in paddy fields. Sci Rep. 2014;4(1):5563.
Kitomi Y, Hanzawa E, Kuya N, Inoue H, Hara N, Kawai S, et al. Root angle modifications by the DRO1 homolog improve rice yields in saline paddy fields. Proc Natl Acad Sci. 2020;117(35):21242–50.
Skalitzky CA, Martin JR, Harwood JH, Beirne JJ, Adamczyk BJ, Heck GR, et al. Plastids contain a second sec translocase system with essential functions. Plant Physiol. 2011;155(1):354–69.
Uga Y, Sugimoto K, Ogawa S, Rane J, Ishitani M, Hara N, et al. Control of root system architecture by DEEPER ROOTING 1 increases rice yield under drought conditions. Nat Genet. 2013;45(9):1097–102.
Wu X, Cai X, Zhang B, Wu S, Wang R, Li N, et al. ERECTA regulates seed size independently of its intracellular domain via MAPK-DA1-UBP15 signaling. Plant Cell. 2022;34(10):3773–89.
Gray WM, Hellmann H, Dharmasiri S, Estelle M. Role of the Arabidopsis RING-H2 protein RBX1 in RUB modification and SCF function. Plant Cell. 2002;14(8):2137–44.
Schwechheimer C, Serino G, Deng X. Multiple ubiquitin ligase-mediated processes require COP9 signalosome and AXR1 function. Plant Cell. 2002;14(10):2553–63.
Sun Y, Wang C, Wang N, Jiang X, Mao H, Zhu C, et al. Manipulation of Auxin Response Factor 19 affects seed size in the woody perennial Jatropha curcas. Sci Rep. 2017;7(1):40844.
Raman H, Raman R, McVittie B, Borg L, Diffey S, Singh Yadav A, et al. Genetic and physiological bases for variation in water use efficiency in canola. Food Energy Secur. 2020;9(4):e237.
Acknowledgements
We would like to express our sincere appreciation to all teachers, students, and instrument platform in our research group for their invaluable assistance.
Funding
This study was supported by National natural science foundation of China, (31271759); Guangdong provincial science and technology projects (2013b060400024, 2014a020208116, and 2016a020208015) (China); Project of enhancing school with innovation of Guangdong ocean university, GDOU2013050206 (China).
Author information
Authors and Affiliations
Contributions
G.H., J.L. and X.Y. designed the research program; G.H. and L.Z. performed most of the experiments; C.L., X.Z., H.L. and J.Z. performed part of the experiments; G.H. wrote the manuscript; G.H., J.L. and X.Y. revised the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Huang, G., Lu, J., Yin, X. et al. QTL mapping and candidate gene mining of seed size and seed weight in castor plant (Ricinus communis L.). BMC Plant Biol 24, 885 (2024). https://doi.org/10.1186/s12870-024-05611-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12870-024-05611-6