Explore the gene network regulating the composition of fatty acids in cottonseed
BMC Plant Biology volume 21, Article number: 177 (2021)
Cottonseed is one of the major sources of vegetable oil. Analysis of the dynamic changes of fatty acid components and the genes regulating the composition of fatty acids of cottonseed oil is of great significance for understanding the biological processes underlying biosynthesis of fatty acids and for genetic improving the oil nutritional qualities.
In this study, we investigated the dynamic relationship of 13 fatty acid components at 12 developmental time points of cottonseed (Gossypium hirsutum L.) and generated cottonseed transcriptome of the 12 time points. At 5–15 day post anthesis (DPA), the contents of polyunsaturated linolenic acid (C18:3n-3) and saturated stearic acid (C18:0) were higher, while linoleic acid (C18:2n-6) was mainly synthesized after 15 DPA. Using 5 DPA as a reference, 15,647 non-redundant differentially expressed genes were identified in 10–60 DPA cottonseed. Co-expression gene network analysis identified six modules containing 3275 genes significantly associated with middle-late seed developmental stages and enriched with genes related to the linoleic acid metabolic pathway and α-linolenic acid metabolism. Genes (Gh_D03G0588 and Gh_A02G1788) encoding stearoyl-ACP desaturase were identified as hub genes and significantly up-regulated at 25 DPA. They seemed to play a decisive role in determining the ratio of saturated fatty acids to unsaturated fatty acids. FAD2 genes (Gh_A13G1850 and Gh_D13G2238) were highly expressed at 25–50 DPA, eventually leading to the high content of C18:2n-6 in cottonseed. The content of C18:3n-3 was significantly decreased from 5 DPA (7.44%) to 25 DPA (0.11%) and correlated with the expression characteristics of Gh_A09G0848 and Gh_D09G0870.
These results contribute to our understanding on the relationship between the accumulation pattern of fatty acid components and the expression characteristics of key genes involved in fatty acid biosynthesis during the entire period of cottonseed development.
The quality and nutritional value of plant oils are largely defined by their fatty acid compositions. Because of the low freezing point of saturated fatty acids (SFAs), vegetable oils with a high content of SFAs will coagulate, turbid, semi solidified or even solidified at a lower temperature. From health perspective, SFAs may increase the risk of cardiovascular diseases [1,2,3,4]. Polyunsaturated fatty acids (PUFAs), e. g. linoleic acid (C18:2n-6) and linolenic acid (C18:3n-3), are beneficial to human health, but they are easily oxidized and become rancid when exposed to oxygen under general storage conditions [4,5,6,7]. Chemical hydrogenation of PUFAs can improve the oxidation stability of oil, but unfortunately, it produces harmful trans fatty acids as by-products [8, 9]. Monounsaturated fatty acid (MUFA) oleic acid (C18:1n-9) is stable and considered to be beneficial to human health [3, 8,9,10,11,12]. Edible oil with a high proportion of C18:1n-9 represents “healthy oil”, such as olive oil, due to low content of SFAs and no need of chemical hydrogenation .
Cottonseed kernel contains 25–35% oil and is one of the major sources of edible vegetable oil. Its nutritional components and quality have thus attracted much attention. The major fatty acid components of cottonseed oil are linoleic acid (C18:2n-6), palmitic acid (C16:0), oleic acid (C18:1n-9) and stearic acid (C18:0). Generally, the relative content of C18:1n-9 in cottonseed oil is ~ 15%. Because of its nutritional benefit, increasing the relative content of C18:1n-9 is one of the major targets of genetic improvement of plant oils. In higher plants, C18:1n-9 is converted from C18:1n-9-ACP by one of the acyl-ACP thioesterases, FATA, and is then used as substrate to synthesize C18:2n-6, the most abundant fatty acid component of cottonseed oil, by fatty acid desaturase 2 (FAD2) . Genetically modified plants with high oleic acid (C18:1n-9) content have been developed in several crops, e.g. soybean [15, 16], rapeseed [17, 18], cotton [3, 10, 19] and jatropha , by decreasing FAD2 expression. Changes in fatty acid composition improved the nutritional value but may have unexpected side effects, such as reduced total oil content or seed vigor [3, 8], probably due to the natural compositions of various fatty acids in seeds as a result of natural selection being necessary for seed germination and seedling growth. Composition of seed fatty acids is also closely related to the self-protection and physiological defense mechanisms of offspring [3, 21]. Therefore in-depth analysis of the gene network regulating fatty acid components of cottonseed oil is of great significance for their genetic modification to meet the needs of both human health and the fitness of cotton plants.
In this study, we first investigated the dynamic changes of fatty acid components in the developing seeds of an Upland cotton (Gossypium hirsutum L.) cultivar Xinluzao 33 and generated transcriptomes of developing seeds at multiple developmental time points. We then analyzed the expression profiles of the key genes involved in fatty acid biosynthesis and performed gene network analysis using differentially expressed genes (DEGs). The results deepened our understanding of the genes associated with the fatty acid metabolic network and the temporal regulation of cottonseed fatty acid components, and provided gene expression information that may be helpful for identification of targets for improvement of cottonseed oil quality.
Dynamic profiles of fatty acids during cottonseed development
To investigate the dynamic changes of fatty acid components during cottonseed development, the relative contents of fatty acid components were determined by GC-MS. In total, 13 fatty acid components at 12 developmental time points, separated into three stages, i.e. early (5–15 DPA), middle (20–35 DPA) and late (40–60 DPA), were determined. In the mature cotton seeds, the main fatty acid components were polyunsaturated linoleic acid (C18:2n-6, 57.62%), monounsaturated oleic acid (C18:1n-9, 15.59%), saturated palmitic acid (C16:0, 22.36%) and stearic acid (C18:0, 2.31%), all other components accounted for less than 1% of the total oil content. The relative content of C18:2n-6 gradually increased from 5 to 15 DPA, significantly increased after 15 DPA, reached a higher level at 25 DPA, and tended to be stable after that (Table 1). The highest relative content of C18:1n-9 was observed at 5 DPA (23.42%), then decreased significantly in 10 to 15 DPA seeds and increased again from 15 to 20 DPA seeds, and remained relatively constant afterwards until maturity. Saturated C16:0 was the second highest fatty acid component in the mature cottonseeds with its maximum relative percentage (44.23%) observed at 15 DPA and a significant reduction from 15 (28.35%) to 20 DPA (23.45%). C18:0 had a similar dynamic change profile as C16:0, with its maximum relative content (13.75%) observed at 15 DPA, then rapidly decreased to 5.29% at 20 DPA and finally to 2.31% in the mature seeds. Obviously, it seems that C16:0 and C18:0 synchronously accumulated mainly during the early (5–15 DPA) and middle (20–35 DPA) stages of seed development. It should be noted that the highest relative content (7.44%) of linolenic acid C18:3n-3 was observed in 5 DPA seeds, decreased to 3.12% at 10 DPA and further down to 0.55% at 20 DPA, and was one of the least in the mature seeds (Table 1). There was a significantly negative correlation between the relative content of C18:2n-6 and C18:3n-3. During development of cottonseeds, the relative content of myristic acid (C14:0), palmitoleic acid (C16:1), arachidic acid (C20:0), behenic acid (C22:0) and lignoceric acid (C24:0) was no more than ~ 2% with a relatively high percentage observed in the early stage. With the development of cottonseeds, the content of these fatty acids gradually decreased from middle to late (20–60 DPA) stages and remained at a very low level (< 1%) in mature seeds. The relative content of heptadecanoic acid (C17:0) was constantly low and even not detected at 20–25 DPA. Both trans-9-octadecenoic acid (C18:1n-9 t) and all-trans-9,12-octadecadienoic acid (C18:2n-6tt) were only detectable at 5–15 DPA (Table 1).
To understand how the dynamic changes in fatty acid components during cottonseed development are regulated, we generated seed transcriptome profiles using RNA-Sequencing (RNA-seq). RNA-seq libraries were generated from 12 different developmental time points (5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55 and 60 DPA) and sequenced. A total of ~ 599 million raw reads were generated. The average Q30 of the reads from all samples was 94.04% and the read GC content was 44.66%. Raw reads were filtered to remove low quality ones and a total of 576,668,718 clean reads were finally used in alignment. Overall, from 93.35 to 96.38% of the clean reads could be aligned to the reference genome with 87.80–90.09% of them being uniquely aligned (Table S3). These results suggested that the RNA-seq data were high quality and suitable for further analyses.
We identified genes differentially expressed at 10–60 DPA seeds using 5 DPA seeds as a control. From 10 DPA to 20 DPA, the number of DEGs increased from 396 to 2451. The number of DEGs at 25, 30 and 35 DPA three middle developmental time points was similar and started to increase at 40 DPA with a significant increase observed at 55 and 60 DPA (Fig. S2A). Regarding the number of up- and down-regulated DEGs, two transition points were observed at 10–15 DPA and 40–45 DPA. There were more down-regulated DEGs than up-regulated ones at 10 DPA and 45–60 DPA, but more up-regulated DEGs than down-regulated ones at 15–40 DPA (Fig. S2A). The period from 10 to 30 DPA is critical for oil accumulation. We further compared DEGs at 10, 20 and 30 DPA three time points. In the comparison group of 20 DPA vs 10 DPA, 30 DPA vs 20 DPA and 30 DPA vs 10 DPA, 1268, 1607, and 2517 DEGs were up-regulated, and 430, 1030 and 1302 DEGs were down-regulated, respectively. Among these DEGs, 501, 667 and 1413 were unique to each comparison, respectively, and 113 genes showed differential expression in all three comparisons (Fig. S2B).
Expression profiles of the genes involved in biosynthesis of fatty acids during cottonseed development
The de novo biosynthesis of fatty acids takes place mainly in the plastid. We first investigated the expression dynamics of the major genes involved in fatty acid biosynthesis (Fig. 1). The first step of fatty acid biosynthesis is the formation of malonyl coenzyme by acetyl-CoA carboxylase (ACC), which is a key rate-limiting enzyme in fatty acid biosynthesis  and has a great impact on the final level of seed fatty acid content. The cotton genome contains two pairs of homoeologous ACC genes (Gh_A01G1574 and Gh_D01G1885, Gh_A05G2294 and Gh_D05G2554). All four genes had a higher expression level in the early to middle developmental stages than in the late stages but the two pair homoeologs had a different profile, with Gh_A01G1574 and Gh_D01G1885 being mainly expressed in 5–15 DPA seeds while Gh_A05G2294 and Gh_D05G2554 being mainly expressed in 20–25 DPA seeds. Malonyl-CoA is catalyzed by fatty acid synthase (FAS) to produce 16:0 carbon palmitoyl-ACP via the recurring reactions. The elongation of the 16-carbon palmitoyl-ACP to the 18-carbon stearoyl-ACP is catalyzed by β-ketoacyl-ACP synthase II . Cotton has four pairs of homoeologous genes encoding β-ketoacyl-ACP synthase II (KASII). They are Gh_A13G1675 and Gh_D13G2493, Gh_A08G2201 and Gh_D08G2566, Gh_A06G0711 and Gh_D06G0821, and Gh_A09G0720 and Gh_D09G2463. Gh_A13G1675 and Gh_D13G2493 were mainly expressed in 5–15 DPA seeds, Gh_A08G2201 and Gh_D08G2566 were mainly expressed in 20–25 DPA seeds, and Gh_A06G0711 and Gh_D06G0821 had their highest expression levels observed at 30 DPA. These results suggest that Gh_A13G1675 and Gh_D13G2493 might be the major genes contributing to the conversion from C16:0-ACP to C18:0-ACP in cottonseeds. C18:0-ACP is desaturated by Δ9-stearoyl-ACP desaturase (SAD) to form monounsaturated C18:1n-9-ACP. Twelve SAD genes (Gh_A02G0894, Gh_D02G1097, Gh_A02G1788, Gh_D03G0588, Gh_A05G2865, Gh_D05G3162, Gh_A10G1286, Gh_D10G1197, Gh_A05G2312, Gh_D05G2574, Gh_A12G0128 and Gh_D05G3167) were identified in the cotton genome. Most had a higher expression level in 5–25 DPA seeds than in 30–60 DPA seeds, with the exception of two pairs of homoeologs (Gh_A02G1788 and Gh_D03G0588, Gh_A05G2312 and Gh_D05G2574), which had high expression levels at 20–25 DPA (Fig. 1).
In plants, the acyl group from acyl-ACP can be hydrolyzed by two different classes of acyl-ACP thioesterases (FATs), FATA and FATB, to release free fatty acids. During cottonseed development, one pair of homoeologous FATA gene (Gh_A08G1740 and Gh_D08G2090) was highly expressed, particularly in 5–25 DPA seeds, and three of the four pairs of homoeologous FATB (Gh_A08G2137 and Gh_D08G2504, Gh_A06G0514 and Gh_D06G0571, Gh_A13G1750 and Gh_D13G2098, and Gh_A07G0026 and Gh_D07G0034) were also mainly expressed in 5–25 DPA seeds. The specificity of FATs is the major determinant of the chain length and the level of saturated fatty acids (SFAs) found in most plant tissues . It had been shown that FATB displayed high catalytic efficiencies towards 16:0-ACP, whereas FATA had a high activity with oleoyl-ACP and a low activity with 16:0-ACP (~ 75-fold lower than FATB) . At the early developmental stage of cottonseed, the expression levels of KASII and FATA were high, resulted in a higher accumulation of C18:0 in 0–15 DPA seeds. However, C18:0-ACP is also the substrate for synthesis of C18:1n-9-ACP. Meanwhile, the expression level of FATBs was significantly higher than that of FATAs at the same developmental stage during cottonseed development, particularly in 15–25 DPA seeds, consequently, the accumulation of C16:0 was much greater than that of C18:0 in cottonseed, and C16:0 being the major SFA in mature cottonseeds.
The major polyunsaturated fatty acid (PUFA) in cottonseed is synthesized by fatty acid desaturase 2 (FAD2, C18:1n-9 to C18:2n-6 desaturation) and polyunsaturated linolenic acid (C18:3n-3) is desaturated from C18:2n-6 by fatty acid desaturase 3 (FAD3). These reactions take place in the endoplasmic reticulum (ER). Two pairs of homoeologous GhFAD2 (Gh_A13G1850 and Gh_D13G2238, Gh_A11G2814 and Gh_D11G3169) were expressed in developing cottonseeds with different expression patterns. Gh_A13G1850 and Gh_D13G2238, named as GhFAD2–1 , were mainly expressed in 25–50 DPA seeds, while Gh_A11G2814 and Gh_D11G3169 were mainly expressed in 5–20 DPA seeds. The relative content of C18:2n-6 increased significantly from 15 to 25 DPA and remained high afterwards, overlapping with the period of high expression of Gh_A13G1850 and Gh_D13G2238, indicating that these two homoeologs are the major contributors of C18:2n-6 biosynthesis, consistent with previous results [26, 27]. GhFAD3 are encoded by two pairs of homoeologous genes (Gh_A09G0848 and Gh_D09G0870, Gh_A07G0946 and Gh_D07G1026). These two pairs also had a different expression pattern in developing cottonseed. Gh_A09G0848 and Gh_D09G0870 had a relatively high expression level in 5 to 15 DPA seeds, particular in 5 DPA. After 20 DPA, their expression level decreased significantly and was almost undetectable in mature cottonseed. In contrast, expression of Gh_A07G0946 and Gh_D07G1026 was observed from 5 to 50 DPA, with the highest expression level observed in 40 DPA seeds. C18:3n-3 was apparently accumulated in the early stage of seed development and reached its peak (7.44%) at 5 DPA. The relative content of C18:3n-3 was significantly decreased from 1.16% at 15 DPA to 0.11% at 25 DPA and remained very low afterwards. These results indicate that the content of C18:3n-3 in cottonseed was determined by Gh_A09G0848 and Gh_D09G0870 rather than Gh_A07G0946 and Gh_D07G1026.
Confirmation of DEGs by qRT-PCR analysis
To validate the RNA-seq results and identified DEGs, 12 genes related to fatty acid synthesis, including 6 genes related to fatty acid desaturation, were selected for confirmation by qRT-PCR. The relative expression levels and expression tendency of the selected genes determined by qRT-PCR were highly consistent with that of RNA-seq (Fig. 2), indicating highly reliable RNA-seq results.
Gene ontology analysis of DEGs
To get insights into the biological function of the DEGs, we performed Gene Ontology (GO) analysis using the genes differentially expressed between 5 DPA and other time points. In total, 15,647 non-redundant DEGs were identified and they were enriched with 308 GO terms (Table S4). The relative content of each fatty acid component changed significantly from 5 to 30 DPA and tended to be relatively stable after 30 DPA. The 11,645 non-redundant DEGs identified in 10–30 DPA were enriched with 178 GO terms. Individually, 20, 21, 77, 112 and 111 GO terms were enriched at 10, 15, 20, 25 and 30 DPA, respectively, with 8, 1, 24, 26 and 29 GO terms unique to each time point (Fig. 3a). The number of enriched GO items was significantly fewer at 10 and 15 DPA than at other time points (Fig. 3a, b). Nine commonly enriched GO terms were found in DEGs from 10 to 30 DPA and were related to fatty acid biosynthesis and redox-related process (Fig. 3d). In addition to these GO terms enriched in all time points, 1, 4 and 37 commonly enriched GO terms were found between 15 and 20 DPA, between 20 and 25 DPA, and between 25 and 30 DPA, respectively (Fig. 3a), and 26 GO items were commonly enriched in 20, 25 and 30 DPA three time points (Fig. 3c, Table S5). In 35–60 DPA, 34 GO terms were significantly enriched. In addition, fatty acid synthase activity, fatty acid metabolism and fatty acid biosynthesis processes were the commonly enriched GO items in 35–60 DPA (Table S6).
Through GO analysis of DEGs among different groups, it was found that, in terms of biological processes, these DEGs were mainly related to fatty acids, lipids and other metabolic processes; regarding cell components, they were involved in cell parts and membranes; and their main molecular functions were binding, splitting and catalytic activity. These results suggest that fatty acid synthesis and metabolism-related pathways are mainly enriched in the early stages of cottonseed development, while genes related to cellular component and molecular functions were enriched in the middle development stages.
KEGG pathways of DEGs
Pathway analysis was further performed to determine the functions of DEGs. A total of 122 enriched KEGG pathways were found for the 15,647 non-redundant DEGs from 10 to 60 DPA (Table S7). Many significantly changed pathways were related to metabolism of fatty acid elongation, biosynthesis of unsaturated fatty acids, and α-linolenic acid metabolism, which are associated with the synthesis of fatty acid components (Table S8). Generally, C18:3n-3 was mainly synthesized at the early stage (around 5 DPA) of seed development. The enriched KEGG pathways at 10 DPA included those involved in metabolism of phenylalanine, alanine, aspartate and glutamate, and biosynthesis of phenylpropanoid, monoterpenoid, glucosinolate and cutin, suberine and wax. At 20 DPA, pathways related to biosynthesis of flavonoid, diterpenoid, and unsaturated fatty acids were enriched (Fig. 4a, Table S8). FAD2 (Gh_D13G2238 and Gh_A13G1850) and FAD3 (Gh_A09G0848 and Gh_D09G0870) were components of these pathways (Fig. S3B). The pathways enriched at 25 DPA were also related to biosynthesis of unsaturated fatty acids and elongation of fatty acids (Fig. 4b, Table S8). Among the 13 DEGs in the fatty acid extension pathway, 12 encodes 3-ketoyl-CoA synthetase (Fig. S3C). The enriched pathways at 30 DPA were similar to that at 25 DPA. We found a remarkable feature that the elongation of fatty acid pathway was enriched in 35, 40, 45, and 50 DPA cottonseed, while the fatty acid degradation pathway was enriched in 60 DPA (Table S9), which might explain why the relative content of the fatty acids tended to be stabilized at the late development stages.
Gene network analysis with WGCNA
WGCNA analysis was performed to obtain a comprehensive understanding the gene network regulating fatty acid synthesis during cottonseed development. The 15,647 non-redundant DEGs identified between 5 DPA and other time points were used in the analysis and a total of 10 gene modules associated with the specific expression profiles of different samples were identified (Fig. 5a). Among these 10 modules, 6 (yellow, brown, green, pink, magenta and blue) were significantly associated with middle and/or late stages of cottonseed development, and no gene module was significantly associated with the early developmental stage (5–15 DPA) although the turquoise module was marginally significant at 5 DPA (Fig. 5b). The yellow module with 861 genes was highly associated with 20 DPA. The brown (870 genes) module was significantly associated with 25 DPA. The green (400 genes) and pink (74 genes) modules were significantly associated with 30 DPA and 40 DPA, respectively. There were 33 genes in the magenta module that was significantly associated with 50 DPA, and 1037 genes in the blue module that was significantly associated with 55 DPA (Fig. 5b). KEGG analysis was performed on 3275 DEGs from these six modules. The modules of yellow, brown, pink and magenta were all enriched with genes involved in the α-linolenic acid metabolism. The relative content of C18:3n-3 was relatively high in 5 DPA seeds (7.44%) and reduced significantly to 0.55% in 20 DPA seeds (Table 1), consistent with significant down-regulation of GhFAD3 (Gh_A09G0848 and Gh_D09G0870) at 20 DPA (Fig. 1) and C18:3n-3 being a precursor substance for production of jasmonic acid and oxyacids as genes of the α-linolenic acid metabolism pathway involved in biosynthesis of jasmonic acid and oxyacids were up-regulated at 20 DPA (Table S10). In addition, at 25 DPA, the brown module was also related to fatty acid metabolism and biosynthetic pathways; both the pink module and the blue module were enriched with genes involved in the linoleic acid metabolic pathway (Table S10).
Co-expression gene networks and hub genes associated with fatty acid synthesis
According to the changes of the relative content of fatty acids during cottonseed development and the results of gene network analysis, the yellow and brown modules were chosen for co-expression analysis to identify the hub genes of the modules. Based on the criteria of eigengene-based connectivity (KME) value ≥0.89 and edge weight value ≥0.5, 29 genes were found to be co-expressed in the brown module. Gh_D03G0588 and Gh_A02G1788 encoding SAD were identified to be the hub genes of the brown module (Fig. 6). At 25 DPA, Gh_D03G0588 and Gh_A02G1788 were significantly up-regulated (Fig. 6, Table S11). At this time point, the relative content of C18:0 decreased significantly, indicating that most C18:0-ACP might has been used in synthesis of C18:1n-9-ACP due to significantly increased expression level of SAD genes. However, the relative content of C18:1n-9 did not increase significantly, indicating rapid desaturation and conversion of C18:1n-9-ACP to C18:1n-9 and finally C18:2n-6, the major unsaturated fatty acid in cottonseed (Table 1).
Cotton is not only the most important fiber crop, but also an important oil crop. Our results showed that the content of SFAs in cottonseed oil is relatively high (over 20% of C16:0). It means that long-term consumption of cottonseed oil might increase the incidence of cardiovascular diseases. PUFAs that are easy to be oxidized and unstable under the conditions of light and high temperature are also relatively high (up to 60%), while the content of C18:1n-9, a fatty acid being stable and beneficial to human health, is only 15.6% in cottonseed oil, far lower than 75.7% in olive oil , 55.9–72.0% in rapeseed oil , 28.0% in sunflower oil , 41.5% in sesame oil , 42.9% in palm oil and 29.1% in soybean oil . Therefore, it has great practical value to enhance effectively the content of C18:1n-9 in cottonseed to improve the nutritional quality of cottonseed oil.
Understanding plant lipid metabolism is the basis of improving oil quality and increasing oil yield. The biochemical pathways of storage oil production in plants have been well documented. However, the factors regulating fatty acid synthesis and controlling total oil content in oilseed crops remain to be elucidated [30, 31]. According to our study, C16:0, C18:0 and C18:3n-3 were mainly synthesized at 5–15 DPA during the development of cottonseed. Synthesis of C18:2n-6 happened mainly in the middle and late stages of seed development. These fatty acid components showed an obvious correlation with each other during oil accumulation. In general, increase of C18:2n-6 was accompanied by decrease of other fatty acid components, particularly C18:3n-3, which decreased from 7.44% at 5 DPA to 0.02% at 60 DPA. Compared with their highest content at 15 DPA, C16:0 and C18:0 decreased by 50% and nearly 6 times in mature cottonseed, respectively. C18:1n-9 and C18:2n-6 accounted for more than 70% of all fatty acid components in mature cottonseed and were the main fatty acid components determining the total oil content of cottonseed.
The range of C18:1n-9 content in cottonseed is rather narrow among cotton germplasm resources , and it is difficult to breed high C18:1n-9 cotton varieties by conventional breeding methods . Understanding the expression characteristics and functions of the key genes in fatty acid synthesis is the prerequisite for improving cottonseed oil quality. Expression of genes encoding key enzymes for fatty acid synthesis have been shown to be tissue and developmental stage specific . We found that up-regulated and down-regulated DEGs were mainly observed at 15–40 DPA and 45–60 DPA, respectively (Fig. S2A), consistent with the results reported in a previous transcriptome sequencing experiment . The relative content of most fatty acid components started to decrease after 30 DPA and C18:2n-6, the most abundant component, stabilized after 25 DPA (Table 1). In agree with this observation, down-regulated DEGs in the 30 DPA vs 20 DPA comparison were enriched with GO terms related to fatty acid biosynthesis (Table S12). Similar results have also been reported in soybean . Through the analysis of GO enrichment and KEGG pathways of DEGs, it was found that fatty acid biosynthesis, fatty acid metabolism, and cutin and wax biosynthesis are involved in the early development of cottonseed. It has been proposed that GhFAD2 expression increases to rapidly accumulate large amount of C18:2n-6, when the content of C16:0 and C18:0 exceeds a certain threshold . We found that the expression level of GhFAD2 increased significantly from 20 DPA (Fig. 1). Accordingly, the content of C18:2n-6 increased significantly from 20 DPA and reached its maximum at 40 DPA (Table 1). When the amount of substrate was lower than the threshold, the expression level of GhFAD2 was down-regulated and tended to be relatively stable. And then, the very long chain fatty acid components and their derivatives, such as wax constituents, were synthesized in the seeds. Through the analysis of gene expression profile during cottonseed development, GhFAD3 (Gh_A09G0848 and Gh_D09G0870) was highly expressed at 5 DPA and decreased significantly at 20 DPA. After a further reduction at 25 DPA, GhFAD3 maintained a very low expression level from 30 to 60 DPA. The expression profiles of GhFAD2 and GhFAD3 indicate these two enzymes play an important role in regulating the content of fatty acid components during cottonseed development.
Cotton has a complex allotetraploid genome and gene replication can lead to functional redundancy and diversification. It has been demonstrated that expression of ~ 30% of the cotton homoeologs are significantly A- or D-biased during fiber development . We found that DEGs during the rapid oil accumulation period (10–30 DPA) are biased towards the D subgenome (Table S13). This was consistent with the result that high expression levels of fatty acid synthesis genes were found in the Dt homeologs . Based on the analyses of WGCNA and the co-expression gene networks, the SAD genes (Gh_A02G1788 and Gh_D03G0588) encoding the soluble desaturase enzyme that converts saturated C18:0 into unsaturated C18:1n-9  were identified as the candidate hub genes highly associated with unsaturated fatty acid content at 25 DPA. SAD is highly active in developing cottonseeds, resulting in only 2–3% of C18:0 in the seed oil at maturity, but the C18:0 content can be significantly increased by silencing expression of SAD genes . It is worth noting that the highest expression level of SAD (Gh_A02G1788 and Gh_D03G0588) at 25 DPA was consistent with their preferential expression in developing ovules, especially at 25 DPA [39, 40]. The continuous decrease of C18:0 content from 15 to 25 DPA was accompanied by the increase of C18:1n-9 content, which is consistent with the result that SAD plays a decisive role in determining the ratio of saturated fatty acids to unsaturated fatty acids . Moreover, the activity of SAD genes and consequently the C18:1n-9 content play important roles in senescence regulation, fungal infection resistance, and mechanical damage reactions [39, 41,42,43,44].
Because of its low specific gravity and high energy content, fat is the most economical and effective storage form of nutrients. During cottonseed development, the relative content of saturated C16:0 is always above 20%, indicating the importance of C16:0 in seed development and physiological activity. C16:0 is also a precursor for subsequent biosynthesis of other fatty acids. PUFAs are not only essential structural components of cells, but also play important roles in seed germination, physiological activity and resistance to adverse environment . The content of PUFAs increased gradually during the period of seed maturation and the C18:2n-6 content was over 50% in mature cottonseed. C18:3n-3 is also an important substrate for biosynthesis of many other lipids and Jasmonic acid (JA) through the octadecanoid pathway. Although the content of C18:3n-3 was dramatically reduced from 20 DPA, it was thought that the threshold requirement for C18:3n-3 to synthesize JA is very low in cotton based on the observation of cotton anthers with only 1–2% of C18:3n-3 remaining fertile . A high level of C18:2n-6 contributes to seed germination and early seedling growth under low temperature and salt conditions . The highest content of PUFAs in cottonseed was C18:2n-6, not C18:3n-3. The ratio of C18:2n-6 to C18:3n-3 during cottonseed development was shown in Fig. S4. This accumulation pattern of fatty acid components in cottonseed is not accidental, but a result of nature selection closely related to the mechanisms of plants’ self-defense and protection of their offspring.
Investigating the correlation between the content of various fatty acid components and the expression pattern of key genes responsible for fatty acid biosynthesis is a necessary step to understand the regulation of fatty acid metabolism and is also an effective way to explore the physiological significance of the key genes and the strategy for improving cottonseed oil quality. In this study, we have gained certain insights into the relationship between the dynamic changes of the expression levels of fatty acid biosynthesis genes and the accumulation of the main fatty acids in cottonseed, although the biological function of most DEGs is yet to be further investigated in future.
Materials and methods
Cotton (G. hirsutum L. cv Xinluzao 33) plants were grown in the field in Shihezi, China. Flowers were tagged on the day of blooming. Bolls were collected at 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, and 60 day post anthesis (DPA), respectively. No permissions were needed to study and collect these samples. Seeds were separated from bolls and stored in refrigerator at − 80 °C until further use. 5–60 DPA de-husked seeds were used in measurement of fatty acid content and transcriptome analysis.
Fatty acid analysis
The content of fatty acid components of each sample was analyzed by external standard method. Thirty-seven fatty acid methyl esters (Sigma-Aldrich) with known concentration were diluted by 2, 5, 10, 20, 40 times with hexane (Table S1), respectively, and used to establish the quantitative standard curve of C14:0, C16:0, C18:0, C18:1n-9, C18:2n-6, and C18:3n-3 (Fig. S1). Seeds were ground into powder with liquid nitrogen, and then were freeze-dried using a vacuum freeze dryer. For each analysis, 1 g freeze-dried sample was transferred into glass tube, and oil was extracted using Soxtherm apparatus (Gerhadt). Then, 5 ml 0.4 M KOH-methanol and 5 ml hexane were added and mixed. The solution was transferred into vials and shaken for 30 min at 40 °C. Subsequently, the upper hexane layer was used in GC-MS analysis.
The qualitative and quantitative analyses were performed using GCMS-QP2020 (Shimadzu, Japan) at an electron ionization of 70 eV with an HP-88 column (100 m × 0.2 mm) and film thickness of 0.2 μm according to Liu et al. (2019) . The details were as follows: after split/splitless injection at 250 °C and incubation in oven for 2 min at 40 °C, the temperature was increased to 240 °C at a rate of 4 °C/min and then held at 240 °C for 15 min. Helium was used as the carrier gas with a constant flow rate of 2 mL/min. A volume of 1.0 μL was injected and the injection mode was split with a ratio of 10:1. The ion source and interface temperature was set to 200 °C and 250 °C, respectively. The mass scan range was m/z 40–500 and the solvent delay time was set to 13 min. The quantification was carried out according to the response value of quantitative ions and the established standard curve. Each test was repeated three times, and the content of each fatty acid component was calculated as the percentage of total measured fatty acids.
RNA isolation, library preparation and sequencing
Total RNA was extracted using Trizol reagent (Invitrogen) following the manufacturer’s protocol. RNA purity was checked using the NanoPhotometer® spectrophotometer (IMPLEN). RNA concentration was measured using Qubit® RNA Assay Kit in Qubit® 2.0 Flurometer (Life Technologies). RNA integrity was assessed using the RNA Nano 6000 Assay Kit with the Bioanalyzer 2100 system (Agilent). A total amount of 3 μg RNA per sample was used as input material for library preparation using NEBNext® Ultra™ RNA Library Prep Kit for Illumina® (NEB). The libraries were sequenced on an Illumina Hiseq platform. Using Hisat2 v2.0.5, paired-end clean reads were aligned to the TM-1 reference genome . FeatureCounts v1.5.0-p3 was used to count the read numbers mapped to each gene. FPKM of each gene was calculated based on the length of the gene and the number of reads mapped to the gene.
Differential expression, gene function annotation and enrichment analysis
Differential expression analysis was performed using the DESeq R package (1.18.0) through negative binomial distribution and calculation of false discovery rate . The adjusted p-value < 0.05 and (|log2 (fold change) | ≥2 were used as the criteria for identification of DEGs among samples. Gene Ontology (GO) enrichment analysis of DEGs was implemented by the GO seq R package, in which gene length bias was corrected. GO terms with a corrected p-value less than 0.05 were considered to be significantly enriched. KOBAS software was used to test the statistical enrichment of DEGs in KEGG pathways.
Quantitative RT-PCR validation of DEGs
Gene specific primers were designed using cDNA sequences of the target genes with the NCBI Primer-BLAST program (Table S2). qRT-PCR was carried out as described by Cheng et al. (2020) . Three independent biological experiments were performed for samples of each time point. The relative expression levels were calculated with cotton ubiquitin gene (GhUBI, XM_012634824) as a reference according to Cheng et al. (2020) .
Constructing co-expression networks of DEGs
The Weighted Correlation Network Analysis (WGCNA) package was used to construct the co-expression networks among DEGs associated with fatty acid synthesis in seeds from different developmental stages. The hub genes were screened on the basis of the module KME values and high-weight values. The correlation networks were drawn using Cytoscape 3.6.1.
Availability of data and materials
All Gene ID and annotation files could be obtained from CottonGen (https://www.cottongen.org). All other data generated or analyzed during this study are included in this manuscript.
Saturated fatty acids
Polyunsaturated fatty acids
Monounsaturated fatty acid
Differentially expressed genes
Quantitative real-time polymerase chain reaction
Kyoto Encyclopedia of Genes and Genomes
Weighted Correlation Network Analysis
- K ME :
Fragments per kilobase of transcript per million mapped reads
Baum SJ, Kris-Etherton PM, Willett WC, Lichtenstein AH, Rudel LL, Maki KC, et al. Fatty acids in cardiovascular health and disease: a comprehensive update. J Clin Lipidol. 2012;6(3):216–34. https://doi.org/10.1016/j.jacl.2012.04.077.
Orsavova J, Misurcova L, Ambrozova JV, Vicha R, Mlcek J. Fatty acids composition of vegetable oils and its contribution to dietary energy intake and dependence of cardiovascular mortality on dietary intake of fatty acids. Int J Mol Sci. 2015;16(6):12871–90. https://doi.org/10.3390/ijms160612871.
Liu F, Zhao YP, Zhu HG, Zhu QH, Sun J. Simultaneous silencing of GhFAD2-1 and GhFATB enhances the quality of cottonseed oil with high oleic acid. J Plant Physiol. 2017;215:132–9. https://doi.org/10.1016/j.jplph.2017.06.001.
Cox C, Mann J, Sutherland W, Chisholm A, Skeaff M. Effects of coconut oil, butter, and safflower oil on lipids and lipoproteins in persons with moderately elevated cholesterol levels. J Lipid Res. 1995;36(8):1787–95. https://doi.org/10.1016/S0022-2275(20)41497-X.
Carvalho AG, Silva KA, Silva LO, Costa AM, Akil E, Coelho MA, et al. Jussara berry (Euterpe edulis M.) oil-in-water emulsions are highly stable: the role of natural antioxidants in the fruit oil. J Sci Food Agric. 2019;99(1):90–9. https://doi.org/10.1002/jsfa.9147.
Jabeur H, Zribi A, Abdelhedi R, Bouaziz M. Effect of olive storage conditions on Chemlali olive oil quality and the effective role of fatty acids alkyl esters in checking olive oils authenticity. Food Chem. 2015;169:289–96. https://doi.org/10.1016/j.foodchem.2014.07.118.
Zhou Y, Zhao W, Lai Y, Zhang B, Zhang D. Edible plant oil: global status, health issues, and perspectives. Front Plant Sci. 2020;11:1315. https://doi.org/10.3389/fpls.2020.01315.
Horn PJ, Sturtevant D, Chapman KD. Modified oleic cottonseeds show altered content, composition and tissue-specific distribution of triacylglycerol molecular species. Biochimie. 2014;96:28–36. https://doi.org/10.1016/j.biochi.2013.08.010.
Wilson RF. The role of genomics and biotechnology in achieving global food security for high-oleic vegetable oil. J Oleo Sci. 2012;61(7):357–67. https://doi.org/10.5650/jos.61.357.
Liu Q, Singh SP, Green AG. High-stearic and high-oleic cottonseed oils produced by hairpin RNA-mediated post-transcriptional gene silencing. Plant Physiol. 2002;129(4):1732–43. https://doi.org/10.1104/pp.001933.
Thomas SS, Cha YS, Kim KA. Effect of vegetable oils with different fatty acid composition on high-fat diet-induced obesity and colon inflammation. Nutr Res Pract. 2020;14(5):425–37. https://doi.org/10.4162/nrp.2020.14.5.425.
Kinney AJ, Cahoon EB, Hitz WD. Manipulating desaturase activities in transgenic crop plants. Biochem Soc Trans. 2002;30(Pt 6):1099–103. https://doi.org/10.1042/bst0301099.
Schwingshackl L, Hoffmann G. Monounsaturated fatty acids, olive oil and health status: a systematic review and meta-analysis of cohort studies. Lipids Health Dis. 2014;13(1):154. https://doi.org/10.1186/1476-511X-13-154.
Shanklin J, Cahoon EB. Desaturation and related modifications of fatty acids. Annu Rev Plant Physiol Plant Mol Biol. 1998;49(1):611–41. https://doi.org/10.1146/annurev.arplant.49.1.611.
Buhr T, Sato S, Ebrahim F, Xing A, Zhou Y, Mathiesen M, et al. Staswick P; tom Clemente. Ribozyme termination of RNA transcripts down-regulate seed fatty acid genes in transgenic soybean. Plant J. 2002;30(2):155–63. https://doi.org/10.1046/j.1365-313X.2002.01283.x.
Pham AT, Shannon JG, Bilyeu KD. Combinations of mutant FAD2 and FAD3 genes to produce high oleic acid and low linolenic acid soybean oil. Theor Appl Genet. 2012;125(3):503–15. https://doi.org/10.1007/s00122-012-1849-z.
Peng Q, Hu Y, Wei R, Zhang Y, Guan C, Ruan Y, et al. Simultaneous silencing of FAD2 and FAE1 genes affects both oleic acid and erucic acid contents in Brassica napus seeds. Plant Cell Rep. 2010;29(4):317–25. https://doi.org/10.1007/s00299-010-0823-y.
Miller JF, Zimmerman DC, Vick BA. Genetic control of high oleic acid content in sunflower oil. Crop Sci. 1987;27(5):923–6. https://doi.org/10.2135/cropsci1987.0011183X002700050019x.
Chapman KD, Austin-Brown S, Sparace SA, Kinney AJ, Ripp KG, Pirtle IL, et al. Transgenic cotton plants with increased seed oleic acid content. J Am Oil Chem Soc. 2001;78(9):941–7. https://doi.org/10.1007/s11746-001-0368-y.
Qu J, Mao HZ, Chen W, Gao SQ, Bai YN, Sun YW, et al. Development of marker-free transgenic Jatropha plants with increased levels of seed oleic acid. Biotechnol Biofuels. 2012;5(1):10. https://doi.org/10.1186/1754-6834-5-10.
He M, Ding NZ. Plant unsaturated fatty acids: multiple roles in stress response. Front Plant Sci. 2020;11:562785. https://doi.org/10.3389/fpls.2020.562785.
Zhao Y, Wang Y, Huang Y, Cui Y, Hua J. Gene network of oil accumulation reveals expression profiles in developing embryos and fatty acid composition in upland cotton. J Plant Physiol. 2018;228:101–12. https://doi.org/10.1016/j.jplph.2018.06.002.
Konishi T, Shinohara K, Yamada K, Sasaki Y. Acetyl-CoA carboxylase in higher plants: most plants other than gramineae have both the prokaryotic and the eukaryotic forms of this enzyme. Plant Cell Physiol. 1996;37(2):117–22. https://doi.org/10.1093/oxfordjournals.pcp.a028920.
Pidkowich MS, Nguyen HT, Heilmann I, Ischebeck T, Shanklin J. Modulating seed beta-ketoacyl-acyl carrier protein synthase II level converts the composition of a temperate seed oil to that of a palm-like tropical oil. Proc Natl Acad Sci U S A. 2007;104(11):4742–7. https://doi.org/10.1073/pnas.0611141104.
Salas JJ, Ohlrogge JB. Characterization of substrate specificity of plant FatA and FatB acyl-ACP thioesterases. Arch Biochem Biophys. 2002;403(1):25–34. https://doi.org/10.1016/S0003-9861(02)00017-6.
Liu F, Ma L, Wang Y, Li Y, Zhang X, Xue F, et al. GhFAD2-3 is required for anther development in Gossypium hirsutum. BMC Plant Biol. 2019;19(1):393. https://doi.org/10.1186/s12870-019-2010-9.
Zhang T, Hu Y, Jiang W, Fang L, Guan X, Chen J, et al. Sequencing of allotetraploid cotton (Gossypium hirsutum -L. acc. TM-1) provides a resource for fiber improvement. Nat Biotechnol. 2015;33(5):531–7. https://doi.org/10.1038/nbt.3207.
Alves AQ, da Silva VA Jr, Góes AJS, Silva MS, de Oliveira GG, Bastos IVGA, de Castro Neto AG, Alves AJ. The fatty acid composition of vegetable oils and their potential use in wound care. Adv Skin Wound Care. 2019;32(8):1–8. https://doi.org/10.1097/01.ASW.0000557832.86268.64.
Pospišil M, Škevin D, Mustapić Z, et al. Fatty acid composition in oil of recent rapeseed hybrids and 00-cultivars. Agric Conspec Sci. 2007;72(3):187–93.
Yang S, Miao L, He J, Zhang K, Li Y, Gai J. Dynamic transcriptome changes related to oil accumulation in developing soybean seeds. Int J Mol Sci. 2019;20(9):2202. https://doi.org/10.3390/ijms20092202.
He M, Qin CX, Wang X, Ding NZ. Plant unsaturated fatty acids: biosynthesis and regulation. Front Plant Sci. 2020;11:390. https://doi.org/10.3389/fpls.2020.00390.
Dowd MK, Boykin DL, Meredith WRJr.; Campbell BT, Bourland FM, Gannaway JR, Glass KM, Zhang J. Fatty acid profile of cottonseed genotypes from the national cotton variety trials. J Cotton Sci. 2010;14:64–73.
Sturtevant D, Horn P, Kennedy C, Hinze L, Percy R, Chapman K. Lipid metabolites in seeds of diverse Gossypium accessions: molecular identification of a high oleic mutant allele. Planta. 2017;245(3):595–610. https://doi.org/10.1007/s00425-016-2630-3.
Banilas G, Karampelias M, Makariti I, Kourti A, Hatzopoulos P. The olive DGAT2 gene is developmentally regulated and shares overlapping but distinct expression patterns with DGAT1. J Exp Bot. 2011;62(2):521–32. https://doi.org/10.1093/jxb/erq286.
Chen H, Wang FW, Dong YY, Wang N, Sun YP, Li XY, et al. Sequence mining and transcript profiling to explore differentially expressed genes associated with lipid biosynthesis during soybean seed development. BMC Plant Biol. 2012;12(1):122. https://doi.org/10.1186/1471-2229-12-122.
Hovav R, Udall JA, Chaudhary B, Rapp R, Flagel L, Wendel JF. Partitioned expression of duplicated genes during development and evolution of a single cell in a polyploid plant. Proc Natl Acad Sci U S A. 2008;105(16):6191–5. https://doi.org/10.1073/pnas.0711569105.
Hovav R, Faigenboim-Doron A, Kadmon N, Hu G, Zhang X, Gallagher JP, et al. A Transcriptome profile for developing seed of Polyploid cotton. Plant Genome. 2015;8:1.
Ohlrogge JB, Jaworski JG. Regulation of fatty acid synthesis. Annu Rev Plant Physiol. 1997;48(1):109–36. https://doi.org/10.1146/annurev.arplant.48.1.109.
Shang X, Cheng C, Ding J, Guo W. Identification of candidate genes from the SAD gene family in cotton for determination of cottonseed oil composition. Mol Gen Genomics. 2017;292(1):173–86. https://doi.org/10.1007/s00438-016-1265-1.
Stewart JM, Oosterhuis DM, Heitholt JJ, Mauney, JR. Physiology of cotton. New York: Springer; 2010.
Kachroo A, Fu DQ, Havens W, Navarre D, Kachroo P, Ghabrial SA. An oleic acid–mediated pathway induces constitutive defense signaling and enhanced resistance to multiple pathogens in soybean. Mol Plant-Microbe Interact. 2008;21(5):564–75. https://doi.org/10.1094/MPMI-21-5-0564.
Jiang CJ, Shimono M, Maeda S, Inoue H, Mori M, Hasegawa M, et al. Suppression of the rice fatty-acid desaturase gene OsSSI2 enhances resistance to blast and leaf blight diseases in rice. Mol Plant-Microbe Interact. 2009;22(7):820–9. https://doi.org/10.1094/MPMI-22-7-0820.
Kachroo A, Kachroo P. Fatty acid-derived signals in plant defense. Annu Rev Phytopathol. 2009;47(1):153–76. https://doi.org/10.1146/annurev-phyto-080508-081820.
Zhang Y, Maximova SN, Guiltinan MJ. Characterization of a stearoyl-acyl carrier protein desaturase gene family from chocolate tree, Theobroma cacao L. Front Plant Sci. 2015;6:239.
Zhang J, Liu H, Sun J, Li B, Zhu Q, Chen S, et al. Arabidopsis fatty acid desaturase FAD2 is required for salt tolerance during seed germination and early seedling growth. PLoS One. 2012;7(1):e30355. https://doi.org/10.1371/journal.pone.0030355.
Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B Methodol. 1995;57:289–300.
Cheng XQ, Zhang XY, Xue F, Zhu SH, Li YJ, Zhu QH, et al. Characterization and transcriptome analysis of a dominant genic male sterile cotton mutant. BMC Plant Biol. 2020;20(1):312. https://doi.org/10.1186/s12870-020-02522-0.
This work was supported by National Natural Science Foundation of China [31960369, 31660353]. The funding bodies had no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
The authors report no declarations of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The quantitative standard curve and correlation coefficient of major fatty acid methyl esters in cottonseed determined by GC-MS. Fig. S2. The number and distribution of differentially expressed genes at 5–60 DPA. A, The number of DEGs that were up-regulated or down-regulated at 5–60 DPA. B, Venn diagram showing DEGs overlapping in different cottonseed development stages or unique to each developmental stage. Fig. S3. Heatmap comparison DEGs associated with cottonseed developmental stages. A,Venn diagram showed the different KEGGs in five stages. B and C showed the DEGs associated to the functional categories at 5, 10, 15, 20, 25 and 30 DPA respectively. B shows DEGs related to the biosynthesis of unsaturated fatty acids. Red represents high expression, and green represents low expression. Each row represents a DEG.S5, S10, S15, S20, S25and S30 represent 5, 10, 15, 20, 25and 30 DPA respectively. Fig. S4. The ratio of C18:2n-6 to C18:3n-3 during cottonseed development
Composition, retention times and fragment ions of the standard mixture of 37-component fatty acid methyl esters by GC-MS. Table S2. Primers used in qRT-PCR analysis.Table S3. Statistics of RNA-seq reads and mapping results.
Analysis of GO enrichment for 15,647 non-redundant DEGs
Analysis of GO enrichment from 20DPA to 30DPA
Analysis of GO enrichment from 35DPA to 60DPA
Analysis of KEGG enrichment for 15,647 common DEGs
Analysis of KEGG enrichment of the DEGs
Analysis of KEGG enrichment at the 35–60 DPA
Key KEGG pathways genes enriched in six module
Candidate hub genes in the brown module
Analysis of GO enrichment in down-regulated DEGs of the 30 DPA vs 20 DPA
Number of DEGs exhibiting homeolog expression bias during cotton seed development
About this article
Cite this article
Ma, L., Cheng, X., Wang, C. et al. Explore the gene network regulating the composition of fatty acids in cottonseed. BMC Plant Biol 21, 177 (2021). https://doi.org/10.1186/s12870-021-02952-4