Skip to main content

Transcriptomic and metabolomic profiling of flavonoid biosynthesis provides novel insights into petals coloration in Asian cotton (Gossypium arboreum L.)



Asian cotton (Gossypium arboreum L.), as a precious germplasm resource of cotton with insect resistance and stress tolerance, possesses a broad spectrum of phenotypic variation related to pigmentation. Flower color affects insect pollination and the ornamental value of plants. Studying flower color of Asian cotton varieties improves the rate of hybridization and thus enriches the diversity of germplasm resources. Meanwhile, it also impacts the development of the horticultural industry. Unfortunately, there is a clear lack of studies concerning intricate mechanisms of cotton flower-color differentiation. Hereby, we report an integrative approach utilizing transcriptome and metabolome concerning flower color variation in three Gossypium arboreum cultivars.


A total of 215 differentially accumulated metabolites (DAMs) were identified, including 83 differentially accumulated flavonoids (DAFs). Colorless kaempferol was more abundant in white flowers, while gossypetin-fer showed specificity in white flowers. Quercetin and gossypetin were the main contributors to yellow petal formation. Pelargonidin 3-O-beta-D-glucoside and cyanidin-3-O-(6''-Malonylglucoside) showed high accumulation levels in purple petals. Quercetin and gossypetin pigments also promoted purple flower coloration. Moreover, 8178 differentially expressed genes (DEGs) were identified by RNA sequencing. The correlation results between total anthocyanins and DEGs were explored, indicating that 10 key structural genes and 29 transcription factors promoted anthocyanin biosynthesis and could be candidates for anthocyanin accumulation. Ultimately, we constructed co-expression networks of key DAFs and DEGs and demonstrated the interactions between specific metabolites and transcripts in different color flowers.


This study provides new insights into elucidating the regulatory mechanisms of cotton flower color and lays a potential foundation for generate cotton varieties with highly attractive flowers for pollinators.

Peer Review reports


Asian cotton (Gossypium arboreum L.) is one of the significant sources of natural fiber, initially introduced to China from the Indian subcontinent during the twelfth century [1]. It possesses resilience against biological and environmental stresses and is also a natural source of genetic variation for fiber-related traits [2,3,4,5]. Asian cotton has a broad spectrum of phenotypic variation related to pigmentation. The petals are often yellow, yellow–red, purple, or white with and without purplish-red basal spots [6]. There is a long history of research into the color of cotton flowers [7]. The flower color of cotton has also become a tool for genetic and taxonomic studies of cotton [8,9,10]. Brightly colored petals attract insects like bees and butterflies [11], increasing pollen dispersal, increasing heterosis, and enriching germplasm resources. Accordingly, it is well known that hybridization produces abundant variation. Hybrid cotton has higher fiber yield than inbred cotton [12, 13]. However, the current cotton hybrid breeding is limited by the low pollination rate of natural hybridization [12, 14]. Insect pollination is an effective way of cotton hybrid breeding. Flower color is an important phenotypic trait affecting insect pollination [15, 16]. It is an ideal genetic improvement strategy to generate cotton varieties with highly attractive for pollinator. Moreover, colorful flowers are a significant addition to the horticultural industry [17]. Furthermore, Asian cotton, as a diploid closely related to the At genome of upland cotton, has a high-quality gene source. It is relatively easier to realize the characterization of genetic mechanisms than the allotetraploid with complex genomes. Therefore, exploiting cotton flower color has far-reaching implications for the diversity of cotton germplasm resources.

Previously published reports suggested varying levels of co-accumulation of secondary metabolites responsible for plant pigmentations, such as betalains, flavonoids, and carotenoids [18,19,20,21,22]. In general, flavonoids in plants are classified into six groups, chalcones, flavonoids, flavonols, isoflavonoids, anthocyanins, and flavanols [23]. Among them, flavonoids and flavonols are one of the sources of color in fruits and flowers, which are usually yellow or colorless. Quercetin and gossypetin are known flavonoids involved in synthesizing yellow pigments in plants [24,25,26]. Flavonols like kaempferol, quercetin, and myricetin have important medical properties such as free radical scavenging and antioxidant [27,28,29]. Anthocyanins are a type of flavonoid that can be found in a variety of plants as a natural water-soluble pigment. They are responsible for the color development of flowers, leaves, and fruits [30,31,32,33,34,35,36,37]. Phenylpropanoid biosynthesis, flavonoid metabolism, and anthocyanin metabolism are three major stages of flavonoid biosynthesis in plants. Flavonoid biosynthesis is aided by the structural genes PAL, C4H, 4CL, CHI, CHS, F3H, DFR, ANS, UFGT, and 3GT [38]. The F3H gene was first cloned from Artemisia annua, and it could convert pinealin to dihydrokaempferol by in vitro enzyme activity analysis [39]. In Cheng's study, three DFR genes (DFR1, DFR2, DFR3) were cloned from Ginkgo biloba and found that DFR1 could convert dihydroquercetin into colorless anthocyanins, while DFR2 could convert dihydrokaempferol into white leucopelargonidin [40]. Transcription factors are another type of regulatory gene involved in the phenylpropanoid-flavonoid synthesis pathway. MYB, bHLH, MADS-box, and WD40 are the main identified in the previous study that play a regulatory role in the flavonoid synthesis pathway [41, 42], with MYB, bHLH, and WD40 forming a ternary complex to regulate the expression of structural genes [43]. The bHLH transcription factor and the R2R3-MYB protein activate anthocyanin biosynthetic genes in Petunia and most other dicotyledons [44]. Gonzales et al. [45] found that Arabidopsis MYB75, MYB90, MYB113 and MYB114 regulate the expression of the Arabidopsis anthocyanin biosynthesis genes F3'H, DFR, ANS and UFGT.

Advancements in omics have enabled us to integrate multi-omics better to understand regulatory mechanisms behind particular traits [46]. Combining transcriptome and metabonomic methods, Zhang et al. [47] studied the peeling process of winter jujube. They found that a large amount of pigment is deposited in the cell wall, revealing the metabolic pathways and key genes that control the biosynthesis of lignin during the peeling process of winter jujube. Wang et al. [48] emphasized that the red fading of 'Red Bartlett' pears is closely related to the decrease in anthocyanin synthesis, increase in degradation, and inhibition of anthocyanin transport. Another study on Tunisian soft-seed pomegranate [49] identified 51 phenolic compounds, most contained in red–purple pomegranate arils significantly higher than those in light red pomegranate arils. Similarly, the combination of transcriptomic and metabolite approaches, researchers identified VuMYB90-1, VuMYB90-2, VuMYB90-3, VuCPC, VuMYB4, bHLH, and WD40 proteins affecting the accumulation of anthocyanins and flavonoids through regulation structural genes expression [50].

Hybrid cotton varieties produce higher yields than inbred varieties. However, inefficient and costly cross-breeding is a pressing problem in cotton production today. Developing cotton varieties with highly attractive flowers for pollinators is an effective approach to reducing costs and increasing pollination efficiency. This study aimed to decipher petal color variation in Asian cotton utilizing a metabolomics platform coupled with transcriptomics. Our results will provide a genetic basis for flower color variation in Asian cotton, contributing to developing new high-yield hybrid varieties and further enriching cotton germplasm resources.


Phenotypes of petals of different Asian cotton varieties

To comprehend the genetic and metabolite regulatory networks of different flower color varieties, we chose three representative varieties with different petal colors for our study, Shixiya 1, GA0146, and GA0149. On the day of anthesis, Shixiya 1 has a white corolla and dark red petal basal spots and is named W_Flo. The wild-type GA0146 and its mutant material GA0149 were biologically similar except for the difference in corolla color (Fig. 1A), with GA0146 having a yellow corolla and GA0149 having a purple corolla, named Y_Flo and P_Flo, respectively. We measured the total anthocyanin content of petals for each variety, depicting significant differences in anthocyanin accumulation in petals (Fig. 1B). W_Flo had the lowest anthocyanin content (24.25 mmol/g), while Y_Flo (50.08 mmol/g) and P_Flo (192.53 mmol/g) had significantly higher anthocyanin content.

Fig. 1
figure 1

Comparison of phenotype and total anthocyanin content of white, yellow and purple Asiatic cotton (Gossypium arboreum L.) petals. A Color observation of petals from ‘Shixiya 1’ (W_Flo), ‘GA0146’ (Y_Flo), and ‘GA0149’ (P_Flo) Asiatic cotton. B Total anthocyanin content of three G. arboreum petals. Letters displayed above each rectangle indicate significant differences between samples (p ≤ 0.05)

Overview of the metabolomic data

We determined metabolic profiles using LC–MS/MS to comprehend the differential accumulation of flavonoids and their impact on flower color regulation. Metabolic quantification was subjected to principal component analysis (PCA)(Fig. 2A). The first two components covered 65.61% variation, with 43.52% for PC1 and 22.09% for PC2. We quantified 569 metabolites from 12 major metabolite classes, including flavonoids (122, 21.44%), lipids (78, 13.71%), amino acids and derivatives (76, 13.36%), phenolic acids (60, 10.54%), organic acids (51, 8.96%), nucleotides and derivatives (51, 8.96%), alkaloids (36, 6.33%), tannins (15, 2.64%), lignans and coumarins (9, 1.58%%), steroids (1, 0.18%), terpenoids (1, 0.18%), and others (69, 12.13%) (Fig. S1) ( Additional file 1: Figure S1). The accumulation pattern of identified metabolites in three samples has been presented as a heatmap (Fig. 2B), representing the differential metabolic landscape of different petal colors. Moreover, replicates of each sample were grouped together in both PCA and cluster analysis, emphasizing the quality and reproducibility of the subjected datasets.

Fig. 2
figure 2

Preliminary analysis of metabolomics data. Comparison of metabolite composition and content in petals. A Principal component analysis (PCA) of the metabolite quantification in petals of three colors (W_Flo, Y_Flo, P_Flo). B Cluster analysis of all metabolites from samples of W_Flo, Y_Flo and P_Flo. In the heat map, different colors represent the degree of accumulation of each metabolite, with red to blue indicating high to low. C Detailed classification of flavonoid metabolites

Among identified metabolites, we focused on the accumulation pattern of flavonoids. Therefore, we characterized 134 metabolites, including 122 flavonoids and 12 proanthocyanidins (Table S1) (Additional file 2: Table S1). The identified flavonoids could be classified into subclasses, including chalcones, sinensetin, dihydroflavone, anthocyanins, flavonols, flavonoids, isoflavones, and proanthocyanidins (Fig. 2C). Heatmap depicting differential accumulation has been presented as supplementary Fig. S2 (Additional file 3: Figure S2). A significant proportion includes flavonols (61). Moreover, five anthocyanins were identified, including cyanidin-3-O-(6''-Malonylglucoside), pelargonidin 3-O-beta-D-glucoside, cyanidin-3- galactoside chloride, cyanidin 3-glucoside, and cyanidin-O-syringic acid.

We summarized and mapped the phenylpropanoid-flavonoid biosynthetic pathway (Fig. 3). Fifty-four metabolites were identified in the pathway, with their content varying across samples. Interestingly, most kaempferols were relatively high in W_Flo; major quercetin and its derivatives substantially accumulated in Y_Flo; gossypetin species were abundant in Y_Flo and P_Flo; while pelargonidin significantly accumulated in purple petals. In addition to cyanidin-O-syringic acid, three-quarters of cyanidin were highly accumulated in purple petals. The content of catechin derivatives gradually decreased, such as gallocatechin-gallocatechin-catechin (from 175,570(white) to 71,507(purple)). Most of the remaining flavonoid metabolites were high in yellow and purple petals. Therefore, we speculated that kaempferols, quercetin, gossypetin, cyanidin and pelargonidin are key factors in regulating the petal color in Asian cotton. Moreover, pelargonidin and cyanidin may cause the purple corolla. Gossypetin and quercetin are significant yellow pigments involved in the deposition of yellow petal pigments, while kaempferol is an essential metabolite regulating the formation of white petals.

Fig. 3
figure 3

Heat map of phenylpropanoid-flavonoid synthesis pathway of petals of three colors: W_Flo, Y_Flo and P_FLo. The color of each rectangle represents the average amount of flavonoid metabolites in the two biological replicates. The blue rectangle means low and the red means high. PAL: phenyl ammonia-lyase; C4H: cinnamate 4-hydroxylase; 4CL: 4-coumarate-CoA ligase; CHS: chalcone synthase; CHI: chalcone isomerase; DFR: dihydroflavonol 4-reductase; FLS: flavonol synthase; F3H: flavonoid 3-hydroxylase; F3’H: flavonoid 3′-hydroxylase; F3′5’H: flavonoid 3′5′-hydroxylase; FNS, flavonoid synthase; ANS, anthocyanidin synthase; UFGT: UDP-glucose: flavonoid 3- O-glucosyltransferase

Metabolic differences among the three colors petals of G. arboreum

Comparisons of metabolic profiles for three samples identified differential accumulation of metabolites with 140, 177, and 76 DAMs in W_Flo vs. Y_Flo, W_Flo vs. P_Flo, and Y_Flo vs. P_Flo, respectively (Fig. 4A and Table S2) (Additional file 4: Table S2). As shown in Fig. 4A, 215 metabolites had differential accumulation in at least one of the compared combinations among the three petals, with up-regulated DAMs varying from 27 to 72 and down-regulated DAMs ranging from 49 to 109. In addition, we identified conserved DAMs between different samples and identified 18 DAMs differentially accumulated in all three samples. These conserved DAMs include amino acids and derivatives, phenolic acids, nucleotides and derivatives, flavonoids, organic acids, and other six categories, with flavonoids being the most abundant class (Fig. 4B). These identified DAMs were mapped to the KEGG pathways for further enrichment analysis. Annotation of DAMs revealed that they are related to anthocyanin biosynthesis, phenylpropanoid biosynthesis, flavone and flavonol biosynthesis, flavonoid biosynthesis, isoflavonoid biosynthesis (Fig. 4C). It means that the metabolites responsible for the different colors of Asian cotton are mainly flavonoids.

Fig. 4
figure 4

Differentially accumulated metabolites (DAMs) analysis of metabolome. A The number of differentially accumulated metabolites between three kinds of petals. B Venn diagram showing the overlapping and unique DAMs in comparison groups. C The top 25 terms of KEGG enrichment of DAMs. The vertical axis represents the enriched KEGG pathways. The horizontal axis represents the name of comparison groups. The significantly enriched pathways are labeled by red text

We also screened for differential accumulated flavonoids (DAFs) (Table S3) (Additional file 5: Table S3) and found 66, 76 and 21 DAFs were detected in the three comparison groups (W_Flo vs. Y_Flo, W_Flo vs. P_Flo, and Y_Flo vs. P_Flo), respectively. A total of 6 overlapping DAFs were identified in the three comparison groups, including two anthocyanins (pelargonidin 3-O-beta-D-glucoside, cyanidin-3-O-(6''-Malonylglucoside)). Therefore, we speculated that differential accumulation of these anthocyanins may cause a change in cotton flower color. Moreover, in the comparative combination W_Flo vs. Y_Flo and W_Flo vs. P_Flo, as opposed to Y_Flo vs. P_Flo, the phenylpropanoid biosynthesis, flavone and flavonol biosynthesis, flavonoid biosynthesis, isoflavonoid biosynthesis and biosynthesis of secondary metabolites were all more significant. The results suggested that differential accumulation of metabolites in the phenylpropanoid-flavonoid pathway may cause yellow to purple flower color variation. Compared to white and yellow flowers, the DAMs were mainly enriched in isoquinoline alkaloid biosynthesis, phenylalanine metabolism, glutathione metabolism, glucosinolate biosynthesis and cyanoamino acid metabolism pathways, which may result in higher capacity of purple petals than white and yellow flowers in response to external stresses such as pests, oxidation, drought and temperature [51,52,53].

Overview of the transcriptome data and identification of DEGs

The metabolome is interpreted as the end product of genetic pathways with genes as basic regulators. To identify the genes involved in flower color changes, we further sequenced the transcriptome of Asian cotton petals using RNA-Seq technology to explore the regulatory mechanism of the flavonoid compounds causing flower color changes. We took the petals of W_Flo, Y_Flo and P_Flo on the day of flowering and constructed nine cDNA libraries with three biological replicates in each group. The sequencing yielded a total of 205.55 million reads, and after filtering out the low-quality reads, we obtained a total of 61.39 Gb clean data. The Q30 of each cDNA library was above 92% (Table S4) (Additional file 6: Table S4). We identified 40,960 genes and quantified the expression of these genes in petal tissues. PCA was performed using their FPKM values (Fig. 5A). The replicates in each group clustered together, validating the credibility of transcriptome data sets for further downstream analysis. Moreover, the expression profile based on FPKM values has been presented in Fig. 5B. To verify the accuracy of the transcriptome data, we selected 6 genes from the flavonoid biosynthetic pathway for qRT-PCR expression analysis. The significant positive correlation between RNA-Seq and qRT-PCR data is shown in Fig. S3 (Additional file 7: Figure S3). The relative gene expression level of qRT-PCR was consistent with the FPKM value of the RNA-Seq data, indicating that the RNA-Seq data is credible and accurate.

Fig. 5
figure 5

Statistical analysis of transcriptome data from three cotton petal samples. A PCA score plot. B cluster analysis. The color indicates the relative levels of genes from low (blue) to high (red)

Using |Log2Fold Change|≥ 1, FDR < 0.05, differentially expressed genes of samples with different flower colors were statistically analyzed (Fig. 6). The results showed that there were 4,777 DEGs in the comparison group of W_FLo vs. Y_FLo, in which 2,233 genes were up-regulated, 2,544 genes were down-regulated; in W_FLo vs. P_FLo, there were 6,244 DEGs, including 2,152 genes up-regulated and 4,103 genes down-regulated; there were the fewest number of DEGs (3,249) in ‘Y_FLo vs. P_FLo’ group, with only 842 genes up-regulated and 2,407 genes down-regulated. (Fig. 6A). By comparing the number of DEGs, we hypothesized that more changes in gene expression are required during the change from white petals to yellow or purple than yellow to purple. This is consistent with the peach flesh changes studied by Hong Ying et al. [54]. Furthermore, we identified 816 DEGs as conserved between three groups (Fig. 6B), suggesting their involvement in flower color regulation.

Fig. 6
figure 6

Multivariate statistical analysis of transcriptome data in three Asiatic cotton petals samples. A The number of differentially expressed genes in each comparison group. B Venn diagram of DEGs in petals of different colors. C KEGG enrichment of DEGs in each comparison group. The y-axis indicates the KEGG pathway, and the x-axis indicates the name of each group. Red characters indicate flavonoid synthesis-related pathways and green words show other key metabolic pathways

All the 8,178 DEGs (detected in at least one comparative combination) were further annotated using KEGG metabolic pathways and presented in Fig. 6C. It is noteworthy that among the top 25 enriched pathways, we can observe that DEGs are significantly involved in metabolism, carbohydrate metabolism, pentose and glucuronate interconversions, biosynthesis of other secondary metabolites, photosynthesis and plant-pathogen interaction, but a few important genes were also enriched in phenylpropanoid biosynthesis, carotenoid biosynthesis flavonoid biosynthesis, flavone and flavonol (Fig. 6C). Meanwhile, the differential genes are mainly enriched in processes such as photosynthesis and energy metabolism in purple flowers compared to white and yellow flowers. The above results suggest that changes in differential gene expression result in significant changes in metabolic activity. This is similar to the study of the metabolome. To clarify the cellular composition, molecular functions, and biological processes involved in DEGs, we mapped all 8,178 DEGs to the GO database for enrichment analysis (Fig. S4) (Additional file 8: Figure S4). As shown in Figure S4, about 79.13% of the genes had catalytic activity (32.01%, 2,618), transferase activity (12.07%, 987), small molecule binding (11.98%, 980), nucleotide (11.53%, 943) and nucleoside phosphate binding (11.53%, 943). In the cellular component category, about 16.36% of genes constitute the cellular anatomical entity, 8.52% are involved in membrane composition, and more than 22.65% of genes are enriched in biological processes such as macromolecule modification, cellular protein modification process and protein modification process.

Flavonoid biosynthesis and differential expression of regulatory genes

Combining the KEGG enrichment results and gene function annotation, we screened important DEGs in the phenylpropanoid-flavonoid synthesis pathway, including the flavonoid biosynthesis (ko00941) pathway, phenylpropanoid biosynthesis (ko00940) pathway, isoflavonoid biosynthesis (ko00943) pathway and flavone and flavonol biosynthesis (ko00944) pathway, with the number of DEGs ranging from 2 to 66. There are 24 key DEGs involved in the flavonoid synthesis pathway potentially associated with color variation, including 4CL (Ga05G0055, Ga05G1511, Ga08G1805, Ga01G2437), CHI (Ga13G0234, Ga04G1997), CHS (Ga09G0006, Ga10G1445, Ga10G1446, Ga05G3486), DFR (Ga05G2037, Ga06G0096), F3'H (Ga11G2145), FL (Ga05G2477, Ga04G1879), LAR (Ga12G1133), PAL (Ga02G1655, Ga04G0847, Ga09G1700, Ga11G0075), ANR (Ga05G1789), ANT17 (Ga08G2083), and UFGT (Ga11G2554, Ga02G0536). The expression profile of these genes in different-colored flowers suggested significant variation (Fig. 7A). We calculated Pearson correlation coefficients (Table 1) to identify key genes regulating anthocyanin synthesis to explore the relationship between DEGs and total anthocyanin accumulation. The results showed that 13 DEGs promoted anthocyanin synthesis, and 10 DEGs suppressed anthocyanin synthesis. Among them, the expression levels of Ga05G2037 (DFR), Ga02G1655 (PAL), Ga05G3486 (CHS), Ga04G0847 (PAL), Ga06G0096 (DFR), Ga10G1446 (CHS), Ga02G0536 (UFGT), Ga13G0234 (CHI), Ga10G1445 (CHS) and Ga08G2083 (ANT17) were significantly and positively correlated with the total anthocyanin content (R2 ≥ 0.95), suggesting that these 10 genes play an important role in anthocyanin accumulation.

Fig. 7
figure 7

Heatmap showing the expression levels of the differentially expressed structural genes (A) and TFs (B) associated with color (FPKM value). Orange indicates high expression and blue indicates low expression

Table 1 Differential expression of structural genes in each comparison group

Transcription factors are a major player in regulating the expression of structural genes and, in turn regulating metabolite synthesis. In this study, we identified and classified 734 TFs, mainly from MYB, AP2/ERF, NF-Y A/B/C, WRKY, and bHLH families (Fig. S5) (Additional file 9: Figure S5). MYB, bZIP, WRKY, and bHLH families play crucial roles in the flavonoid and anthocyanin biosynthetic pathways [45]. These were also differentially expressed in different color petals, where the differences were mainly concentrated in W_Flo vs. P_Flo (144). In brief, both DEGs and TFs were more abundant in white flowers than in purple flowers, explaining why pelargonidin and cyanidin accumulation was highest in P_Flo. In addition, the correlation coefficients between TFs and total anthocyanin content indicated a total of 29 positively regulated TFs (R2 > 0.9) (Table 2), among which MYB was predominant and might act as a promoter of anthocyanin accumulation. Four negatively regulated (R2 < 0.9) TFs, including PIF, bHLH, MYB, and ATHB, may be repressors in anthocyanin synthesis. The results showed that the positively regulated transcription factors were expressed at the highest level in P_Flo, while the negative regulators were expressed at the highest level in W_Flo (Fig. 7B).

Table 2 Differential expression of known TFS in each comparison group

Regulatory networks between metabolites and genes associated with flower color

In addition to anthocyanins, quercetin and gossypetin were also considered equally vital substances resulting in yellow petals prior to the characterization of carotenoids [24, 25]. Moreover, quercetin and kaempferol were found to accumulate in high amounts in white Primula vulgaris [26]. Therefore, to understand the regulatory relationships between anthocyanins, quercetin, gossypetin, and kaempferol in pigmentation, we performed correlation analysis between key anthocyanins (2), gossypetin (5), quercetin (10), and kaempferol (5) (Table 3), with DEGs in the phenylpropanoid-flavonoid synthesis pathway to construct a network (Table S5) (Additional file 10: Table S5).

Table 3 List of the key DAFs regulating flower coloration in the 3 flower types and their Log2FC values

The regulatory network revealed genes associated with key metabolites and depicted compounds relevant to petal color (Fig. 8). Kaempferol was significantly associated with the white corolla, which is in agreement with the Li’s study. [26]. The colorless kaempferol showed high accumulation levels in W_Flo. Hmcp002029 (kaempferol-O-Pentoside-O-hexoside), pme0321 (kaempferol-7-O-rhamnoside), Lmmn003398 (kaempferol acetyl-glucoside) showed the most significant fold change among the three compared combinations. The expression pattern of two genes (PAL (Ga11G0075) and 4CL (Ga05G0055)) depicted a significant positive correlation with Hmcp002029 accumulation. Moreover, MYBs, FL (Ga05G2477) and F3H (Ga11G2145) jointly negatively regulate the synthesis of pme0321. The genes significantly associated with Lmmn003398 were mainly TFs (MYB and WRKY). The metabolite RFG0001-der20 (gossypetin-fer) was specific in W_Flo, with the value of Log2 Fold change up to 13.92. Transcriptome data showed that MYB (Ga07G1423, Ga12G1064, Ga11G2161, Ga02G1643, Ga03G2432, Ga12G1767), and WERKY (Ga04G1859) were highly correlated with Rfg0001-der20. Besides, Ga05G0055 (4CL1) and Ga04G1879 (FLS) were also positively correlated with RFG0001-der20 (R2 > 0.95). We speculate that these genes positively regulate the synthesis of gossypetin-fer. Previous studies have reported that TFs are essential in the biosynthesis of colorless kaempferol-associated flavonols and cooperate with structural genes to control flower color. Therefore, we hypothesize that MYBs and WRKY are central genes co-expressed or regulated with 4CL, FLS, and PAL, thus responsible for forming the white corolla. For Y_Flo (Fig. 8), the results showed a significantly higher accumulation of quercetin and gossypetin compared to white flowers, especially pme3130 (Log2 FC = 10.12), Lmmp003266 (Log2FC = 9.65), Lmmp003271 (Log2 FC = 7.02), and pma0214 (Log2 FC = 6.55). Interestingly, Ga05G0055 (4CL1) and Ga04G1879 (FLS) were negatively correlated with Lmmp003271, pme3130, and pma0214, with high expression of these two genes, suppressed gossypetin 7-glucoside, quercetin 4'-O-glucoside, and methylquercetin O-hexoside synthesis. Although Lmmp003266 accumulated only in trace amounts in Y_Flo, it was significantly higher in yellow flowers than in white and purple flowers. Some members of the bHLH, MYB, and WERKY family, such as MYB12, MYB30, MYB306, MYB1236, MYB44, MYB61, bHLH14, WRKY2, were expressed and had the lowest expression in Y_Flo but the highest content of quercetin or gossypetin, which are responsible for yellow petals, are negatively regulated by these TFs. The co-expression network (Fig. 8) showed that multiple metabolites regulated the purple petal species, and four kinds of metabolites were present in the purple flower species. The contents of pma3392 and pmb0542 gradually increased in the three groups, with the purple petals having the highest abundance, indicating they are essential for purple petals. Pma3392 accumulation not only relied on the positive regulation of CHS (Ga10G1445, Ga10G1446, Ga05G3486), DFR (Ga06G0096, Ga05G2037), FL (Ga05G2477), and MYB (Ga09G0340, Ga05G2658, Ga02G0929, Ga09G0647, Ga08G2734) but was also negatively regulated by bHLH (Ga08G1465). Pmb0542 was mainly regulated by MYB, where MYB4 (Ga03G2139) positively regulated the accumulation of pmb0542, while MYB16 (Ga04G1608) negatively regulated the synthesis of pmb0542.

Fig. 8
figure 8

Co-expression networks of differentially expressed genes and metabolites associated with pigmentation. Metabolites including kaempferol, quercetin, gossypiin, and anthocyanins, are shown in milky white, yellow, and pink, respectively. DEGs involve phenylpropanoid biosynthesis, flavonoid biosynthesis, isoflavonoid biosynthesis, flavone and flavonol biosynthesis and transcription factors. The size of the diamond and circle represents the genes FPKM and the metabolite content, respectively. The milk-white diamonds are kaempferol; the yellow diamonds mean quercetin or gossypiin, and the red diamonds are anthocyanins. The names of the metabolites are as follows: 1. Hmcp002029; 2. Lmmp002 l 30; 3. mws 1043-N; 4. Lmmn003398; 5. pme0321; 6. Lmmp002529; 7. Rfg0001-der20; 8. Rfg0376-der08; 9. pma0214; 10. Lmmp003266; 11. Rfg0001-der07; 12. pme3369; 13. pme3130; 14. pme2954; 15. Lmyp004052; 16. Mws1139-N; 17. Lmmpoo3271; 18. Rfg0380-der04; 19. mws0855; 20. mws0045; 21. pme3392; 22. pmb0542. The solid line indicates that genes and metabolites are positively regulated, while the dotted line indicates negative regulation

In general, the purple petals appeared to be formed by the differential accumulation of anthocyanins, quercetin and are influenced by several metabolites. The yellow petals seemed to be most influenced by gossypetin and quercetin. On the other hand, the white petals resulted from the weak accumulation of Rfg0001-der20 and the high kaempferol content.


The flower color in Asian cotton ranges from white to yellow to purple. Brightly colored petals attract insects, spreading pollen and enriching Asian cotton's germplasm resources. Asian cotton can also be used as ornamental due to its eye-appealing flower characteristics. Moreover, flavonoids are involved in forming and developing flowers, fruits, and seeds in plants and other functions such as antioxidant activity, UV protection, and protection against biotic and abiotic stresses. The study of flavonoids in cotton petals could provide a new way for healthy anthocyanin extraction and thus increasing the value-addition of Asian cotton. To date, metabolome concerning flower petals in Asian cotton has not been reported. In this study, we systematically implied metabolomics and transcriptomics approaches to identify the major flavonoid metabolites affecting Asian cotton petals and screened key genes associated with petal color formation, providing important information for the enrichment of cotton germplasm resources.

The important role of anthocyanins in the color formation of Asian cotton petals

Anthocyanins are the primary factor influencing the coloration of different plant organs [34,35,36,37, 55, 56]. Several studies have demonstrated the essential role of anthocyanins in color formation. For instance, Xue et al. [34] depicted association of anthocyanins with red-colored seed-coat in peanut. A study by Qiu et al. concerning passion fruit [57] found that the total anthocyanin content of purple fruit was significantly higher than that of yellow fruit, raising the potential value of passion fruit as a functional food. Purple wheat [58] has special health benefits, with the color of purple wheat seeds deepening and the total anthocyanin content increasing over time during the developmental stages of purple wheat. White, yellow, blue and pink Primula vulgaris [26] showed a gradual increase in total anthocyanin content as the color deepened. Similarly, Asian cotton petals are rich in anthocyanins, so they can also be used as a natural source of pigments in food. Through phenotypically identifying and comparing the total anthocyanin content of different varieties of Asian cotton, we found that the total anthocyanin content of white, yellow, and purple petals was significantly different. Therefore, purple petals can be used as a natural source of anthocyanins to enhance the added value of Asian cotton.

The potential metabolites involved in flower coloration

The determination of secondary metabolite content and species has an essential function in the research exploitation of species. Flavonoids are vital secondary metabolites in plants [59, 60]. During plant growth and development, flavonoids fulfill a variety of physiological functions. They changed the color of plant organs [61], defended against biological attacks such as pathogenic bacteria [62], and inhibited the harm of abiotic factors such as temperature, drought, and salinity [55, 63,64,65]. Therefore, we performed a qualitative and quantitative analysis of secondary metabolites in Asian cotton petals. A total of 134 flavonoids were identified by metabolomic analysis. Forty-eight metabolite components (including 32 DAFs) of flavonoids, flavanols and anthocyanins were enriched in the well-known metabolic pathways. According to the analysis of W_Flo, Y_Flo, and P_Flo flavonoid compounds, we found a dramatic shift in the content of metabolites in the phenylpropanoid flavonoid biosynthetic pathway along with the deepening of petal color, with a gradual decrease in the content of catechin derivatives such as gallocatechin. Therefore, we speculated that catechins are necessary precursors for synthesizing anthocyanins, consistent with the results of Wang et al. [58] on purple wheat seeds. Moreover, other flavonoid metabolites showed multiple trends, indicating that Asian cotton petal color regulation is a complex and dynamic process regulated by multiple factors.

More than 20 kinds of anthocyanin have been identified so far, among which cornflower, geranium, paeoniflorin, petunia pigments, ariocyanin and mallow pigments are more common in plants [66]. In this study, two key anthocyanins were identified: cyanidin and pelargonidin. Cyanidin causes red–purple color variation, while pelargonidin contributes to orange and red [67]. Pelargonidin 3-O-beta-D-glucoside and cyanidin -3-O-(6''-Malonylglucoside) depicted differential accumulation in all three petals. Xue et al. [68] emphasized the regulatory role of cyanidin and pelargonidin in strawberry flower color, resulting in darkening of flower color. Quercetin and gossypetin were considered a major yellow pigment when carotenoids were not found in P. vulgaris [24, 25]. By analyzing the phenylpropanoid-flavonoid biosynthetic pathway, we identified 12 quercetin and 8 gossypetin that differed significantly in different color petals. Quercetin 4'-O-glucoside, methylquercetin O-hexoside, gossypetin 7-glucoside, and quercetin glu-malonyl-pen were specific in the yellow group, and we predicted them to be the main metabolites responsible for yellow petals. Although anthocyanin components were detected in W_Flo, they may be due to the effect of floral base spots. However, the white petals contained higher levels of kaempferol than the other two groups, especially kaempferol acetyl-glucoside and kaempferol 3-O-galactoside had the highest content, and kaempferol-O-pentoside-O-hexoside had the highest foldchange, which may be an important factor in regulating the pigmentation of white petals.

The key DEGs responsible for the phenylpropanoid – flavonoid biosynthesis pathway

Transcriptome analysis is important for identifying genes responsible for a stage-specific trait. The analysis of different comparison groups revealed that only a few genes showed an up-regulation trend in expression as petal color deepened, suggesting that the process of color deepening requires only a few key involvements. Moreover, the functional enrichment of DEGs indicated that changes in differential gene expression caused significant changes in metabolic activities. Some key secondary metabolites in plants have defensive functions and chemosensory effects against pests and diseases [69, 70]. However, we also found that anthocyanin biosynthesis is linked to the phenylpropanoid-flavonoid synthesis pathway, carbohydrate metabolism, pentose and glucuronate interconversions, and plant-pathogen interaction. Sugar, carbohydrate, and plant hormone play a crucial role in anthocyanin biosynthesis [58]. We also screened important DEGs in the phenylpropanoid flavonoid synthesis pathway by analyzing the transcriptome, including 4CL, CHI, CHS, DFR, F3'H, FL, LAR, PAL, and UFGT, and most of these structural genes were significantly expressed in purple petals [71]. We further selected 10 structural genes that were notably associated with the total anthocyanin content, suggesting critical roles of DFR, PAL, CHS, UFGT and CHI in accumulating anthocyanins [72]. We also identified MYB that may act as promoters of anthocyanin accumulation; Ga11G2616(PIF), Ga08G1465(bHLH), Ga08G1561(MYB) and Ga02G0399(ATHB) that may act as repressors of anthocyanin synthesis [73,74,75]. These results suggest that these 10 structural genes and 29 TFs are candidate genes responsible for regulating petal color traits.

Co-expression networks of key metabolites and genes

The combined transcriptome and metabolome analysis is an important tool for identifying the genes responsible for the different petal colors [48, 76, 77]. Correlation analysis between metabolome and transcriptome showed that the expression levels of some DEGs were closely related to the accumulation of flavonoid metabolites. Therefore, we constructed an interaction network containing 161 DEGs involved in the phenylpropanoid-flavonoid synthesis pathway and 29 flavonoids (7 kaempferol, 10 quercetin, 5 gossypetin, and 2 anthocyanins). Overall, purple petals were regulated by multiple metabolites, with kaempferol, quercetin gossypetin and anthocyanins all present in the purple petals. Pelargonidin 3-O-beta-D-glucoside and cyanidin-3-O-(6''-malonylglucoside) are highly accumulated in P_Flo and are essential for the formation of purple petals. Combined with the correlation analysis, CHS and DFR were considered as key structural genes in the flavonoid biosynthesis pathway in purple petals [78,79,80]. As we all know, CHS is a firmware rate-limiting enzyme of flavonoid biosynthesis that affects downstream secondary metabolites, while DFR is a key enzyme in the anthocyanin synthesis pathway that controls the accumulation of colorless anthocyanin-like metabolites in plants [19, 81]. These results suggest that the expression of flavonoid biosynthetic genes contributes to the accumulation of anthocyanins in purple petal forming. Our findings show that white petals are formed because of the weak accumulation of gossypetin-fer and high levels of kaempferol-7-O-glucoside, kaempferol 3-O-galactoside, and kaempferol acetyl-glucoside. These kaempferol compounds are synthesized in MYBs and bHLH co-expressed with 4CL and PAL resulted. It is consistent with the study of P. vulgaris [26]. Yellow petals are regulated by most gossypetin, mainly negatively regulated by 4CL, FLS, MYB, bHLH, and WERKY. Surprisingly, the co-expression network of Lmmp003271 (gossypetin 7-glucoside) indicated that Ga05G0055 (4CL1) had negative regulation of its accumulation, but this gene also positively regulated the synthesis of Rfg0001-der20 (gossypetin-fer). It implies that the low expression of Ga05G0055 increases the accumulation of gossypetin 7-glucoside and inhibits the synthesis of gossypetin-fer.


In the present study, we combined transcriptomic and metabolomic analyses to elucidate the mechanism of color differences between different Asian cotton varieties on a flowering day. A total of 122 flavonoid metabolites (including 5 anthocyanins) were identified in the three groups of samples. Moreover, we summarized the important metabolites and genes in the phenylpropanoid-flavonoid synthesis pathway, key factors intimately associated with petal color. We preliminarily screened for DEGs significantly associated with the phenylpropanoid-flavonoid synthesis pathway and, more importantly, identified transcription factors positively and negatively regulating anthocyanin content. In addition, 10 DEGs and 29 TFs played vital roles in anthocyanin accumulation. Ultimately, color-associated co-expression networks were constructed by joint analysis to screen metabolites and candidate genes associated with different petal colors. We have elucidated the metabolites and major genes that regulate Asian cotton petal color through comparative and integrative analyses, providing a basic framework and key basis for further studies on cotton flower color. However, further studies on these candidate metabolites and genes are needed to clarify the molecular mechanisms.

Materials and methods

Plant material

Three varieties of Asian cotton with different colored petals, Shixiya1 (white), GA0146 (yellow) and GA0149 (purple), were planted in three rows at the experimental base of the Institute of Cotton, Chinese Academy of Agricultural Sciences, Anyang, Henan Province (36°07′ N, 114°50′ E). Five completely open flowers from each row on the same day were taken for phenotype observation and their petals for total anthocyanin and RNA extraction. All samples were frozen in liquid nitrogen immediately after sampling and stored immediately in a -80 °C refrigerator.

Measurement of total anthocyanin content

The method of extracting total anthocyanins from Camellia sinensis petals by Fu et al. [82] was used and the extraction conditions were optimized. Asian cotton petals were ground into powder in liquid nitrogen. About 0.65 g of dry powder was added to 20 ml of 95% (0.1 mol L−1 HCL) ethanol solution and then heated in a water bath at 60 °C for two hours. Finally, the absorbance values of the extracts were measured by the BioTek microplate reader (Gene Company Limited, America) at 520 nm, 620 nm and 650 nm. The total anthocyanin content was calculated by the formula: A = (A530-A620)-0.1(A650-A620), anthocyanin content (mmol/g FW) = A × V × 1,000/489.72 m, where V represents the volume of the extract and m represents the weight of the dried petal powder. Ninety-five percent (0.1 mol L−1 HCL) ethanol solution was used as the blank control. Three replicates were analyzed for each sample.

Metabolite identification and quantification

A series of metabolite extraction, identification and quantification procedures were carried out at Wuhan Metware Biotechnology Co., Ltd ( following the company's standard procedures [83, 84]. Cryo-preserved samples were weighed and extracted with 1.0 ml of 70% methanol at 4 °C. Extracts were analyzed using liquid chromatography-mass spectrometry/M.S. analysis (LC–MS/MS, UPLC, Shim-pack UFLC SHIMADZU CBM30A system; MS, Applied Biosystems 6500 QTRAP). All metabolites were identified and quantified by Metware's metabolite database and public metabolite database. Differential accumulation of metabolites (DAMs) between samples was identified using orthogonal partial least squares discriminant analysis. Metabolites with |Log2 Foldchange|≥ 1 and VIP (variable importance in project) ≥ 1 were defined as DAMs.

Transcriptome sequencing and analysis of Asian cotton flower

Total RNA was extracted from the samples (Shixiya 1, GA0146 and GA0149) with the RNA Extraction kit (TIANGEN, Beijing, China). The RNA quality and concentration were assessed with agarose gel electrophoresis and NanoDrop2000 spectrophotometer. RNA sample quality testing, library construction and sequencing for each sample were done at Biomarker Biotechnology ( The cDNA libraries were sequenced on an Illumina NovaSeq 6,000 platform with generating paired-end reads. Low-quality data containing adapter and poly-N were removed for downstream analysis. The resulting set of high-quality clean reads was used for transcriptome analysis. The clean reads were localized to the Asian cotton reference genome (CRI-v1.0) ( [85] using Hisat2 to obtain unigenes. Fragments per kilobase of exon model per million mapped fragments (FPKM) for all genes to determine gene expression values. The differentially expressed genes (DEGs) were identified by R package DESeq2 for subsequent analysis. The genes featuring FDR < 0.05 and |Log2 Fold change|≥ 1 were considered DEGs. The Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases were applied further to annotate the identified DEGs from each pairwise combination using TBtools software [86]. Taking q-value < 0.05 as the criterion, twenty-five significant pathways were selected. Heat maps were generated using the OmicStudio tools at

Considering the important role of phenylpropanoid-flavonoid pathway structural genes and TFs in anthocyanin synthesis, the related structural genes and TFs in all samples were identified. Combined with KEGG enrichment results and transcription annotation files, all the annotated TFs were retrieved by searching the transcription annotation file. The related structural genes were filtered in ko00940, ko00941, ko00943 and ko00944 pathways. The Pearson correlation coefficients (R2) between key genes in the phenylpropanoid-flavonoid pathway and the total anthocyanin content of each sample were calculated using the R package Hmisc [87]. The TFs with |R2|> 0.9 and the structural genes with |R2|> 0.95 were retained for subsequent analysis.

Integrative analysis of key metabolites and genes

To analyze the interactions between genes and metabolites related to flower coloration, we constructed a co-expression network between key DAMs and important DEGs. First, we calculated the mean of key DAMs content and important DEGs expression of three biological replicates in each sample. Then, R package Hmisc was used to calculate the correlation coefficient between metabolites and genes. The DAMs and DEGs between three colored cotton were selected when |R2|> 0.8. Finally, co-expression networks were constructed using the correlation coefficients of genes and metabolites to reveal the interactions between petal color metabolites and genes. The co-expression networks were visualized using Cytoscape software (version 3.8.2).

The quantitative real time-PCR validation

Total RNA was reverse-transcribed in a 20 μL reaction mixture using the HiScript® II Q RT SuperMix for qPCR (+ gDNA wiper) kit (Vazyme, China). The PCR product was examined by 1.2% agarose gel (Tsingke, China) electrophoresis. If the band size and brightness met the requirements, the cDNA was of good quality and could be used for qRT-PCR experiments. The 20 μL reactions were performed with 10 μL of ChamQTM SYBR® qPCR Master Mix (High ROX Premixed), 1.0 μL 10 mM forward and reverse primers, 7.0 μL of ddH2O and 2.0 μL 5 times diluted cDNA template. The cotton Histone3 was used as an internal reference gene. The ABI Prism 7500 Fast system was used to perform qRT-PCR. The primers used for qRT-PCR are listed in Table S6 (Additional file 11: Table S6). The relative expression of genes was calculated by the 2−ΔΔCT method [88].

Availability of data and materials

The datasets generated and analyzed in the current study are available from the corresponding author on reasonable request. All data generated or analyzed during this study are included in this published article and its supplementary information files. The raw RNA-seq data are freely available at NCBI project PRJNA854799.


4CL :

4-Coumarate-CoA ligase


Anthocyanidin synthase

C4H :

Cinnamate 4-hydroxylase


Chalcone isomerase


Chalcone synthase


Differentially accumulated flavonoid metabolite


Differentially accumulated metabolite


Differential expressed gene


Dihydroflavonol 4-reductase

F3 5’H :

Flavonoid 3′5′-hydroxylase

F3H :

Flavonoid 3-hydroxylase

F3'H :

Flavonoid 3′-hydroxylase


Flavonol synthase


Flavonoid synthase


Fragments Per Kilobase of exon model per Million mapped fragments


Gene Ontology


Kyoto Encyclopedia of Genes and Genomes


Leucoanthocyantin reductase


Phenyl ammonia-lyase


Principal component analysis


Transcription factor


UDP-glucose: flavonoid 3-O-glucosyltransferase


  1. Feng L, Chen Y, Xu M, Yang Y, Yue H, Su Q, Zhou C, Feng G, Ai N, Wang N, Zhou B. Genome-wide introgression and quantitative trait locus mapping reveals the potential of Asian cotton (Gossypium arboreum) in improving upland cotton (Gossypium hirsutum). Front Plant Sci. 2021;12:719371.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Mehetre S, Aher A, Gawande V, Patil V, Mokate A. Induced polyploidy in Gossypium: a tool to overcome interspecific incompatibility of cultivated tetraploid and diploid cottons. Curr Sci. 2003;84(12):1510–2.

    Google Scholar 

  3. Ma XX, Zhou BL, Lü YH, Guo WZ, Zhang TZ. Simple sequence repeat genetic linkage maps of a-genome diploid cotton (Gossypium arboreum). J Exp Bot. 2008;50(4):491–502.

    CAS  Google Scholar 

  4. Riazuddin S, Husnain T. Gossypium arboreumGHSP26 enhances drought tolerance in Gossypium hirsutum. Biotechnol Prog. 2010;26(1):2125Mortazavi.

    Google Scholar 

  5. Erpelding JE, Stetina SR. Genetic characterization of reniform nematode resistance for Gossypium arboreum accession PI 417895. Plant Breed. 2018;137(1):81–8.

    Article  CAS  Google Scholar 

  6. Huang Z. Chapter 3 Cotton germplasm resources: Section 1 Cultivation of cotton species: G. arboreum L. In: cotton varieties and their genealogy in China. Edited by Huang Z. China, China agriculture press; 1996.

  7. Neelakantam K, Seshadri TR. Pigments of cotton flowers. Proc Indian Acad Sci. 1937;5(4):357–64.

    Article  Google Scholar 

  8. Tan J, Wang M, Tu L, Nie Y, Lin Y, Zhang X. The flavonoid pathway regulates the petal colors of cotton flower. PLoS ONE. 2013;8(8):e72364.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Parks CR, Ezell WL, Williams DE, Dreyer DL. The application of flavonoid distribution to taxonomic problems in the genus Gossypium. Bulletin of the Torrey Botanical Club. 1975;102(6):350–61.

  10. Liang Z, Jiang R, Zhong W. New red flower germplasm lines of cotton selected from hybrid of Gossypium hirsutum XG. bickii. Sci China C Life Sci. 1997;40(3):284–92.

    Article  CAS  PubMed  Google Scholar 

  11. Tanaka Y, Sasaki N, Ohmiya A. Biosynthesis of plant pigments: anthocyanins, betalains and carotenoids. Plant J. 2008;54(4):733–49.

    Article  CAS  PubMed  Google Scholar 

  12. Shahzad K, Qi T, Guo L, Tang H, Zhang X, Wang H, Qiao X, Zhang M, Zhang B, Feng J. Adaptability and stability comparisons of inbred and hybrid cotton in yield and fiber quality traits. Agronomy. 2019;9(9):516.

    Article  Google Scholar 

  13. Xing C, Guo L, Li W, Wu J, Yang D, Qi T, Ma X, Zhang X. Ten-year achievements and future development of cotton heterosis utilization. J Cotton Sci. 2017;29(S1):28–36.

    Google Scholar 

  14. Abid MA, Wei Y, Meng Z, Wang Y, Ye Y, Wang Y, He H, Zhou Q, Li Y, Wang P, Li X, Yan L, Malik W, Guo S, Chu C, Zhang R, Liang C. Increasing floral visitation and hybrid seed production mediated by beauty mark in Gossypium hirsutum. Plant Biotechnol J. 2022;20(7):1274–84.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Erpelding JE. Genetic characterization of the red colored corolla phenotype for Gossypium arboreum accession PI 529731. Plant Breed. 2021;140(1):142–9.

    Article  CAS  Google Scholar 

  16. Zhang Y, Butelli E, Martin C. Engineering anthocyanin biosynthesis in plants. Curr Opin Plant Biol. 2014;19:81–90.

    Article  CAS  PubMed  Google Scholar 

  17. Tanaka Y, Katsumoto Y, Brugliera F, Mason J. Genetic engineering in floriculture. Plant Cell Tiss Organ Cult. 2005;80(1):1–24.

    Article  CAS  Google Scholar 

  18. Tanaka Y, Tsuda S, Kusumi T. Metabolic engineering to modify flower color. Plant Cell Physiol. 1998;39(11):1119–26.

    Article  CAS  Google Scholar 

  19. Grotewold E. The genetics and biochemistry of floral pigments. Annu Rev Plant Biol. 2006;57:761–80.

    Article  CAS  PubMed  Google Scholar 

  20. Freyre R, Griesbach RJ. Inheritance of flower color in Anagallis monelli L. HortScience. 2004;39(6):1220–3.

    Article  CAS  Google Scholar 

  21. Yamamizo C, Hirashima M, Kishimoto S, Ohmiya A. Carotenoid composition in the yellow and pale green petals of Primula species. Bull Natl Inst Flor Sci. 2011;11:67–72.

    Google Scholar 

  22. Harborne JB. 8 - Animal pheromones and defence substances. In: Introduction to ecological biochemistry (Fourth Edition). Edited by Harborne JB. San Diego: Academic Press; 1993. p. 211–42.

  23. Ververidis F, Trantas E, Douglas C, Vollmer G, Kretzschmar G, Panopoulos N. Biotechnology of flavonoids and other phenylpropanoid-derived natural products. Part II: Reconstruction of multienzyme pathways in plants and microbes. Biotechnol J. 2007;2(10):1235–49.

    Article  CAS  PubMed  Google Scholar 

  24. Harborne J. Plant polyphenols-XV: Flavonols as yellow flower pigments. Phytochemistry. 1965;4(5):647–57.

    Article  CAS  Google Scholar 

  25. Harborne J. Comparative biochemistry of the flavonoids-VII: Correlations between flavonoid pigmentation and systematics in the family Primulaceae. Phytochemistry. 1968;7(8):1215–30.

    Article  CAS  Google Scholar 

  26. Li L, Ye J, Li H, Shi Q. Characterization of metabolites and transcripts involved in flower pigmentation in Primula vulgaris. Front Plant Sci. 2020;11:572517.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Cushman JC, Bohnert HJ. Genomic approaches to plant stress tolerance. Curr Opin Plant Biol. 2000;3(2):117–24.

    Article  CAS  PubMed  Google Scholar 

  28. Wang H, Hu T, Huang J, Lu X, Huang B, Zheng Y. The expression of Millettia pinnata chalcone isomerase in Saccharomyces cerevisiae salt-sensitive mutants enhances salt-tolerance. Int J Mol Sci. 2013;14(5):8775–86.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  29. Nakabayashi R, Yonekura-Sakakibara K, Urano K, Suzuki M, Yamada Y, Nishizawa T, Matsuda F, Kojima M, Sakakibara H, Shinozaki K. Enhancement of oxidative and drought tolerance in Arabidopsis by overaccumulation of antioxidant flavonoids. Plant J. 2014;77(3):367–79.

    Article  CAS  PubMed  Google Scholar 

  30. Santos-Buelga C, Mateus N, De Freitas V. Anthocyanins. Plant pigments and beyond. J Agric Food Chem. 2014;62(29):6879–84.

    Article  CAS  PubMed  Google Scholar 

  31. Fan X, Fan B, Wang Y, Yang W. Anthocyanin accumulation enhanced in Lc-transgenic cotton under light and increased resistance to bollworm. Plant Biotechnol Rep. 2016;10:1–11.

    Article  PubMed  Google Scholar 

  32. Liang J, He J. Protective role of anthocyanins in plants under low nitrogen stress. Biochem Biophys Res Commun. 2018;498(4):946–53.

    Article  CAS  PubMed  Google Scholar 

  33. Xu Z, Rothstein SJ. ROS-Induced anthocyanin production provides feedback protection by scavenging ROS and maintaining photosynthetic capacity in Arabidopsis. Plant Signal Behav. 2018;13(3):e1451708.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  34. Xue Q, Zhang X, Yang H, Li H, Lv Y, Zhang K, Liu Y, Liu F, Wan Y. Transcriptome and metabolome analysis unveil anthocyanin metabolism in pink and red testa of peanut (Arachis hypogaea L.). Int J Genomics. 2021;2021:5883901.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  35. Zhou Z, Gao H, Ming J, Ding Z, Lin X, Zhan R. Combined transcriptome and metabolome analysis of pitaya fruit unveiled the mechanisms underlying peel and pulp color formation. BMC Genomics. 2020;21(1):734.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Liu Y, Lv J, Liu Z, Wang J, Yang B, Chen W, Ou L, Dai X, Zhang Z, Zou X. Integrative analysis of metabolome and transcriptome reveals the mechanism of color formation in pepper fruit (Capsicum annuum L.). Food Chem. 2020;306:125629.

    Article  CAS  PubMed  Google Scholar 

  37. Leng F, Cao J, Ge Z, Wang Y, Zhao C, Wang S, Li X, Zhang Y, Sun C. Transcriptomic analysis of root restriction effects on phenolic metabolites during grape berry development and ripening. J Agric Food Chem. 2020;68(34):9090–9.

    Article  CAS  PubMed  Google Scholar 

  38. Hichri I, Barrieu F, Bogs J, Kappel C, Delrot S, Lauvergeat V. Recent advances in the transcriptional regulation of the flavonoid biosynthetic pathway. J Exp Bot. 2011;62(8):2465–83.

    Article  CAS  PubMed  Google Scholar 

  39. Xiong S, Tian N, Long J, Chen Y, Qin Y, Feng J, Xiao W, Liu S. Molecular cloning and characterization of a flavanone 3-Hydroxylase gene from Artemisia annua L. Plant Physiol Biochem. 2016;105:29–36.

    Article  CAS  PubMed  Google Scholar 

  40. Cheng H, Li L, Cheng S, Cao F, Feng X, Yuan H, Wu C, Permyakov EA. Molecular cloning and characterization of three genes encoding dihydroflavonol-4-reductase from Ginkgo biloba in anthocyanin biosynthetic pathway. PLoS ONE. 2013;8(8):e72017.

    Article  CAS  Google Scholar 

  41. Jian W, Cao H, Yuan S, Liu Y, Lu J, Lu W, Li N, Wang J, Zou J, Tang N. SlMYB75, an MYB-type transcription factor, promotes anthocyanin accumulation and enhances volatile aroma production in tomato fruits. Hortic Res-England. 2019;6(1):1–15.

    CAS  Google Scholar 

  42. Lu W, Chen J, Ren X, Yuan J, Han X, Mao L, Ying T, Luo Z. One novel strawberry MADS-box transcription factor FaMADS1a acts as a negative regulator in fruit ripening. Sci Hortic. 2018;227:124–31.

    Article  CAS  Google Scholar 

  43. Feng C, Ding D, Feng C, Kang M. The identification of an R2R3-MYB transcription factor involved in regulating anthocyanin biosynthesis in Primulina swinglei flowers. Gene. 2020;752:144788.

    Article  CAS  PubMed  Google Scholar 

  44. Goff SA, Cone KC, Chandler VL. Functional analysis of the transcriptional activator encoded by the maize B gene: evidence for a direct functional interaction between two classes of regulatory proteins. Genes Dev. 1992;6(5):864–75.

    Article  CAS  PubMed  Google Scholar 

  45. Gonzalez A, Zhao M, Leavitt JM, Lloyd AM. Regulation of the anthocyanin biosynthetic pathway by the TTG1/bHLH/Myb transcriptional complex in Arabidopsis seedlings. Plant J. 2008;53(5):814–27.

    Article  CAS  PubMed  Google Scholar 

  46. Cho K, Cho KS, Sohn HB, Ha IJ, Hong SY, Lee H, Kim YM, Nam MH. Network analysis of the metabolome and transcriptome reveals novel regulation of potato pigmentation. J Exp Bot. 2016;67(5):1519–33.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Zhang Q, Wang L, Wang Z, Zhang R, Liu P, Liu M, Liu Z, Zhao Z, Wang L, Chen X, Xu H. The regulation of cell wall lignification and lignin biosynthesis during pigmentation of winter jujube. Hortic Res-England. 2021;8(1):238.

    Article  CAS  Google Scholar 

  48. Wang Z, Du H, Zhai R, Song L, Ma F, Xu L. Transcriptome analysis reveals candidate genes related to color fading of “red bartlett” (Pyrus communis L). Front Plant Sci. 2017;8:455.

    PubMed  PubMed Central  Google Scholar 

  49. Yuan L, Niu H, Yun Y, Tian J, Lao F, Liao X, Gao Z, Ren D, Zhou L. Analysis of coloration characteristics of Tunisian soft-seed pomegranate arils based on transcriptome and metabolome. Food Chem. 2022;370:131270.

    Article  CAS  PubMed  Google Scholar 

  50. Li Y, Chen Q, Xie X, Cai Y, Li J, Feng Y, Zhang Y. Integrated metabolomics and transcriptomics analyses reveal the molecular mechanisms underlying the accumulation of anthocyanins and other flavonoids in cowpea pod (Vigna unguiculata L.). J Agric Food Chem. 2020;68(34):9260–75.

    Article  CAS  PubMed  Google Scholar 

  51. Sánchez-Pujante PJ, Borja-Martínez M, Pedreo MA, Almagro L. Biosynthesis and bioactivity of glucosinolates and their production in plant in vitro cultures. Planta. 2017;246(1):19–32.

    Article  PubMed  CAS  Google Scholar 

  52. Winter K. Diversity of CAM plant photosynthesis (crassulacean acid metabolism): a tribute to Barry Osmond. Funct Plant Biol. 2021;48(7):III–IX.

    Article  PubMed  Google Scholar 

  53. Winter K, Holtum JA. Environment or development? Lifetime net CO2 exchange and control of the expression of Crassulacean acid metabolism in Mesembryanthemum crystallinum. Plant Physiol. 2007;143(1):98–107.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Ying H, Shi J, Zhang S, Pingcuo G, Wang S, Zhao F, Cui Y, Zeng X. Transcriptomic and metabolomic profiling provide novel insights into fruit development and flesh coloration in Prunus mira Koehne, a special wild peach species. BMC Plant Biol. 2019;19(1):463.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  55. Guo J, Han W, Wang MH. Ultraviolet and environmental stresses involved in the induction and regulation of anthocyanin biosynthesis: A review. Afr J Biotechnol. 2009;725(25):4966–72.

    Google Scholar 

  56. Zhang Y, Chen G, Dong T, Pan Y, Zhao Z, Tian S, Hu Z. Anthocyanin accumulation and transcriptional regulation of anthocyanin biosynthesis in purple bok choy (Brassica rapa var chinensis). J Agric Food Chem. 2014;62(51):12366–76.

    Article  CAS  PubMed  Google Scholar 

  57. Qiu W, Su W, Cai Z, Dong L, Li C, Xin M, Fang W, Liu Y, Wang X, Huang Z, Ren H, Wu Z. Combined analysis of transcriptome and metabolome reveals the potential mechanism of coloration and fruit quality in yellow and purple Passiflora edulis Sims. J Agric Food Chem. 2020;68(43):12096–106.

    Article  CAS  PubMed  Google Scholar 

  58. Wang F, Ji G, Xu Z, Feng B, Zhou Q, Fan X, Wang T. Metabolomics and transcriptomics provide insights into anthocyanin biosynthesis in the developing grains of purple wheat (Triticum aestivum L.). J Agric Food Chem. 2021;69(38):11171–84.

    Article  CAS  PubMed  Google Scholar 

  59. Georgiev V, Ananga A, Tsolova V. Recent advances and uses of grape flavonoids as nutraceuticals. Nutrients. 2014;6(1):391–415.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  60. Liang Z, Wu B, Fan P, Yang C, Duan W, Zheng X, Liu C, Li S. Anthocyanin composition and content in grape berry skin in Vitis germplasm. Food Chem. 2008;111(4):837–44.

    Article  CAS  Google Scholar 

  61. Schiestl FP, Johnson SD. Pollinator-mediated evolution of floral signals. Trends Ecol Evol. 2013;28(5):307–15.

    Article  PubMed  Google Scholar 

  62. Piasecka A, Jedrzejczak-Rey N, Bednarek P. Secondary metabolites in plant innate immunity: conserved function of divergent chemicals. New Phytol. 2015;206(3):948–64.

    Article  PubMed  Google Scholar 

  63. Havsteen BH. The biochemistry and medical significance of the flavonoids. Pharmacol Ther. 2002;96(2–3):67–202.

    Article  CAS  PubMed  Google Scholar 

  64. Agati G, Azzarello E, Pollastri S, Tattini M. Flavonoids as antioxidants in plants: Location and functional significance. Plant Sci. 2012;196:67–76.

    Article  CAS  PubMed  Google Scholar 

  65. Falcone Ferreyra ML, Rius SP, Casati P. Flavonoids: biosynthesis, biological functions, and biotechnological applications. Front Plant Sci. 2012;3:222.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. De Rosso VV, Mercadante AZ. The high ascorbic acid content is the main cause of the low stability of anthocyanin extracts from acerola. Food Chem. 2007;103(3):935–43.

    Article  CAS  Google Scholar 

  67. Khoo HE, Azlan A, Tang ST, Lim SM. Anthocyanidins and anthocyanins: colored pigments as food, pharmaceutical ingredients, and the potential health benefits. Food Nutr Res. 2017;61(1):1361779.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  68. Xue L, Wang J, Zhao J, Zheng Y, Wang H-F, Wu X, Xian C, Lei J-J, Zhong C-F, Zhang Y-T. Study on cyanidin metabolism in petals of pink-flowered strawberry based on transcriptome sequencing and metabolite analysis. BMC Plant Biol. 2019;19(1):1–16.

    Article  Google Scholar 

  69. Maag D, Erb M, Köllner TG, Gershenzon J. Defensive weapons and defense signals in plants: some metabolites serve both roles. BioEssays. 2015;37(2):167–74.

    Article  PubMed  Google Scholar 

  70. Wang F, Dong YX, Tang XZ, Tu TL, Zhao B, Sui N, Fu DL, Zhang XS. Comparative transcriptome analysis revealing the effect of light on anthocyanin biosynthesis in purple grains of wheat. J Integr Plant Biol. 2018;66(13):3465–76.

    CAS  Google Scholar 

  71. Saito K, Yonekura-Sakakibara K, Nakabayashi R, Higashi Y, Yamazaki M, Tohge T, Fernie AR. The flavonoid biosynthetic pathway in Arabidopsis: structural and genetic diversity. Plant Physiol Biochem. 2013;72:21–34.

    Article  CAS  PubMed  Google Scholar 

  72. Li H, Lv Q, Ma C, Qu J, Cai F, Deng J, Huang J, Ran P, Shi T, Chen Q. Metabolite profiling and transcriptome analyses provide insights into the flavonoid biosynthesis in the developing seed of tartary buckwheat (Fagopyrum tataricum). J Agric Food Chem. 2019;67(40):11262–76.

    Article  CAS  PubMed  Google Scholar 

  73. Shoeva OY, Gordeeva EI, Khlestkina EK. The regulation of anthocyanin synthesis in the wheat pericarp. Molecules. 2014;19(12):20266–79.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  74. Jiang W, Liu T, Nan W, Jeewani DC, Niu Y, Li C, Wang Y, Shi X, Wang C, Wang J, Li Y, Gao X, Wang Z. Two transcription factors TaPpm1 and TaPpb1 co-regulate anthocyanin biosynthesis in purple pericarps of wheat. J Exp Bot. 2018;69(10):2555–67.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Zong Y, Xi X, Li S, Chen W, Zhang B, Liu D, Liu B, Wang D, Zhang H. Allelic variation and transcriptional isoforms of wheat TaMYC1 gene regulating anthocyanin synthesis in pericarp. Front Plant Sci. 2017;8:1645.

    Article  PubMed  PubMed Central  Google Scholar 

  76. Lou Q, Liu Y, Qi Y, Jiao S, Tian F, Jiang L, Wang Y. Transcriptome sequencing and metabolite analysis reveals the role of delphinidin metabolism in flower color in grape hyacinth. J Exp Bot. 2014;65(12):3157–64.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Matus JT. Transcriptomic and metabolomic networks in the grape berry illustrate that it takes more than flavonoids to fight against ultraviolet radiation. Front Plant Sci. 2016;7:1337.

    Article  PubMed  PubMed Central  Google Scholar 

  78. Dixon RA, Steele CL. Flavonoids and isoflavonoids - a gold mine for metabolic engineering. Trends Plant Sci. 1999;4(10):394–400.

    Article  CAS  PubMed  Google Scholar 

  79. Cao K, Ding T, Mao D, Zhu G, Fang W, Chen C, Wang X, Wang L. Transcriptome analysis reveals novel genes involved in anthocyanin biosynthesis in the flesh of peach. Plant Physiol Biochem. 2018;123:94–102.

    Article  CAS  PubMed  Google Scholar 

  80. Ravaglia D, Espley RV, Henry-Kirk RA, Andreotti C, Ziosi V, Hellens RP, Costa G, Allan AC. Transcriptional regulation of flavonoid biosynthesis in nectarine (Prunus persica) by a set of R2R3 MYB transcription factors. BMC Plant Biol. 2013;13:68.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Zhao D, Tao J. Recent advances on the development and regulation of flower color in ornamental plants. Front Plant Sci. 2015;6:261.

    Article  PubMed  PubMed Central  Google Scholar 

  82. Fu M, Yang X, Zheng J, Wang L, Yang X, Tu Y, Ye J, Zhang W, Liao Y, Cheng S, Xu F. Unraveling the regulatory mechanism of color diversity in Camellia japonica petals by integrative transcriptome and metabolome analysis. Front Plant Sci. 2021;12:685136.

    Article  PubMed  PubMed Central  Google Scholar 

  83. Wang Y, Zeng X, Xu Q, Mei X, Yuan H, Jiabu D, Sang Z, Nyima T. Metabolite profiling in two contrasting Tibetan hulless barley cultivars revealed the core salt-responsive metabolome and key salt-tolerance biomarkers. Aob Plants. 2019;11(2):plz021.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Cao H, Ji Y, Li S, Lu L, Tian M, Yang W, Li H. Extensive metabolic profiles of leaves and stems from the medicinal plant Dendrobium officinale Kimura et Migo. Metabolites. 2019;9(10):215.

    Article  CAS  PubMed Central  Google Scholar 

  85. Du X, Huang G, He S, Yang Z, Sun G, Ma X, Li N, Zhang X, Sun J, Liu M. Resequencing of 243 diploid cotton accessions based on an updated A genome identifies the genetic basis of key agronomic traits. Nat Genet. 2018;50(6):796–802.

    Article  CAS  PubMed  Google Scholar 

  86. Chen C, Chen H, Zhang Y, Thomas HR, Frank MH, He Y, Xia R. TBtools: an integrative toolkit developed for interactive analyses of big biological data. Mol Plant. 2020;13(8):1194–202.

    Article  CAS  PubMed  Google Scholar 

  87. R Core Team. R: A language and environment for statistical computing. 2011;1:12–21.

  88. Schmittgen TD, Livak KJ. Analyzing real-time PCR data by the comparative CT method. Nat Protoc. 2008;3(6):1101–8.

    Article  CAS  PubMed  Google Scholar 

Download references


We would appreciate all colleagues in our lab for providing experimental materials and technical assistance. The national mid-term genebank for cotton in the Institute of Cotton Research of the Chinese Academy of Agricultural Sciences (ICR, CAAS) provided us with cotton seeds. We are grateful for their kind helps.


This work was supported by grants from the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (Grant No. 31621005) and the Central Public-interest Scientific Institution Basal Research Fund (1610162022001).

Author information

Authors and Affiliations



A.X., X.D. and S.H conceived and designed the experiment. A.X. conducted experiments and analyzed the data and wrote the manuscript. S.H and M.N. reviewed and revised, and all the other authors revised the manuscript. All authors reviewed and approved the final manuscript.

Corresponding authors

Correspondence to Shoupu He or Xiongming Du.

Ethics declarations

Ethics approval and consent to participate

The collection of plant materials used in our study complied with institutional and national guidelines. Field studies were conducted in accordance with local legislation.

Consent for publication

Not applicable.

Competing interests

The authors declare no conflicts of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additionalfile 1: Figure S1.

Classification and statistics of all metabolites obtained.

Additionalfile 2: Table S1.

The classification and quantification results of all flavonoid metabolites.

Additionalfile 3: Figure S2.

The heatmap analysis of allflavonoid metabolites by TBtools. A: chalcones;B: sinensetin; C: dihydroflavone; D: dihydroflavonol; E: anthocyanins; F:flavonoid; G: flavonols; H: flavanols; I: isoflavones; J: proanthocyanidins; I:flavonoid; II: Tannins.

Additionalfile 4: Table S2.

All differentially expressed genes in each comparison group.

Additionalfile 5: Table S3.

Differentially accumulated flavonoid metabolites in each comparison group.

Additional file 6: Table S

4. Overview of the transcriptome sequencing dataset.

Additionalfile 7: Figure S3.

qRT-PCR validation of gene expression level in the transcriptome.

Additionalfile 8: Figure S4.

The enrichment results of all differentially expressed genes.

Additionalfile 9: Figure S5.

The identification and classification results of all TFs.

Additionalfile 10: Table S5.

The critical color-related metabolites and genes in co-expression networks.

Additionalfile 11: Table S6.

The primers for the qRT-PCR used in this study.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xing, A., Wang, X., Nazir, M.F. et al. Transcriptomic and metabolomic profiling of flavonoid biosynthesis provides novel insights into petals coloration in Asian cotton (Gossypium arboreum L.). BMC Plant Biol 22, 416 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: