Ethephon-induced sucrose production occurred only in the low-sugar clone
The analysis of sucrose content data showed highly significant (P < 0.001) genotype and treatment effects but there was no genotype by treatment interaction (Fig. 1). The top-performing high-sugar clone (HS) ROC22 produced almost 60% more sucrose (103 mg/g FW) than the low-sugar clone (LS) GT86–887 (40 mg/g FW) in the control treatment (Fig. 1). External application of ethephon increased sugar content by about 20% in GT86–887 (P < 0.001) as compared to control on day five after treatment, but it failed to elicit any difference in sucrose content in ROC22 and medium-sugar clone (MS) YT71–210 during that period (Fig. 1). The variation for sugar content between treatments did not change on day seven, the last point of measurements, in all the genotypes.
Transcriptome analysis and identification of unigenes
To better understand the molecular changes associated with the natural variation for sucrose accumulation occurring in sugarcane, and how that process is affected by the externally supplied ethephon, we analysed transcriptome of young developing parts of sugarcane stem (developing/maturing phytomers) from ethephon-treated and control plants from all three genotypes using Illumina paired-end HiSeq platform. Samples from two biological repeats for each treatment from each genotype were prepared and sequenced (Additional file 11: Table S1).
The reproducibility of the sequencing data was evaluated by Pearson’s correlation coefficients and it was calculated by log10 (RPKM+ 1). All biological replicates were strongly correlated (R2 ≥ 0.90) with each other (Additional file 1: Figure S1). After quality assessment and data clearance, a total of more than 3 billion (G) reads combined from all samples were retained as high-quality reads and used in subsequent analyses (Additional file 11: Table S1).
The transcriptome assembled from the RNA-seq data was used as reference. The assembly of the clean reads was carried out using Trinity [27] and they were analyzed (Additional file 12: Table S2 and Additional file 13: Tables S3). A total of 337,456 transcripts with an average length of 930 bp and N50 of 1488 were assembled by de novo assembly. From this assembly 163,054 UniGenes with an average length of 731 bp and N50 of 1198 bp were identified (Fig. 2 and Additional file 2: Figure S2).
The unigenes were annotated in databases including NR, NT, KO, SwissProt, PFAM, GO, and KOG. Among them, 86,944 unigenes could be annotated through at least one database used, whereas 8385 of them were annotated in all databases (Additional file 14: Table S4). About 32% of the unigenes were annotated by Gene Ontology (GO) analysis, and were divided into three functional categories: those associated with biological processes, cellular components and molecular functions (Additional file 3: Figure S3A). For the “biological process” group, the most frequently associated terms were growth, development, metabolism, regulation, signailing, localisation and tissue organisation processes. For the “cellular component” group, the dominant classes were related to cell compartments, macromolecules, membranes, organelles and cellular structural parts. Catalitycal activity, binding, transport, macromolecular processes, molecular transducers and transcription factor were the major classes in the “molecular function” grouping. About 10% of the unigenes were annotated by KAAS to obtain the corresponding KO numbers, which were used to annotate them in the KEGG pathway (Additional file 3: Figure S3B). Based on the KO results, the genes were separated into 5 clusters in which the first three bins were involved in signal transduction, translation and carbohydrate metabolism, with 1782, 1886 and 1774 genes, respectively. Using KOG, 12% of the identified unigenes were annotated (Additional file 3: Figure S3C), and they were divided into 26 groups, with general function prediction only, translation and ribosomal biogenesis, post-translational modifications and protein turnover, signal transduction and energy production and conversion being the dominant ones.
We further analyzed the specific distribution of unigenes (67147) annotated with NR database based on BLASTx, which revealed that the transcripts from sugarcane share high sequence similarity with that of Sorghum bicolor (41.5%), Zea mays (20.1%), Setaria italica (8.8%), Oryza sativa (5.3%), and Oncorhynchus mykiss (5.3%) (Additional file 3: Figure S3D).
Genotype had a dominant effect on differential gene expression than ethylene treatment
Among the annotated genes, approximately 70,000 genes were expressed in each of the samples, and 38,692 genes were co-expressed in all of the samples (Additional file 1: Figure S1A). To gain more insights into the molecular changes associated with the intrinsic genotypic and ethylene-induced variation for sucrose accumulation, transcriptomes of three genotypes with and without ethephon treatment were screened for differentially expressed genes (DEGs) by pairwise comparisons.
There were 24,938 genes differentially expressed in all of the samples combined. We performed hierarchical clustering of these DEGs using the Euclidean distance method associated with complete-linkage. The expression patterns of six clusters, subcluster1–subcluster6, were plotted (Additional file 4: Figure S4). In LS clone, 4889 genes were down-regulated and 11,806 genes were up-regulated irrespective of the treatment (Additional file 4: Figure S4 subcluster1 and 2), among which 1154 genes were up-regulated more than five-fold (Additional file 4: Figure S4 subcluster5). In both LS and HS genotypes, 2182 genes were down-regulated and 5177 genes were up-regulated (Additional file 4: Figure S4. Subcluster 3 and 4). One cluster involving 450 genes were up-regulated in all the samples except the MS genotypes under ethylene treatment (Additional file 4: Figure S4 subcluster 6).
Pair-wise comparisons of the three genotypes without ethylene treatment, MS_CK vs. LS_CK, HS_CK vs. MS_CK and HS_CK vs. LS_CK, revealed 12,488, 7003 and 10,859 DEGs, respectively, with 748 DEGs being common to all (Fig. 3a). In the same three genotypes following ethylene treatment, the comparison of transcriptomes of MS_T vs LS_T, HS_T vs MS_T and HS_T vs LS_T revealed as many as 8779, 9992 and 9580 DEGs, respectively (Fig. 3b). Transcriptome comparison of the same genotype with and without ethylene treatment, MS_T vs. MS_CK, LS_T vs. LS_CK and HS_T vs. HS_CK, identified 636, 2918 and 96 DEGs, respectively (Fig. 3c). The number of up-regulated DEGs was larger than that of down-regulated ones in LS_T vs. LS_CK, HS_T vs. MS_T, HS_CK vs. MS_CK and HS_T vs. HS_CK comparisons, but the opposite was true for MS_T vs. MS_CK, MS_T vs. LS_T, MS_CK vs. LS_CK, HS_T vs. LS_T and HS_CK vs. LS_CK (Fig. 3d). Overall the data on gene expressions shows genotype-dependent ethylene response, similar to sugar production. The most dramatic expression changes were found in the comparison between HS_CK and LS_CK, which indicate the occurrence of significant gene level variation for sugar production among different sugarcane genotypes. Further, the heatmap on gene expression profiles (Fig. 3e), revealed that more significant gene expression differences exist between genotypes than between treatments.
Sucrose and starch metabolism genes were expressed more abundantly in high-sugar than in low-sugar genotype, but they were more responsive to ethylene treatment in low-sugar genotype
Classification of DEGs based on GO annotations helped to group them into different functional categories. This was followed by pairwise comparisons of samples between genotypes grown under the same condition, and between samples from the same genotype grown under different conditions. The over-represented GO terms of DEGs were in three GO categories (cellular component, molecular function, and biological processes) (Additional file 5: Figure S5). In the molecular function category, most GO terms such as catalytic activity, transcription factors, transport, binding, etc., were enriched significantly in most pairwise comparisons except for HS_T vs. HS_CK and MS_T vs. MS_CK. Also, the terms of “Kinase activity”, “Phosphotransferase activity” and “Energy metabolism” were enriched significantly in this group (Additional file 5: Figure S5A). In the biological process category, GO terms of” Phosphorylation”, “Protein modification” and “Stress responses “were enriched significantly in most comparisons (Additional file 5: Figure S5B). In the cellular component category, no GO term was enriched significantly in comparisons between different genotypes. However, some terms such as apoplast, cell periphery and nucleoplasm were enriched in samples from ethylene treatment which suggested the effect of ethylene on the cellular components and activities in apoplasm (Additional file 5: Figure S5C).
KEGG analysis was also performed to explore the pathways in which DEGs are involved (Additional file 6: Figure S6). The DEGs were assigned to photosynthesis, plant hormone signal transduction, plant-pathogen interaction, starch and sugar metabolism, stress responses, lipid metabolism, apoptosis and amino-sugar and nucleotide-sugar metabolism. The most co-expression genes found among the different pairwise comparisons were involved in plant hormone signal transduction and plant-pathogen interaction. According to the GO and KEGG analysis, DEGs were further clustered between different genotypes irrespective of the treatment condition.
Evidence published so far indicate the existence of a complex mechanism to control sugar metabolism and accumulation in sugarcane. To further understand the gene regulatory dynamics of sucrose production in sugarcane, DEGs involved in carbohydrate metabolism were selected for pairwise comparisons (Fig. 4a). The results showed that under both water and ethylene treatments, the expression of unigenes involved in starch/sucrose metabolic pathway, such as c91663_g1 (sucrose α-glucosidase, INV), c93760_g1 (sucrose α-glucosidase, INVs), c84505_g3 (sucrose α-glucosidase, INV), c87273_g1 (polygalacturonase) and c95560_g1 (6-phosphofructokinase), and cell wall modifying genes c71803_g1 (pectinesterase) and c90370_g1 (pectinesterase) was higher in HS than that in MS and LS. The transcript of c101377_g1 (starch synthase) in HS and MS was higher than that of LS. Some other unigenes including c83285_g1, c101169_g3 (cellulose synthase), c86041_g2 (1,3-beta-D-glucan synthase), c86791_g4 (phloem sucrose loading), c105310_g2 (1,3-beta-D-glucan synthase), c42182_g1 (1,4-alpha-glucan branching enzyme) were highly expressed in HS and MS. These highly expressed starch/sucrose metabolism related genes suggest the occurrence of high starch/sucrose metabolic strength in HS compared to MS and LS. Also, there were other unigenes highly expressed in MS and LS, which include c103501_g1 (starch synthase), c99737_g1 (sucrose alpha-glucosidase), c83266_g1 (pectinesterase), and c95985_g1 (phosphotransferase) with function mostly in starch and sucrose metabolism and cell wall modification.
We compared the differentially expressed starch and sucrose metabolic pathway (KEGG) in HS and LS. Many genes involved in the starch and sucrose metabolism were changed. The amylase (EC3.2.1.1 and EC3.2.1.2), starch synthase (EC2.4.1.31) and invertase (EC3.2.1.26) were more intensely expressed in HS while the expression of hexokinase (EC2.7.1.1), glucan-branching enzyme (EC2.4.1.18) and glycogen-branching enzyme (TreXYZ) was lower (Additional file 7: Figure S7). Also, the genes involved in Photosystem I, Photosystem II and related protein complexes were differentially expressed in HS and LS (Additional file 8: Figure S8).
Sugar transport is an essential process for sugar accumulation in sugarcane [28,29,30]. We analyzed the expression of sugar transporter genes. Cytosolic and cell wall invertases are supposed to function in sugar transport through symplast and apoplast [31,32,33,34]. From the transcriptome data, the cytosolic acid invertase c93030_g1 (annotated as CINV2) and cell wall invertases c102611_g1 (CWINV3) and c102611_g2 (CWINV3) were all expressed in both HS and LS, and the expression of CWINV3 was significantly high in HS and enhanced by ethylene treatment in LS (Fig. 4b). Meanwhile, the cellular pH and activity of ATPases affect sugar transport through the membrane. Expression of c101111_g2 (ATPase2), c79387_g1 (ATPase), c87456_g3 (AATP1) and c87964_g1 (ACA2) were significantly induced by ethylene treatment (Fig. 4c), which may be part of the mechanism by which ethylene increases sucrose accumulation in sugarcane.
The sucrose biosynthesis in the source organ (leaf) also affects its accumulation in sugarcane stem. Starch phosphorylase (SPase) and sucrose phosphate synthase (SPS) are the two key enzymes in sucrose synthesis. SPS catalyzes the binding of UDPG to F6P to form sucrose phosphate and ethylene strongly stimulated SPS transcript accumulation in banana [35]. The expressions of c72504_g1 (SPS1F), c91176_g1 (SPP2), C72030_g1 (SPS1F) displayed no significant difference in HS and LS, while they were all influenced by ethylene treatment in LS (Fig. 4d, Additional file 7: Figure S7). SuSy catalyses sucrose biosynthesis or degradation depended on the availability of UDP-Glucose [36, 37]. Relatively higher expression of c83314_g1 (SUS3) and c96004_g2 (SUS4) were recorded in LS than in HS and their expression was induced by ethylene treatment (Fig. 4d). We also analyzed the impact of ethylene treatment on expression of photosynthetic genes and found that only two proteins in Photosystem II were affected by ethylene (Additional file 8: Figure S8). Activity of these genes, however, did not display obvious correlation with the sugar content in the test plants.
Ethylene caused differential expression of many transcription factors related to abiotic stress, carbohydrate metabolism, hormone metabolism and signaling, and epigenetic modifiers
To further understand the regulatory network of carbohydrate metabolism as affected by ethylene treatment, expression patterns of transcription factors, regulatory proteins as well as the epigenetic factors were analyzed. Genes with the similar expression pattern in at least two pairwise comparisons were selected for further studies. Expression profiles of the genes annotated as transcription factor are shown in Fig. 5a. The expression of transcription factors (TFs) such as c67755_g1, c95935_g1, and c78352_g3 (sequence-specific DNA binding TFs, sugarcane symporter TFs, Zinc ion binding, etc.) was higher in HS than in MS and LS irrespective of the treatment, and they were only slightly up-regulated by ethylene treatment in MS. Similarly, expression of transcription factors c100278_g1, c85125_g2, c93121_g3, c105226_g1, c95494_g4, c99023_g1 and c101031_g1 was higher in HS than in MS and LS, but they were down-regulated following ethylene treatment in MS. Transcription factors c81582_g1, c93409_g3 and c77130-g1 also showed higher expression in HS and MS than LS whereas an opposite trend was observed for transcription factors c94993_g1, c102624_g1, c103585_g1, c97708_g1, c98618_g1, c86855_g1, c83546_g1, c101324_g3, c104225_g1 (TFs associated with protein catabolism, sequence-specific DNA binding, WRKY DNA binding domain, etc.) irrespective of the growing condition. Transcription factors c101112_g1, c93996_g1, c101169_g3, c105527_g1, c62764_g1 and c83361_g1 (TFs associated with ATP binding, protein phosphorylation, cell cycle transcriptional regulation, ERFs, etc.) expressed low in HS and MS than LS, and they were down-regulated by ethylene.
Furthermore, the genes involved in histone modification, protein phosphorylation and DNA methylation modification with similar expression profiles were shown in Fig. 5b and c. Most of the genes annotated were shown to be involved in histone methylation or RNA-directed DNA methylation pathways. Several methyltransferases (e.g. c94499_g2, c86342_g1, c102490_g1 and c83430_g5) were expressed highly in HS or HS and MS than that in LS. The expression of some of them (e.g. c89405_g1) was further induced by ethylene treatment. In contract some methyltransferases such as c102613_g1, c85289_g1 and c67261_g2 had lower expression in HS than in MS and LS, while c91981_g1, c99770_g1, c85603_g1, c92526_g2, c81202_g1, c76393_g1, c68777_g2, c104953_g2 and c79589_g3 showed higher expression in LS than in HS and MS. Similarly, as shown in Fig. 5c, another DNA modification enzymes N-acetyltransferase and histone acetyltransferases (e.g. c82698_g3, c121832_g1, c86464_g1 and c99980_g1 showed differential expression in all three test clones and in response to ethylene treatment with no clear trend in relation to sucrose accumulation.
The DEGs were further analyzed with MapMan, which classified the DEGs into 23 categories. Among them, the most enriched groups were those associated with RNA, regulation of transcription, ERFs/WRKYs, NAC domain transcription factors and hormone metabolism and signaling (Fig. 6a). Among the differentially expressed transcription factors, unigenes c101031_g1, c97708_g1 and c81582_g1 were annotated (NR database) as heat shock factor (HSF), WRKY and ethylene responsive factor (ERF), respectively. By comparing the expression of ERFs in the HS and LS varieties before or after the ethylene treatment, we found that most ERFs were expressed more abundantly in HS than in LS under normal condition, but the relative increase in their activity in response to ethylene treatment was much pronounced in LS varieties than in HS clone (Fig. 6b).
Validation of candidate gene expression and identification of genetic elements associated with sucrose accumulation
According to the results of gene annotation and the gene expression, the genes involved in starch metabolism and sugar transport (c91663_g1, c95560_g1, c71803_g1, c83266_g1, c93760_g1, c42181_g1), transcription regulation factors (c81582_g1, c93121_g3, c77130_g1, c101169_g1 and c83361_g1), and the epigenetic modification (c99770_g1) were further analyzed by qRT-PCR (Additional file 9: Figure S9). The expression patterns of these genes in different genotypes were similar to the transcriptome results. Transcription regulation factors c81582_g1, c93121_g3, and c77130_g1 showed relatively higher level of activity in HS clone ROC22 compared with that in YT71–210 (MS) and GT86–887 (LS). Only the transcript of ERF protein, c81582_g1 increased in response to ethylene treatment in all the three genotypes suggesting its likely involvement in ethylene-mediated sugar accumulation or related growth processes. Starch metabolism and sugar transport-related genes c91663_g1, c95560_g1, c71803_g1 were expressed remarkably high in HS clones and their activity was greatly inhibited by applied ethylene without any impact on sugar content.
The expression levels of c83266_g1, a gene involved in glucose metabolism, a transcription factor c101169_gl and an epigenetic modifier c99770_g1 with protein-glutamate methylesterase were decreased with an increase in sugar content in 3 genotypes in the control. The transcript level of those genes, however, decreased remarkably in all three genotypes following ethylene treatment except for c101169_gl in the medium sugar clone YT71–210. These genes may be acting as negative regulators of sugar accumulation. The high proportion of c91663_g1, c95560_g1 and c71803_g1 transcripts, annotated to be involved in starch metabolism and sugar transport, in ROC22 than the other two genotypes in the control may also account for the observed genotypic difference in sugar content.