Transcriptome analysis of heat stress and drought stress in pearl millet based on Pacbio full-length transcriptome sequencing

Heat and drought are serious threats for crop growth and development. As the sixth largest cereal crop in the world, pearl millet can not only be used for food and forage but also as a source of bioenergy. Pearl millet is highly tolerant to heat and drought. Given this, it is considered an ideal crop to study plant stress tolerance and can be used to identify heat-resistant genes. In this study, we used Pacbio sequencing data as a reference sequence to analyze the Illumina data of pearl millet that had been subjected to heat and drought stress for 48 h. By summarizing previous studies, we found 26,299 new genes and 63,090 new transcripts, and the number of gene annotations increased by 20.18%. We identified 2792 transcription factors and 1223 transcriptional regulators. There were 318 TFs and 149 TRs differentially expressed under heat stress, and 315 TFs and 128 TRs were differentially expressed under drought stress. We used RNA sequencing to identify 6920 genes and 6484 genes differentially expressed under heat stress and drought stress, respectively. Through Pacbio sequencing, we have identified more new genes and new transcripts. On the other hand, comparing the differentially expressed genes under heat tolerance with the DEGs under drought stress, we found that even in the same pathway, pearl millet responds with a different protein.

reduced by approximately 3.6 billion hectares (25% of the world's highlands).
Pearl millet (Pennisetum glaucum (L.) R. Br.) ranks 6th behind the world's most important cereal crops, such as wheat, rice, maize, sorghum and barley [3], and it is grown on an area of 31 million hectare (ICRISAT 2016) in the world. Pearl millet is an ideal plant for bioethanol production and can be used as a sustainable and alternative energy source, due to its high concentration of easily extractable fermentable sugar.
Pearl millet is highly tolerant to heat because its root branchings and flowering speed increase along with the growth of new leaves at certain high temperature levels [4]. The relative growth rate and net assimilation rate (NAR) significantly increase in pearl millet and slightly decrease in maize subjected to high temperature (38/ 27°C) [5]. Previous reports distinguish pearl millet as a crop that is highly resistant toward drought. Under a certain degree of drought stress, pearl millet has a higher resistance to drought than other species of millet (finger millet, Job's tears, barnyard millet, common millet, and foxtail millet) and maintains an unchanged morphology, particularly with respect to the leaf area and shoot fresh and dry weight [6,7]. Therefore, pearl millet is an ideal material to study the mechanism of heat and drought resistance of cereal crops. Studies on the root activity [4], seed germination rate [8] and seedling growth rate of pearl millet under heat stress [9] have previously been reported. Some researchers have cloned and investigated the HSP (sHSP [10], hsp70 [11] and HSP90 [12]) genes in pearl millet, but the presence of only a few reports do not provide sufficient information on the sequencing of pearl millet under heat stress. In addition, studies on the sequencing of pearl millet under drought stress are still limited [13,14].
The genome of pearl millet was reported in 2017 [15], but short reads cannot be mapped to the genes due to incomplete genome annotations. Single-molecule real-time sequencing technology also known as the next-generation sequencing, such as PacBio sequencing, allows the direct production of full-length transcripts, making it ideal for transcript recovery and isoform detection of wellsequenced and/or incomplete genomic sequences [16,17], but its disadvantage is low throughput [18]. Secondgeneration sequencing using the Illumina platform is an effective way to quantify the gene expression and highquality reads. However, due to the short length of the reads produced by the second generation of sequencing, computational assembly is required [18,19]. To avoid these problems, we combined the two sequencing techniques to study the similarities and differences of the molecular mechanism of the response of pearl millet under heat stress and drought stress. First, we used Illumina sequencing data to correct the raw data of the full-length transcriptome, and second, we used the corrected fulllength transcriptome data as a reference sequence to analyze the short sequencing data.
In this study, we obtained the following results: a. We identified 63,090 new transcripts and 26,299 new genes; b. Compared with previous studies, the annotation rate of pearl millet genes increased by 20.18%; c. The heat shock protein under HSPs still function after 48 h of heat stress; d. The plants can still be regulated by the abscisic acid (ABA) pathway under 48 h drought stress; e. Even with the same mechanism of response, plants tend to select different protein species when they face different stresses. This study demonstrated the changes of pearl millet under 48 h heat stress and drought stress at the molecular level, which provided a new concept to study the tolerance of pearl millet. In addition, the new genes that were discovered provide more resources for breeding projects.

Results
A total of 132,488 non-redundant transcripts and 64,878 genes were identified by full-field transcriptome modeling Acquisition of the full-length transcriptome data was based on the third-generation sequencing platform of PacBio Sequel. By filtering the raw data, we removed the connector and original offline data that was less than 50 bp in length to obtain 6,842,837 Subreads based on 17.89 G ( Table 1). The average length of the Subreads was 2615 bp. A total of 354,139 circular consensus (CCS) were obtained through conditional screening (full passes of 1 and quality of 0.80). Finally, 306,369 full length non-chimeric reads (Flnc) were obtained with complete 5′-primers, 3′-primers and poly-A tails in which an average of the Flnc read length was 2897.
The third-generation sequencing technology, represented by PacBio, has the advantage of long read lengths, but the technique has a high single base error rate. To reduce the high error rate, the Illumina data were used for correction. After correcting the data, 132,588 consensus, 3302 N50, and 2006 N90 were obtained. CD-HIT software removed redundant and similar sequences that resulted in 132,488 non-redundant transcripts and 64, 878 genes.
A total of 40,159 (61.90%) genes were annotated as GO terms. The results of the GO enrichment analysis showed that the genes were primarily enriched in metabolic process, cellular process, single−organism process, cell, cell part, membrane, binding, catalytic activity, transporter activity in biological process (BP), molecular function (MF), and cellular component (CC) (Fig. 1c).
A total of 59,981 (92.45%) genes were annotated in the KEGG database (Fig. 1d). The KEGG pathway analysis revealed that 3204 (4.94%) genes were clustered in the carbohydrate metabolism pathway, 2960 (4.56%) genes in the signal transduction pathway, and 2262 (3.49%) in energy metabolism.
We annotated 39,024 (60.15%) genes in the KOG database. A total of 7332 (11.30%) genes were annotated in general function prediction only, while 5256 (8.10%) were in signal transduction mechanisms and 3997 (6.16%) in post-translational modification, protein turnover and chaperones.
A total of 2792 TFs, 1223 TRs, 694 lncRNA, and 1124 alternative splicing events were identified in pearl millet By predicting 132,488 non-redundant transcripts using iTAK software, 2792 genes were predicted to be TFs, and 1223 genes were predicted to be TRs (Fig. 2a). A total of 272 transcripts were predicted to be MYBrelated TFs. MYB-associated transcription factors are important telomere-binding proteins that have the effect of maintaining the integrity of the chromosomal structure and regulating gene transcription, and 235 transcripts were predicted as other TFs. A total of 196 transcription factors were considered C3H transcription factors, which are involved in abiotic stress responses in plants [20]. There are 190 transcripts that were anchored as C2H2 TFs, primarily related to plant growth and development and the response to environmental stresses [21]. In addition, CNCI, CPC, Pfam-scan and PLEK tools were used to predict long non-coding RNA (LncRNA) (Fig. 2b). Based on known annotations, 4266 transcripts were detected as long non-coding proteins through CNCI. Consistent with the NCBI database, 5389 transcripts were predicted as long non-coding proteins by CPC (e-value 1 e-10 ). Translation and identification of transcripts via Pfam-scan (default parameters of -E 0.001 --domE 0.001) showed that 20,233 transcripts were long non-coding proteins. There were 4963 non-coding transcripts belonging to long protein by screening with PLEK (default parameters of -minlength 200), while 694 transcripts were simultaneously screened by CNCI, CPC, Pfam-scan and PLEK.
A total of 1124 alternative splicing events were identified by SUPPA (Fig. 2c). There were 499 genes belonging  to the Retained Intron Type, 300 genes as Alternative 3′ Splice Sites, and 224 genes were related to Alternative 5′ Splice Sites.

Photosynthetic proteins may be involved in resistance to heat stress
There were a total of 6920 differentially expressed genes (DEGs), of which 3555 and 3365 genes were up-and down-regulated, respectively.
GO enrichment analysis displayed that "transmembrane transport" (114 up-regulation and 252 down-regulation), "transport" (308 up-regulation and 444 down-regulation) and "establishment of localization" (308 up-regulation and 444 down-regulation) were the most enriched terms of the biological process. "photosystem II oxygen evolving complex" (5 up-regulation and 7 down-regulation) and "photosystem II" (5 up-regulation and 10 down-regulation) were the most enriched terms of the cellular component (Additional file 2). "transporter activity" (141 up-regulation and 248 down-regulation) and "monooxygenase activity" (104 up-regulation and 10 down-regulation) were the most enriched terms of the cellular component. We found that "RNA splicing", "anatomical structure morphogenesis", "regulation of translation", "spliceosomal complex", "translation regulator activity", "zinc ion binding" and "nucleic acid binding" were the most enriched GO terms among the differentially expressed up-regulated genes. "single−organism localization", "single−organism transport", "transmembrane transport", "intrinsic component of membrane", "integral component of membrane", "membrane part", "transmembrane transporter activity", "transporter activity" and "diacylglycerol O−acyltransferase activity" were the most enriched GO terms among the differentially expressed down-regulated genes (Fig. 3a).
The KEGG enrichment analysis of 6920 DEGs showed that "Photosynthesis -antenna proteins", "Circadian rhythm-plant" and "Starch and sucrose metabolism" were the most enriched pathways. A KEGG enrichment analysis of 3555 up-regulated DEGs revealed that "Spliceosome", "Valine, leucine and isoleucine degradation" and "Valine, leucine and isoleucine degradation" were the most enriched terms, and a KEGG enrichment analysis of 3365 down-regulated differentially expressed genes indicated that "Photosynthesis -antenna proteins", "Glycerophospholipid metabolism" and "Circadian rhythm -plant" were the most enriched pathways (Fig. 3b).

Glycerophospholipid metabolic pathway may play an important role in the drought tolerance of pearl millet
By comparing the gene expression levels under drought treatment with the control conditions, a total of 6484 DEGs (P < 0.05) were shown to be up-or downregulated between samples collected at 48 h. There were 3041 up-regulated and 3443 down-regulated DEGs.
The results of the KEGG enrichment analysis indicated that the DEGs were primarily enriched in "Photosynthesis -antenna proteins", "Glycerophospholipid metabolism" (Additional file 3) and "Glycolysis/Gluconeogenesis" pathway. The up-regulated DEGs were primarily enriched in "Glycerophospholipid metabolism", "Valine, leucine and isoleucine degradation" and "Arginine and proline metabolism" pathway. The down-regulated DEGs were primarily enriched in "Carbon fixation in photosynthetic organisms", "Photosynthesis -antenna proteins" and "Porphyrin and chlorophyll metabolism" pathways ( Fig. 5b).
A total of 1881 genes were involved in both heat stress and drought stress responses A total of 1881 DEGs were simultaneously found in both heat stress and drought stress (Fig. 5a). GO enrichment analysis results show that the most enriched GO terms were "metabolic process", "cellular process", "membrane", "membrane part", "catalytic activity", and "binding" (Fig. 5a). The KEGG enrichment analysis showed that "Photosynthesis -antenna proteins", "Alanine, aspartate and glutamate metabolism", "Brassinosteroid biosynthesis", and "Terpenoid backbone biosynthesis" were the most enriched pathways (Fig. 5a). Among the 1881 DEGs, 500 genes were up-regulated under both heat stress and drought stress (Fig. 5b). Through GO enrichment analysis, it was found that the 500 DEGs were primarily enriched in "cellular process", "metabolic process", "membrane", "cell", "binding", and "catalytic activity" GO terms (Fig. 5b). A KEGG enrichment analysis found that the main enrichments were "Estrogen signaling pathway", "Antigen processing and presentation", and "NOD-like receptor signaling pathway" pathway (Fig. 5b). A total of 1077 DEGs were simultaneously down-regulated under heat stress and drought stress (Fig. 5c). GO enrichment analysis showed that the main enrichment was in "metabolic process", "cellular process", "membrane", "membrane part", "catalytic activity", and "binding" GO terms (Fig. 5c). A KEGG enrichment analysis found that the main enrichment was in "photosynthesis -antenna protein", "alanine, aspartic acid and glutamate metabolism", and "nitrogen metabolism" (Fid 5c). A total of 122 DEGs were up-regulated under heat stress but down-regulated under drought stress. GO enrichment analysis of the 122 DEGs showed that "metabolic process", "cellular process", "membrane", "membrane part", "binding", "catalytic activity" were the most enriched GO terms (Fig. 6a). The results of the KEGG enrichment analysis indicated that 122 differentially expressed genes were primarily enriched in the "Thiamine metabolism", "Cyanoamino acid metabolism", "Starch and sucrose metabolism", and "Phenylpropanoid biosynthesis" pathways (Fig. 6b). A total of 182 DEGs were down-regulated under heat stress but were upregulated under drought stress. GO enrichment analysis of the 182 DEGs revealed that the main enrichment was at the "single-organism process", "metabolic process", "membrane", "membrane part", "catalytic activity", and "binding" GO terms (Fig. 6c). The results of a KEGG enrichment analysis indicate that the main enrichment is in the "Glycerophospholipid metabolism", "Monoterpenoid biosynthesis", and "Phosphonate and phosphinate metabolism" pathways (Fig. 6d).

Discussion
A total of 63,090 new transcripts, 26,299 new genes were identified, and annotations increased by 20.18% Pearl millet is a versatile grain that can be used as grain, hay and green fodder. It is distributed in arid and semiarid regions, such as India and sub-Saharan Africa [22]. In recent years, there have been many reports on the sequencing of corn [23,24], rice [25] and wheat [26], but only a few that examined pearl millet. Some researchers constructed a genetic linkage map of 640 cM using GBS sequencing (genotyping-by-sequencing) to identify the genes associated with Striga and other agronomic traits [27]. In April 2016, researchers performed an RNA-Seq analysis of pearl millet planted in two climate gradients to explore their adaptability to climate [28]. In May 2016, researchers constructed the highest density genetic linkage map using GBS sequencing. The total average density of SNP markers in the map was 23.23 SNP/cM, and the total length was 716.7 cM [29]. In June 2016,  [31]. To the best of our knowledge, there have been no reports of the full-length transcriptome of the pearl millet in previous studies. In this study, we performed singlemolecule long-reading sequencing of total RNA from 12 samples at four stages (three-leaf stage, five-leaf stage, flowering stage and heading stage) of pearl millet. In this study, we identified 132,488 transcripts (average length 3102 bp, N50 3302 bp) containing 63,090 more transcripts than the previously reported 69,998 (average length 725 bp, N501014 bp) transcripts in 2017 in same crop (Table 1). A total of 64,878 genes were identified, of which 26,299 were newly discovered. In this study, 27, 190 (41.91%) genes were simultaneously annotated in seven databases (Nt, Nr, Pfam, KOG, Swiss-prot, KEGG, and GO), and 62,436 (96.24%) genes were annotated in at least one database. We annotated 59,981 genes by KEGG, which is 38,526 more genes (21,455 genes) than had been previously reported. We annotated 51,764 genes with SwissProt, 30,238 more genes than the 21, 455 that had been previously identified. This information provides more resources for the identification and uncovering of important genes in pearl millet. We found that 1507 DEGs were up-regulated or down-regulated under heat stress and drought stress. We suggest that researchers study these genes in more detail, which are not only related to heat resistance but also to drought resistance.

Expression of classical heat-related genes in pearl millet
Previously published literature revealed that pearl millet is highly resistant to heat stress, but no information regarding RNA sequencing exists. We performed RNA sequencing on heat-stressed pearl millet and discovered DEGs associated with reactive oxygen species (ROS) produced by exposure to heat stress. As a starting signal of plants, ROS prompts local tissues to respond to local abiotic stress stimulation, but the expression of the ROS genes are temporary [32,33]. We identified nine DEGs associated with ROS production, two DEGs encoding amine oxidases (AOC), and the remaining seven DEGs encoding polyamine oxidases (PAO). With the exception of two genes that were up-regulated, the others were down-regulated (Additional file 4). These results indicated that a variety of genes involved in ROS production were inhibited after 48 h of heat stress. This is consistent with previous studies of the expression of genes involved in ROS production that have not been continuously expressed under stress [14]. ROS scavenging enzymes play an important role in protecting plants from temperature stress and are essential for the detoxification of ROS produced under stress conditions [34]. We found that 10 DEGs encoded ROS scavenging enzymes, five of which were up-regulated (Additional file 4), while the other five genes were down-regulated. These data suggest that when ROS is reduced, ROS clearance enzymes will decrease, and this effect is not sustained. The accumulation of heat shock proteins (HSP) under the control of heat stress transcription factors (HSFS) plays an important role in the heat stress response (HSR) and subsequently produces heat tolerance in plants and other organisms [35,36]. In this study, we found that 30 DEGs were associated with HSPs; three DEGs encoded HSP100; 14 DEGs encoded HSP90; 12 DEGs encoded HSP70, and one DEG encoded sHSP, and all were upregulated DEGs (Additional file 4). We hypothesize that the heat shock proteins still function even if the plants are subjected to sustained heat stress.

Expression of typical drought-related genes in pearl millet
Under drought stress, the ABA content of plants increases, which regulates the opening and closing of pores, reduces water loss and maintains the moisture Fig. 7 Analysis of DEGs specific to heat stress or drought stress. a GO analysis of DEGs specific to heat stress. b KEGG analysis of DEGs specific to heat stress. c GO analysis of DEGs specific to drought stress. d KEGG analysis of DEGs specific to drought stress content of plants [37,38]. Under drought stress, we found five DEGs encoding 9-cis-epoxycarotenoid dioxygenases (NCED), 1 DEG encoding zeaxanthin epoxidase (ABA1), and two DEGs encoding shortchain dehydrogenases/reductases (ABA2), four of which were up-regulated DEGs, and the other four were down-regulated DEGs (Additional file 5). We also discovered a DEG encoding PYL8, which is a receptor for ABA. The gene was down-regulated under drought stress (Additional file 5). This indicates that under 48 h of drought stress, the ABA pathway still functions in the drought stress response. Previous studies have found that slow anion channel correlation (SLAC) regulates the opening and closing of stomata, which is necessary for stomatal closure [39]. After 48 h of drought stress, a SLAC-encoding DEG was identified, which indicates that even after prolonged drought stress, pearl millet reduces water evaporation by regulating stomatal closure. (Additional file 5). The Asr gene family is classified as a new group of late embryogenic abundant (LEA) genes and is involved in adaptation to drought stress [40]. Under stress condition, the expression of the Asr gene was up-regulated. We found two Asr genes with different levels of expression, which were up-regulated under drought stress and are consistent with previous reports. (Additional file 5).
In the same molecular mechanism, the pearl millet selects different proteins to respond to different stresses The results showed that 5039 DEGs and 4603 DEGs were regulated under heat stress and drought stress, respectively, indicating that plants have different molecular mechanisms for heat and drought stress [41]. In addition, even if the same mechanism exists under both types of stresses, the types of proteins involved, the number of proteins, and the choice of related genes are different, such as ROS clearance mechanisms. Under drought stress, we found 14 DEGs associated with ROS scavenging enzymes, two DEGs encoding SOD, 11 DEGs encoding APX, and one DEG encoding GPX, six of which were up-regulated DEGs, while the other eight DEGs were down-regulated. The gene encoding CAT was significantly expressed 48 h after heat stress, but the expression of the gene was not significant under drought stress. The number of DEGs encoding APX proteins was higher under drought stress than heat stress. The number and type of ROS scavenging enzymes expressed under drought stress were different than those expressed from heat stress, indicating that ROS scavenging enzymes are specific to the type of stress (Table 2). A total of 1881 DEGs were collectively expressed under heat stress and drought stress, indicating that these 1881 DEGs play important roles under both types of stresses.
Among the 1881 DEGs, 1577 DEGs with the same pattern of expression (both up-regulated or down-regulated under heat stress and drought stress) are essential to combat heat stress and drought stress, such as HSPs. HSPs are essential molecular chaperones in eukaryotic cells that play an important role in the folding and activation of proteins involved in signal transduction and regulation of the cell cycle. Under drought stress, we found that 26 DEGs encode HSP, of which 11 DEGs encode HSP70, 14 encode HSP90, and one encodes sHSP, all of which up-regulate DEGs (Table 3). Among them, 12 DEGs encoding HSP existed in both heat stress and drought stress. There were 304 DEGs with different expression patterns under heat stress and drought stress, indicating that different stresses have different levels of regulation of the same gene.

Conclusions
We combined two sequencing technologies to study the similarities and differences of the molecular mechanism of millet under heat and drought stress. In this study, 63,090 new transcripts and 26,299 new genes were identified, and the functional annotation of genes was improved by 20.18%. Under heat and drought stress, 6920 and 6484 genes were differentially expressed, respectively, and 1881 differentially expressed genes were present in both types of stresses. This information provides additional resources to identify and unearth important genes in pearl millet. In addition, we found that under the same mechanism of response, plants have different protein choices when faced with different types of stresses. This lays the foundation to study the heat and drought resistance mechanism of pearl millet.

Plant materials, cultivation and treatment
The pearl millet variety Tifleaf 3 used in this study was provided by Sichuan Agricultural University and is preserved in the Department of Grassland Science, College of Animal Science and Technology, Sichuan Agricultural University, Ya'an, Sichuan Province, China. Pearl millet seeds were grown in pots (10*15 cm) containing quartz sand and placed in a growth chamber. The pots were exposed to 14 h of light at 26°C and 10 h of darkness at 22°C for 13 days with 50% Hoagland's nutrient solution (1 mM MgSO 4 , 1 mM KH 2 PO 4 , 1 mM NH 4 NO 3 , 0.5 mM CaCl 2 , 0.1 mM FeNa-EDTA, 25 mM NaCl, 0.1 mM H 3 BO 3 , 0.1 mM Na 2 SiO 3 , 1.5 μM CuSO 4 , 50 μM KCl, 10 μMMnSO 4 , 0.075 μMNa 2 MoO 4 and 2 μM ZnSO 4 ). Thirteen-day-old plants were divided into three groups (control, heat treatment group and drought treatment group). The culture conditions for the control group were 14 h of light at 26°C and 10 h of darkness at 22°C with 50% of Hoagland's nutrient solution. The plants in the heat treatment group were exposed to 14 h of light at 40°C and 10 h of darkness at 35°C with 50% of Hoagland's nutrient solution. The plants in the drought treatment group were exposed to 14 h of light at 26°C and 10 h of darkness at 22°C with 20% PEG (polyethylene glycol 6000) dissolved in 50% of Hoagland's nutrient solution. All the treatment time is 48 h (2 days).

RNA preparation for Iso-Seq
Leaf and root samples of the treated plants were collected separately at the three-leaf and five-leaf stages. When the pearl millet was in the heading stage, the spikelet, leaves, stems and roots were collected. While at the flowering stage, the spikelet, leaves, stems and roots were collected, and these samples were immediately stored at − 80°C. The RNA was extracted using the RNeasy Plant Mini Kit [42], and the RNA quality was analysed using RNA gel electrophoresis. Equal amounts of RNA from 12 samples (1 μg per sample) were pooled together to form total RNA, and then the SMRT library was prepared using 3 μg total RNA.

PacBio Iso-Seq library preparation and sequencing
The Iso-Seq library was prepared using the Isoform Sequencing protocol (Iso-Seq) with the Clontech SMAR-Ter PCR cDNA Synthesis Kit and the BluePippin Size Selection System protocol as described by Pacific Biosciences (P/N100-377-100-05 and P/N100-377-100-04) with some modifications. First, 3 μl of total RNA solution was added to deionized water containing a single primer and incubated at 72°C for 3 min (3′ SMART primer IIA from the Clontech SMARTer kit 5′-AAG-CAGTGGTATCAACGCAGAGTACTNN-3′). Next, the SMARTER II A oligonucleotide (from the Clontech SMARTer kit 5′-AAGCAGTGGTATCAACGCAGAG TACXXXXX-3′), 5X first-strand buffer, DTT, dNTP mix, RNase inhibitor and SMARTScribe reverse transcriptase were added to the mixture and incubated at 72°C for 1 h. Finally, the reaction was terminated at 70°C. After 23 PCR cycles, the length of the PCR product was screened by the BluePippin Size Selection System and divided into 1-2 Kb, 2-3 Kb and 3-10 Kb fragments. After size screening, the cDNA was subjected to 12 cycles of PCR reactions. The PCR amplification  Since the frequency of nucleotide indels and mismatches in the Iso-Seq reads were much higher than those in the shorter high-throughput sequencing, LoR-DEC software was used to correct additional nucleotide errors in consensus reads based on Illumina RNA-Seq data. The redundancy in the corrected readings was obtained using software CD-HIT (−c 0.95 -T 6 -G 0 -aL 0.00 -aS 0.99) to obtain the final transcript for subsequent analysis.

CDS prediction
Protein coding sequences from cDNA were identified by the ANGEL pipeline (a long-read implementation of ANGLE). We used closely related species to ensure that the protein sequences were ANGEL-trained and then performed an ANGEL prediction for a given sequence.

TF analysis and Lnc-RNA analysis
Transcription factors were predicted by iTAK software. We predicted lncRNA using four software: a) The CNCI (Coding-Non-Coding-Index, parameters as default parameters) was an effective software to distinguish the protein-encoding and non-coding sequences by profiles adjoining nucleotide triplets without relying on known annotations. b) The CPC (Coding Potential Calculator) was used to assess the ORF extent and quality of transcripts and search the sequenced base eukaryote protein database from NCBI (e value "1 e-10 ") to identify coding transcripts and non-coding transcripts. c) All the transcripts were translated, and then Pfam Scan (−E 0.001 --domE 0.001) was used to determine if they have a domain of a known protein family. d) Predicting Lnc-RNA with PLEK SVM classifier (−minlength 200). The PLEK SVM classifier identifies transcripts of encoded proteins by optimizing the K-mer approach, which eliminates the need for reference genomes and annotations. All of the four software programs described above identified noncoding transcripts, which were identified as Lnc-RNA.

RNA preparation for RNA-Seq
Three treatments were conducted simultaneously. After 48 h of treatment, we randomly selected the leaves of 16 seedlings and collected them in a cryogenic vials and immediately stored at − 80°C (Additional file 1). Three biological replicates were set for each treatment. The RNA was extracted from the samples using the RNeasy Plant Mini Kit, and the RNA quality was checked using RNA gel electrophoresis.

RNA-Seq library preparation and sequencing
The purity of RNA was detected using a NanoDrop spectrophotometer (California, USA), and the concentration of RNA was determined by a Qubit RNA assay kit in a Qubit 2.0 fluorometer system (California, USA). The library was constructed using the NEBNext® UltraTM Directional RNA Library Prep Kit for Illumina® (California, USA). The NEBNext®Poly (A) mRNA Magnetic Isolation Module was used to enrich the mRNA, and Fragmentation Buffer was added to break the mRNA into short segments. A strand of cDNA was synthesized with random hexamer primers. The second strand of cDNA was synthesized by adding buffer, dNTPs and DNA polymerase I. The double strand cDNA was purified by AMPure XP beads. The purified cDNA was repaired at the end; a tail was added and sequenced, and the fragment size was selected by AMPure XP beads. Finally, the final cDNA library was obtained by PCR enrichment. Qubit 2.0 was used for preliminary quantification; Agilent 2100 was used to detect the inserted fragments of the library, and finally Illumina