Full-length transcriptome sequencing reveals a low-temperature-tolerance mechanism in Medicago falcata roots

Low temperature is one of the main environmental factors that limits crop growth, development and production. Medicago falcata is an economically and ecologically important legume that is closely related to alfalfa and exhibits better tolerance to low temperature than alfalfa. Understanding the low-temperature-tolerance mechanism of M. falcata is important for the genetic improvement of alfalfa. In this study, we explored the transcriptomic changes in low-temperature-treated M. falcata roots by combining SMRT and NGS technologies. A total of 115,153 nonredundant sequences were obtained, and 8,849 AS events, 73,149 SSRs and 4,189 LncRNAs were predicted. A total of 111,587 genes from SMRT were annotated, and 11,369 DEGs were identified in this paper that are involved in plant hormone signal transduction, protein processing in endoplasmic reticulum, carbon metabolism, glycolysis/gluconeogenesis, starch and sucrose metabolism, and endocytosis pathways. We characterized 1,538 TF genes into 45 TF gene families, and the most abundant TF family was WRKY, followed by ERF, MYB, bHLH and NAC. A total of 134 genes were differentially coexpressed at all five temperature points, including 101 upregulated genes and 33 downregulated genes. PB40804, PB75011, PB110405 and PB108808 were found to play crucial roles in the tolerance of M. falcata to low temperature. The WGCNA results showed that the MEbrown module was significantly correlated with low-temperature stress in M. falcata. Electrolyte leakage was correlated with most genetic modules and corroborated that electrolyte leakage can be used as direct stress markers to reflect cell membrane damage from low-temperature stress in physiological assays. The consistency between the qRT-PCR results and RNA-Seq analyses confirm the validity of the RNA-Seq data and the analysis of the

regulation of low-temperature stress in the transcriptome.

Conclusions
The full-length transcripts generated in this study provided a full characterization of the gene transcription of M. falcata and are useful for mining new low-temperature stressrelated genes specific to M. falcata. These new findings facilitate the understanding of low-temperature-tolerance mechanisms in M. falcata.

Background
Low temperature is one of the main environmental factors that limits plant growth, development and geographical distribution [1]. Low-temperature stress, consisting of chilling stress (<10 °C) and freezing stress (<0 °C), can reduce crop productivity to some extent [2]. The effects of low temperature on plants depend on developmental stage and exposure time. Under low-temperature stress, young tissues and organs are more seriously damaged than old tissues and organs, and the reproductive stage is more sensitive to low temperature than the vegetative stage [3]. Exposure to low temperature causes physiological and molecular changes in plants, such as the inhibition of photosynthetic activity, reduced water uptake, oxidative stress via an increase in reactive oxygen species (ROS) accumulation, an increase in intracellular pH and osmotic pressure, and functional abnormalities in chloroplasts, mitochondria and other organelles, which lead to metabolic disorders in plants [4][5][6]. Furthermore, low temperature temporarily inhibits sucrose synthesis, and the rearrangement of the membrane causes changes in the stability and mobility of proteins and a shift in redox homeostasis, which decreases enzyme activities and alters metabolism homeostasis and gene transcription [7,8] Plants have developed many mechanisms and pathways that enable them to minimize the negative effect of low temperature and to grow and reproduce successfully [9]. The osmolytes induced by low-temperature stress, including proline, soluble sugar, and cold-induced stress proteins (dehydrins and LEA proteins), can improve the cell osmotic potential, protect the stability of biological membranes, alleviate oxidative damage limitation, and even act as signals to regulate the expression of stress-related genes [10,11]. The overexpression of SlNAM1, a typical NAC gene, improves low-temperature tolerance in transgenic tobacco by improving osmolytes and reducing the H 2 O 2 and superoxide anion radical (O 2 .-) contents under low temperature, which contribute to alleviate the oxidative damage of the cell membrane after low-temperature stress [12].
Low temperature induces the production of Ca 2+ , which can be sensed by corresponding receptors, among which lipid Ca 2+ channels may be the primary cryogenic signal receptors, and can then activate calcium response protein kinase (CPKs CIPKs, and CRLK1) and MAPK cascade responses, which regulate cold-responsive (COR) gene expression [13,14]. The overexpression of COLD1 (jap) significantly enhances chilling tolerance by interacting with the G-protein alpha subunit to activate the Ca 2+ channel to sense low temperature and to accelerate G-protein GTPase activity [15]. Low-temperature stress rapidly induces the expression of many transcription factors (TFs), including the AP2 domain proteins CBFs, which then activate the expression of numerous downstream COR genes [16][17][18]. The expression of the CBF gene is controlled by upstream TFs, such as the bHLH TF ICE1. ICE1 is subjected to sumoylation and polyubiquitylation and subsequent proteasomal degradation, mediated by the SUMO E3 ligase SIZ1 and ubiquitin E3 ligase HOS1, respectively [13,16].
Single-molecule long-read sequencing (SMRT), which was developed by PacBio Biosciences RSII, provides a third-generation sequencing platform. Because of its long reads, it is widely used in genome sequencing and has eliminated many restrictions in sequencing by generating full-length or long sequences [19][20][21][22]. Next-generation sequencing (NGS)

Physiological responses of M. falcata under low-temperature stress
We examined the electrolyte leakage; malondialdehyde (MDA) content; superoxide dismutase (SOD), catalase (CAT) and peroxidase (POD) activities; and superoxide anion radical (O 2 .-), soluble protein, reduced glutathione (GSH), proline and soluble sugar contents to investigate the physiological changes in M. falcata roots exposed to lowtemperature stress for 2 h (Fig. 1). Under low-temperature stress, the electrolyte leakage increased gradually with decreasing temperature (Fig. 1A). The highest electrolyte leakage was observed at -15 °C, which was 4.18 times higher than that in the control environment. The MDA content decreased gradually and peaked at -10 °C (relative to the control) and slightly decreased at -15 °C (Fig. 1B). No obvious difference was observed in SOD activity, and this value increased after low-temperature treatment and decreased from 4 °C to -15 °C (Fig. 1C). The changes in CAT and POD activities were completely contrasting under different temperatures. CAT activity first increased, and POD activity first decreased (Fig. 1D -E). The content of O 2 .increased significantly after lowtemperature treatment and reached a peak at -10 °C (Fig. 1F). The content of soluble protein decreased first after low-temperature treatment and increased after 4 °C and then decreased significantly after 0 °C (Fig. 1G). The GSH content increased significantly after low-temperature treatment and reached a peak at -5 °C (Fig. 1H). The proline content decreased slightly in the low-temperature environment (Fig. 1I). The highest soluble sugar content was measured at -10 °C (Fig. 1J).  Figure S1).  Figure S2).

Annotation and expression of transcripts under low-temperature stress
To evaluate gene expression levels in response to low-temperature stress, we mapped all clean data back onto the assembled transcriptome, and the readcount for each gene was obtained from the mapping results by using RNA-Seq by Expectation Maximization (RSEM) software (Additional file 7: Table S5) [46]. The mapped readcount for each gene was then converted into the expected number of fragments per kilobase of transcript per million mapped reads (FPKM) (Additional file 8: Table S6). FPKM values can eliminate the impact of transcript length and sequencing differences on computational expression. The boxplot diagram of FPKM values indicated that gene expression levels were not evenly distributed in the different experimental environments ( Fig. 2A). The Pearson correlation coefficient was used to evaluate the correlations of each biological sample, with r 2 values close to 1 indicating a strong correlation between two replicate samples (Fig. 2B). Then, all sequences were used for further DEG analysis after excluding the abnormal samples.

Analysis of DEGs in response to low-temperature stress
In total, 11,369 DEGs displaying up-or downregulation between samples (fold change≥2 and false discovery rate (FDR)<0.01) collected at any pair of temperature points were identified by comparing gene expression levels under low-temperature stress (Additional file 9: Table S7). Clustering patterns of DEGs under low-temperature stress were determined by hierarchical cluster analysis of all DEGs (Fig. 3A). All DEGs with the same or similar expression levels were clustered, and a set of genes was quickly activated during the early stage of low-temperature stress (4 °C), and other genes were activated under a freezing temperature (-10 °C). The 11,369 DEGs identified were grouped into six subclusters by K-means coexpression cluster analysis (Fig. 3B). The expression level of genes in subcluster 1 (1,271 genes) began to increase after the temperatures dropped to -5 °C and reached a maximum at -10 °C; then, the expression levels decreased rapidly.

Identification of putative TFs
TFs play an important role in cell function and development and directly regulate gene expression through interactions with themselves and other proteins to participate in plant stress regulations, including low-temperature-related processes. In this study, 1,538 TF genes were differentially expressed between different temperature points and were classified into 45 TF gene families according to PlantTFDB (http://planttfdb.cbi.pku.edu.cn/) (Additional file 10: Table S8). The most abundant TF family was the WRKY (186 genes) family, followed by the ERF (165 genes), MYB (143 genes), bHLH (131 genes) and NAC (111 genes) families.

Comparison of coexpressed genes between the low-temperature-treated and control samples
Under low-temperature stress, a total of 8,683 DEGs were identified by a comparison of each temperature point to the control environment. Interestingly, the number of upregulated DEGs (5,876 genes) was much higher than the number of downregulated DEGs (2,807 genes), and 134 genes were differentially coexpressed at all five temperature points (Additional file 11: Table S9). As shown in Fig. 4A, there were 101 upregulated genes and 33 downregulated genes in all five comparisons. These 134 coexpressed genes were assigned into the GO categories of biological process, cellular component and molecular function (Fig. 4B). In the biological process category, "metabolic process", "cellular process", "single-organism process" and "biological regulation" were the most enriched terms. In the cellular component category, "cell", "cellular component" and "organelle" were the most enriched. In the molecular function category, "binding" was the most enriched term, followed by "catalytic activity". COG functional classification of the 134 coexpressed genes showed that most of the genes were enriched in "General function prediction only", "Transcription", "Replication, recombination and repair", "Signal transduction mechanisms" and "Carbohydrate transport and metabolism" (Fig. 4C). KEGG enrichment showed that most of the coexpressed genes were enriched in the "Circadian rhythm -plant", "Cysteine and methionine metabolism", "Arginine and proline metabolism", "Phenylpropanoid biosynthesis", "Galactose metabolism", "Plant-pathogen interaction" and "Biosynthesis of amino acids" pathways ( Fig. 4D).

low-temperature stress
WGCNA was performed to obtain a better understanding of which genes within these complex signaling networks were the most connected hubs. The number of genes in the module was clustered according to the expression levels, and genes with a high clustering degree were allocated to the same models. The 8,683 DEGs identified by comparing lowtemperature-treated and control samples were clustered based on topological overlap, and then the gene modules were obtained from the dynamic tree cut. Finally, 12 gene modules were identified after merging modules with similar expression patterns (Fig. 5A). The magenta modules contained the most genes, 1,092 genes, and the violet nodules contained the fewest genes, 70 genes (Additional file 14: Table S10). The gray module was not a true module but a place to categorize the remaining genes that were not well correlated with any one of the significant colored modules. The kME (module eigengenebased connectivity) was calculated for each gene to every module, and 449 genes were found to act as a hub in more than one module.
All 12 genetic modules with a module characteristic value P < 0.05 were used to find the modules that were highly correlated with samples and physiological indicators (Fig. 5B). MEbrown module genes were involved in 23 major categories, including "General function prediction only", "Signal transduction mechanisms", "Transcription" and "Replication, recombination and repair" (Additional file 13: Figure S3A). GO analysis results showed that the MEbrown module genes were mainly involved in 13 biological processes, such as "protein phosphorylation", "regulation of transcription, DNA-templated", and "oxidationreduction process", were distributed to eight cellular component terms, such as "integral component of membrane", "plasmodesma", "chloroplast stroma" and "nucleus", and consisted of 14 molecular functions, which included ATP binding, protein serine/threonine kinase activity, and cation binding (Additional file 13: Figure S3B). KEGG enrichment showed that most of the genes in the MEbrown module were enriched in "Starch and sucrose metabolism", "Plant-pathogen interaction", "Circadian rhythm -plant", "Protein processing in endoplasmic reticulum", "Galactose metabolism" and "Plant hormone signal transduction" (Additional file 13: Figure S3C).

Confirmation of RNA-Seq sequencing data by qRT-PCR analysis
The DEGs associated with low-temperature stress were selected for qRT-PCR assays to confirm the SMRT sequencing data. Eighteen genes were selected randomly from 134 DEGs coexpressed at all five temperature points. We found that the fold-changes in the expression calculated by sequencing data did not exactly match the expression values detected by qRT-PCR analysis, but the expression profiles were basically consistent for all 18 genes (Additional file 14: Figure S4). These analyses confirmed the reliability of the gene expression values generated from SMRT sequencing data.

Discussion
The approach of combining NGS and SMRT has become increasingly popular in researching plant responses to adverse environments for providing high-quality and increasingly complete assemblies at the transcriptome level [26-28]. SMRT can generate full-length or long sequences, and the high error rate can be overcome using short and high-accuracy NGS reads [19,22]. In this study, we combined SMRT and RNA-Seq methods to analyze the transcriptome assembly for M. falcata roots under low-temperature stress and identify key functional and regulatory genes involved in low-temperature tolerance. Ultimately, we obtained 115,153 nonredundant sequences, and the average ROI was long enough to represent the full-length transcripts (Table 1). We also predicted a total of 8,849 AS events, 73,149 SSRs and 4,189 LncRNAs.

Changes in physiological indicators
Our results indicated the complexity of physiological changes in M. falcata in response to low-temperature stress. Under low-temperature stress, the roots displayed relatively more extensive changes in membrane, antioxidants and osmolytes. An increase in the contents of MDA, proline, soluble sugar and electrolyte leakage has been demonstrated in coldtreated wheat and wild tomato [25,47]. During low-temperature domestication, M. falcata accumulates more sucrose and proline and high sucrose phosphate synthase (SPs) and sucrose synthase activity [48]. In this paper, the increase in MDA, electrolyte leakage, proline and soluble sugar contents was observed in low-temperature-treated M. falcata roots, suggesting that osmolytes might protect plant cell membranes, increase membrane stabilization, and balance osmotic pressure during low-temperature-induced dehydration in M. falcata; additionally, electrolyte leakage can be used as a direct stress marker to reflect cell membrane damage by low-temperature stress (Fig. 1). Low-temperature stress induces the activities of cell apoptosis factors, and it has been proven that the injury in M.
falcata under low temperature is related to proteins in cells [49]. The enzymatic antioxidant system is a protective mechanism employed to eliminate or reduce ROS and increase a plant's capacity for stress tolerance under low-temperature stress [50,51]. The changes in CAT, POD, and GSH may play a key role in the detoxification of ROS induced by low temperature in M. falcata roots.

Gene expression of M. falcata roots in response to low-temperature stress
In this study, a total of 11,369 DEGs were identified as responsive to low-temperature stress at all temperature points, of which 68.8% were induced and 31.2% were repressed under low-temperature stress. All DEGs were grouped into six subclusters (Fig. 3B), and an enrichment analysis was conducted with the KEGG pathway. In the organismal systems category, the most enriched pathway was "Plant-pathogen interaction", indicating a basic plant immunological response in M. falcata roots under low-temperature stress [52]. and plays a key role in alleviating ROS [54,55].

Genes encoding TFs in response to low-temperature stress
TFs play an important role in cell function and development and directly regulate gene expression through interactions with themselves and proteins to participate in plant stress regulatory processes, including low-temperature-related processes. Many TFs, including bHLH, bZIP, MYB, C2H2, ERF, NAC and WRKY, that confer tolerance to low temperature to plants have been identified using transcriptomic approaches [56]. In this study, 1,538 TF genes were differentially expressed between different temperature points and classified into 45 TF gene families. The most abundant TF family was WRKY, followed by ERF, MYB, bHLH and NAC. Our results are consistent with previous reports on TFs in plant low-temperature stress and suggest that the WRKY family plays a critical role in M. falcata low-temperature tolerance.

Identification of genes responsible for the response to low temperature
Plants have developed many mechanisms and pathways that enable them to minimize the negative effect of low temperature. Global analysis of stress-responsive genes facilitates the understanding of the plant response to low-temperature stress. In this paper, a total PB40804 was also annotated as "Amino acid transport and metabolism" in COG. The abundance of amino acid transporters is correlated with multitude fundamental roles in plant growth and development, and low-temperature stress could decrease the amino acid concentrations and alter their composition [59,60]. "Circadian rhythm -plant" was the most enriched pathway in KEGG, and we found that PB110405, the gene with the most GO terms in the biological process category, was enriched in this pathway. PB110405 was annotated with 9 GO terms in the biological process category, including "regulation of transcription, DNA-templated", "temperature compensation of the circadian clock", "response to hydrogen peroxide", "starch metabolic process" and "response to cold". The "Circadian rhythm-plant" pathway suggested that ambient temperatures in M. falcata were substantially influenced by low temperature. PB108808, a putative ortholog of MYB-related TF LHY in Arabidopsis, is a gene that is induced by low temperature and that indicates the presence of interplay between circadian rhythm and the response to low temperature in M.
falcata. The following pathway was "Arginine and proline metabolism". Arginine and proline metabolism is one of the central pathways for the biosynthesis of the amino acids arginine and proline [27]. Proline accumulation is a well-known measure for alleviating abiotic stress in plants [61]. Combined with the changes in proline contents under lowtemperature stress, our results indicated that osmotic regulatory substances, protective protein molecules in M. falcata, play important roles in the response to low-temperature stress, and the composition of aromatic compounds may change under low-temperature stress.

Identification of genetic modules corresponding to low-temperature stress
WGCNA, known as gene coexpression network analysis, is a systematic biological method that can be applied to the study of biological processes with multiple sources [62]. It has been proven that WGCNA can be an efficient data mining method, specifically for screening genes related to traits and for conducting modular classification to obtain coexpression modules with high biological significance [63]. In this paper, 8,683 DEGs were identified by comparing low-temperature-treated and control samples and were clustered into 12 gene modules after merging modules with similar expression patterns (Fig. 5). It was found that the MEbrown module was significantly correlated with low-temperature stress in M. falcata. GO enrichment analysis of MEbrown showed that regulatory pathways with biological significance could be obtained in this module. For example, we enriched "cold-response pathways", "response to stress" and "intracellular signal transduction" terms in GO annotation. COG classification illustrated that the MEbrown module was enriched in many DEGs in general function prediction only and the signal transduction mechanism. KEGG pathway enrichments revealed that there were many DEGs involved in starch and sucrose metabolism. We also found that electrolyte leakage was correlated with more genetic modules than other physiological indicators, which corroborated our finding in physiological assays that electrolyte leakage can be used as a direct stress marker to reflect cell membrane damage from low-temperature stress.

Conclusions
Taken together, we explored the transcriptomic changes in low-temperature-treated M.

Physiological assays of low-temperature-treated M. falcata roots
Leaf cell membrane damage was determined as electrolyte leakage [65]. MDA was determined using a modified thiobarbituric acid (TBA) method [66]. The activity of SOD was measured by the nitroblue tetrazolium (NBT) method [67]. The activity of CAT was measured following the method of Maehly and Chance [68]. The activity of POD was measured according to the method of Zaharieva et al. [69]. The superoxide anion radical (O 2 .-) contents were determined following Elstner's method [70]. The soluble protein content was determined according to the Bradford method [71]. The content of reduced glutathione (GSH) was fluorometrically estimated [72]. The proline content was determined by the ninhydrin method [73]. The soluble sugar content was determined following the method of Dreywood [74].
All assays described above were repeated four times on four biological replicates. The data shown as the mean ± SD were subjected to ANOVA to determine significant differences. The least significant differences (LSD) of means were determined using Duncan's test at the level of significance defined as α=0.05.

RNA isolation, library preparation and sequencing
Total RNA from each sample was isolated using the TRIzol Reagent (Invitrogen, USA) according to the manufacturer's protocol, and genomic DNA was removed via digestion with DNase I (TaKaRa, Japan). Then, the purity, concentration and nucleic acid absorption peak were measured with a Nanodrop ND-1000 spectrophotometer (NanoDrop, USA). RNA integrity was detected by an Agilent 2100 Bioanalyzer (Agilent, USA). Genomic DNA contamination was detected by electrophoresis.
The library was prepared after the samples passed the quality tests. For Illumina cDNA library preparation, 20 μg of total RNA from each pool was enriched with magnetic beads with oligo (dT) and randomly interrupted by adding fragmentation buffer. Then, first-strand cDNA was synthesized with random hexamers by using mRNA as a template.
Second-strand cDNA was synthesized after adding buffer, dNTPs, RNase H and DNA polymerase I. The cDNA was purified with AMPure XP beads. The purified double-stranded cDNA was subjected to end repair, the addition of a poly-A tail and ligation with the sequencing linker, and the fragment size was selected using AMPure XP beads. Finally, the cDNA library was prepared by PCR enrichment.
For PacBio Iso-Seq library preparation, the cDNA was synthesized by using the SMARTer™ PCR cDNA Synthesis Kit (TaKaRa, Japan). cDNA libraries of different sizes were generated using BluePippin. Then, the screened cDNA was amplified by PCR, end-repaired, connected to the SMRT dumbbell type connector, and exonuclease digested. Finally, the library was prepared after a secondary screening using BluePippin. A total of eight SMRT cells were used for the three libraries at three size ranges: 1-2 kb, 2-3 kb, and 3-6 kb.
After the accurate quantification of libraries using Qubit 2.0 and the library sizes were qualified using Agilent 2100, the libraries were sequenced using PacBio RS II with 8 SMRT cells and an Illumina HiSeq 2500 platform in the Biomarker Institute (Biomarker, China).

Quality filtering and transcriptome assembly
Raw reads were processed into error-corrected ROIs using the ToFu pipeline with full passes>=0, and the accuracy of the sequence was greater than 0.75 (https://github.com/PacificBiosciences/cDNA_primer/wiki/Understanding-PacBiotranscriptome-data#readexplained). High-quality clean data were obtained by removing reads containing connectors, low-quality reads (including those with N removal ratio greater than 10%, and reads where the number of bases with mass value Q≤10 accounted for more than 50% of the reads). Next, FLNC transcripts were determined by searching for the poly-A tail signal and the 5' and 3' cDNA primers in ROIs. Iterative Clustering for Error Correction (ICE) was used to obtain consensus isoforms, and full-length consensus sequences from ICE were polished using SMRT Analysis (v2.3.0). Full-length transcripts with postcorrection accuracy above 99% were generated for further analysis. Redundant reads were removed from the Iso-Seq high-quality full-length transcripts using CD-HIT (identity > 0.99). The resulting transcript sequence was directly used for subsequent analysis of AS events, SSRs and LncRNAs. The second-generation data were used to quantify and differentially analyze the new coding sequence (CDS).

Identification of AS events, SSRs, LncRNAs and CDSs
We used Iso-SeqTM data directly to run all-vs-all BLAST with high-identity settings; BLAST alignments that met all criteria were considered products of candidate AS events: There were two high-scoring segment pairs (HSPs) in the alignment; the two HSPs had the same forward/reverse direction, within the same alignment, one sequence was continuous, or with a small "Overlap" size (smaller than 5 bp); the other one was distinct to show an "AS Gap", and the continuous sequence mostly completely aligned to the distinct sequence.
Transcripts with lengths greater than 200 nt and with more than two exons were selected as LncRNA candidates and further screened using CPC/CNCI/CPAT/Pfam, which have the power to distinguish the protein-coding genes from the noncoding genes. TransDecoder (https://github.com/TransDecoder/TransDecoder/releases) was used to identify CDS regions within transcript sequences.

Gene function annotation
To acquire the most comprehensive annotation, all full-length transcripts from SMRT were aligned with the NCBI NR, SwissProt, GO, COG, KOG, Pfam, and KEGG databases using BLAST software (version 2.2.26) [45]. GO enrichment analysis was implemented by the GOseq R package-based on Wallenius noncentral hypergeometric distribution [75]. KOBAS software was used to test the statistical enrichment of DEGs in KEGG pathways [76].

Quantification of gene expression levels
The gene expression level of each sample was identified by RSEM [46]. Clean data were mapped back onto the assembled transcriptome, and readcount for each gene was obtained from the mapping results. Bowtie2 software was used to compare the clean data from Illumina sequencing to the SMRT sequencing data. Quantification of gene expression levels was estimated by FPKM considering the effect of the sequence depth and gene length on the fragments.

Identification of DEGs
Differential expression analysis was performed using the DESeq R package (1.10.1) to identify DEGs between the low-temperature-treated and control samples and samples collected at different temperature points [77]. DESeq provides statistical analyses for determining differential expression in digital gene expression data using a model based on the negative binomial distribution. In the detection of DEGs, fold change≥2 and FDR<0.01 were taken as the screening criteria. The fold change represents the ratio of expression amount between two samples (groups). The FDR was obtained by correcting the p-value of difference significance. Because the differences in the transcriptome sequencing expression analysis is the transcribed expression values of a large number of independent statistical hypothesis tests, false positives are a concern; thus, in the process of analyzing DEGs, the recognized Benjamini-Hochberg correction method of hypothesis testing with the original significance p values for correction and, eventually, the FDR were used as key indicators of screening DEGs.

Identification of putative TFs
All DEGs were BLAST searched to a plant TF database (PlantTFDB4.0, http://planttfdb.cbi.pku.edu.cn) to identify putative TFs. TF information was annotated based on the comparison results.

Coexpression network analysis with WGCNA
Coexpression networks were constructed using the WGCNA package in R from all DEGs [62]. The modules were obtained using the automatic network construction function blockwise modules with default settings. The eigengene value was calculated for each module and used to test the association with each physiological index. The total connectivity and intramodular connectivity (function soft connectivity), kME (for modular membership, also known as eigengene-based connectivity), and kME-P values were calculated for the DEGs.

Validation of RNA-Seq data by qRT-PCR
RNA samples isolated above were used as templates and reverse transcribed with the HiScript II Q Select RT SuperMix for qPCR (gDNA eraser) kit (Vazyme, China). Primers used in this study were designed using Primer 5 with RefSeq and listed in Additional file 15:

Ethics approval and consent to participate
Not applicable.

Consent for publication
Not applicable.

Availability of data and material
All relevant supplementary data are provided within this manuscript as Additional files.