Transcriptome profiles revealed molecular mechanisms of alternating temperatures in breaking the epicotyl morphophysiological dormancy of Polygonatum sibiricum seeds

To adapt seasonal climate changes under natural environments, Polygonatum sibiricum seeds have a long period of epicotyl morphophysiological dormancy, which limits their wide-utilization in the large-scale plant progeny propagation. It has been proven that the controlled consecutive warm and cold temperature treatments can effectively break and shorten this seed dormancy status to promote its successful underdeveloped embryo growth, radicle emergence and shoot emergence. To uncover the molecular basis of seed dormancy release and seedling establishment, a SMRT full-length sequencing analysis and an Illumina sequencing-based comparison of P. sibiricum seed transcriptomes were combined to investigate transcriptional changes during warm and cold stratifications. A total of 87,251 unigenes, including 46,255 complete sequences, were obtained and 77,148 unigenes (88.42%) were annotated. Gene expression analyses at four stratification stages identified a total of 27,059 DEGs in six pairwise comparisons and revealed that more differentially expressed genes were altered at the Corm stage than at the other stages, especially Str_S and Eme. The expression of 475 hormone metabolism genes and 510 hormone signaling genes was modulated during P. sibiricum seed dormancy release and seedling emergence. One thousand eighteen transcription factors and five hundred nineteen transcription regulators were detected differentially expressed during stratification and germination especially at Corm and Str_S stages. Of 1246 seed dormancy/germination known DEGs, 378, 790, and 199 DEGs were associated with P. sibiricum MD release (Corm vs Seed), epicotyl dormancy release (Str_S vs Corm), and the seedling establishment after the MPD release (Eme vs Str_S). A comparison with dormancy- and germination-related genes in Arabidopsis thaliana seeds revealed that genes related to multiple plant hormones, chromatin modifiers and remodelers, DNA methylation, mRNA degradation, endosperm weakening, and cell wall structures coordinately mediate P. sibiricum seed germination, epicotyl dormancy release, and seedling establishment. These results provided the first insights into molecular regulation of P. sibiricum seed epicotyl morphophysiological dormancy release and seedling emergence. They may form the foundation of future studies regarding gene interaction and the specific roles of individual tissues (endosperm, newly-formed corm) in P. sibiricum bulk seed dormancy.

Conclusions: A comparison with dormancy-and germination-related genes in Arabidopsis thaliana seeds revealed that genes related to multiple plant hormones, chromatin modifiers and remodelers, DNA methylation, mRNA degradation, endosperm weakening, and cell wall structures coordinately mediate P. sibiricum seed germination, epicotyl dormancy release, and seedling establishment. These results provided the first insights into molecular regulation of P. sibiricum seed epicotyl morphophysiological dormancy release and seedling emergence. They may form the foundation of future studies regarding gene interaction and the specific roles of individual tissues (endosperm, newly-formed corm) in P. sibiricum bulk seed dormancy.
Keywords: Polygonatum sibiricum red, Epicotyl morphophysiological dormancy, Temperature stratification, SMRT (single-molecule real-time) sequencing, Full-length transcriptome, RNA sequencing, Gene expression, Hormone, Transcription factor, Seed germination-related gene Background Polygonatum sibiricum Red (Huangjing in Chinese) is an edible perennial lily species with medicinal properties. The rhizome of this plant together with those of P. kingianum Col1. et Hemsl and P. cyrtonema Hua have been used as a traditional Chinese medicine for nourishing Qi and Yin and for enhancing spleen, lung, and kidney functions for approximately 1600 years [1][2][3][4]. Modern pharmacological studies have proved that polygonati rhizoma can improve immunity as well as lower blood sugar and lipid levels. Additionally, its anti-viral and anti-tumorigenic properties have been confirmed. Thus, polygonati rhizoma may be useful for developing novel drugs and health products relevant for treating agerelated diseases, hypolipidemia, atherosclerosis, osteoporosis, liver diseases, diabetes mellitus, lung diseases, coughs, fatigue, and insomnia [3][4][5][6][7][8]. P. sibiricum is one of three Polygonatum species used as a source of polygonati rhizoma listed in the Pharmacopoeia of the People's Republic of China. As an edible and medicinal plant, the market demand for polygonati rhizoma has increased substantially, with more than 4000 tons produced annually. For many years, most of the polygonati rhizoma on the market was derived from wild resources, which has depleted the limited wild resources due to the long growth period to produce harvestable rhizomes (3-4 years for rhizome propagation and 5-6 years from seed propagation). To satisfy the increasing market demand and ensure the sustainable production and supply of polygonati rhizoma and to protect and preserve wild resources, Polygonatum plants are now widely cultivated in China. P. cyrtonema Hua is mainly cultivated in the Yangtze River basin and in the southern region, whereas P. kingianum Col1. et Hemsl and P. sibiricum Red are primarily cultivated in Yunnan province and northern China, respectively. P. sibiricum can be propagated from its rhizomes or seeds. Because of the low efficiency of rhizome propagation and the considerable time needed for rhizome growth (3-4 years), seed propagation may be the superior option for the large-scale cultivation of P. sibiricum plants. However, P. sibiricum seeds at maturation have morphophysiological dormancy (MPD) and require a long dormancy period of about 15 months under natural conditions to complete the morphological and physiological dormancy-related processes before seedling emergence. It was found that seed structure, endogenous inhibitors, and underdeveloped embryo at maturation influenced P. sibiricum seed dormancy and germination [9]. Methods for effectively stimulating the germination of P. sibiricum dormant seeds, including soaking, exogenous hormone treatments, and temperature stratifications, were also evaluated. It was observed that P. sibiricum seeds exposed to 25°C could germinate in 30-60 days and form corm tissue, after which a lowtemperature (4°C) treatment is needed to break the epicotyl dormancy in about 60 days before seedling emergence [10][11][12]. Similar to germination process of MPD seeds of Lilium dahuricum [13], Lilium polyphyllum [14], and Arisaema dracontium especially in cold regions [15,16], P. sibiricum seeds underwent corm formation and plumule development, and required nutrients transported from endosperm reserves into the new corm tissue and a cold stratification prior to seedling establishment [12,17]. This type of seed dormancy and germination is quite different from the corresponding processes of many other MPD seeds such as Paris polyphylla [18], Panax quinquefolius [19], and Paeonia suffruticosa Andr [20]. For Paris polyphylla, Panax quinquefolius and Paeonia suffruticosa dormant seeds, their embryo differentiation into a visible radicle, plantule, hypocotyl or epicotyl, and/or cotyledons occurs inside the seed before germination. In contrast, the immature club-shaped embryo in P. sibiricum seeds elongates under suitable warm and moist conditions and pushes the radicle, hypocotyl, and plantule primordium out of the endosperm through the hilum; the visible plumule then differentiates and develops on the protuberant hypocotyl (defined as the "corm") and stops growing until a cold stratification is exerted to release epicotyl dormancy [12,17] (Fig. 1). Currently, the molecular basis of P. sibiricum seed dormancy release and germination, especially in terms of corm formation and epicotyl dormancy, remains unclear. Because of a lack of genome and transcriptome sequence information, in this study, we applied a single-molecule real-time sequencing (SMRT-seq) strategy to analyze a pooled total RNA sample derived from six stages to generate a complete and full-length P. sibiricum transcriptome during seed germination and seedling emergence. The transcript isoforms served as the reference sequences for the functional annotation of P. sibiricum genes and in the subsequent comparative transcriptome study. Additionally, we conducted an Illumina short-read sequencingbased comparison of the transcriptomic profiles in four key stages of P. sibiricum seeds during dormancy release. The potential genes related to the MPD release of P. sibiricum mature seeds following a warm stratification to develop the corm and the subsequent epicotyl dormancy release by a cold stratification were separately identified. These results provided the first insights into the molecular regulation of P. sibiricum seed MPD release, germination, and seedling emergence.

Results
SMRT-seq analysis of P. sibiricum transcriptome during seed dormancy, germination, and seedling emergence To obtain a sequenced P. sibiricum transcriptome during seed dormancy, germination, and seedling emergence, the PacBio RSII platform was used to perform a SMRTseq analysis of a pooled RNA sample from six different seed developmental stages [dormant mature seed, early germinating seed during a warm stratification (Ger-S), germinated seed with a corm during a warm stratification (Corm), early stage (about 4 weeks) of a cold stratification (Str), late stage (about 8 weeks) of a cold stratification (Str_S), and seedling emergence during a warm stratification (Eme)]. A total of 4,789,895 subreads (11.36 Gb) were generated, with an average length of 2373 bp and an N50 of 3166 bp (Table 1). A total of 292,791 circular consensus sequences (CCS) were obtained using the SMRTlink 5.1 software and further classified into 49,909 non-full length reads and 239,376 fulllength reads, of which 230,162 were full-length nonchimeric (Flnc) reads. An isoform-level clustering A five main overall stages of P. sibiricum seeds during stratification: dormant mature seed before a warm stratification (Seed); seed germination with the embryo extruding from a hilum (Ger-S) and cormlet formation (Corm) at 25°C stratification; the radicle and lateral roots form and the corm continues to grow at 4°C stratification (Str_S); and seedling emergence with a leaf after transferring to 25°C (Eme). B, four major stages of embryo development during the germination at 25°C: i) undeveloped club-shaped embryo in a dormant seed at maturation; ii) immature embryo elongates and is extruded from the hilum, indicating the dormant P. sibiricum seed has germinated; iii) hypocotyl increases in size and a cormlet forms on the radicle; and iv) corm continues to grow and differentiate, which is accompanied by the emergence of a plumule. Em: embryo; Ra: radicle; Co: corm; Pl: plumule analysis yielded 132,557 consensus reads, with an average length of 3076 bp, an N50 of 3633 bp, and an N90 of 2098 bp after correcting errors using RNA sequencing (RNA-seq) data. Redundant sequences among 132,557 consensus reads were eliminated using the CD-HIT software, ultimately resulting in 87,251 unigenes. Approximately 82.45% of all unigenes (71,936) had only one transcript and 9.39% unigenes had two transcripts (Table  S1). The length distributions of the subreads, Flnc reads, and consensus sequences are presented in Fig. 2A-C. Additionally, 46,407 unigenes (53.19% of the total) were longer than 3 kbp (Fig. 2D). Using the ANGLE pipeline, 84,177 unigenes were predicted as protein-coding sequences, of which 46,255 were identified as full-length sequences (i.e., a complete coding sequence as well as 5′ and 3′ untranslated regions). The length distribution of the predicted protein-coding sequences is provided in Figure S1.
Functional annotation of P. sibiricum transcriptome A total of 87,251 full-length unigenes were functionally annotated based on BLAST searches of the NCBI nonredundant protein (Nr), NCBI non-redundant nucleotide (Nt), Swiss-Prot, Protein family (Pfam), Clusters of Orthologous Groups (KOG/COG) of proteins, Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. Using an E-cutoff value ≤1e-5 and the top hit for the searches, a total of 77,148 unigenes (88.42%) had significant sequence matches in at least one of the databases (Table S2) (Table S3). Among 402 Viridiplantae species with sequence matches, the three species with the most matches to P. sibiricum unigenes were Asparagus officinalis (39,017, 52.13%), Elaeis guineensis (7561, 10.10%), and Phoenix dactylifera (6226, 8.31%), all of which are monocots. There are currently only 543 genes and 1732 protein sequences of Polygonatum species available in the NCBI databases. Our SMRT-seq analysis resulted in 544 P. sibiricum unigenes assigned to the Nr sequences of the following eight Polygonatum species: P. biflorum (1), P. cyrtonema (175), P. involucratum (1), P. multiflorum (143), P. pubescens (7), P. roseum (64), P. sibiricum (73), and P. verticillatum (79) ( Table S4). These 544 unigenes are 34.7-100% homologous to known sequences from Polygonatum species. Note: 5′ primer reads refers to the reads with the 5′ primer; 3′ primer reads refers to the reads with the 3′ primer; poly-A reads refers to the reads with poly-A; fulllength non-chimeric (Flnc) reads refers to non-chimeric reads with the 5′ primer, 3′ primer, and poly-A The KOG database was built with orthologous proteins encoded in the Arabidopsis thaliana (Arabidopsis) genome and the genomes of six other non-Viridiplantae species (http://www.ncbi.nlm.nih.gov/COG/). The BLASTx search of the KOG database matched 48,662 P. sibiricum unigenes with 7587 Arabidopsis orthologous proteins as well as 2244 P. sibiricum unigenes with 1279 orthologous proteins from non-Viridiplantae species. These 50,906 P. sibiricum unigenes were classified into 3209 orthologous protein clusters, and their distribution in 25 categories is presented in Figure S2. The largest KOG category was "general functional prediction only" (13,408, 15.37%), followed by "post-translational modification, protein turnover, chaperones" (5295, 6.07%) and "signal transduction mechanisms" (4766, 5.46%). The GO functional annotation resulted in 41,807 unigenes classified into 54 subcategories (Table S5) of the three main functional categories ( Figure S3). Specifically, 26,629 unigenes were assigned to 25 biological processes, with "metabolic process," "cellular process," and "single-organism process" revealed as the largest subcategories.
Among 73,525 P. sibiricum unigenes annotated using the KEGG database, 34,513 were assigned KO identifiers, of which 22,381 unigenes were further mapped to 341 KEGG pathways (third level) (Table S6). A total of 9323 unigenes were assigned to the metabolic pathways, including "carbohydrate metabolism" (3240), "energy metabolism" (1969), "lipid metabolism" (1506), and "amino acid metabolism" (1934). Additionally, 4225 unigenes were assigned to cellular processes, of which 2871 were associated with "transport and catabolism" and 1238 were related to "cell growth and death." Among the environmental information processing modules, 3862 unigenes were predicted to affect signal transduction pathways, including the "plant hormone signal transduction pathway" (699) and "MAPK signaling pathway-plant" (656), suggesting they may be useful for studying the regulatory effects of hormones on P. sibiricum seed germination.
Differential expression of P. sibiricum genes during seed stratification To investigate the gene expression dynamics and patterns in P. sibiricum seeds during warm and cold stratifications and seedling establishment, four stages (Seed, Corm, Str_S, and Eme; Fig. 1) were analyzed by RNAseq using the Illumina PE150 platform, with three biological replicates for each stage. A total of 46-102 million clean reads were generated, of which 53.97-76.34% were mapped to SMRT reference transcripts using the RSEM software (Table S7). Gene transcription level was assessed using FPKM value (i.e., expected number of fragments per kilobase of transcript sequence per million base pairs sequenced), which adjusted the transcript length and sequencing depth. The data indicated that 30.84-40.01% of the genes in 12 samples had FPKM values < 0.1, whereas 32.71-39.86% of the genes had FPKM values > 1( Figure S4A). Unigenes with FPKM values > 0.3 were considered to be expressed. Accordingly, 62,984 genes were expressed in at least one of the 12 samples. The FPKM density distribution of the genes indicated that the overall gene expression of the Corm stage differed from that of the other three stages (Seed, Str_S, and Eme) ( Figure S4B). The FPKM-based PCA confirmed that different P. sibiricum seed developmental stages were well separated, even though the three replicates for the Seed sample were more poorly clustered than the replicates for the other three stages ( Figure  S5A). A Pearson correlation analysis also revealed that the correlation between Seed2 and the Seed1 and Seed3 replicates (R 2 = 0.68 and 0.66) ( Figure S5B) was lower than the correlation among replicates for the other three stages. Therefore, only Seed1 and Seed3 were used for analyzing differential gene expression. Pairwise comparisons revealed that P. sibiricum seeds at the Corm stage had substantially more differentially expressed genes (DEGs), especially compared with the seeds at the Str_S and Eme stages (Fig. 3A). Using an adjusted p value < 0.05 and |log 2 (fold-change)| ≥ 1 as the criteria, 7248, 17,619, and 19,319 DEGs were respectively detected in comparisons between the Corm stage and the Seed, Str_ S, and Eme stages. Only 3183 and 1170 unigenes were differentially expressed between the Eme and Seed stages and between the Str_S and Seed stages, respectively. These results imply that imbibed P. sibiricum seeds with undeveloped embryos had physiological activities that were similar to those of P. sibiricum cold-stratified seeds and seedlings after the dormancy constraints in the Corm stage seeds were eliminated or alleviated by cold stratification. Of 27,059 DEGs, 46-5411 were differentially expressed only between two stages. Additionally, 8825 and 6428 DEGs were detected in two and three comparisons, respectively (Fig. 3B).
A hierarchical clustering analysis ( Fig. 4A) classified 27,059 DEGs into eight main subclusters. Compared The KEGG pathway enrichment analyses of the P. sibiricum DEGs in the Corm vs Seed, Str_S vs Corm, Eme vs Corm, and Eme vs Str_S comparisons involved the top 20 pathways and − log 10 (adjusted p value) of "all DEGs" sets in each comparison (Fig. 5). The enriched pathways among the top 20 KEGG pathways differed greatly among the Str_S vs Corm, Eme vs Corm, and Eme vs Str_S comparisons. Only two KEGG pathways, "DNA replication" and "Spliceosome," were enriched for the "all DEGs" set in the Corm vs Seed comparison (Fig.  5A). Many enriched KEGG pathways were identified for the "all DEGs" set and for the "down-regulated" gene set in the Str_S vs Corm comparison (Fig. 5B) or for the "up-regulated" gene set in the Eme vs Str_S comparison (Fig. 5D). Thirteen KEGG pathways, including "starch and sucrose metabolism," were enriched for the "downregulated" gene set, whereas three KEGG pathways, including "plant signal hormone transduction," were enriched for the "up-regulated" gene set in the Str_S vs Corm comparison (Fig. 5B). Additionally, 15 KEGG pathways were significantly enriched for the "up-regulated" gene set, but only three enriched KEGG pathways were identified for the "down-regulated" gene set in the Eme vs Str_S comparison (Fig. 5D). The enriched KEGG pathways in the Eme vs Corm comparison, including "starch and sucrose metabolism" and "steroid biosynthesis," were mainly associated with the "down-regulated" gene set (Fig. 5C).
Expression of hormone metabolism and signaling genes during P. sibiricum seed dormancy and germination A BLASTx search of the Arabidopsis protein database identified 1189 P. sibiricum unigenes possibly involved in the metabolism of ABA, GA, auxin, BR, and other hormones or the upstream pathways (Table S8). Additionally, 1285 putative hormone signaling genes were identified based on the KEGG annotations and published relevant information for Arabidopsis (Table S8). The expression levels of 475 hormone metabolism genes and 510 hormone signaling genes changed during P. sibiricum seed dormancy release and seedling emergence (Table S8, Figure S6). Figures 6 and 7 present the expression levels of the selected DEGs involved in the biosynthesis, degradation, and signaling of ABA (51), GA (30), CK (30), auxin (60), BR (39), JA (54), and ethylene (37) in four samples. The expression levels of many ABA signaling genes (13 of 34) were up-regulated in the Corm, Str_S, and Eme stages (Fig. 6A), reflecting the importance of ABA signaling for seed germination and dormancy release. The CYP707A1 gene, which is involved in ABA degradation, was more highly expressed in the Corm (c9540), Str_S (c60708), and Eme (c29776) stages than in the Seed stage. The expression of a GA3ox gene (c4067) involved in GA biosynthesis was upregulated in the Seed and Corm stages, especially compared with that in the Eme stage, whereas the expression of a GA2ox gene (c8451) involved in GA degradation was up-regulated in the Corm and Eme stages (Fig. 6B).
Fifteen ABA-and GA-related genes were selected for a quantitative real-time polymerase chain reaction (qRT-PCR) analysis to verify the accuracy of the RNA-seqbased expression levels during the P. sibiricum seed dormancy release process ( Figures S7 and S8). We analyzed and "down-regulated" sets, respectively four seed dormancy release stages, three involving a stratification at 25°C (Seed, Corm, and Eme) and one low-temperature stage (Str_S). The FPKM values indicated that two ABA biosynthesis-related genes, ZEP (zeaxanthin epoxidase) and AAO3 (abscisic-aldehyde oxidase 3), and one ABA degradation-related gene (CYP707A) had consistent expression trends in the four samples. More specifically, ZEP and CYP707A were expressed at low levels in the Corm stage, in contrast to the relatively high expression level at the lowtemperature stage (Str_S). These results indicate that an exposure to low temperatures is important for breaking the P. sibiricum epicotyl dormancy, which is required to complete the seed epitcotyl dormancy release process. The ABA signaling genes, including ABI5 [basic leucine zipper (bZIP) transcription factor (TF)], ABF (ABA-responsive element-binding factor), PYL4 (polyketide cyclase/dehydrase and lipid transport superfamily protein), PYL8, and PP2C (protein phosphatase 2C), were also analyzed by qRT-PCR. Their RT-qPCR results were basically consistent with the RNA-seq data. Two GA synthesis genes [GA3ox and GAMT2 (gibberellic acid methyltransferase 2)], five GA signaling genes [GASA3 (gibberellin-regulated protein 3), GASA6, GASA14, GID1, and GID1-like (gibberellin receptor)], and CIGR2 (chitin-inducible gibberellin-responsive protein 2) also  Table S8 showed the consistent qRT-PCR and FPKM values (Figure S8). The GA3ox expression level was up-regulated at the Corm stage, whereas GAMT2 expression was upregulated at the Seed and Eme stages. The GASA3, GASA6, and GASA14 genes were differentially expressed. Moreover, GID1 and CIGR2 were expressed at low levels during the Seed and Corm stages, but were highly expressed at the Str_S and Eme stages.

Expression of transcription factors during P. sibiricum seed dormancy and germination
Transcription factors are critical for seed development and germination [21][22][23]. In this study, we annotated our transcriptome using the iTAK software and the associated database, ultimately identifying 2605 TFs from 67 TF families as well as 1552 transcriptional regulators (TRs) from 25 TR families (Tables S9, S10). Among them, 1018 TFs (57 TF families) and 519 TRs (21 TR families) were differentially expressed during P. sibiricum seed stratification and germination. A hierarchical clustering analysis performed using the FPKM values of the TF and TR DEGs produced 10 subclusters ( Fig. 8 and Table S9). These 10 subclusters were classified into two main groups based on the relative expression levels in the Corm samples: Subcluster 1-6 (542 TFs/ 273 TRs) in Corm with decreased expression while Subcluster 7-10 (476 TFs/246 TRs) having higher expression.
-Some Arabidopsis TF/TR genes were functionally known to play roles in seed dormancy and germination (Table S11). The possible roles of the differentially expressed TFs and TRs are listed in Table S11 and are described in more detail below.

Discussion
Polygonatum sibiricum is a well-known traditional Chinese medicinal plant throughout east Asian countries. Studies regarding the cultivation and seed biology of this plant species have been conducted, but its seed germination characteristics remain unclear [17,[27][28][29][30]. Cheng et al. and Zhu et al. reported that GA 3 and a lowtemperature treatment (0°C for 120 days) may enhance P. sibiricum seed germination [28,30]. However, our research [12] suggested that warm stratification followed by a low-temperature stratification is a more appropriate strategy for inducing P. sibiricum seed germination, which accords with the characteristics of the seeds with deep simple epicotyl MPD [31]. Under natural conditions, P. sibiricum mature seeds are dispersed in mid-fall and germinate under suitable warm conditions during the following spring and summer, with only the radicle and corm emerging. The shoot emerges in the spring Note: Gene was named based on annotated Arabidopsis homologs. The full information of gene regulating Arabidopsis seed dormancy and germination was given in Table S11 continues to grow and differentiate, which is accompanied by the emergence of a plumule after the second cold winter, indicating that P. sibiricum seeds exhibit the deep morphophysiological epicotyl dormancy. However, P. sibiricum seeds can develop into seedlings within 6 months following the successive exposures to warm/cold conditions [12]. Polygonatum kingianum, another important source of polygonati rhizoma, is mainly grown in Yunnan province, China. Its mature seed has an underdeveloped embryo and possesses the similar seed germination characteristics to P. sibiricum [32]. However, they had the distinct seedling establishment processes after corm formation. No P. sibiricum seedling emerged under prolonged warm conditions after corm formation, indicating that a cold treatment is essential for the epicotyl dormancy release of P. sibiricum corm and its leaf emergence. However, P. kingianum germinated seeds had about 50% seedling emergence when inculated at 25°C [33]. In view of this, an additional low-temperature treatment seems unnecessary for P. kingianum seedling emergence after corm formation, although an appropriate chilling period can enhance the emergence rate of P. kingianum seedling and accelerate its emergence [32,33]. These results indicated that Polygonatum species growing in different climatic regions have distinct seed germination properties and regulatory mechanisms.
To understand the molecular mechanisms of corm formation under warm temperature, cold stratification to break the epicotyl dormancy, and seedling establishment during P. sibiricum seed germination, we performed gene expression analyses of four stages (Seed, Corm, Str_S, and Eme) using transcriptome sequencing techniques. During the MD release, P. sibiricum seeds stratified at warm temperature undergo several important morphological changes including embryo growth, radicle emergence and elongation, and formation of a plumule-containing cormlet. Our DEG analyses revealed that there were more DEGs in the Corm vs Str_S and Corm vs Eme comparisons than in the other comparisons, implying that corm development is a key stage for the transcriptional regulation of P. sibiricum seed germination and seedling emergence. It also indicated that a specific cold stratification treatment period breaks the epicotyl dormancy via the expression-level changes of many genes in the developed corm. Fewer DEGs between the mature seed and the corm were identified during MD release. This may have been because our sampling time-points with the bulk seed did not cover all the important developmental events occurring in embryo differentiation and endosperm weakening during MD release, given that the genes determining plant development and growth are usually spatially and temporally expressed [34,35]. Wang et al. [32] found that 7 TFs including DAG2, Dof5.7, bZIP60, MYB111, MYB55, MYB46, and REM1 possibly regulated the expression of 17 hub genes that were altered among three different dormant statuses of P. kingianum corm. However, we only detected three homologs of AtbZIP60 in our experimental condition and found that they (c18106, c17463 and c8065) were all decreased during cold stratification and elevated in Corm and Eme under warm temperature (Table S9). bZIP60 TF has been found to modulate the unfolded protein response (UPR) in plants and could enhance heat stress tolerance [36]. The upregulation of bZIP60 transcripts in P. sibiricum corm and seedling under warm temperature may suggest its role of warm temperature tolerance.
Studies on plant species with MPD seeds [37][38][39][40] revealed that their seed dormancy and germination is controlled jointly by endogenous hormones and environmental conditions including temperature, soil or seed moisture, light, smoke, and nutrient availability, which is similar to Arabidopsis and cereal crops with physiological dormant seeds [41][42][43][44][45]. In addition to ABA and GA, which are two major hormones that respectively induce and break seed dormancy in most plants, other phytohormones, such as cytokinins, jasmonic acid, strigolactones, brassinosteroids, ethylene, salicylic acid, and auxin, may also regulate seed dormancy and germination in a plant species-dependent manner. Their contents in dormant and germinating seeds are partly regulated by the expression of genes related to hormone metabolism and signaling [40,46]. In this study, we identified 475 putative hormone metabolismrelated unigenes and 510 putative hormone signaling genes that were differentially expressed during P. sibiricum seed dormancy and seedling and shoot emergence (Figs. 6 and 7, Figure S6). Although ABA is considered to be crucial for inducing and maintaining seed dormancy, most P. sibiricum DEGs regulating ABA biosynthesis and degradation such as NPQ1, ABA2, CYP707A1 were more highly expressed in the Corm, Str_S, and Eme stages than in the Seed stage. ABI3 and ABI5 are two ABA signaling-related genes that positively control seed dormancy [43]. ABI5 is found highly expressed in dormant seeds of several plant species [19,46]. Their homologous genes in P. sibiricum were expressed at higher levels in the Seed and Corm stages than in the cold-stratified germinated seeds and seedlings, implying ABA may have accumulated during the Seed and Corm stages under warm conditions, leading to the dormancy of the Seed and Corm stages. In P. cyrtonema, which has similar seed dormancy/germination characteristics to P. sibiricum, the ABA level in seeds stored for a long period in wet sand is reportedly higher under warm conditions than under cold conditions [40]. It was also found that P. kingianum germinated seeds at corm stage had the higher ABA content than cold-stratified and non-dormant germinated seeds, which was consistent with the higher expression of ABA synthesis-related transcripts [33]. Fluridone is an inhibitor of ABA biosynthesis and promoted seed dormancy release like cold stratification [47,48]. In our recent pre-experiments, we observed that fluridone-treated P. sibiricum seeds germinate at a higher rate than untreated seeds (data not shown). Hence, a fluridone treatment of P. sibiricum MD seeds and the corm after MD release may be useful for elucidating the effects of ABA on P. sibiricum seed MD release and the induction of epicotyl physiological dormancy under warm conditions as well as epicotyl dormancy under cold conditions. Consistent with molecular regulation of PD seed germination of Arabidopsis [43], many genes related to chromatin modifiers and remodelers, DNA methylation, mRNA degradation, and cell wall structures were also differentially regulated during P. sibiricum seed germination, epicotyl dormancy release, and seedling establishment ( Table 2, Fig. 9, Table S11). During the germination of P. polyphylla, P. quinquefolius, and P. suffruticosa seeds, the cotyledons remain inside the seed coat/endosperm after MD release and have to be pulled outside until the endosperm is sufficiently weakened to eliminate the mechanical resistance from the surrounding tissues (testa and endosperm). In contrast, the plumule of P. sibiricum develops outside of the hard and compact endosperm after MD release. However, the plumule does not immediately differentiate and elongate to push the shoot out, and this epicotyl physiological dormancy may be correlated with inhibitors in the corm [12] and the slow mobilization and transport of endospermic reserves. Seed storage matter metabolismrelated genes, such as CathB3, MAN2, MAN7, PROTE-OLYSIS6, and ANNAT2, as well as cell wall loosening genes (EXPA4 and EXPA8) and polygalacturonase genes (e.g. c19523, c4968) were highly expressed in the corm and then decreased considerably during the cold stratification and seedling emergence. The β-mannanase and polygalacturonase activities in P. sibiricum seeds increase as the endosperm weakens and seed germination proceeds during a warm stratification [49].

Conclusions
In summary, we analyzed seed samples at four key stratification stages to explore the molecular mechanism regulating seed germination and dormancy release. A full-length transcriptome database was established, after which the expression patterns of some dormancy-related DEGs at the Corm, Str_S, and other stages were analyzed. The results of this study have helped to further characterize the P. sibiricum seed dormancy trait and may form the basis of future related investigations. Specifically, we built databases comprising unigenes involved in the P. sibiricum seed germination process and phytohormone-related unigenes associated with seed dormancy (Table S8 and Table S11). These databases may enable researchers to further elucidate the molecular mechanism underlying seed dormancy and germination.

Plant materials
Polygonatum sibiricum Red plants are originated from Qishan County, Shanxi province, China and kindly provided by a local P. sibiricum growing farmer. They were identified by Dr. Jianjun Qi at the Institute of Medicinal Plant Development. These plants were then cultivated and conserved in a shaded field at the Institute of Medicinal Plant Development, Beijing, China. The seed voucher specimen (HGJG0012) was deposited at the national medicinal plant gene bank of the Institute of Medicinal Plant Development, Beijing, China. The normal field management was applied, according to our institutional field plantation guidelines. Yellow or black ripe berries were collected in early October 2016, fermented at room temperature for several days to soften the pulp outside the seeds, and then rubbed with a fine nylon mesh bag to obtain clean seeds. Fully-filled and healthy mature seeds were soaked in tap water for 1 day and then stratified in moist sand in a plastic box (10 X15 cm), which was covered with a lid to delay evaporation. Three sequential temperature stratifications were used to induce P. sibiricum seed germination and seedling emergence: 1) warm stratification at 25°C for 4-6 weeks to promote radicle extrusion and cormlet formation; 2) germinated seeds with corms were subjected to a cold stratification at 4°C for 8 weeks; 3) the coldstratified seeds were transferred to 25°C to induce seedling emergence. Warm and cold stratifications were conducted separately in a temperature-controlled incubator and in a refrigerator. In this study, the following P. sibiricum samples were collected separately: mature seeds soaked for 1 day at room temperature before the 25°C stratification (Seed); seeds at the early germination stage during a warm stratification, with an extruded radicle (Ger-S); germinated seeds with a corm during a warm stratification (Corm); seeds at the early stage (about 4 weeks) of a cold stratification (Str), seeds at the late stage (about 8 weeks) of a cold stratification (Str_S); and seeds at the seedling emergence stage during a warm stratification (Eme). The samples were frozen with liquid nitrogen and then stored at − 80°C for later use.

Iso-seq library construction and PACBIO SMRT sequencing
To well represent the transcriptome information from different seed stages, we sequenced the full-length of the expressed genes in seeds of different stages (Seed, Ger-S, Corm, Str, Str_S, and Eme) using the PacBio technique.
The work pipeline of our experiment to generate Iso-seq and RNA-seq data was shown in Figure S9. Total RNA was extracted from the collected P. sibiricum seeds (Seed, Ger-S, Corm, Str, Str_S, and Eme) using the TRIzol reagent (Invitrogen, USA) according to the manufacturer's instructions. RNA samples with RIN values ≥7.8 were mixed equally to form an RNA pool for constructing the Iso-Seq library and the PacBio full-length cDNA sequencing. Poly-(A) RNA was isolated from the RNA pool using the oligo-(dT) magnetic bead-binding method and the Poly-(A) Purist™ Kit (Invitrogen, USA). The isolated poly-(A) RNA was eluted with RNase-free water. The mRNA (1 μg) was used as the template to synthesize cDNA with the Clontech SMARTer cDNA synthesis kit. After the PCR amplification, quality control, and purification steps were completed, the BluePippin Size Selection System was used to produce three fractions containing fragments 1-2, 2-3, and 3-6 kb long. The cDNA products were then used to construct SMRTBell Template libraries using the SMRTBell Template Prep Kit. The concentration and quality of the cDNA library were determined using the Qubit 2.0 fluorometer and the Agilent 2100 Bioanalyzer, respectively. All the operations during the library construction followed the protocols of the above-used kits. Finally, three SMRT cells were sequenced on the PacBio RS platform (Pacific Biosciences, Menlo Park, CA, USA) at Beijing Novogene Scientific Co., Ltd. (Beijing, China).

RNA-seq library construction and sequencing
Samples collected at the Seed, Corm, Str_S, and Eme stages (three biological replicates) underwent an RNAseq analysis. The poly-(A) mRNA was enriched from the total RNA using oligo-(dT) magnetic beads. Following the enrichment, the mRNA was fragmented into small pieces in fragmentation buffer. These fragments served as templates for the first-strand cDNA synthesis using Superscript™ III reverse transcriptase and random hexamer (N6) primers. The RNA templates were removed, after which the second cDNA strand was synthesized using dNTPs, DNA polymerase I, and RNase H. The resulting short cDNA fragments were purified with AMPure XP beads. After the end-repair and A-tailing steps, the short cDNA fragments were ligated with the Illumina paired-end adapters and purified with AMPure XP beads. Next, a PCR was used to selectively enrich DNA fragments with adapters at both ends and prepare the final cDNA library, according to the kit's protocols. The concentrations of the cDNA libraries were determined using the Qubit 2.0 fluorometer (Life Technologies, Carlsbad, CA, USA) and their quality was evaluated using the Agilent 2100 Bioanalyzer. Finally, the 12 constructed libraries were sequenced from both ends using the Illumina HiSeq™ 2500 system (Illumina, San Diego, CA, USA) at Beijing Novogene Science Co., Ltd. (Beijing, China).
Additional nucleotide errors in the polished Flnc consensus sequence were corrected based on the Illumina RNA-seq data with the LoRDEC software. Redundant isoforms in the corrected consensus reads were removed with CD-HIT (−c 0.95 -T 6 -G 0 -aL 0.00 -aS 0.99) to obtain the final non-redundant reference transcripts for the subsequent analyses.

Gene functional annotation
Non-redundant Flnc transcripts were annotated based on BLAST searches of the following seven databases: Nr (NCBI non-redundant protein sequences), Nt (NCBI non-redundant nucleotide sequences), Pfam (Protein family), KOG/COG (Clusters of Orthologous Groups of proteins), Swiss-Prot (a manually annotated and reviewed protein sequence database), KEGG (Kyoto Encyclopedia of Genes and Genomes), and GO (Gene Ontology). The programs used for the functional annotation included hmmscan (version: 3.1b2) for the Pfam database analysis, blast+ (version: 2.6.0+) for the Nt database analysis, and diamond blastx (version: 0.8.36) for the Nr, KOG/COG, Swiss-Prot, KEGG, and GO database analyses. The E-cutoff value for all seven database analyses was set as ≤1e-5. The top hit for the BLAST results was used for the functional annotation. The open reading frame of each FL transcript was predicted using the ANGLE pipeline.
Hormone metabolism and signaling genes were annotated based on AraCyc (v17.1) and the KEGG database as well as the relevant published literature regarding Arabidopsis. Transcription factors and regulators were identified and classified using the iTAK program (version 1.7a) (https://github.com/kentnf/iTAK) and the associated database.