De novo assembly and comparative analysis of the transcriptome of embryogenic callus formation in bread wheat (Triticum aestivum L.)

Background During asexual reproduction the embryogenic callus can differentiate into a new plantlet, offering great potential for fostering in vitro culture efficiency in plants. The immature embryos (IMEs) of wheat (Triticum aestivum L.) are more easily able to generate embryogenic callus than mature embryos (MEs). To understand the molecular process of embryogenic callus formation in wheat, de novo transcriptome sequencing was used to generate transcriptome sequences from calli derived from IMEs and MEs after 3d, 6d, or 15d of culture (DC). Results In total, 155 million high quality paired-end reads were obtained from the 6 cDNA libraries. Our de novo assembly generated 142,221 unigenes, of which 59,976 (42.17%) were annotated with a significant Blastx against nr, Pfam, Swissprot, KOG, KEGG, GO and COG/KOG databases. Comparative transcriptome analysis indicated that a total of 5194 differentially expressed genes (DEGs) were identified in the comparisons of IME vs. ME at the three stages, including 3181, 2085 and 1468 DEGs at 3, 6 and 15 DC, respectively. Of them, 283 overlapped in all the three comparisons. Furthermore, 4731 DEGs were identified in the comparisons between stages in IMEs and MEs. Functional analysis revealed that 271transcription factor (TF) genes (10 overlapped in all 3 comparisons of IME vs. ME) and 346 somatic embryogenesis related genes (SSEGs; 35 overlapped in all 3 comparisons of IME vs. ME) were differentially expressed in at least one comparison of IME vs. ME. In addition, of the 283 overlapped DEGs in the 3 comparisons of IME vs. ME, excluding the SSEGs and TFs, 39 possessed a higher rate of involvement in biological processes relating to response to stimuli, in multi-organism processes, reproductive processes and reproduction. Furthermore, 7 were simultaneously differentially expressed in the 2 comparisons between the stages in IMEs, but not MEs, suggesting that they may be related to embryogenic callus formation. The expression levels of genes, which were validated by qRT-PCR, showed a high correlation with the RNA-seq value. Conclusions This study provides new insights into the role of the transcriptome in embryogenic callus formation in wheat, and will serve as a valuable resource for further studies addressing embryogenic callus formation in plants. Electronic supplementary material The online version of this article (10.1186/s12870-017-1204-2) contains supplementary material, which is available to authorized users.


Background
Efficient plant regeneration from in vitro cultured cells and tissues is a prerequisite requirement for the successful application of plant genetic engineering, which has been incorporated into the improvement and breeding of many crops [1]. However, the regeneration efficiency of a great number of plant species is still low under in vitro conditions. Embryogenic callus formation offers great potential for fostering in vitro culture efficiency whereby somatic cells are induced to differentiate embryogenic cells and form somatic embryos that develop into new plants [2,3]. The ability to initiate embryogenic cultures is controlled by the intrinsic and external factors of the explants, including the genotype, the developmental stage, and the different tissue of the explants, as well as the factors associated with the medium as perceived by the complementary sensors in cells, including stress, phytohormones, and the artificial nutritional environment in the culture process [4][5][6]. In plants, the tissue culture response (TCR), regenerative property, regeneration response, and somatic embryogenesis are all traits that are used to study tissue culture.
The bread wheat (Triticum aestivum L.) is one of the most important human food crops. Considerable effort and progress has been made in tissue culture system optimization over the last few decades, however tissue culture efficiency is still low and lags behind that of other plants such as rice and maize [24,25]. A number of studies on wheat embryogenic callus formationassociated genes have been performed using Northern and semi qRT-PCR analyses [26,27]. Considerable data have been accumulated regarding the influence upon TCR, but with a limited knowledge of the genes involved therein.
Next-generation sequencing (NGS) technology has high sensitivity and includes both low-and high-level gene expression [28][29][30], and has been increasingly applied in numerous plants to elucidate development [31], senescence [32,33], effects of different factors on plant growth [32,33], and determine transcriptome changes in response to various abiotic stresses such as cold [34], salt [35], drought [36], water [37], and has also been used in wheat transcriptome studies regarding response to stress or disease [38,39]. In terms of in vitro culturing in plants, NGS has been successfully used in cotton [40], Picea balfouriana [41], camphor tree [42], Maize [43], and in Ramie [44] for transcriptome analysis. However, few reports have been found regarding the transcriptome in wheat in vitro culture at the genomic scale. Thus, the study of gene expression patterns and functioning during callus formation could provide a molecular basis for embryogenic callus formation in wheat. However, the large size and polyploid complexity of the wheat genome [45], and the difficulties of tissue culture, have been obstacles to the study of the genes involved in wheat embryogenic callus formation. Using Illumina deep sequencing, we compared the transcriptome expression changes between IMEs and MEs for in vitro culture at 3d, 6d, and 15d, and between developmental stages in IMEs and MEs, combined with the functional annotation of the DEGs, especially of those that are potentially involved in embryogenic callus formation. This will provide novel insights into wheat embryogenic callus formation and the transcriptome data presented here will serve as a valuable resource for future studies of the genes and gene regions involved.

Plant material and tissue culture
The bread wheat cultivar Zhoumai 18 was grown in the Zhengzhou experimental field (longitude E 113.65°, latitude N 34.76°) of the Henan Agricultural University of China during the 2011-2012 cropping season. Harvested seeds were planted in the same experimental field in October 5 2012, after which they were transplanted into a greenhouse in January 15 2013 under the following controlled environmental conditions: 75% relative humidity, 26/20°C day/night temperature, 12−/12-h light/ dark photoperiod, and a light intensity of 10,000 lx. Anthesis was recorded in February of 2013 when 50% of the plants had reached the flowering stage.
Immature seeds from the main spikes were harvested 14 days after pollination. The mature seeds were soaked in running tap water for 4 h. Immature and mature seeds were surface-sterilized for 30 s in 75% (v/v) ethanol, followed by 6 min immersion in 0.1% (m/v) mercuric chloride solution with agitation, and then rinsed 4 times with sterilized distilled water. The IMEs and MEs were extracted from the sterilized seeds on a clean bench and explants were placed with the scutellum upwards in sterile Petri dishes containing solid Murashige and Skoog (MS) medium (MS basal salts, Gamborg's B5 vitamins, and 2 mg L −1 2, 4-Dichlorophenoxyacetic acid). They were then grown in a growth chamber at 22-24°C in the dark. The MEs and IMEs were cultured in 3 biological replicates, each replicate consisting of 15 plates and each plate containing 10 embryos. The embryos were harvested at 3 DC, 6 DC, and 15 DC from 5 plates in each of the 3 replicates, after which they were snap-frozen in liquid nitrogen and stored at −80°C until RNA extraction.

RNA extraction, RNA-seq library construction and Illumina sequencing
Total RNA was extracted from the IME and ME callus in the 3 biological replicates using TRIzol reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's instructions, and then characterized on a 1% agarose gel. The purity of the RNA was examined using a NanoDrop 2000 spectrophotometer (IMPLEN, CA, USA), and the integrity was examined using an Agilent 2100 Bioanalyzer (Agilent Technologies, CA, USA). The RNA samples of the 3 biological replicates were mixed in equal amounts and used for the construction of libraries by the Biomarker Biotechnology Corporation (Beijing, China). A total of 3 μg RNA for each sample was used as input material for cDNA library construction. The mRNA was enriched and purified with oligo (dT)rich magnetic beads and then broken into short fragments. Taking these cleaved mRNA fragments as templates, first strand cDNA was synthesized and primed by oligo-dT and the second strand cDNA was synthesized using random primers. The resulting cDNAs were then subjected to end-repair and phosphorylation using T4 DNA polymerase and Klenow DNA polymerase.
Following this, an "A" base was inserted as an overhang at the 3′ ends of the repaired cDNA fragments and Illumina paired-end solexa adaptors were subsequently ligated to these cDNA fragments to distinguish the different sequencing samples. To select a size range of templates for downstream enrichment, the products of the ligation reaction were purified and visualized on a 2% agarose gel. Next, PCR amplification was performed to enrich the purified cDNA template.
RNA sequencing was performed on an Illumina HiSeq (TM) 2000 (San Diego, CA, USA). The RNA-seq read data were deposited in the NCBI Sequence Read Archive (SRP093588).

De novo assembly and functional annotation
Clean, high quality reads were obtained after filtering the adaptor sequences and reads with ambiguous "N" bases and with a base quality less than Q30 (and more than 10% base quality less than Q20). The clean reads were assembled into contigs using the Trinity method (Grabherr MG, et al., 2011) using an optimized k-mer length (25-mer). Subsequently, the contigs were assembled into transcripts according to the paired-end information, after which they were clustered based on sequence similarity. Finally, the longest transcript in each cluster was regarded as a unigene, and the singletons were combined together as total unigenes. All unigenes were mapped to the reference sequences (wheat unigenes from NCBI) and allowed no more than one nucleotide mismatch. For annotation, all unigenes were used for the blast search and annotation against a series of public databases using BLASTx (E-value-5 ≤ 10), including the NCBI non-redundant (Nr) protein databases (http://www.ncbi.nlm.nih.gov), the Swiss-Prot protein database (http://www.expasy.ch/sprot), the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database (http://www.genome.jp/kegg), the Cluster of Orthologous Groups (COG) database (http://www.ncbi.nlm.nih.gov/ COG), and the pfam database. Functional classification by gene ontologies (GO) of all unigenes was performed using Blast2GO software.

Identification of differentially expressed genes
To quantify the gene expression level, the number of mapped clean reads was estimated using RSEM [46] for each sample, and the normalized expression values RPKM (reads per kilo base of exon model per million mapped reads) of each unigene in the 6 libraries were used as the value of gene expression levels [47]. To determine significant differences in gene expression, we used threshold criteria based on the FDR statistical method and compared the normalized expression values RPKM using a threshold value of P ≤ 0.001 and |log2 Ratio| ≥ 1 based on the FDR < 0.05.

Analysis of transcription factors
Putative transcription factors were identified by BLASTx against the Plant Transcription Factor Database (PlnTFDB) 3.0 [48] using the bread wheat Transcription Factor (http://planttfdb.cbi.pku.edu.cn/index.php?sp=Tae), Arabidopsis Transcription Factor (http://planttfdb.cbi.pku.edu.cn/index.php?sp=Ath) and the alignment results of orthologous in the NCBI. Subsequently, the differentially expressed TFs were picked out from the DEGs in the IME vs. ME comparison, and between the stages in IMEs and MEs.

Unigene quantification by real-time PCR.
A total of 32 unigenes were selected for expression profile validation by qRT-PCR. Reverse transcription (RT) reactions were performed in 3 independent biological replicates with RNA that was individually extracted from 3 independent biological samples of six types of callus, performed using a PrimeScript® RT reagent Kit with gDNA Eraser (Perfect Real Time; Takara, China). The first genomic DNA elimination reaction was conducted in a final volume of 10 μL including 2.0 μL of 5× gDNA Eraser Buffer, 1.0 μL of gDNA Eraser, 2.0 μL of total RNA, and 5 μL of RNase Free dH2O. Reactions were incubated at 42°C for 2 min, and then maintained at 4°C. The RT reactions (10.0 μL) were then used for the SYBR® Green qPCR assay in a 20-μL reaction mixture that included 4.0 μL of 5 × PrimeScript® Buffer 2 (for real time), 1.0 μL of PrimeScript® RT Enzyme Mix I, 1.0 μL of RT Primer Mix, and 4.0 μL of RNase Free dH2O. The reactions were incubated at 37°C for 15 min, followed by 85°C for 5 s, after which they were maintained at 4°C.
Real-time PCR was performed on a Bio-Rad CFX96TM Real-Time System with SYBR® Premix Ex Taq II™ (Takara, China). Each reaction included 2 μL of product from the diluted RT reactions (cDNA solution), 1.0 μL of each primer (Additional file 1), 12.5 μL of SYBR® Premix Ex TaqTM II (2×), and 8.5 μL of RNase free water. All qRT-PCR reactions were incubated in a 96-well plate at 95°C for 30 s, followed by 40 cycles at 95°C for 5 s, 60°C for 30 s, and 72°C for 30 s. The actin gene (GenBank: AB181991) was used as the endogenous control. All reactions were run in triplicate. The specificity of each primer pair was verified by agarose gel electrophoresis and melting curve analysis. The relative expression levels of genes were calculated using the 2 -ΔΔCt method [49]. The gene sample in the ME library along with the CT value was selected as the calibrator, in which the expression level was set as 1.0. The relative expression levels of the same genes in the corresponding IME library were then normalized by comparison.

Illumina sequencing and de novo assembly
With regards to the transcriptional analysis, the calli from IMEs and MEs cultured in vitro at 3d, 6d and 15d were selected, resulting in a total of 6 cDNA libraries (IME3, IME6, IME15 and ME3, ME6, ME15) that were constructed and sequenced using the Illumina HiSeq™ 2000 platform. Following data cleaning and quality checking, 155,192,839 reads (31.34 G) with Q30 values ≥80.00% were obtained from the six libraries. Among the clean reads, 100% had quality scores at Q20 level ( Table 1).
The high-quality reads were then merged and de novo assembled using the Trinity platform software, resulting in the generation of 331,201 transcripts with an average length of 1009.69 bp and a N50 length of 1639 bp. The total length of all transcripts was approximately 334 Mb. These transcripts were further subjected to cluster and assembly analyses, resulting in 142,221 unigenes with an average length of 657 bp and an N50 value of 1001 bp, of which 23,875 unigenes were more than 1kbin length ( Table 2, Additional file 2). Open Reading Frames (ORFs) were analyzed by Getorf from the EMBOSS package and 141,393 unigenes (99.42%) had ORFs with a start codon (Table 3).
Metabolic pathway analysis was conducted using the KEGG orthology (KO) categories system. Consequently, 14,727 (10.36%) unigenes were assigned to 128 pathways and divided into 5 classes. In the primary pathway hierarchy, "metabolism" and "genetic information processing" were dominant ( Fig. 4; Additional file 4c).

Investigation of differentially expressed genes and functional categorization
The expression levels of each unigene in the6 libraries were normalized and calculated using RPKM [47]; Additional file 5). Differentially expressed genes (DEGs) analysis in the comparisons of IME vs. ME at the 3 corresponding stages resulted in a total of 5194 DEGs in all three comparisons, with 3181, 2085 and 1468 DEGs at 3, 6 and 15 DC, respectively ( Fig. 5a; Additional file 6a-c). With regards to the comparisons between developmental stages 4731 DEGs were identified ( Fig. 5a; Additional file 7a-d), of which 2937 overlapped with the 5194 DEGs in the IME vs. ME comparisons (Additional file 7e).
In the IME vs. ME comparisons, 283 DEGs overlapped in all 3 comparisons, of which 266 were annotated in the 7 databases mentioned above ( Fig. 5b; Additional file 6d-e). The degree of DEG overlap between the comparisons during wheat embryogenic callus formation isindicated in Fig. 5b. Functional classification indicated that 2579, 1878, and 1271 DEGs could be annotated in at least one of the 7 databases at 3, 6, and 15 DC, respectively, while 1873, 1433, and 919 DEGs could be assigned to at least one GO term at 3, 6, and 15 DC, respectively (Table 5; Additional file 8a-c).
In the comparisons between stages in IMEs and MEs, 1302, 1450, 1525 and 1274 DEGs could be annotated in at least one of the 7 databases in the IME6 vs. IME3, IME15 vs. IME6, ME6 vs. ME3, and ME15 vs. ME6 comparisons, respectively. Those assigned to GO terms included 1916, 1035, 1063, and 897 across the 4 comparisons, respectively (Table 5; Additional file 8d-g). In the biological process category of GO assignment, metabolic processes, cellular process and single-organism process were the 3 top groups in each comparison (Fig. 6a). With respect to the molecular function category, the assignments were mainly attributed to catalysis and binding (Fig. 6b).
Of the KEGG pathway annotation of DEGs in the IME vs. ME comparisons, 473 unigenes were assigned to 97 KEGG pathways in IME3 vs. ME3, while 335 unigenes were assigned to 89 KEGG pathways in IME6 vs. ME6, and 202 unigenes were assigned to 82 KEGG pathways in the IME15 vs. ME15 comparison (Additional file 9a-c).
Pathway enrichment analysis revealed that the most representative unigenes included phenylpropanoid biosynthesis, plant-pathogen interaction, starch and sucrose metabolism, and plant hormone signal transduction in the IME3 vs. ME3 comparison (Additional file 9 a). In the IME6 vs. ME6 comparison, the 4 most represented unigenes included phenylpropanoid biosynthesis, phenylalanine metabolism, starch and sucrose metabolism, and plant hormone signal transduction (Additional file 9b). In the IME15 vs. ME15 comparison, phenylpropanoid biosynthesis and phenylalanine metabolism were the prevailing categories (Additional file 9c).

Differently expressed TFs
TFs possess important functions in embryogenic callus formation. All 1787 TFs were identified in this study (1748 from Triticum aestivum L., 23 from Arabidopsis thaliana L., and 16 from public databases; Additional file 10a). The DEGs were classified into TF families and the results indicated that there were 271 significantly differentially expressed TFs in IME vs. ME comparisons ( Fig. 7a; Additional file 10b). In the IME3 vs. ME3 comparison, 133 out of the 191 TFs showed up-regulation, while 58 showed down-regulation (Fig. 7e). The prevailing family was the Apetala2 (AP2) /Ethylene Responsive Factor (ERF) family with 36 (30 upregulated) unigenes, followed by the MYB family with 23 unigenes (16 up-regulated), the bHLH family with 18 unigenes (11 up-regulated), the WRKY family with 15 unigenes (14 up-regulated), and the NAC family with 10 unigenes (6 up-regulated; Fig. 7b).
Comparisons between stages in the IMEs and MEs libraries provided us with 194 significantly differentially expressed TFs in total, with 87, 79, 56, and 60 in IME6  Table 6). Five of the 10 TFs were selected for expression level validation by qRT-PCR and the results indicated that they were all highly associated with the RNA-Seq value (Fig. 11).

Somatic embryogenesis-related genes in wheat callus
With respect to embryogenic callus formation, except for TFs, there were 346 DEGs identified in at least one of IME vs. ME comparisons including stress related genes or homologues to the previously annotated genes that were potentially somatic embryogenesis related genes, such as calcium-dependent protein kinase (CDPK)/calmodulin (CAM) [50,51], germin-like protein (GLP) [52,53], glutathione s-transferase (GST) [54], late embryogenesis abundant (LEA) proteins [55,56], zinc finger [57,58], non-specific lipid-transfer protein (LTP) [59], heat-shock protein (HSP) [60], indole acetic acid induced protein (Aux/IAA), and other aux related genes [4] (Fig. 8a). Of the 346 stress and somatic embryogenesis related genes (SSEGs), 35 overlapped in the 3 comparisons of IME vs. ME ( Table 6). Five of the 35 SSEGs were selected for expression level validation by qRT-PCR and the results indicated that they were all highly correlated with the RNA-Seq value (Fig. 11). Auxin genes play an important role in plant cell dedifferentiation and redifferentiation [4,61]. In this study, 44 DEGs (38 from the 346 SSEGs and 6 from the 271 TFs) were related to auxin, i.e., auxin-responsive protein (10), auxin efflux carrier component (2), auxin-induced protein (9), auxin-repressed protein (2), auxin transporter-like protein (2), auxin response factor (ARF, 4), small auxin-up RNAs (SAURs) family protein (7), GH3 auxin-responsive promoter (6), indole-3-acetaldehyde oxidase (1), and other auxin-related proteins were identified in IME vs. ME comparisons (Additional file 11a). The expression of 346 SSEGs in the 6 libraries is shown in Fig. 8b.  Interestingly, 217 of them overlapped with the comparisons between stages in IME and/or ME (Additional file 11b). The KEGG pathways associated with plant hormone signal transduction in IME vs. ME comparisons, with regards to auxin related genes, included GH3 and SAURs which were down-regulated at 15d, as well as GH3 which was up-regulated at 3d and 6d. SAUR was mix regulated (up and down) at 3d and 6d, and AUX1 was down-regulated at 3d and up-regulated at 6d. In the comparisons between stages in IMEs, ARF was downregulated in IME6 vs. IME3 and in IME15 vs. IME6, while GH3 was down-regulated in the IME15 vs. IME6 comparison, and most of the auxin related genes are down-regulated during embryogenic callus formation from 3d of callus initiation to stage 6d of visible embryogenic callus, and to the somatic embryo generation at 15d in IME (Additional file 12). This implies that the majority of the auxin related genes are regulated in an intricate program in embryogenic callus formation.

Expression and GO analysis of the common DEGs in IME vs. ME comparisons
Of the 5194 DEGs in IMEs vs. MEs, 283 were simultaneously differentially expressed in all the three comparisons. Of the 283 overlapping DEGs, 35 overlapped in the above mentioned 346 SSEGs, while 10 overlapped in the above-mentioned 271 TFs. The expressions of the 283 DEGs were clustered into 6 groups in the IME libraries (Fig. 9a).
GO function analysis of the 283 overlapped DEGs indicated that 208 unigenes received at least one GO term and were classified into 3 major functional categories:   Fig. 9b). The DEGs related to processes of response to stimulus (52 unigenes), multi-organism process (11 unigenes), reproductive process (11 unigenes), and reproduction (5 unigenes) possessed a significantly higher percentage in the 283 overlapped DEGs than in all unigenes (Fig. 9b). The 52 unigenes related to response to stimulus process included 18 of the 35 SSEGs and 2 of the 10 TFs, suggesting that somatic embryogenic formation was intimately related to the response to stimulus process. Therefore, the remaining 32 unigenes may also be involved in somatic embryogenic formation. With regards to multi-organism processes, reproductive processes and reproduction in biological processes, 11, 11 and 5 unigenes were further merged into 16 unigenes due to the involvement of some genes in multiple functions. Three of the merged 16 unigenes belonged to 35 SSEGs and the 10 TFs, and the remaining 13 genes may also be involved in somatic embryogenic formation. Of the 13 unigenes, 5 (comp115979_c0, comp122548_c0, comp124996_c0, comp126585_c2, comp131969_c0) were simultaneously involved in all 3 processes. Furthermore, the 32 and 13 unigenes possibly involved in somatic embryogenic formation were merged into 39 unigenes by the bio-process terms of the GO analysis ( Table 6). Four of the 39 bioprocesses-involved unigenes (BPGs) were selected to perform expression level analysis by qRT-PCR (Fig. 10).
The results indicated that their expression levels were highly associated with the RNA-Seq value (Fig. 11). Of all 283 overlapping DEGs in all 3 comparisons of IME vs. ME, 12 containing 5 annotated genes (1 from the 35 SSEGs, 2 from the 10 TFs, and 2 from the 39 BPGs mentioned above) and 7 unknown genes (Table 7) showed significant differences in the different stages of IMEs, but did not show any significant difference in the different stages of MEs, indicating that the 7 unknown genes might be related to embryogenic callus formation. The qRT-PCR results validated that the 7 unknown genes and one (comp126788_c0) of the 5 annotated genes showed highly consistent expression levels with RNA-Seq (Fig. 10). Therefore, expression comparative analysis and function analysis showed that the above-mentioned 10 IFs, 35 SSEGs, 39 BPGs and 7 unknown DEGs might contribute to embryogenic callus formation.

Discussion
Bread wheat is one of the major human food crops and has been subject to the application of biotechnology for crop improvement. High plant regeneration efficiency forms the basis for biotechnology by genetic transformation. Numerous studies have aimed to improve regeneration efficiency in wheat by optimizing the culture conditions, the explant physiological traits, and the genotypes. Unfortunately, the regeneration efficiency remains poor. Since embryogenic calli can be harnessed to generate plants with improved tissue culture efficiency, illustration of the molecular mechanism of embryogenic callus formation is helpful for manipulation of the regeneration process.
The expression of the somatic embryogenesis-related genes and signaling regulation has been researched in plants by comparative analysis between the somatic     [4,42,62] or during somatic embryogenesis [43] using NGS. In the present study, we obtained the transcriptome of the embryogenic calli derived from IMEs and the non-embryogenic calli derived from MEs following culturing for 3d, 6d, and 15d. The de novo assembly generated 142,221 unigenes and the comparisons of IME vs. ME at 3 stages provided us with 5194 DEGs; more DEGs were identified in the comparisons at 3 DC and 6 DC than at 15 DC. Additionally, 4731 DEGs were identified in the comparisons between the stages in IMEs and MEs, suggesting that the regulatory pathway exists in embryogenic callus formation in bread wheat. Previous studies have identified genes that are potentially involved in somatic embryogenesis, including some genes that are related to stress response [4][5][6]43] and auxin processes [40]. Several studies have revealed that stress plays a key role in somatic embryogenesis [63]. In the 3 comparisons of IME vs. ME, 346 SSEGs of DEGs were identified including HSP, GLPs, LEA, CDPK/CAM, GST, and LTP, which are potentially involved in embryogenic callus formation. Additionally, 35 of them overlapped in all 3 comparisons of IME vs. ME.
LTP is related to embryogenic callus development through the regulation of the embryogenic callus formation pathway [59], and Ca 2+ has been suggested to play an intermediary role during plant embryogenesis [50,51]. GST was previously shown to be induced by various stresses involved in the SE through complex interactions with other transcriptional regulators and auxin metabolism [54,64]. Germins and GLPs are thought to play a significant role in stress and in somatic embryogenesis in wheat and Germins had oxalate oxidase activity [52,53], 6 differently expressed genes related to GLPs (4) and oxalate oxidase (2) were identified in IME vs. ME comparisons (Additional file 11 b-c), while LEA, ABA and ethylene are stress-related substances for the acquisition of embryogenic competence by plant cells [55,56]. Many genes thatrespond to various abiotic stresses have previously been found to be induced by auxin [4]. For example, Hsps in carrot somatic cells was found to be an auxin-responsive gene during somatic embryo development [60].
Auxin as a plant growth hormoneis considered to be essential for the initiation of cell division and differentiation [65] before it can express embryogenic competence [66], as well as in the initiation of embryogenesis from somatic tissues [67]. Auxin accumulation was previously detected in developing somatic embryos in Arabidopsis and carrot [68,69], and the endogenous auxin concentration reached a peak at the embryogenic callus stage during somatic embryogenesis [40]. In sunflower, an endogenous auxin pulse was believed to be involved in the induction of somatic embryogenesis [70]. Exogenous Fig. 8 Analyses of differentially expressed somatic embryogenesis related genes during wheat callus development. a Expression of 346 differentially expressed somatic embryogenesis related genes. b The distribution of 346 somatic embryogenesis related genes in comparisons of IME vs. ME at the 3 stages. Numbers represent the percentage of genes out of the differentially expressed genes in each comparison auxin application and endogenous auxin content were both determining factors during the induction phase. The level of free and conjugated IAA was previously found to be highly regulated with embryonic callus formation in wheat [26]. In this study, KEGG analysis of DEGs in IME vs. ME comparison demonstrated that many genes were involved in plant hormone signal transduction with down-regulation during the process of embryogenic callus formation (Fig. 8a). Research of SE identification and characterization indicated that stressrelated genes and proteins were associated with SE in stress-induced acquisition [71]. Thus the 346 DEGs of SSEGs, especially the 35 overlapping DEGs, might be involved in embryogenic callus formation.
TFs are important factors involved in plant development as well as in regeneration process. A series of studies on SE development revealed that complex transcription regulation networks existed in cell differentiation, maintaining embryogenic competency, embryogenic patterning, meristem maintenance, as well as roles in stress and hormonemediated signaling [40,44]. In this study, we identified 271 differentially expressed TFs in the comparisons of IME vs. Fig. 9 Analyses of 283 common differentially expressed genes in IME vs. ME at three stages during embryogenic callus formation in wheat. a Cluster analysis of 283 common DEGs in IME vs. ME at three stages based on K-means method. 283 common DEGs were divided into 6 distinct temporal expression profiles in IME. b GO Functional categorization of 283 common differentially expressed genes based on gene ontology ME, 10 of which were simultaneously differentially expressed in all 3 comparisons. Of the 271 TFs, bHLH, AP2/ERF, b-ZIP, and WRKY were highly represented.
Zinc finger family proteins have been proven to be involved in stress, the regulation of plant cell death and callus differentiation [57,58]. BohLOL1, encoding an Fig. 10 Relative mRNA levels of 32 DEGs in wheat callus were determined by quantitative RT-PCR analyses. Genes were normalized to the actin gene (GenBank: AB181991). Experiments were repeated in triplicate. Error bars represent one standard deviation (SD). Relative expression level of genes in IME were presented as fold-change (mean ± SD) compared according to ME of 1.0 LSD1-like zinc finger protein in bamboo and participating in growth regulation and response to biotic stress, was up-regulated by auxins, cytokinins, and hydrogen peroxide in in vitro culture [72]. PEI1, encoding a protein containing a Cys3-His zinc finger domain, is an embryo-specific transcription factor that plays an important role during Arabidopsis embryogenesis [73]. The over expression of OsLSD1, a rice zinc finger protein, accelerated callus differentiation in transformed rice tissues [58]. In the present study, the up-regulated unigenes that encode the zinc finger family protein in the IME vs. ME comparisons were 15/18, 8/18, and 0/3 at 3, 6, and 15 DC, respectively (Additional file 8). The ratio of the up-regulated DEGs was higher at the phase of embryogenic callus initiation than at the development and completed phase, indicating that they have a regulatory role in wheat embryogenic callus formation.
A member of the AP2/ERF domain family BABY BOOM (BBM) was identified as a marker of SE in cell cultures of Brassica napus (B. napus L.).The ectopic expression of BBM can lead to somatic embryogenesis in Arabidopsis and B. napus [21,74], while over-expression can induce embryo formation and enhance regeneration ability in tobacco [75], sweet pepper [76], and in cacao [77]. The A. thaliana EMBRYOMAKER (EMK), another member of the AP2/ERF domain family, plays a redundant role in the maintenance of embryonic cell identity. Ectopic expression of AtEMK can promote the initiation of somatic embryos from cotyledons [78,79] revealed that AP2/ERF expression was related to the dramatic regulation of somatic embryogenesis in Hevea. In the present study, there were 20 AP2/ERF genes differentially expressed in the comparisons of IME vs. ME at 3 stages (Additional file 8), with 2 genes (comp126788_c0, comp125594_c0) being differentially expressed in IME vs. ME and in the comparisons between stages in IME, but not in comparison between stages in ME. Thus these AP2/ERF genes may be related to embryogenic callus formation in wheat.
Members of bHLH are involved in growth, developmental, and abiotic stress responses [80], axillary meristem formation [81]. They also participate in epidermal cell type specification in Arabidopsis [82], and are involved in brassinosteroid and abscisic acid signaling in rice and Arabidopsis [83]. BIM1, a bHLH protein controlling Arabidopsis embryonic patterning and interacting with the auxin and BR signaling pathways [84], which interacts with other proteins and TFs such as the key regulator MYB families that involved in stress response and regulated embryogenesis pathways [85]. In this study, the ratio of up-regulated unigenes of bHLH in IME vs. ME comparisons was 12/19, 3/18, and 2/11 at 3, 6, and 15d, respectively, suggesting that a complex regulation exists in embryogenic callus formation in wheat. Additionally, the majority of MYB genes was up-regulated in embryogenic initiation and showed a relatively low expression level during somatic embryo development, indicating that the members of these 2 families play important roles in embryogenic callus formation.
The WRKY TFs have been shown to be involved in the response to biotic and/or abiotic stresses [86], and were found to be up-regulated during SE in plants. The expression of the member genes in this family was shown to be associated with embryogenic callus formation [87]. In this study, the DEGs at 3 DC possessed the highest numbers of WRKY than other stages, with the ratio of up-regulated unigenes in IME vs. ME comparisons being 14/15, 2/6, and 5/5 at 3, 6, and 15d, respectively. Fig. 11 Correlation analysis of log2 fold change values obtained from RNA-Seq and qRT-PCR for 32 DEGs in the comparisons of IME vs. ME. RNA-Seq fold change refers to the ratios of RPKM values of IME to ME for selected genes, while Q-PCR fold change is the relative quantity of IME normalized to expression level of ME B3 domain transcription factors in Arabidopisis (LEC2, FUS3 and ABI3) encode regulatory proteins involved in embryogenesis and the induction of somatic embryo development [16]. LEC1 (LEAFY COTYLEDON1), a CCAAT box-binding factor, along with LEC2 and FUS3, are essential for plant embryo development. For instance, LEC2 promotes embryogenic induction in the somatic tissues of Arabidopsis [88]. Furthermore, cacaoTcLEC2, a functional ortholog of AtLEC2, is involved in similar genetic regulatory networks during cacao somatic embryogenesis [89]. In this study, 4, 1, and 1 B3 TFs were differentially expressed in IME vs. ME at 3, 6 and 15d, with all of them being up-regulated. Three of them were differentially expressed in the comparison between stages in IME and/or ME.
The MADS-domain transcriptional regulator AGA-MOUS-LIKE15 (AGL15) had been reported to enhance somatic embryo development in Arabidopsis and soybean when ectopically expressed [90]. Soybean orthologs of the Arabidopsis MADS box genes, AGAMOUS-Like15 (GmAGL15) and GmAGL18, constitutively expressed increased embryogenic competence of explants. ABI3 and FUSCA3 were found to be directly up-regulated by GmAGL15 [22]. WUSCHEL controls meristem function by direct regulation of cytokinin-inducible response regulators, which was highly expressed during early somatic embryo development [91]. In the present study, MADS and wux genes were differentially expressed at 3d and 6d. Histone H3.3 participates in the epigenetic transmission of active chromatin states in animal and Histone H3 showed appreciable levels in the somatic embryos at all stages of somatic embryo development in Alfalfa [92,93], 15 Histone DEG was identified with 3 of H3 in IME vs. ME comparisons (Additional file 11c).
Somatic embryogenesis is a complex process. TF with auxin and stress treatment were essential for the acquisition of embryogenic competence by carrot somatic cells [5]. In this study, many stress factors including oxidative, heat and salt stress, were differentially expressed in the comparisons. Thus, the Aux, TF, and stress related genes regulated the embryogenic callus formation. Except for SSEGs and TFs, there were 39 genes that were present in all 3 comparisons of IME vs. ME and were involved in GO biological process of response to stimulus, in multiorganism process, reproductive process and reproduction, function as expansin, cellulose synthase, mobile element transfer protein, multicopper oxidase, ATPase family associated with various cellular activities (AAA), glutaredoxin, ethylene insensitive 3, senescence-associated protein, universal stress protein family and pathogenesis-related protein/dehydrase and lipid transport, and were perhaps related toembryogenic callus formation.
The 7 unknown DEGs presented in three of the IME vs. ME comparisons and in the comparisons between stages in IMEs but not in MEs, function as xyloglucan endotransglucosylase/hydrolase (XTH) protein 22, xyloglucan endo-transglycosylase (XET) C-terminus, root cap, cortical cell-delineating protein, U-box domain/E3 ubiquitinprotein ligase A. thaliana, glycerol-3-phosphate acyltransferase 2 A. thaliana, gag-polypeptide of LTR copia-type/ reverse transcriptase Integrase core domain (Table 6). XTH, a cell wall-modifying enzyme, can act as cell wallloosening enzymes [94], and may possess XET or endohydrolase activities [95], indicating their special roles in embryogenic callus formation.

Conclusion
In summary, the combination of tissue culture of wheat embryos and RNA-seq approaches by global gene expression patterns during embryogenic callus formation were performed to investigate the regulatory genes in embryogenic callus formation. Expression and function analysis indicated that the DEGs of 271 TFs (10 overlapped in all 3 comparisons of IME vs. ME), 346 SSEGs (35 overlapped in all 3 comparisons of IME vs. ME), 39 BPGs, and 7 unknown unigenes that overlapped in IME vs. ME and in comparisons between stages in IMEs, but not in any one of comparison in ME, might play an important role in embryogenic callus formation in wheat. The present comparative profiling provided new insights into the molecular mechanisms for the regulation of embryogenic callus formation processes. However, as molecular markers are required to follow specific events during embryogenic callus formation, further experiments are required to evaluate the interaction between stress and auxin signaling during embryogenic callus development.