Genome-wide identi cation and expression pro ling of SET DOMAIN GROUP family in Dendrobium catenatum

Background Dendrobium catenatum, as a precious Chinese herbal medicine, is an epiphytic orchid plant, which grows on the trunks and cliffs and often faces up to diverse environmental stresses. SET DOMAIN GROUP (SDG) proteins act as histone lysine methyltransferases, which are involved in pleiotropic developmental events and stress responses through modifying chromatin structure and regulating gene transcription, but their roles in D. catenatum are unknown. Results In this study, we identified 44 SDG proteins from D. catenatum genome. Subsequently, comprehensive analyses related to gene structure, protein domain organization, and phylogenetic relationship were performed to evaluate these D. catenatum SDG (DcSDG) proteins, along with the well-investigated homologs from the model plants Arabidopsis thaliana and Oryza sativa as well as the newly characterized 42 SDG proteins from a closely related orchid plant Phalaenopsis equestris. We showed DcSDG proteins can be grouped into eight distinct classes (I~VII and M), mostly consistent with the previous description. Based on the catalytic substrates of the reported SDG members mainly in Arabidopsis, Class I (E(z)-Like) is predicted to account for the deposition of H3K27me2/3, Class II (Ash-like) for H3K36me, Class III (Trx/ATX-like) for H3K4me2/3, Class M (ATXR3/7) for H3K4me, Class IV (Su (var)-like) for H3K27me1, Class V (Suv-like) for H3K9me, as well as class VI (S-ET) and class VII (RBCMT) for methylation of both histone and non-histone proteins. RNA-seq derived expression profiling showed that DcSDG proteins usually displayed wide but distinguished expressions in different tissues and organs. Finally, environmental stresses examination showed the expressions of DcASHR3, DcSUVR3, DcATXR4, DcATXR5b, and DcSDG49 are closely associated with drought-recovery treatment, the expression of DcSUVH5a, DcATXR5a and DcSUVR14a are significantly influenced by low temperature, and even 61% DcSDG genes are in response to heat shock. Conclusions This study systematically identifies and classifies SDG genes in orchid plant D. catenatum, indicates their functional divergence during the evolution, and discovers their broad roles in the developmental programs and stress responses. These results provide constructive clues for further functional investigation and epigenetic mechanism dissection of SET-containing proteins in orchids.

decreased tolerance to dehydration stress in atx1 plants [35]. ATX1 modulates dehydration stress signaling in both abscisic acid (ABA)-dependent and -independent pathways. During ABA-dependent pathway, dehydration stress induces ATX1 binding to NCED3 locus, which encodes the rate-limiting enzyme in ABA biosynthesis. Subsequently the deposition of H3K4me3 mark and recruitment of RNA polymerase II are increased, leading to enhanced NCED3 expression and ABA production [36]. Dehydration stress causes dynamic and speci c changes in global histone H3K4me1/2/3 patterns in Arabidopsis, especially H3K4me3 marks with broad distribution pro les on the nucleosomes of stressinduced genes [37]. Similarly, drought stress triggers massive changes in H3K4me3 enrichments on numerous loci (respectively including 3927/910 genes with increased/decreased depositions) in rice seedlings, showing positive correlation with their transcript changes in response to drought stress [38]. However, when Arabidopsis exposed to cold temperatures, H3K27me3 deposition gradually decreases in the chromatins of two cold-responsive genes, COR15A and ATGOLS3 [39].
Dendrobium catenatum (also known as Dendrobium o cinale) belongs to the Orchidaceae family, is a rare and precious Chinese medicinal herb. The stem of D. catenatum is the major medicinal part used for relieving upset stomach, promoting body uid production, and nourishing Yin and antipyresis in traditional remedy and health care [40]. Furthermore, the plant stem contains bioactive extracts with anticancer, hepatoprotective, hypolipidemic, antifatigue, antioxidant, anticonstipation, hypoglycemic, gastric ulcer-protective, and antihypertensive effects, and immunoenhancement, as con rmed by modern pharmacology [41]. However, given its long-time extensive demand and over-exploitation, D. catenatum suffers from near extinction, and was once de ned as an endangered medicinal plant in China [42]. In the past 20 years, D. catenatum has been successfully cultivated and became an important economic crop for health care. Unfortunately, environmental stresses, such as drought, cold, and high temperature, extremely restrict its growth, resulting in heavy yield loss [43]. Hence, it is necessary to screen and identify the candidate genes conferring resistance to differential environmental stresses in D. catenatum molecular breeding.
To obtain a detailed understanding on the SDG family in medicinal orchid plants, we identi ed SDG members throughout the genome of D. catenatum, and subsequently performed comprehensive assessments on the phylogenetic relationship, gene structure, domain organization, gene expression pro ling, and response to environmental stresses. Our results provide insights into the evolution and function of SDG genes in medicinal orchid plants.

Identi cation of SDG Proteins in the D. catenatum Genome
To obtain all the members of SDG proteins in D. catenatum, we performed BLASTP search using known Arabidopsis and rice SDG proteins as queries against the D. catenatum genome (INSDC: JSDN00000000.2). First, we checked the SDG genes of Arabidopsis and rice in the Superfamily 1.75 database (http://supfam.org/SUPERFAMILY/). We discovered 49 genes in Arabidopsis thaliana, corresponding to those reported in literature (Additional le 1) [10,17]. On the other hand, 46 genes were identi ed in Oryza sativa (Additional le 1), three (Os01g65730/OsSET44, Os01g74500/OsSET45, Os06g03676/OsSET46) more than the 43 reported genes [21]. Reciprocal BLAST was carried out to con rm that the hits from D. catenatum and its close relative Phalaenopsis equestris belong to the SDG family. Finally, we obtained 44 SDG genes in D. catenatum (Table 1 and Fig. 1) and 42 in P. equestris (Additional le 2), and they were named after their Arabidopsis homologs.
To characterize and classify the SDG family in D. catenatum, we used SDG proteins in the dicot model plant A. thaliana, the monocot model plant O. sativa, and its close relative P. equestris as references for phylogenetic analysis. The results showed that the 181 SDG proteins from the above four species could be clustered into eight classes (I~VII and M), mostly corresponding to the classi cation criteria in Arabidopsis [10]. By contrast, ARABIDOPSIS TRITHORAX RELATED 3 (ATXR3) branch was previously classi ed into Class III [10], but it was separated from the other branches and near ATXR7-like proteins in Class IV in this study (Fig. 1). Considering their similar substrate speci cities, we combined ATXR3 branch and the neighboring ATXR7-like proteins and categorized them under class M, as supported by the further phylogenetic analysis among Classes III, IV, and M (Fig. 4). Given the high limited reports on Class VI and VII, which feature potential functions for non-histone and histone methylation, we mainly focused on the roles of Classes I~V and M with well-investigated histone methylation speci city in this study.
Class I: E(z)-Like (H3K27me2/3) Class I contains two E(z) homologs in each of monocot plants D. catenatum, P. equestris and rice, and three well-characterized homologs, namely, CURLY LEAF (CLF), SWINGER (SWN), and MEAEA (MEA), that represent three distinct clades in the dicot plant Arabidopsis (Fig. 2). The genes in this class contain 15~16 introns, which are extremely longer in the two orchid species compared with those of Arabidopsis and rice. This result suggests that overall intron length positively correlated with the corresponding genome size. A similar phenomenon related to intron/exon proportion was also observed in the members of the other classes as will be mentioned later. The three Arabidopsis E(z) proteins act as the catalytic subunits of the evolutionarily conserved Polycomb Repressive Complex 2 (PRC2), which is involved in the deposition of H3K27me3 repressive mark on the target gene locus [44]. CLF (dominant H3K27me3 writer) and SWN act redundantly in vegetative and reproductive development, whereas MEA functions exclusively in suppression of central cell proliferation and endosperm development [45][46][47]. Rice E(z) homologs SDG711 and SDG718 participate in mediating accurate photoperiod control of owering time [48]. Clades I-1 (CLF-like) and I-2 (SWN-like) each contain one ortholog in the examined species, but Clade I-3 (MEA-like) is con ned to Arabidopsis. Plant E(z)-like proteins generally harbor highly conserved domain organization at the C-terminal region, which includes a SANT domain, the cysteine rich CXC domain, and the signature SET domain, except for DcCLF and PeSWN, which lack the SANT domain, and DcSWN, which possesses an additional SANT domain at the N-terminus.
Class II: Ash-Like (H3K36me) Class II can be further divided into ve clades, each of which consists of a single member per plant species, except for two members of Clade II-1 in Arabidopsis and D. catenatum (Fig. 3). Clades II-1 to II-4 exist in all the examined species, but Clade II-5 is only found in rice and contains a single member, SDG707, with unknown function. Class II proteins generally share three conserved domains: an Associated with SET (AWS), SET, and PostSET domains [10,49].
The loss of SDG724 leads to late owering [52]. Notably, DcASHH3a/3b in D. catenatum lack AWS domain, different from PeASHH3 from its close relative P. equestris and ASHH3 orthologs in Arabidopsis and rice, and their functional divergence during speciation is interesting to investigate.
Clade II-2 (ASHR3-like) members are characterized by an additional PHD domain near the N-terminus, except for rice SDG736. ASHR3/SDG4 participates in regulating pollen tube growth and stamen development, and its overexpression leads to growth arrest and male sterility [53,54]. ASHR3 harbors catalytic activities on H3K36me1 and possible H3K36me2, which is involved in regulating cell division competence in the root meristem [55].
Clade II-3 (ASHH1-like) members display uniform protein length and highly conserved AWS-SET-PostSET domain combination at the N-terminus. Arabidopsis ASHH1/SDG26 knockout leads to a lateowering phenotype through decreasing H3K4me3 and H3K36me3 level at the SOC1 locus [56,57]. Similarly, the knockdown of rice ortholog SDG708 causes a late-owering phenotype and a genome-wide decrease in H3K36me1/2/3 levels during early growth stages [58]. Predictably, D. catenatum DcASHH1 harbors a similar function.
Clade II-4 (ASHH2-like) proteins are considerably longer than the others, and characterized by an additional CW domain near to the N-terminal triple domain combination. Arabidopsis ASHH2/SDG8 acts as the major H3K36me2/3 writer [57, 59], and its knockout leads to pleiotropic phenotypes in vegetative and reproductive stage [60]. Consistently, the knockdown of rice ortholog SDG725 causes wide-ranging defects, including dwar sm, erect leaves and small seeds [32]. In the aspect of protein architecture, ASHH2 ortholog in D. catenatum or P. equestris is more like Arabidopsis SDG8 than rice SDG725.
Class III: Trx/ATX-Like (H3K4me2/3) Class III consists of ve members, which can be further divided into three clades in each examined plant species (Fig. 4). Class III proteins are characterized by tandem PHD domains in the middle region and SET-PostSET domain combination at the N-terminus. Moreover, several clades contain additional distinct domains, such as PWWP domain speci c to Clade III-1/2, and FYRN-FYRC domain combination speci c to Clade III-1.
Clade III-1 (ATX1-like) contains two members in Arabidopsis, and one in each of the three other species.  [63][64][65], suggesting that ATX1-like proteins demonstrate conserved biochemical and molecular functions during evolution. However, ATX1like proteins produce speci c phenotypes in distinct species due to the differences in developmental context. Thus, DcATX1 and PeATX1 in orchid may play important roles in owering time control.
Clade III-2 (ATX3-like) includes three members in each tested species. Arabidopsis ATX3/4/5 are clustered together and separated from the monocot orthologs. The orthologs from D. catenatum, and P. equestris are consistently clustered together, concordant with their close relationship. In Arabidopsis, ATX3/4/5 exhibit a common evolutionary origin, and function redundantly in genome-wide H3K4me2/3 pro les.
Furthermore, atx3 atx4 atx5 triple mutant displays dwar sm and reduced fertility [66]. In rice, ATX3-like proteins SDG721 and SDG705 function redundantly in modulating H3K4 methylation levels. The loss of both genes results in semi-dwar sm [67]. Considering the dwarf phenotype of ATX3-like mutants in Arabidopsis and rice, the homologs in D. catenatum and P. equestris might be involved in regulating plant architecture.
Clade III-3 shows speci city toward the examined monocots and contains one copy per species. D. catenatum DcATX3d and P. equestris PeATX3d are characterized by an additional Jas domain at the Cterminus, in contrast with the rice ortholog OsSET37/SDG732. Further survey of this clade will provide insights into the evolution of SDG family in monocots.
Class M: ATXR3/7 (H3K4me) Class M comprises of two clades, namely, Clade M-1 (ATXR7-like) and M-2 (ATXR3-like). Each clade contains one copy per plant species (Fig. 4). ATXR7-like proteins usually lack extra domains, except for PeATXR7 with a C-terminal GYF domain. Arabidopsis ATXR7/SDG25 acts as the writer of H3K4 monomethylation (H3K4me1), and its knockout results in early owering [59,68]. ATXR3-like proteins also contain only one copy in each species, are characterized by the presence of DUF4339 domain in the middle region, except for OsSET27/SDG701. Arabidopsis ATXR3/SDG2 is the major H3K4me3 writer, whose depletion leads to pleiotropic development defects [28,69,70]. D. catenatum DcATXR3 and P. equestris PeATXR3 feature a more similar protein architecture to Arabidopsis ATXR3/SDG2 than rice SDG701. This nding suggests that ATXR3-like proteins in orchid may retain their ancestral role, whereas rice ortholog may functionally diverge, as attributed to the loss of speci c domain and partial sequence.
Class IV: Su(var)-like (H3K27me1) Class IV can be divided into two clades, Clade IV-I (ATXR5-like) and IV-II (ATXR6-like), which are characteristic of an N-terminal PHD domain in addition to the de ned SET domain, except for PeATXR5 (Fig. 4). In Arabidopsis, ATXR5 and ATXR6 show largely overlapping functions, and the depletion of both results in global H3K27me1 reduction and heterochromatin decondensation [71,72]. ATXR5/6 are involved in maintaining DNA replication [73] and repressing the expression of transposable element [74].
The overexpression of either ATXR5 or ATXR6 causes male sterility [75]. ATXR5 and ATXR6 probably perform separate roles because of ATXR5 with a dual localization in plastids and nucleus but ATXR6 solely in nucleus [75].
Class V: Suv-Like (H3K9me) Class V contains 15 members in Arabidopsis, 14 in rice and D. catenatum, and 13 in P. equestris; These members can be further divided into two subclasses, SUVH and SUVR, which include Clades V-1 to V-3 and V-4 to V-6, respectively (Fig. 5). Class V proteins are usually characterized by PreSET-SET-PostSET or PreSET-SET domain combinations. SUVH proteins often contain another symbolic SET-and RING-ASSOCIATED (SRA) domain, whereas SUVR proteins in Clades V-4 and V-5 often include another WIYLD domain and tandem ZnF_C2H2 domains, respectively. SUVH genes usually lack introns, except for the members of SUVH4 branch and two members (PeSUVH45 and SDG727) of SUVH5 branch, whereas SUVR genes contain variable number of introns. In general, Class V members are responsible for methylation of histone H3 lysine 9 (H3K9me), in which H3K9 dimethylation (H3K9me2) is the critical mark for gene silencing and DNA methylation, and are involved in heterochromatin formation and reprogramming of gene expression [76].

SUVH subclass
In Clade V-1, the ve members SUVH1/3/7/8/10 in Arabidopsis cluster together and show distinction from the ve homologs in rice and each of the two homologs in D. catenatum or P. equestris. This result indicates that duplication of these clade members occurred after divergence between dicots and monocots. However, the two orthologs in either D. catenatum or P. equestris respectively pair together, indicating that their gene duplication occurred before the split of Dendrobium and Phalaenopsis. Arabidopsis SUVH1/SDG32 performs a distinct anti-silencing function to promote the expression of DNA methylation-targeted genes. SUVH1 knockout causes no effect on H3K9me2 levels but reduces H3K4me3 levels [77]. Furthermore, SUVH1 binds to highly methylated genomic loci targeted by RNA-directed DNA methylation (RdDM). However, rice SUVH1-like protein SDG728 retains its classical function to mediate H3K9 methylation and participates in retrotransposon repression [78]. D. catenatum and P. equestris include two SUVH1-like proteins, far less than the ve members in Arabidopsis and rice. Thus, the function of SUVH1 homologs and the evolutionary mechanisms in orchids require further investigation.
Clade V-2 comprises two members for each examined species, and these members lack PostSET domains, in contrast with those in clade V-1. In Arabidopsis, SUVH2 and SUVH9 as sister paralogs show overlapping functions in RdDM and heterochromatic gene silencing [79,80]. SUVH2 overexpression leads to ectopic heterochromatization accompanied with signi cant developmental defects, such as extreme dwar sm [79,81]. SUVH2 and SUVH9 may feature inactive histone methyltransferase activity [82,83]. However, the simultaneous absences of SUVH2 and SUVH9 lead to a marked decrease in H3K9me2 levels in the RdDM loci [80,84]. SUVH2 and SUVH9 can bind to methylated DNA and facilitate the recruitment of Pol V to RdDM loci [82,84]. Considering the highly similar domain organization among SUVH2-like proteins in these examined species, their function should be evolutionarily conserved.

SUVR subclass
In Clade V-4, there are 3 members in Arabidopsis and P. equestris, 1 in rice, and 4 in D. catenatum, respectively. In Arabidopsis, SUVR2 mediates transcriptional silencing in both RdDM-dependent andindependent manners [92]. SUVR4 participates in the epigenetic defense mechanism by introducing H3K9me3 marks to repress potentially harmful transposon activity [93]. SUVR4 speci cally converts H3K9me1 into H3K9me3 at transposons and pseudogenes within the euchromatin [93,94], but SUVR1 and SUVR2 show no detected histone methyltransferase activity in vitro [92,95]. In this study, D. catenatum DcSUVR4, P. equestris PeSUVR4, and rice SDG712 were grouped together with Arabidopsis SUVR4 but not with SUVR1/2, implying that they possess ubiquitin-binding and HKMTase activities, except for PeSUVR4, which includes an obviously short sequence and lacks WIYLD and PreSET domains.
Clade V-5 contains one member in each tested species, and characterized by an additional tandem ZnF_C2H2 domain, except for SDG706. In Arabidopsis, SUVR5 lacks the SRA domain but recognizes speci c DNA sequences through its zinc nger motifs and establishes the heterochromatic state through H3K9me2 deposition in a DNA methylation-independent manner [96]. The knockout of SUVR5 leads to delayed owering, and no further enhanced phenotype occurs in the quintuple suvr1 suvr2 suvr3 suvr4 suvr5 mutants [96,97]. This nding suggests that SUVR5 is a dominant developmental regulator in SUVR subclass.
Clade V-6 members exist in one copy in each species, and their encoding proteins are notably shorter than those of the other clades in this class. Arabidopsis SUVR3 contains an additional AWS domain close to the SET-PostSET domain combination, and DcSUVR3 contains an intact PreSET-SET-PostSET domain combination. However, SUVR3 orthologs in rice and P. equestris only contain a PreSET domain, suggesting that the genes in Clade V-6 may undergo less selective pressures and become increasingly divergent during evolution. The functions of the genes in this clade remain uncharacterized thus far.

Tissue and organ expression pro les of DcSDG genes
To investigate the potential roles of DcSDGs during growth and development in D. catenatum, we detected the expression pro les of DcSDGs by reanalyzing the RNA-seq data from different plant tissues and organs, including leaf, root, green root tip, white part of root, stem, ower bud, sepal, labellum (lip), pollinia, and gynostemium (column) [98]. The expression pro les of six duplicated DcSDG gene pairs were further compared (Additional le 4). In general, one copy showed higher expression levels than the others in all tissues, except for DcSUVH5a/5b pairs, suggesting that one paralog might performed a dominant function during plant growth and development. DcASHH3a/3b, DcATX3b/3c, DcATXR5a/5b and DcSUVR14b/14c exhibited similar expression patterns, whereas DcSUVH5a/5b pairs displayed differential expression pro les in the detected tissues and organs. These results indicate that distinct duplicated gene pairs might undergo different evolutionary pressures and diverge at varying periods.
Expression levels of DcSDGs in response to environmental stresses D. catenatum is an epiphytic orchid plant that grows on trunks and cliffs and often experiences diverse environmental stresses, such as drought, cold, and high temperature. To detect the responses of DcSDG genes to drought stress, the expression pro les of DcSDGs were assessed by analyzing the RNA-seq data from the leaves under different drought treatments [99] (Fig. 7 and Additional le 5). In brief, the seedlings were irrigated on the 1st day, and kept unwatered from the 2nd day to the 7th day, and recovered on the 8th day. Leaves were sampled at both 06:30 and 18:30 on the 2nd (DR5 and DR8), 7th (DR6 and DR10), and 9th (DR7 and DR15) days, respectively, and at 18:30 on the 8th day (DR11). The results showed that one-week of drought stress notably repressed the expressions of DcCLF, DcASHR3, DcSUVR3, and DcSUVR14c, but obviously induced the expression of DcATXR5b, DcATXR4, and DcSDG49 when sampling at both dawn and dusk. Subsequently, rewatering restored the expression levels of DcASHR3, DcSUVR3, DcATXR5b, DcATXR4, and DcSDG49.
To explore the possible roles of SDG proteins in response to cold stress, we evaluated the expression levels of DcSDGs through analyzing the raw RNA-seq reads from the leaves of D. catenatum seedlings treated at 20 ℃ (control) and 0 ℃ for 20 h, respectively [43] (Fig. 8 and Additional le 6). Data revealed that 32% of DcSDG genes (14) showed transcription change in response to cold stress. To further understand the roles of DcSDG proteins in response to high temperature (35℃) stress, the expression pro les of SDG genes in the leaves of D. catenatum seedlings were examined by quantitative reverse-transcription-polymerase chain reaction (RT-qPCR) (Fig. 9). The results show the diverse expression patterns of DcSDG genes during heat shock treatment. At 3 h after treatment (HAT), the number of upregulated genes (10) was slightly higher than that of downregulated genes (7). At 6 HAT, more DcSDG genes were induced (15 upregulated genes versus 10 downregulated genes). At 12 HAT, the number of upregulated genes (27) was evidently higher than that of downregulated genes (3). Of the genes examined upon exposure to heat shock, three Class II genes (DcASHH3a/3b and DcATX3a), ve Class V genes (DcSUVH2a/2b, DcSUR14b/14c, and DcSUVR3), two Class VI genes (DcATXR1 and DcASHR1), two Class VII genes (DcSDG45/48), and one Class M gene DcATXR3 were distinguished from the corresponding control in at least at one time point (P<0.05, Fig. 9).

Discussion
Characterization and classi cation of SDG proteins in D. catenatum D.catenatum shows extensive application value in the food service industry, pharmaceutical, cosmetics, health products, and ornamental horticulture in China. The recent successful genome sequencing of D.
To date, the understanding of Class VI and VII is highly limited. Class VI includes six clades (Additional le 7), whose members feature a long S-ET domain interrupted by a Zf-MYND domain and have not been functionally characterized in plants. In mammalian ASHR1 homologs, SET And MYND Domain Containing 3 (SMYD3) may methylate histones H3K4 and H4K5, whereas SMYD2 can dimethylate H3K36 and repress gene transcription [101][102][103]. SMYD proteins also possess non-histone substrates, such as SMYD2 for p53 and estrogen receptor α and SMYD3 for VEGFR1 and MAP3K2, in the nucleus and cytoplasm [104][105][106][107]. Class VII consists of nine clades (Additional le 8), whose members are  [113]. (3) Speci c SDG proteins can interplay with other types of epigenetic regulators. SUVH4/5/6 interacts with the histone deacetylase HDA6 to silence a subset of transposons through histone H3K9 methylation and H3 deacetylation [114]. Further in-depth molecular dissection of SDG members in D. catenatum with speci c developmental and living modes will enrich the action mechanisms of the SDG gene family.

Histone methylation established by SDG proteins is widely involved in responses to environmental
stresses and pathogen challenges [27,33,[35][36][37][38]. Studies have reported that drought stress can cause global changes of histone H3K4 methylation patterns in Arabidopsis and rice [37,38]. Class III member ATX1 is implicated in drought stress response via both ABA-dependent and -independent pathways in Arabidopsis [35,36]. Here we observed that the dynamic expression changes of DcASHR3 (Class II), DcSUVR3 (V), DcATXR5b (V), DcATXR4 (VI), and DcSDG49 (VII) were closely associated with droughtrewatering treatment, indicating that methylations of H3K36 and H3K9 are also involved in drought response. H3K36me3 and H3K27me3 have been proven to play antagonistic roles in the cold-induced epigenetic switch at the Arabidopsis FLOWERING LOCUS C (FLC) locus [115]. In D. catenatum, we identi ed six signi cant cold-responsive DcSDG genes, including DcASHH1 (Class II), DcATX3b/3d (III), DcATXR5a/5b (Class M), DcSUVH5a/5b and DcSUVR14a/3 (V) (Fig. 8). The results indicate diverse histone methylation marks with speci c DcSDG proteins that perform certain roles during cold treatment. It will be intriguing to further investigate their direct targets by high-throughput ChIP-seq method combined with excellent commercial antibodies against various histone methylation marks. Recently, Huang et al.
[24] thoroughly identi ed cotton SDG genes and noted that the expressions of most of these genes decreased under high-temperature conditions. In this study, 61% of DcSDG genes showed response to heat shock, but the number of upregulated genes was considerably higher than that of downregulated ones. This nding may be related to the particularly epiphytic lifestyle and crassulacean acid metabolism pathway in D. catenatum.

Conclusion
In this study, we identi ed 44 SDG proteins in D. catenatum and 42 in its close relative P. equestris, and replenished three other SDG members (Os01g65730/OsSET44, Os01g74500/OsSET45, and Os06g03676/OsSET46) into the previous rice SDG gene family (43 members). Based on the phylogenetic relationship and substrate speci city, these genes were divided into eight classes by using wellcharacterized Arabidopsis SDG members as references. In addition, we analyzed the expression pro les of D. catenatum SDG genes in different tissues and organs and their responses to diverse environmental stresses. Our ndings provide comprehensive information on the classi cation and expression pro les on D. catenatum SDG genes, and will lay the foundation for the functional characterization of the SDG gene family in orchids.

Methods
Identi cation of SDG gene family in D. catenatum and P. equestris  and SRR3210636) and 0°C cold treatment for 20 h (SRR3210613, SRR3210621 and SRR3210626) were obtained from NCBI provided by Wu et al [43]. Reads of all the samples were aligned to the NCBI Dendrobium reference genome using HISAT package [118] The mapped reads of each sample were assembled using StringTie [119]. Then, all transcriptomes from samples were merged to reconstruct a comprehensive transcriptome using perl scripts. After the final transcriptome was generated, StringTie and edgeR was used to estimate the expression levels of all transcripts. StringTie was used to perform expression level for mRNAs by calculating FPKM. DcSDG genes were selected and differentially expressed genes were de ned with log2 (fold change) >1 or <-1 and with statistical signi cance (p value < 0.05) by R package. Heatmap was generated using TIGR MultiExperiment Viewer (MeV4.9) software [120].

Consent to publish
Not applicable.

Availability of data and materials
All data generated or analyzed during this study are included in this published article and its Additional les. The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request.

Competing interests
The authors declare that they have no competing interests.            Expression of DcSDGs in response to high temperature stress through RT-qPCR assay. The actin gene of D. catenatum was used as an internal reference. The data are representative of three independent experiments. The error bars indicate SD and the asterisk shows the corresponding gene between the heat shock (35℃) and the control (20℃) signi cantly affected by Student′s t test (*, P<0.05; **, P<0.01).