Genome-wide identi cation and characterization of SnRK family genes in Brassica napus

Background Sucrose non-fermenting 1 related protein kinases (SnRK) play crucial roles in responding to biotic and abiotic stresses through activating protein phosphorylation pathways. However, little information of SnRK genes was available in Brassica napus, one of important oil crops. Recently, the released sequences of the reference genome of B.napus provide a good chance to perform genome-wide identification and characterization of BnSnRK gene family in the rapeseed. Results Totally 114 SnRK genes distributed on 19 chromosomes were identified in the genome of B.napus and classified into three subfamilies on the basis of phylogenetic analysis and the domain types. According to gene structure and motif composition analysis, the BnSnRK sequences showed obvious divergence among three subfamilies. Gene duplication and synteny between the genomes of the rapeseed and Arabidopsis were also analyzed to provide insights into the evolutionary characteristics of BnSnRK family genes. Cis-element analysis revealed that BnSnRKs may response to diverse environmental stresses. Moreover, the expression patterns of BnSnRKs in various tissues and under diverse abiotic stresses were distinct difference. Besides, Single Nucleotide Polymorphisms (SNP) distribution analysis suggests the function disparity of BnSnRK family genes in different genotypes of the rapeseed. Conclusion We examined genomic structures, evolution features, expression patterns and SNP distribution of 114 BnSnRKs. The results provide valuable information for functional characterization of BnSnRK genes in future studies.


Introduction
Plants develop various molecular defense mechanisms to cope with abiotic stresses, including salinity, drought and cold stresses. Gene expression regulation and protein modi cation are two important ways for plants to deal with these stresses (Bohnert et al., 2006). The processes of phosphorylation and dephosphorylation mediated by protein kinase play crucial roles in protein modi cation (Hunter et al., 1995). Among the reported protein kinase genes, sucrose non-fermenting 1 (SNF1)-related protein kinases (SnRKs) are involved in different physiological processes (Hrabak et al., 2003).
In plants, SnRKs could be divided into three subfamilies: SnRK1, SnRK2 and SnRK3 based on their sequence similarity and gene structures (Hrabak et al., 2003;Halford and Hardie, 1998). In detail, the SnRK1 subfamily contain a highly conserved N-terminal protein kinase (Pkinase) domain, which are the homologous genes of SNF1 in yeasts and AMPKs in mammals (Celenza et al., 1986). The other subfamilies SnRK2/3 are unique in plants, both are more diverse in plants than the SnRK1 subfamily members. The member of SnRK2 harbors a conserved Pkinase domain and a C-terminal variable adjusting conserved domain (Kulik et al., 2011). In addition, SnRK3 known as CIPK (CBL-interacting protein kinases), also contain conserved N-terminal protein kinase domains and NAF domains, PPI domains in C-terminal ( The SnRK1 family genes involve in the response of plant cells to starvation and metabolic stress. SnRK1 kinases are the catalytic subunits of heterotrimeric complexes that interact with two other subunits (Hardie et al., 1998). In Solanum tuberosum, SnRK1 was proved to be involved in induction of sucrose synthase expression and played an important role in carbohydrate metabolism regulation (Purcell et al. 1998). Besides,ow energy stress (e.g. darkness and hypoxia) could trigger SnRK1α nuclear translocation and further induce SnRK1 target genes to refresh cellular energy for plant growth (Ramon et al., 2019). Furthermore, considerable evidences indicated that SnRK1 genes were hubs in a network of various signaling pathways including cell cycle regulation, pathogen responses and meristem development (Halford et al., 2009).
On the other hand, the SnRK2 genes play important roles in responding to abiotic stresses in plants, especially for osmotic and salt stress. For instance, SnRK2.10 phosphorylate several target genes including dehydrins ERD10 and ERD14 to deal with osmotic stress in Arabidopsis thaliana (Maszkowska et al., 2019). SnRK2.1 positivly regulate salt tolerance in Nicotiana tabacum (Diédhiou et al., 2008;Zhang et al., 2014). In A. thaliana, SnRK2 ubfamily genes could be classi ed into three groups: group 1 contains the ABA-independent kinases, group 2 includes genes responding to drought stress, and group 3 kinases are strongly stimulated by ABA (Kulik et al., 2011;Kawa et al, 2020). The current researches on ABAdependent group 3 kinases are the most extensive. For example, AtSnRK2.6 (OST1), one of the SnRK2 family gene, plays a core role in the ABA signaling pathway in stomatal guard cells, and OST1 protein stability can be regulated via E3-ubiquitin ligase HOS15 to reduce ABA signal sensitivity in Arabidopsis SnRK3 kinases known as CIPKs (CBL(calcium sensor calcineurin B-like proteins)-interacting protein kinases), perform vital functions in resistance to various stresses in plants (Hrabak et al., 2003;Kim et al., 2003). For example, SOS (salt overly sensitive) system was the rst discoverd CBL-CIPK pathway in A.thaliana. In detail, the calcium signal produced by salt stress was sensed by SOS3 (AtCBL4) on cell membrane, and then SOS3 combined with SOS2 (AtCIPK24) forming complex to phosphorylate SOS1 (Na + /H + antiporter) to remove excess Na + out of root cells (Guo et al., 2002;Liu et al., 2000). Besides, MdCIPK13 and MdCIPK22 enhanced salt and drought tolerance through targeting sucrose transporter MdSUT2.2 for phosphorylation in apple (Ma et al., 2019a;Ma et al., 2019b). Overexpression of BnCBL1-BnCIPK6 in B.napus could enhance high salinity tolerance and low potassium tolerance (Chen et al., 2012). In conclusion, increasing evidences emphasized the importance of SnRKs function in nutrition utilization and stress response, and nally researchers could improve plants resistance to stresses by genetic manipulation using these genes.
Brassica napus is an important oil crop in the world. Recently, the genomes of Darmor-bzh (winter ecotype), Zhongshuang 11 and NY7 (seni-winter ecotype) were successfully sequenced and assembled (Chalhoub et al., 2014;Sun et al., 2017;Zou et al., 2019); however, the systematic analysis of BnSnRK gene family has been not well reported. In this study, 114 SnRK gene members were identi ed in the B.napus genome. We systematically analyzed their phylogenetic relationships, protein motifs, gene structures, chromosome distributions and cis-elements in promoter regions. Moreover, the expression pro les of BnSnRKs in diverse tissues as well as under abiotic stresses were determined. In addition, SNPs of each BnSnRK gene were systematically identi ed in a worldwide collection with 300 core germplasm accessions. These results will provide useful information for further investigation of molecular mechanisms of BnSnRK genes for abiotic stress tolerance and molecular breeding in B.napus. The ClustalW with default parameters was used for multiple sequence alignment of 114 BnSnRK nonredundent amino acid sequences (Saitou and Nei, 1987;Kumar et al., 2016). Based on the alignments, a phylogenetic tree was constructed using MEGA 7.0 by the Neighbor-Joining (NJ) method (Kumar et al., 2016), with the following parameters: poisson model, pairwise deletion and 1000 bootstrap replications. Unrooted NJ tree of all SnRK protein sequences from A.thaliana and B.napus was also constructed using MEGA 7.0.

Identi cation of motif compositions and gene structures
To identify conserved motifs of BnSnRK proteins, the Multiple Expectation Maximization for Motif Elicitation (MEME) online program (Bailey et al., 2009) (http://meme.sdsc.edu/meme/itro.html) was performed with the following parameters: number of repetition = any, maximum number of motifs = 10; and optimum motif length = 6 to 100 residues. The exon-intron structures of BnSnRK family genes were analyzed using the Gene Structure Display Server online program (GSDS: http://gsds.cbi.pku.edu.ch) (Hu et al., 2015).

Chromosomal location and gene duplication
The chromosomal positions of all BnSnRK genes were mapped to 19 chromosomes of the rapeseed genome according to the physical location information from the NY7 genome database using MapChart version 3.0 and Circos (Voorrips, 2002;Krzywinski et al., 2009). To identify gene duplication, all B.napus gene sequences were aligned using BLASTP, with an e-value of 1e-10. MCScanX with default values was used to classify the duplication patterns of the SnRK into segmental, tandem duplications . The de nition of tandem duplication is a chromosomal region within 200 kb containing two or more genes (Holub et al., 2001). To exhibit synteny relationships of the orthologous SnRK genes between B.napus and A.thaliana, the syntenic analysis maps were constructed using python script written by ourselves. Non-synonymous (Ka) and synonymous (Ks) substitution of each duplicated BnSnRK gene were calculated using KaKs_Calculator 2.0 (Wang et al., 2010). Divergence time was estimated using the formula T = Ks/2R, where R is 1.5×10−8 synonymous substitutions per site per year (Wang et al., 2010).

Cis-elements in promoter regions of BnSnRKs
Upstream sequences (1500 bp) from the start codon of each BnSnRK gene was extracted from the genome sequence of NY7, and then used for cis-element distributions in promoter regions using PlantCARE software (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) (Lescot et al., 2002).

Expression patterns of BnSnRKs
Transcriptome data of 12 different B.napus tissues were obtained from a previous study by Sun et al.

SNP distribution analysis of BnSnRKs in B.napus accessions
The SNPs in the coding regions of BnSnRK genes were extracted from the total SNPs of 300 B.napus accessions, which were determined by the genome re-sequencing in our previous studies Xuan et al., 2020). High-quality SNPs with MAF greater than 5% and missing rate less than 50% were used for further analysis. All high-quality SNPs were mapped to the "Darmor-bzh" genome (B.napus v4.1 genome, http://www.genoscope.cns.fr/brassicanapus/data/).

Results
Identi cation and phylogenetic analysis of SnRK genes in B.napus.
A total of 114 proteins with Ser/Thr kinase domain were identi ed as the members of SnRK family in the B.napus genome (Table S2). The longest amino acid sequence of each protein was selected for further analyses. The information of gene names, ID, chromosomal locations, amino acid numbers, molecular weights (MW), isoelectric points (pI) and domains were listed in Table S1 and S2. The amino acid length of 114 BnSnRKs protein is ranged from 190 to 1241, correspondingly the melocular weight ranges from 22.0 to 140.6 kDa. The coding regions and sequences of each gene were listed in Table S3.
To analyze evolutionary relationships of SnRK genes in B.napus and A.thaliana, an unrooted phylogenetic tree was constructed using the full-length amino acid sequences of all SnRKs. Totally, 39 SnRKs from A.thaliana and 114 SnRKs from B.napus were identi ed and used in the study (Figure 1). It was reported that 39 AtSnRKs could be clustered into three groups (Hrabak et al., 2003). In this study, based on the phylogenetic analysis, 114 BnSnRKs were also class ed into three groups. In details, 10 proteins are in BnSnRK1 subfamily with Pkinase (PF00069 of Pfam), UBA (PF00627) and KA1 (PF02149) domains, excepting BnSnRK1.3, while 31 proteins contain pkinase domain in BnSnRK2 subfamily with high similarity to AtSnRK2 subfamily, and 73 proteins with Pkinase and NAF (PF03822) domains are in SnRK3 subfamily, respectively ( Figure 1).

Gene structure and protein motif analysis of BnSnRK genes
The web server GSDS (Gene Structure Display Server) analysis was done to determine gene structures of BnSnRKs. As shown in Figure 2, one to fteen exons were found in these BnSnRK genes. The BnSnRK1 subfamily genes have more than 6 exons, and 6 to 13 exons for the BnSnRK2 subfamily genes. While it is various for the BnSnRK3 subfamily with ranged from 1 to 15 exons. Furthermore, there are two subgroups in the BnSnRK3 family. The genes in the SnRK3 subgroup 1 contained more than 8 exons, while the genes in the subgroup 2 contain less than 4 exons.
Using BLAST and MCScanX methods, 106 segmental duplication events were identi ed (Figure 4 and Table S5). Among these events, 104 events were happened between diverse chromosomes, while only 2 duplication events were detected within same chromosome. The result suggested that part of BnSnRK genes were possibly generated by gene duplication, and the segmental duplication events played vital roles in the expansion of SnRK genes in the B.napus genome. We also analyzed the occurrence of the tandem duplication events. Here, 25  .72) were found located less than 200 kb. However, the identities of these gene pairs were less than 70%, indicating they were not tandem duplication events.
Futhermore, the synteny of SnRK gene pairs between B.napus genome and A.thaliana genome was performed. The result showed that 65 SnRK genes of B.napus exhibiting syntenic relationship with AtSnRK genes ( Figure 5 and Table S6), suggesting that these genes might have contributed to the evolution of BnSnRK gene family. To test the evolutionary constraints acting, Ks values, Ka values, Ka/Ks ratios and divergence time of paralogous and orthologous on SnRK family genes were calculated (Table  S5, Table S6). The majority of segmental duplicated BnSnRK gene pairs had Ka/Ks ratios less than 1, and the mean value of BnSnRK3 gene pairs (Ka/Ks = 0.30) was lower than BnSnRK1 (Ka/Ks = 0.41) and BnSnRK2 (Ka/Ks = 0.75). In addition, the Ka/Ks ratios of most orthologous SnRK gene pairs were less than 1, and the mean value of SnRK2 gene pairs (Ka/Ks = 0.09) was much lower than that of SnRK1 (Ka/Ks = 0.75) and SnRK3 (Ka/Ks = 0.23). These results suggested that the BnSnRK gene family may have suffered robust purifying selective pressure during evolution.

Stress-related cis-elements in the promoters of BnSnRK genes
To understand the potential function and regulatory mechanisms of BnSnRK genes, cis-elements (1.5-kb upstream from ATG) were analyzed by using PlantCARE. Totally 104 of 114 BnSnRKs were identi ed with cis-elements including DRE, ABRE and LTRE, involving in dehydration responses, ABA responses and lowtemperature responses ( Figure 6, Table S2). In detail, 88 BnSnRK genes (77.19%) had ABRE cis-elements, 33 BnSnRK genes (28.95%) carried DRE cis-elements, and 44 BnSnRK genes (38.60%) owned LTRE cis-elementsIt was also found that most genes had more than one kind of cis-element types. FBnSnRK1 (1.80) family was higher than BnSnRK2 (1.17) and BnSnRK3 (1.21) families (Table S2). In conclusion, the cis-elements analysis suggested that most BnSnRK genes could be responsed to diverse environmental stresses, and different subfamily genes may be regulated diversely.

Expression pro les of BnSnRKs in different tissues
The expression pro les of 114 BnSnRK genes were compared among 12 tissues of the rapeseed cultivar ZS11 (Figure 7 and Table S7). The expression pattern of BnSnRKs was divided into three groups. A total of 67 genes with high expression level in all examined tissues (log2-based values > 6) were classi ed into the group 1. For example, BnSnRK3.40 was highly expressed in all vegetative organs with average log2based values of 10.85. 21 BnSnRK genes belonging to group2 exhibited relatively low expression levels across the detected tissues (log2-based values > 2). The group3 included 26 BnSnRK genes, with lowest expression levels in all tissues (log2-based values < 2). Furthermore, 6 genes belonging to the group3 (BnSnRK2. 26 , Table S8). Multiple BnSnRK genes were signi cantly induced by diverse treatments. For example BnSnRK2.1 was extremely induced by all treatments. BnSnRK3.39 incerased expression levels responding to dehydration and ABA treatments. In contrast, several BnSnRKs did not respond to any abiotic stresses. for instance, BnSnRK2.12 and BnSnRK2.13 showed almost no expression changes during all treatments. Interestingly, many genes showed opposing expression pro les under diverse treatments. For instance, BnSnRK3.54 was highly up-regulated by dehydration, whereas was repressed by ABA and cold treatments.

SNPs analysis of BnSnRKs in a core germplasm of B.napus
The polymorphism of BnSnRK genes was determined using our previous resequencing data of a core germplasm resource containing 300 accessions Xuan et al., 2020). The SNPs with MAF more than 5% in BnSnRK genes were selected out (Table S9). It was shown that each BnSnRK genes contained an average of 11.76 SNPs. In detail, each BnSnRK1 subfamily genes included 14.00 SNPs, each BnSnRK2 and BnSnRK3 subfamily genes contained an average of 9.67 and 12.53 SNPs. Furthermore, the SNP density of each BnSnRK genes within the same subfamily was also different. For instance, there was no SNP identi ed in BnSnRK1.1, whereas there were 62 SNPs in BnSnRK1.7.
Moreover, a detailed SNP distribution analysis of BnSnRK2.10 and BnSnRK3.39 was conducted because the two genes signi cantly changed their expression level under drought stress (Figure 8). It was found that there were 9 SNP loci in the 1500 bp promoter region , 19 SNPs in the exon/intron region and 1 SNP in the 3'UTR region of BnSnRK2.10; while we identi ed 6 SNP loci in promoter region, 8 SNPs in exon/intron region and 1 SNP in the 3'UTR region of BnSnRK3.39. Combined with phenotypic data of the core germplasm resource under drought stress (data not shown), it was found that SNP16, SNP26, SNP27 in BnSnRK2.10, and SNP4, SNP7, SNP15 in BnSnRK3.39 were signi cantly associated with their drought tolerance.

Discussion
In this study, 114 members of BnSnRK genes were identi ed in the B.napus genome, which were designated as BnSnRK1.1 to BnSnRK3.73 on the basis of their subfamily classi cation. Systematically investigation on the BnSnRK gene family were carried out, inculding phylogenetic relations, gene structures, protein motifs, chromosome distributions, gene duplication and cis-elements in the promoters. Furthermore, the deep analysis was done on the expression pattern and SNP determination of BnSnRK family genes using published data. This study provides a basic information for further functional characterization of SnRK genes to enhance plant adaptive capacity under abiotic stress. Various SnRK subfamily genes contain different conserved domains, but all genes retained a N-terminal protein kinase domain. For example, SnRK3 subfamily genes were reported interact with CBLs in a calcium-dependent manner because of the NAF domain permits. In addition, the NAF domain de nes a set of heterologous kinases involved in diverse signaling processes, as targets of CBL calcium sensor proteins (Albrecht et al., 2001). In this study, it was also found that different BnSnRKs subfamily genes shared the different type of conserved domains. It may suggest there is functional diversi cation of the BnSnRK gene family according to their domains.
In AtSnRK and BnSnRK gene families, different subfamily genes exhibited signi cant gene length and exon-intron structural divergences. In previous studies, genes with less introns were considered to have higher expression levels in plants (Chung et al., 2006;Jeffares et al., 2008). A compact gene structure with less introns allowed rapid gene activation and timely response to diverse environmental stresses (Jeffares et al., 2008). However, combined with transcriptome data used in this study, we did not detect that BnSnRK genes with less introns showed higher expression levels than the other BnSnRKs.
Accumulating evidence suggested that gene activities were generally correlated with disparities in promoter regions (Stephen et al., 2008). The cis-elements located in the promoter regions of genes played key roles in regulating gene expression during growth and environmental changes (Ali et al., 2006;Sarda et al., 1997). The promoter analysis revealed that BnSnRKs contained various types of cis-elements, such as DRE, ABRE and LTRE. Most of gene promoters contained at least one of these components, indicating that many of BnSnRKs were able to respond to diverse abiotic stresses. For instance, by combining gene expression pro les under 4h-cold stress, the expression level of BnSnRKs with LTRE and ABRE ciselements increased by an average of 2.71-fold, while BnSnRKs with no cis-elements only showed 1.47-fold increase. Therefore, the cis-elements analysis provide a clue for gene function study, especially for gene expression pattern under different stress.
Gene expression pro les provided imperative clues to map out gene functionality. Firstly, we used prepublished transcriptome data to investigate BnSnRK genes expression levels in diverse tissues and organs of B.napus (Sun et al., 2017). The analysis results revealed that the expression patterns of these genes were divided into three groups (Figure7). Meanwhile, We found BnSnRKs in group2 contained less cis-elements in promoter than group1 and group3. Speci cally, each gene in group1 contains an average of 2.84 ABRE, 0.40 DRE, 0.93 LTRE, and each gene in group3 contains 2.81 ABRE, 0.38 DRE, 0.30 LTRE, whereas each of group2 gene only has 2.00 ABRE, 0.19 DRE, 0.57 LTRE. These evidences suggested that BnSnRKs activities were correlated with disparities in promoter regions.
The roles function of some BnSnRKs reponding to diverse abiotic stresses were also derived. According to the drought stress data (table S8), the responsive gene BnSnRK2.10 was orthologous to AtSnRK2.3, which regulated ABA synthesis and signaling responding to drought in A.thaliana (Tan et al., 2018), indicating identical function of BnSnRK2.10 in responding to drought stress. BnSnRK2.24, exhibited extremely expression changes in drought, salt stresses and ABA induce, while its orthologs AtSnRK2.2, could also respond to osmotic stress and ABA induction in A.thaliana, indicating that BnSnRK2. 24  This study provides a comprehensive knowledge of SnRK gene family in B.napus. It gives an important implication for further understanding the biological functions of individual BnSnRK genes in B.napus. However, the study only provided preliminary characterization of BnSnRK genes and large functional validation work need to be done in further work to understanding the roles of BnSnRK family.

Conclusion
SnRK genes play important roles in signaling pathways including responses to biotic and abiotic stresses in plants. In this study, a comprehensive study of SnRK gene family in B.napus was performed. A total of 114 BnSnRK genes were characterized and divided into three subfamilies, which showed high similarity in gene structure and motif composition within the same subfamily. Phylogenetic comparison and synteny analysis of SnRK genes between A.thaliana and B.napus provide valuable clues for the evolutionary characteristics of the BnSnRK genes. Moreover, the cis-acting elements, gene expression and SNPs distrubution of BnSnRK family were also determined. These results provide an important information for further understanding biological functions of BnSnRK genes in B.napus.

Declarations
Acknowledgments This work was funded by the National Natural Science Foundation of China (31961143008, 31701411), the Science and Technology Program of Zhejiang Province of China (LGN20C130007), Jiangsu Collaborative Innovation Center for Modern Crop Production, and the 111 project for introduction of foreign experts (B17039). We thank Dr. Guoping Zhang for his insightful advising and contribution in manuscript revision.

Availability of data and materials
All data analyzed during this study are included in this article and its Additional les.   Phylogenetic relationships, architecture of conserved protein motifs and gene structure of the SnRK genes from B.napus. A Phylogenetic tree of 104 BnSnRK proteins. The unrooted neighbor-joining phylogenetic tree was constructed with MEGA7 using full-length amino acid sequences of 104 BnSnRK proteins, and the bootstrap test replicate was set as 1000 times. B Exon/intron organization of BnSnRK genes. Yellow boxes represent exons and black lines with same length represent introns. The upstream/downstream region of BnSnRK genes are indicated in green boxes. The length of exons can be inferred by the scale at the bottom. C The motif composition of BnSnRK proteins. The motifs, numbers 1-10, are displayed in different colored boxes. The sequence information for each motif is provided in Table   S4. The length of protein can be estimated using the scale at the bottom.
Chromosomal distribution of SnRK genes in Brassica napus. left number represent physical location on chromosomes of BnSnRKs.