Genome-wide identi�cation and characterization of cucumber bHLH family genes and the functional characterization of CsbHLH041 in NaCl and ABA tolerance in Arabidopsis and cucumber

Background: The basic/helix-loop-helix (bHLH) transcription factor family exists in all three eukaryotic kingdoms as important participants in biological growth and development. To date, the comprehensive genomic and functional analyses of bHLH genes has not been reported in cucumber (Cucumis sativus L.). Results: Here, a total of 142 bHLH genes were identi�ed and classi�ed into 32 subfamilies according to the conserved motifs, phylogenetic analysis and gene structures in cucumber. The sequences of CsbHLH proteins were highly conserved based on the results of multiple sequence alignment analyses. The chromosomal distribution, synteny analysis, and gene duplications of these 142 CsbHLHs were further analysed. Many elements related to stress responsiveness and plant hormones were present in the promoter regions of CsbHLH genes based on a cis-element analysis. By comparing the phylogeny of cucumber and Arabidopsis bHLH proteins, we found that cucumber bHLH proteins were clustered into different functional clades of Arabidopsis bHLH proteins. The expression analysis of selected CsbHLHs under abiotic stresses (NaCl, ABA and low-temperature treatments) identi�ed �ve CsbHLH genes that could simultaneously respond to the three abiotic stresses. Tissue-specic expression pro�les of these �ve genes were also analysed. In addition, 35S:CsbHLH041 enhanced the tolerance to salt and ABA in transgenic Arabidopsis and in cucumber seedlings, suggesting CsbHLH041 is an important regulator in response to abiotic stresses. Lastly, the functional interoperability network among the CsbHLH proteins was analysed. Conclusion: This study provided a good foundation for further research into the functions and regulatory mechanisms of CsbHLH proteins and identi�ed candidate genes for stress resistance in cucumber.


Background
Basic helix-loop-helix (bHLH) transcription factors form one of the largest families of TFs and exist widely in all three eukaryotic kingdoms [1,2].The bHLH TFs are named for their own structural characteristics [3], which are mostly composed of conserved 60 amino acid residues.According to the different functions, they can be divided into two parts: the basic region and the HLH region [4].The basic region is distributed at the N-terminus of the bHLH conserved domain and contains approximately 15 to 20 residues, which are related to DNA binding [5,6].The HLH domain is distributed at the C-terminus of the gene sequence, composing of two amphipathic α-helices mainly constituting of hydrophobic residues linked by a loop region of variable sequence and length.The HLH domain is an essential structure for the formation of homologous or heterologous dimers in bHLH TFs [6,7].
According to the evolutionary origin, sequence similarity, DNA binding patterns, and functional types, in animals, bHLH transcription factors are mainly divided into six categories, A-F, containing 45 subgroups [8,9].In plants, the bHLH gene family has been divided into 15-26 groups [10], and even up to 32 groups when atypical bHLH proteins are included [2].In Arabidopsis, 167 bHLH proteins are divided into 21 subfamilies [2,11]; the 165 bHLH family members in rice are classi ed into 22 subfamilies [12]; and the 159 bHLH proteins are divided into 21 subfamilies in tomato [13].Currently, increasing numbers of bHLH proteins have been found in plants, and their functional research is gradually increasing.
In plants, the bHLH genes are involved in processes such as metabolic regulation, plant growth and development, and response to environmental signals.The rst member of the bHLH family discovered was the maize R gene, which was shown to play a key role in anthocyanin synthesis [14].Subsequently, an increasing number of bHLHs have been shown to be involved in a wider range of physiological pathways.For example, Phytochrome Interacting Factors (PIFs) have been reported to respond to light signals [15]; overexpression of PRE1 activates gibberellin-dependent responses in Arabidopsis thaliana [16]; AtGL3, AtEGL3 and AtTT8 have been demonstrated to be involved in anthocyanin and PA biosynthesis [17,18]; while AtGL3, AtEGL3 and AtMYC1 also regulate trichome formation and root hair patterning [19].In addition, some bHLH TFs are also considered to be able to respond to a variety of abiotic stresses and improve plant stress tolerance, including tolerance to drought tolerance, salt and cold.In wheat, overexpression of bHLH39 increases tolerance to salt stress [20].The bHLH TFs often function by forming homodimers or heterodimers with other proteins.For example, MYC3 and MYC4 transcription factors all can interact with multiple JAZ proteins (such as JAZ1, JAZ4, and JAZ9) to jointly regulate the JA signalling pathway [21].The MYB-bHLH-WD40 complexes are involved in different processes, such as the biosynthesis of anthocyanins and PAs, leaf trichome formation and root hair patterning [22].In summary, bHLH in plants can form homologous or heterologous complexes with bHLH proteins or other proteins to extend their biological functions.Cucumber (Cucumis sativus L.) is an economically important crop cultivated worldwide [23].The functions of the AtbHLH family have been widely studied in Arabidopsis thaliana [2].However, genomewide information on members of the CsbHLH family has not been reported.In this study, we identi ed and characterized 142 bHLH family genes in cucumber.They were classi ed into 32 subgroups and could be distributed over seven chromosomes.Their gene structures, conserved motifs, synteny analysis, gene duplications and cis-elements in promoters also have been investigated.In addition, the expression levels of some CsbHLH genes were measured by qRT-PCR to study their responses to low temperature (4°C), salt (NaCl) and ABA stress, for which all tested genes were stress-responsive.The protein interaction network among the CsbHLH proteins was predicted, which could help to understand the possible functional mechanism of CsbHLH proteins.Furthermore, overexpression of CsbHLH041 showed increased salt resistance and ABA resistance compared with controls in cucumber and Arabidopsis.We hope that this work will provide useful resources for further studies on the functions and regulatory mechanisms of a potentially important CsbHLH protein, which plays a crucial role in the regulation of abiotic stress responses in cucumber.

Identi cation and analysis of cucumber bHLH genes
To identify CsbHLH family genes in cucumber, we used the BlastP programme to search against the cucumber genome database by using 166 Arabidopsis bHLH proteins [2,10] and the consensus protein sequences of the bHLH domain, with Hidden Markov Model (HMM) pro le (PF00010) as queries.We obtained 164 putative members of the CsbHLH family.To con rm the reliability of the bHLH genes in the cucumber genome, we used Pfam (http://pfam.janelia.org/)and SMART (http://smart.emblheidelberg.de/)[24] to search for the presence of the bHLH domain in the amino acid sequences of the 164 proteins.Only 142 proteins had the corresponding conserved bHLH domain, which were named CsbHLH1 to CsbHLH142 according to their sequence similarity and phylogenies with individual AtbHLH proteins.Finally, the speci c information for the 142 typical bHLH genes, including the gene ID, amino acids length, chromosomal locations, and gene length were present in Table 1.The lengths of the CsbHLH protein sequences varied from 84 residues (CsaV3_1G005290) to 960 residues (CsaV3_1G043790), and the isoelectric points (pI) varied from 4.57 (CsaV3_2G030090) to 11.79 (CsaV3_6G028530).

Phylogenetic analysis, gene structure and conserved motif analysis of CsbHLH gene family
To con rm the structural characteristics of CsbHLH proteins, we performed multi-sequence alignment (MSA) analysis on 142 CsbHLH proteins.All 142 CsbHLH proteins contained the characteristic regions of bHLH: two helix regions, one loop region and one basic region (Fig. 1).Additionally, the conserved amino acids with a sequence identity greater than 50% in bHLH domains, were present as light blue or purple colour (Fig. 1a).Sequence logos were produced using the 142 CsbHLH homologous domain amino acid sequences (Fig. 1b).The CsbHLH proteins in cucumber contained 17 conserved amino acids of bHLH domain, which were present in the bHLH gene family of Arabidopsis and Moso bamboo [2,25].As shown in Fig. 1b, we could clearly observe that key amino acid residues Arg-10, Arg-11, Leu-21 and Leu-53 were highly conserved (92%, 87%, 96%, and 90%, respectively) in the 142 CsbHLH proteins.Subsequently, a phylogenetic tree was constructed on the 142 CsbHLH proteins, which were divided into 32 subgroups (C1-C32) based on the clades over 50% bootstrap support (Fig. 2a).
We then performed gene structure analysis of CsbHLH gene to support the phylogenetic analysis, which showed that CsbHLHs in the same subgroups presented similar numbers of exons and introns, and regardless of intron sizes, the CsbHLH genes in the same subgroups had similar intron-exon gene structures (Fig. 2c).
To further investigate the speci c motifs of CsbHLH proteins in the same subgroup, we used the MEME tool to identify 10 conserved motifs.The different numbers of conserved motifs were present in 142 CsbHLH proteins (Fig. 2b).Moreover, a similar motif existed in CsbHLH proteins of the same subgroup.For instance, all proteins of subgroup 23 contained motifs 1, 2, 4 and 6, and motif 5 was identi ed in most CsbHLH proteins.We also found that certain motifs were absent in certain subgroups.For example, motif 4 was absent in all proteins of the 1, 2 and 3 subgroups (Fig. 2b).
In general, the results of conserved motif and gene structure analyses further con rmed the results of the phylogenetic analysis, indicating that proteins within the same subgroup may have similar functions.
In order to further illuminate the phylogenetic mechanisms of CsbHLH family, we constructed a comparison of the syntenic map of cucumber related to tomato and Arabidopsis, respectively (Fig. 3b).We found that CsbHLH024, CsbHLH040 and CsbHLH054 genes were associated with more than two syntenic gene pairs between cucumber and tomato.Moreover, for instance, CsbHLH020 and CsbHLH049 genes were also corresponded to two syntenic gene pairs between cucumber and Arabidopsis, indicating that these bHLH genes may play a key role in evolution.In addition, we found certain collinear pairs were present between cucumber and both Arabidopsis and tomato (such as CsbHLH132, CsbHLH135 and CsbHLH136) (Fig. 3b; Table S2), illustrating that before the ancestral divergence, these orthologous pairs might have already present.Meanwhile, some CsbHLH genes were not associated with syntenic gene pairs in Arabidopsis or tomato, indicating that they might have been peculiar to cucumber during the course of evolution.

Cis-elements in the promoters of CsbHLH genes in cucumber
According to the studies reported by [27], many bHLH genes may be able to respond to a variety of abiotic stresses.We isolated the 2-kb promoter regions of the CsbHLH genes to identify the potential ciselements (Table S3), in which a number of CsbHLH genes particularly presented elements associated with plant hormones (such as auxin, abscisic acid and gibberellic acid) and stress responsiveness (such as drought inducibility and low temperature).Moreover, the promoter regions of some CsbHLH genes contained an MYB binding site involved in avonoid biosynthetic gene regulation, which might be involved in the synthesis of avonoid in cucumber (Fig. S3; Table S3).In addition, the promoter regions of CsbHLH genes contained G-Box and Box-4 elements related to light responsiveness.The cis-regulatory elements in CsbHLH promoters included the plant light-responsive elements, plant growth-and development-responsive elements, and responding to diverse stresses (Table S3).
To further analyse whether there is co-expression of CsbHLH genes with the same cis-elements, we constructed a co-expression network of CsbHLH genes, based on the available RNA-seq data of 10 cucumber tissues regarding correlations between cucumber bHLH genes [26].The co-expression network containing 23 CsbHLH genes (nodes) and 191 correlations (edges) showed that each of the CsbHLH genes had multiple co-expression genes with same cis-elements (Fig. S4; Table S3).The result indicated the co-expression of genes may be related to the same cis-elements in their promoter regions.

Function prediction of CsbHLHs based on phylogenetic analyses
Previous studies have identi ed and veri ed the function of numerous bHLH proteins in Arabidopsis [28,29].However, the biological functions of CsbHLHs are known little in cucumber.In this study, we performed phylogenetic analyses of 166 AtbHLHs and 142 CsbHLHs proteins to identify the genetic relationship of the bHLH proteins in cucumber and Arabidopsis, so as to preliminarily explore the functions of CsbHLH proteins [2,10] (Fig. 4).
Finally, we divided the 308 bHLH proteins into 23 subfamilies, and predicted the functions of CsbHLHs according to their veri ed functional homologs in the same subfamily (Table S4).As shown in Table S4, most of the proteins of subfamilies 1, 2, 4, 10, 13, 14 and 18 responded to different biotic and abiotic stresses [30,31], such as drought [32], cold [33] and salt [34].Some of the proteins in subfamilies 4 and 10 might be involved in iron regulation, regulating the iron homeostasis [35].The proteins of subfamilies 19 and 23 have been identi ed to regulate ower development [36], and the members of subfamilies 3, 8, 9, 16 and 21 might participate in the development of multiple plant organs [37,38,39].There were PIFs in subfamily 17, related to light signal transduction and protect the normal growth and development of plants [15].The members of subfamily 5 regulate the avonoid biosynthesis and cell differentiation of root epidermis [22].The detailed possible functions of CsbHLHs are listed in Table S4.
In general, although the evolutionary relationships could not be clearly deciphered for the functions of all genes, the analysis was meaningful and necessary.

Expression analysis of CsbHLH genes under different stress conditions and in different tissues
To identify which CsbHLH genes play important roles in abiotic stress responses, we carefully screened 21, 20 and 25 bHLH genes based on the cis-acting elements containing low temperature, defense and stress responsive and abscisic acid (ABA) elements in the promoters of bHLH genes, respectively, and detected their transcriptional changes with treatments of low temperature (4°C), salt (NaCl) and ABA, respectively.As expected, all the CsbHLH genes screened responded to stress treatments under the respective stress conditions (Fig. 5).For example, the expression levels of the 20 CsbHLHs were all positive in response to salt stress, and many of them were upregulated after one hour of salt treatment and achieved the highest expression level 3 hours later, and then gradually declined.The expressions of CsbHLH033, CsbHLH041 and CsbHLH082 were the highest after NaCl treatment for just 1 h, but the expressions levels of CsbHLH136 reached its maximum after 12 h.CsbHLH041 was the most susceptible to salt stress (increased by approximately 37-fold) (Fig. 5a).Under ABA treatment, the transcriptional levels of CsbHLH020, CsbHLH041 and CsbHLH064 were more than 10-fold higher than those of untreated level (CsbHLH020: the highest nearly 61-fold; CsbHLH041: the highest nearly 55-fold; CsbHLH064: the highest nearly 19-fold).In contrast, the expression levels of four of the CsbHLHs genes were signi cantly down-regulated under ABA treatment (CsbHLH011, CsbHLH033, CsbHLH034 and CsbHLH077), as could be seen in Fig. 5b.The expression levels of 20 of the 21 CsbHLH were upregulated at some time points after the 4°C treatment, while only CsbHLH032 was decreased (Fig. 5c).We found the CsbHLH020, CsbHLH064, CsbHLH086, CsbHLH093 and CsbHLH112 genes could simultaneously respond to the three abiotic stresses (Fig. 5).
The expression patterns of genes under different conditions are often related to their functions.Therefore, we used qRT-PCR to detect the expression patterns for CsbHLH020, CsbHLH064, CsbHLH086, CsbHLH093 and CsbHLH112 abiotic stress-responsive CsbHLHs in different tissues.The expression patterns of the ve CsbHLH genes showed different tissue speci cities (Fig. 5d).For instance, CsbHLH093 and CsbHLH112 had higher expression levels in ovaries and roots, but lower expression levels in tendrils and male owers (Fig. 5d).On the contrary, both CsbHLH064 and CsbHLH086 were highly expressed in tendrils and male owers.The expression levels of CsbHLH020 in young leaves and roots were higher than that in other tissues (Fig. 5d).These results suggested that CsbHLH genes might play key roles in plant developmental and physiological processes.
CsbHLH041 enhanced tolerance to NaCl and ABA in transgenic Arabidopsis and cucumber CsbHLH041 expression was signi cantly induced by salt and ABA in cucumber (Fig. 5a-b).Therefore, we used Agrobacterium-mediated transient transformation of cucumber cotyledons to clarify CsbHLH041 tolerance to salt and ABA.After 0.5 h of 100 mM NaCl treatment, serious wilting occurred in the seedlings overexpressing 35S empty vector compared with over-expression CsbHLH041, and the wilting difference was more obvious after 3 hour of NaCl treatment (Fig. 6a).After 12 hours, the survival rate of the transgenic seedlings (24%) was markedly higher than that of the 35S empty vector seedlings (6%), showing that over-expression of CsbHLH041 resulted in signi cant salt resistance (Fig. 6c).After 6 hours of ABA treatment, the transgenic seedlings were more vigorous than 35S empty vector seedlings (Fig. 6b).With the extension of ABA treatment time, the 35S cucumber seedlings showed visible symptoms of ABAinduced damage, such as drying, wilting, and even death, with survival of only 12%.While some CsbHLH041 transgenic plants remained green with expanded cotyledons, and the survival rate was up to approximately 40% (Fig. 6b-c).
To clarify the possible factors underlying the enhanced NaCl and ABA resistance, we examined the enzymatic activities in the ROS clearance system under NaCl and ABA treatments, respectively.Without the NaCl or ABA treatment, the enzymatic activities of POD, SOD and CAT in 35S and 35S:CsbHLH041 transgenic seedlings were no signi cant difference (Fig. 6d-f).Nevertheless, both NaCl treatment and ABA treatment could signi cantly activate more enzymatic scavenging activities in the CsbHLH041 transgenic plants than in the 35S empty vector seedlings (Fig. 6d-f).
To further explore the function of CsbHLH041 resistance to abiotic stress in plants, transgenic Arabidopsis plants overexpressing CsbHLH041 driven by the CaMV35S promoter were generated.Two independent homozygous lines with relatively high expression levels, CsbHLH041 OX1 and CsbHLH041 OX2, were selected for the analysis (Fig. 7a).The salt and ABA tolerance of CsbHLH041 transgenic plants were assessed.There were no differences in seed germination between WT and CsbHLH041 transgenic Arabidopsis on 1/2 MS (Control) (Fig. 7b).However, the germination ratio of transgenic plants seeds was markedly higher than WT seeds in 1/2 MS medium containing 100 mM NaCl or 2 μM ABA (Fig. 7b-d).
Subsequently, the 3-week-old seedlings of CsbHLH041 transgenic lines and wild-type (WT) plants were treated with 200 mM NaCl and 100 μM ABA, respectively.The leaves of WT plants turned severely yellow after 4 days of 200 mM NaCl or 100 μM ABA treatment, while CsbHLH041 transgenic lines were still growing with green leaves (Fig. 7e-f).After 8 days, the difference in NaCl or ABA resistance between WT plants and CsbHLH041 transgenic lines was more obvious, which suggested that CsbHLH041 transgenic plants were more tolerant to salt and ABA stresses than WT.
The protein interaction network predictions for CsbHLH orthologs in Arabidopsis that were crucial for the abiotic stress response Network interaction analysis has been demonstrated to be an effective method to analyse the gene function [40].We used the software STRING 10 to predict the protein interaction network among the 142 CsbHLH proteins (Fig. 8a).Numerous CsbHLH transcription factors interacted with multiple CsbHLHs, consistent with previous reports demonstrating that the binding activity of speci c DNA sequences depends on the homodimers or heterodimers formed by the interactions of bHLH proteins [2].As shown in Fig. 8a, there were 21 proteins that had correlation with more than four other bHLH proteins, which may make them play important roles in regulating plant stress responses and growth, and detailed informations about these orthologs were showed in Table S6.
In our study, CsbHLH041 responded signi cantly to salt and ABA treatments, and CsbHLH041 could enhance tolerance to NaCl and ABA in transgenic Arabidopsis and cucumber (Fig. 5a-b; Fig. 6; Fig. 7).The function of bHLH proteins are mainly realized through the formation of heterodimers or homodimers with other transcription factors, which are essential for their binding to downstream target genes [2].AT5G56960, the CsbHLH041 homologous gene, was at the centre of the protein association network, indicating that it played main roles in regulating different functional proteins (Fig. 8b; Table S6).For example, EP3 might play a role in both normal plant growth and disease resistance [41].VSP1 and VSP2 are anti-insect proteins and respond to methyl jasmonate and wounding, in which their defense function were correlated with its acid phosphatase activity [42,43].The predicted gene association network provides useful resources for subsequent research.

Characterization of the cucumber bHLH family
The basic helix-loop-helix (bHLH) transcription factor family is the second largest family in eukaryotes [10,44] and extensive studies of bHLH families have been identi ed in various plants [2].For example, 166 bHLH genes have been identi ed in Arabidopsis [2,10], 115 bHLH genes in Nelumbo nucifera [45], 188 bHLH genes in apple [40], 167 bHLH genes in rice [12] and 159 bHLH genes in tomato [13].The bHLH TFs have been involved in multiple biological processes in plants, especially in regulating defense against biotic and abiotic stresses [46].However, we know very little about bHLHs in cucumber.In our study, 142 bHLH cucumber genes were identi ed and characterized.According to phylogenetic analyses, the 142 CsbHLHs were divided into 32 subgroups (Fig. 2a), and multiple sequence analysis indicated that the conserved bHLH domains existed in all 142 CsbHLH proteins (Fig. 1).For instance, the two amino acid residues Leu-21 and Leu-53 were relatively conserved in the helical region that are essential for the formation of dimers.Moreover, the conservative sequence analyses indicated that almost all 142 CsbHLH proteins had the conserved 1 and 2 motifs.The analyses of gene structure and the motif further supported the phylogenetic relationship for the 142 CsbHLH genes (Fig. 2b-c).To sum up, these results showed that all 142 CsbHLHs had the characteristics of the bHLH family, con rming the reliability of the bHLH genes discovered in cucumber.

Phylogenetic analysis and evolution of cucumber bHLH genes
In the model plant Arabidopsis, the bHLH gene family has been systematically analysed [2,11].To explore the evolutionary relationships between 142 CsbHLH proteins in cucumber and 166 AtbHLH proteins in Arabidopsis, a phylogenetic tree was constructed based on the protein of 308 bHLHs, which clustered into 23 subfamilies (Fig. 4).There are differences in anatomy and physiology between cucumber and Arabidopsis, so some clades may have different modes of expansion in the bHLH family of cucumber and Arabidopsis.As shown in Fig. 4 and Table S4, not all bHLH members in cucumber were included in these 23 subfamilies, which suggested that there were differences between cucumber and Arabidopsis during the process of evolution.
Studies had shown that gene duplication events played a crucial role in the rapid expansion and evolution of gene families [26].In the cucumber genome, we identi ed 231 segmental duplication events and 1468 tandem duplication gene pairs (Table S1).Seven segmental duplication events and ve tandem duplication gene pairs were found in the CsbHLH family (Fig. 3a).In general, the gene functions of a clade are highly conserved among different plant species, but it is not absolute.Therefore, it is of great signi cance to accurately identify the true orthologs between plant species based on synteny analysis.The results showed that the cucumber genome had extensive synteny with the Arabidopsis and tomato genomes, and 944 and 983 syntenic blocks between the cucumber and Arabidopsis and tomato genome were identi ed, respectively (Table S5).Many CsbHLH genes showed a linear relationship with the tomato and Arabidopsis genes, respectively (Fig. 3b; Table S2).
Previous studies have shown that orthologous genes are usually distributed in the same clade, and have similar functions.In our study, many CsbHLH proteins were grouped into some functional clades of Arabidopsis, providing valuable information for studying the functions of CsbHLHs.CsMYC1 and CsbHLH042 were grouped into subfamily5 along with AtGL3, AtEGL3, AtMYC1 and AtTT8, and were highly homologous to these proteins.In Arabidopsis, AtGL3, AtEGL3 and AtTT8 have been demonstrated to be key regulators of anthocyanin and PA biosynthesis [22].Moreover, AtGL3, AtEGL3 and AtMYC1 were shown to regulate trichome formation and root hair patterning [19,47].Therefore, it is possible that CsMYC1 and CsbHLH042 may control trichome formation and PA biosynthesis in cucumber.

Cucumber bHLH genes may play important roles in abiotic stress tolerance
In the process of plant response to abiotic stress, bHLH TFs act as regulatory genes to regulate the expression changes of related stress genes, thus playing an important role in stress responses.Many studies have shown that bHLH TFs can respond to a range of stresses.For example, in addition to being involved in the morphogenesis of stomata, the TFs INDUCER OF CBF EXPESSION1 (ICE1) and ICE2 in Arabidopsis and their homologous genes in other species can play key roles in the response to low temperature stress [32,46].RERJ1 is upregulated in the event of physical damage and drought stress to plants [48].All these examples indicate that bHLH TFs can play a certain role in response to abiotic stress.However, little is known about the functions of the bHLH gene family in cucumber.To better analyse the protein functions of the bHLH gene family in cucumber, we conducted a preliminary analysis of three aspects to reveal the functions of the CsbHLH gene family.
How cis-elements in the promoters of the bHLH genes respond to the environment will affect their roles in stimulating and regulating gene expression.Cis-element analyses indicated that there were a wide range of elements on the gene promoters of CsbHLH responding to different stresses, such as TCA-element, MBS and LTR (Fig. S3).MYB binding site involved in drought-inducibility existed in many CsbHLH gene promoters (Table S3), indicating that MYB TFs may regulate CsbHLHs expression in drought stress.The TC-rich and ABRE elements related to ABA-dependent or independent stress tolerance also appeared in some CsbHLH gene promoters [49].In general, according to the cis-acting element contained on the promoters, these CsbHLH genes might play key roles responsing to various stresses in cucumber.In addition, the functions of 50 CsbHLHs were predicted, which were mainly related to stress responses and development processes (Table S4).For the third aspect, the regulatory networks for 142 CsbHLH genes were predicted, suggesting that a number of genes could respond to stimuli (Table S6).For example, bHLH093 and ICE1 were involved in the ABA signalling pathway, which were crucial for abiotic stress responses in plants [49,50].These results suggested that the bHLH gene family may also be involved in the response to stress, metabolic regulation, and plant development in cucumber, consistent with previous research [10,12].Subsequently, we analysed and screened CsbHLH genes that might respond to stress, as it is very important to improve stress tolerance of cucumber.According to cis-element analyses, the promoter regions of 60 CsbHLHs were rich in TC-rich cis-elements, suggesting that they may be involved in stress responses and defense (Fig. S3).Moreover, the promoters of 106 CsbHLHs contained the ABA-responsive element, responding to ABA stress and 41 CsbHLHs contained the LTR element, responding to cold stress.The phylogenetic analyses between Arabidopsis and cucumber further showed that 25 CsbHLHs might respond to abiotic stresses, such as ABA, salt, cold and drought (Table S4).Through comprehensive analysis, we carefully screened 21, 20 and 25 bHLH genes that were likely to respond to low temperature (4°C), salt (NaCl) and ABA, respectively.The screened CsbHLH genes all responded to stress treatments under the respective stress conditions (Fig. 5).CsbHLH041 was induced by salt and ABA (Fig. 5a-b), and 35S:CsbHLH041 transgenic Arabidopsis thaliana and transient transformed cucumber cotyledons were shown to have enhanced tolerance to salt and ABA (Fig. 6; Fig. 7).In general, these results provided a good reference for further functional studies of CsbHLH gene family in cucumber.

Conclusions
Our study investigated the bHLH family genes in detail in cucumber.We also performed expression analyses of the selected genes under different stress treatments, and detailed functions of CsbHLH041 using the transgenic method.This work provides abundant insights into the functions and regulatory mechanisms of CsbHLH proteins in cucumber abiotic stress tolerance and growth and development.

Genome-wide identi cation of the CsbHLH genes in cucumber
To identify the CsbHLH gene family members from the entire cucumber genome database, 166 Arabidopsis bHLH proteins were used as query sequences and BlastP searches against the predicted cucumber proteins.In addition, the Hidden Markov Model (HMM) pro le of the bHLH domain (PF00010) from the Pfam database (available online: http://pfam.janelia.org)was also applied as a query to search the bHLH genes.We further examined the bHLH domains of all candidate bHLH genes as described by [24].

Phylogenetic analysis and multiple sequence alignment
The sequence logos for bHLHs were obtained by submitting the multiple alignment sequences to the website (http://weblogo.berkeley.edu/logo.cgi)[51].A phylogenetic tree was constructed with the aligned fully predicted protein sequences of 142 bHLH genes using MEGA7 (https://www.megasoftware.net/)[52].The neighbour-joining (NJ) method was used with the following parameters: Poisson correction, pairwise deletion, and bootstrap (1,000 replicates; random seed).The phylogenetic tree was visualized by plotting it using the EvolView tool (http://www.evolgenius.info).Classi cation of the CsbHLH genes was then performed according to their phylogenetic relationships with their corresponding Arabidopsis bHLH genes.Multiple sequence alignments were performed as described by [26].

Gene duplication and chromosomal distribution
The gene duplication events were assessed as described by [54].According to the physical location information in the cucumber genome database, 142 CsbHLH genes were mapped to cucumber chromosomes as described by [26], and the syntenic analysis maps were completed using TBtools [26].

Analysis of the bHLH gene promoter in cucumber
We downloaded the entire cucumber genome sequence from the cucumber genome database (Chinese Long 9930) and extracted the 2-kb long sequences upstream of the transcription start site of these 142 CsbHLH genes.The cis-acting elements on the promoter regions of these genes were analysed using PlantCARE (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/)software [55].

RNA extraction and qRT-PCR analysis
Total RNA was isolated from cucumber and Arabidopsis plants using an RNAprep pure Plant Kit (TianGen, Beijing, China), following the manufacturer's instructions.Subsequently, reverse transcribed using the PrimeScript ® 1st Strand cDNA Synthesis Kit (Takara, Japan).The qRT-PCR reactions were performed using the UltraSYBR Mixture (with ROX I; Cwbiotech) with the iCycler iQ5 system (BioRad, CA, USA).The results were normalized to those of the cucumber Actin gene.Three biological replicates were used for each analysis.The primers used in this study are provided in Table S7.

Overexpression vector construction, Arabidopsis transformation and transient transformation in cucumber cotyledons
The full-length coding sequence of CsbHLH041 was recombined into the pCAMBIA1300 vector.The construct was transformed into Agrobacterium tumefaciens LBA4404, which was used for transformation of Arabidopsis plants and 8-d-old cucumber cotyledons [57].The Arabidopsis seeds were Colombia (Col-0), which were bred in our laboratory.Homozygous T3 transgenic Arabidopsis lines were identi ed by hygromycin (300 mg/L) selection.

Abiotic stress tolerance assays and ABA sensitivity analysis
For Arabidopsis salt stress and ABA treatment, the seeds of CsbHLH041 T3-generation homozygous lines and Col-0 (WT) were sown in vermiculite soil in pots and cultured under normal conditions at 22 °C for 3 weeks.For salt treatment, the 3-week-old seedlings were watered with 200 mM NaCl solution every other day, and the growth of Col-0 (WT) and CsbHLH041 transgenic lines was observed every 4 days.For ABA treatments, the 3-week-old seedlings were watered with 100 μM ABA solution every other day, and phenotypes were evaluated every 4 days.To check the seed germination rate in response to salt stress and ABA treatment, the seeds of Col-0 (WT) and transgenic lines were surface sterilized and sown in 1/2 MS medium supplemented with 2 μM ABA or 100 mM NaCl, respectively, under normal conditions at 22 °C in a growth chamber.The germination rate was scored on the 7th day after culturing on the plates.
To determine the salt tolerance and ABA sensitivity in cotyledons of 8-d-old cucumber seedlings with transient in ltration of 35S and 35S:CsbHLH041, selected seedlings with equivalent growth were transferred to 6 L nutrient solution for hydroponic growth.Hoagland nutrient solution was used for culture, and the seedlings were grown hydroponically for two days before salt and ABA treatment.They were then treated with salt and ABA, and the nal concentration in the medium was 100 mM and 100 μM, respectively.To ensure the reliability of the experiment, the cucumber seedlings with transient in ltration of 35S and 35S:CsbHLH041 were cultured in the same hydroponic box.The changes in transgenic and control seedlings were observed at different time periods.
Evolutionary tree analysis (circle tree) and subfamily classi cations of bHLHs proteins in cucumber and Arabidopsis thaliana.The evolutionary tree was constructed using the Neighbour-Joining method with 1000 bootstrap replication.The evolutionary distances were computed using poisson correction.The analysis involved 142 cucumber bHLH protein sequences and 166 Arabidopsis thaliana bHLH protein.
Red stars represented the CsbHLH proteins and blue represented the AtbHLH proteins.and tendrils (T), respectively.The cucumber β-actin gene was performed as an internal control, and three independent samples were used for these experiments.Error bars indicated standard errors (SE).Figure 8

Figures
Figures

Figure 5 Relative
Figure 5