Genome-wide identification and characterisation of bHLH transcription factors in Artemisia annua
BMC Plant Biology volume 23, Article number: 63 (2023)
A. annua (also named Artemisia annua, sweet wormwood) is the main source of the anti-malarial drug artemisinin, which is synthesised and stored in its trichomes. Members of the basic Helix-Loop-Helix (bHLH) family of transcription factors (TFs) have been implicated in artemisinin biosynthesis in A. annua and in trichome development in other plant species.
Here, we have systematically identified and characterised 226 putative bHLH TFs in A. annua. All of the proteins contain a HLH domain, 213 of which also contain the basic motif that mediates DNA binding of HLH dimers. Of these, 22 also contained a Myc domain that permits dimerisation with other families of TFs; only two proteins lacking the basic motif contained a Myc domain. Highly conserved GO annotations reflected the transcriptional regulatory role of the identified TFs, and suggested conserved roles in biological processes such as iron homeostasis, and guard cell and endosperm development. Expression analysis revealed that three genes (AabHLH80, AabHLH96, and AaMyc-bHLH3) exhibited spatiotemporal expression patterns similar to genes encoding key enzymes in artemisinin synthesis.
This comprehensive analysis of bHLH TFs provides a new resource to direct further analysis into key molecular mechanisms underlying and regulating artemisinin biosynthesis and trichome development, as well as other biological processes, in the key medicinal plant A. annua.
The bHLH (basic Helix-Loop-Helix) proteins are one of the most important transcription factor (TF) families present in all eukaryotes: from red algae and yeasts to higher plants and animals . These proteins usually contain a highly conserved bHLH domain of 45–60 amino acids . The HLH region comprises two generally hydrophobic helices linked by a loop region , and is critical for homo- or hetero-dimerisation of HLH proteins into functional TFs . The basic motif, rich in basic amino acids (particularly arginine), mediates DNA recognition and binding to E-box or G-box hexanucleotide consensus sequences (5′-CANNTG-3′). Binding of bHLH TFs to E-box sequences has been shown to regulate gene expression in a wide range of biological processes, including cell differentiation, development and other processes, e.g., regulating flag angle, in rice ; determining lateral root initiation in Arabidopsis thaliana ; modulating multiple stress pathways [7, 8]; and controlling iron homeostasis  and hormone signalling . HLH proteins lacking the basic motif can act as repressors by forming heterodimers to sequester bHLH proteins into inactive complexes unable to bind DNA .
Some bHLH TFs contain an additional N-terminal Myc domain. The Myc domain was first identified in oncogenes, and Myc-domain proteins promote proliferation and apoptosis and inhibit terminal differentiation in the genesis of an extraordinarily wide range of cancers . Human c-Myc, a nuclear protein , was shown to interact with a bHLH protein Max to promote transcriptional activity [14, 15]; and Myc-bHLH proteins, encoding both Myc and bHLH domains, have also been reported [16, 17]. In plants, Myc-bHLH TFs contain an MYB interaction region (MIR), which can interact with an R2R3–MYB domain protein to affect transcription and downstream processes .
A. annua (Asteraceae) produces artemisinin, the powerful anti-malarial drug, mainly in its trichomes . The key enzymes involved in artemisinin biosynthesis include ADS (amorpha4,11-diene synthase), DBR2 (artemisinic aldehyde delta-11 (13) reductase), CYP71AV1 (Cytochrome P450 monooxygenase), and ALDH1 (aldehyde dehydrogenase 1) [20,21,22]. Several bHLH TFs have been reported to be involved in artemisinin synthesis, e.g., AabHLH1(AaMyc-bHLH3, in the following naming of this study) ; bHLH112 (AabHLH65) that acts indirectly via ERF1 ; and AaPIF3 (AabHLH20), whose overexpression promotes artemisinin production .
In model plants Arabidopsis and rice, 162  and 167  bHLH, respectively, have been identified. As the genome sequences of more species are published and bioinformatic technologies become more refined, the identification of bHLH TFs in a larger number of species is being completed, e.g., potato , apple , maize , wheat . Here, we have identified 226 putative bHLH TFs from A. annua, and analysed the bHLH domain structures, phylogeny, and gene ontology (GO) annotations of the TFs. Examination of their protein–protein interaction (PPI) network identified key hub genes, and transcriptomic analyses has identified potential genes involved in artemisinin biosynthesis and trichome development.
Characterisation of bHLH TFs in A. annua
A total of 247 bHLH sequences were identified from the existing A. annua protein database  using a Hidden Markov Model search for the PF00010 (HLH) domain. A subsequent BlastP search using the amino acid sequences of 88 bHLH TFs from Arabidopsis identified 59 sequences. After combining the two sets of results and removing repeated entries, 226 sequences were identified (Table 1; cDNA sequences in Supplemental Material 2, gDNA sequences in Supplemental Material 3 and protein sequences in Supplemental Material 4). The presence of HLH domains in these sequences was confirmed by HMMscan and the NCBI Conserved Domains tool.
Analysis of the conserved domains of AabHLH TFs
An alignment of the amino acid sequences of these 226 TFs was generated. Four conserved motifs are typically found in the bHLH domain, namely one basic motif, two helical motifs, and one loop that connected the two helices to form the helix-loop-helix (HLH) domain (Fig. 1A). The 9 aa basic motif of AabHLH TFs contained five highly conserved residues (His-1; Glu-5; Arg-6, 8, 9); the 14 aa helical motifs contained four (Leu-19, 22; Val-23; Pro-44) and seven (Ala-32, 38; Leu-35; Tyr-40; Ile-41; Lys-42; Leu-44) conserved residues in helix 1 and 2, respectively; while the 6 aa loop contained two conserved residues (Lys-28; Asp-30; Fig. 1A).
A predicted three-dimensional structure of the highest consensus sequence was generated, and confirmed the presence of two helices and intervening loop (Fig. 1B). The predicted structure easily forms homo- and hetero-dimers, consistent with the known requirement for bHLH TFs to form dimers to function and maintain stability (Fig. 1B).
The vast majority of AabHLH TFs (191/226) contained a basic motif and an HLH domain (AabHLH1–191), while eleven lacked the basic motif (AaHLH1–11). A further 24 TFs contained an additional Myc domain (PF00249), comprising three short repeated sequences upstream of the bHLH domain. Of these, 22 contained the basic motif (AaMyc-bHLH1–22), while the last two lacked the basic motif (AaMyc-HLH1–2; Table 1).
Phylogenetic analysis of AabHLH TFs
To classify the 226 bHLH TFs from A. annua and explore their evolutionary relationships with 88 Arabidopsis proteins, we constructed an unrooted phylogenetic tree based on their bHLH domains. The 314 TFs clustered into eleven subfamilies (Fig. 2). AaMyc-bHLH and AaMyc-HLH TFs were found in groups I, II, and X. AaHLH TFs mainly occurred in group VII, with a minor presence in groups V, X, and XI. AabHLH TFs were present in every group, while AtbHLH TFs were present in every group except VIII.
Gene ontology classification of AabHLH TFs
Despite the sequences outside the bHLH domain being highly divergent, AabHLH TFs have highly conserved gene ontology (GO) annotations, especially with respect to Molecular Function (Table 2; Supplemental Material 1 Tables S1 and S2). Over 96% AabHLH TFs (217) possess dimerization activity; 86 have DNA binding activity; > 48% are involved in transcription processes; and 12 affect iron ion homeostasis. Several AabHLH TFs play a role in endosperm development and guard cell differentiation (Table 2). While there are only 11 AaHLH TFs (4.9% of the total), they are distributed across the more conserved GO annotations, including GO:0006355, GO:0046983, GO:0003700, GO:0055072, GO:0006357 and GO:0006351 (Supplemental Material 1 Table S2), indicating that a feedback regulation mechanism between AaHLH and AabHLH TFs may exist in A. annua biological processes.
Protein–protein interaction network construction and hub gene identification
Protein interactions between the TFs were predicted with the STRING tool. A total of 227 nodes and 106 edges were identified in the protein-protein interaction (PPI) network; disconnected nodes in the network were hidden (Supplemental Material 1 Fig. S1). Nodes with higher degrees of connectivity tend to be more important for maintaining the stability of the entire network, so we focussed on identifying these hub genes, Cytoscape software was used to modify the PPI network.
AabHLH61 had the highest degree of connectivity (26), followed by AabHLH20, AaMyc-bHLH3, and AaMyc-bHLH1, all with a degree of connectivity of 18 (Table 3; Fig. 3). The top ten proteins by connectivity in the PPI network were considered to be encoded by hub genes (Table 3).
The expression patterns of these genes were explored by quantitative reverse transcription (qRT-)PCR in flower, root, stem, young leaf, old leaf, and seed tissues (Fig. 4). All of these genes exhibited markedly different expression patterns in the six tissues analysed, suggest that these TFs play different functions in affecting various aspects of biological processes. AaMyc-bHLH1 was highly expressed in young and old leaf, while AaMyc-bHLH3 in old leaf. AabHLH61 and AabHLH117 expression were highest in leaf tissues, as well as seed for AabHLH61 and stem for AabHLH117. AaMyc-bHLH9 and AabHLH100 expression also peaked in old leaf, it was at lower levels. Of the remaining hub genes, AabHLH20 was highly expressed in old leaves and seeds; AabHLH106 in the stem; AabHLH111 in roots and stem; and AabHLH151 in seeds.
Differential expression of AabHLH TFs in various tissues
An existing RNA-sequencing (RNA-seq) database was used to further explore the expression patterns of 226 AabHLH TFs at different growth stages in different tissues and organs (young leaf, old leaf, stem, root, epidermis, bud, seed, flower and trichome) . Three obvious clusters (labelled α, β, and γ) of expression were detected (Fig. 5A; Supplemental Material 1 Fig. S2). Expression of genes encoding AabHLH TFs was highest in the α clusters, with most genes exhibiting mid- to high-expression levels; in the β clusters, gene expression was generally lower. Across all four clusters, however, different patterns of tissue-specific expression were observed, e.g., in β, genes were generally most highly expressed in root, bud, and flower. The expression levels of AabHLH TFs from the γ cluster were generally very low across all tissues (Fig. 5A).
Trichomes (small protrusions of epidermal origin on stem, leaf, bud, and flower surfaces) of A. annua are the sites for production and storage of artemisinin [33, 34]. Genes encoding key enzymes in artemisinin synthesis are also highly expressed in trichomes, e.g., ADS, DBR2, CYP71AV1, and ALDH1 (Fig. 5B). To define which AabHLH TFs might be involved in trichome formation and artemisinin synthesis, we identified AabHLH TF-encoding genes with relatively high expression in the trichome. The expression levels of AaMyc-bHLH1, AaMyc-bHLH3, AabHLH184, AabHLH80, AabHLH181, AabHLH88, and AabHLH96 in trichome were comparable to those encoding key artemisinin synthetic enzymes (Fig. 5B). Moreover, AabHLH80, AabHLH96, AabHLH181, AaMyc-bHLH1, and AaMyc-bHLH3 were also highly expressed in bud and young leaf (Fig. 5C), consistent with patterns exhibited by genes encoding artemisinin synthetic enzymes, suggesting that the encoded bHLH TFs may be involved in artemisinin synthesis.
Comprehensive genome-wide detection of AabHLH TFs
Our research identified 226 AabHLH TFs in A. annua (Table 1), which slightly exceeds the 205 found in a previous study , likely due to differences in screening methods. Multiple sequence alignments of full-length AabHLH TF sequences showed that almost all TFs contained the classic bHLH domain, which is similar to domains in maize , tomato , and barley . Some TFs lacked the N-terminal basic motif; these TFs cannot bind DNA, so play a negative regulatory role. For example, PAR1–PRE1 and PAR1–PIF4 heterodimers in Arabidopsis form a complex HLH/bHLH network regulating cell elongation and plant development in response to light and hormones ; bHLH TF GhFP2 and HLH TF GhACE1 antagonistically regulate fibre elongation in cotton ; and antagonistic HLH/bHLH TFs mediate brassinosteroid regulation of cell elongation and plant development in rice and Arabidopsis .
GO annotation analysis
AaHLH TFs are annotated with 8 conserved GO terms (Table 2), particularly dimerisation, DNA-binding, and transcription processes, consistent with typical functions of bHLH TFs [39, 40]. This family of TFs have been reported to be involved in iron ion regulation in tomato  and Arabidopsis ; here, 12 A. annua bHLH TFs were annotated with a GO term implicating a role in iron homeostasis. Other roles for AabHLH TFs suggested by GO annotations, such as in endosperm development and guard cell differentiation, have been reported in other plants [43,44,45,46], indicating that the functions of bHLH TFs from different species are conserved.
Further, AaHLH TFs without basic motifs were distributed across the conserved GO annotations (Supplemental Material 1 Table S2), suggesting a potential role for these TFs in feedback mechanisms with AabHLH TFs across a broad range of biological processes.
Potential function of AabHLH TFs in artemisinin biosynthesis and trichome development
AaMyc-bHLH3 has been reported to bind to the E-box motif of ADS and CYP71AV1 to positively regulate artemisinin biosynthesis in A. annua (annotated AabHLH1 in ). AaMyc-bHLH3 is generally more highly expressed than other AabHLH TFs in young leaf, bud and trichome (Fig. 5B). Genes encoding AaMyc-bHLH1, AaMyc-bHLH3, AabHLH80, AabHLH181, and AabHLH96 showed similar expression patterns, suggesting that they may also be involved in trichome development and artemisinin regulation (Fig. 5B, C). Furthermore, AaMyc-bHLH1and AaMyc-bHLH3 also being hub genes, this also reflects the important role of both in the growth and development of A. annua.
In the well-studied model Arabidopsis thaliana, trichome initiation is regulated by two protein complexes. The first one, the activator–depletion multimer GL1/MYB23-GL3/eGL3-TTG1, forms a MYB-bHLH-WD40 complex that binds to the GLABRA2 (GL2) promoter to positively regulate trichome development. The second one, the activator–inhibitor multimer MYB-bHLH-TTG, negatively regulates trichome formation by replacing the activator GL1/MYB23 with the inactive TRY/CPC-GL3 in a complex with eGL3-TTG1 [47, 48]. Previous studies in A. annua have identified a MYB23 homologue, AaTAR2, which encodes an R2R3 MYB TF expressed mainly in young leaves. Inhibition or overexpression of AaTAR2 resulted in decreased or increased artemisinin content in glandular secretory trichomes (GSTs), respectively, as well as changes in GST morphology . Another gene encoding an R2R3 MYB TF, AaMIXTA1, is mainly expressed in the basal cells of GSTs; again, its overexpression or inhibition resulted in an increase or decrease in GST numbers and artemisinin content in transgenic plants, respectively . While these MYB TFs have been identified, no related bHLH TFs have been reported to regulate trichome initiation and development, as would be expected if the process is conserved with other plants. This new bHLH TF resource can be used to guide further research to uncover the molecular mechanisms underlying GST development in A. annua, and to identify specific bHLHs that may be involved in regulatory complexes.
At last, this comprehensive analysis of bHLH TFs provides a new resource to direct further analysis into key molecular mechanisms underlying and regulating artemisinin biosynthesis and trichome development, as well as other biological processes, in the key medicinal plant A. annua.
Defining A. annua bHLH TF amino acid sequences
The A. annua genome, protein database, and annotation files were downloaded from NCBI (National Center for Biotechnology Information), ID: PRJNA416223 . A local protein database was constructed with NCBI BLAST software (ncbi-blast-2.9.0 + −win64). An HMM (Hidden Markov Model) profile of the HLH conserved PF00010 domain was downloaded from http://Pfam.xfam.org/; this file was used as seed for Hmmer software  to run an HMMsearch in the local protein database (E-value 0.01). In parallel, 88 bHLH protein sequences from Arabidopsis were acquired from TAIR (The Arabidopsis Information Resource) database (https://www.arabidopsis.org/) ; these bHLH were also used as query sequences in a local BlastP search on the A. annua protein database (E-value 0.0001). The resulting sequences were combined, and redundant sequences removed with CD-HIT (http://www.bioinformiscs.org/CD-HIT/). The remaining 226 sequences were analysed with HMMscan (https://www.ebi.ac.uk/Tools/hmmer/search/hmmscan), and bHLH domains were determined with NCBI Conserved Domains (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi). Proteins containing Myc domains were identified by the presence of a PF00249 domain.
Analysis of AabHLH domains
The AabHLH sequences were aligned with MEGA software 6.06 . Conserved amino acids were identified and characterised with Weblogo (http://weblogo.berkeley.edu/), while Swiss-Model (https://swissmodel.expasy.org/) was used to predict the three-dimensional structure of the bHLH domain.
The neighbour-joining phylogenetic tree of bHLH domain sequences from Arabidopsis (88) and A. annua (226) was constructed using Clustal X2  with a bootstrap test of 1000 replicates. MEGA 6.06 was used to modify the phylogenetic tree.
GO analysis of AabHLH TFs
As A. annua is not included in the standard Gene Ontology (GO) Database for Annotation, Visualization, and Integrated Discovery (DAVID), we individually analysed 226 AabHLH TFs with InterPro (http://www.ebi.ac.uk/interpro/) to determine GO terms associated with each protein.
PPI network construction and hub gene identification
To evaluate potential PPI relationships, the 226 AabHLH TFs were mapped to the STRING database (Search Tool for the Retrieval of Interacting Genes, http://string-db.org/), and PPI pairs with a combined score ≥ 0.4 were extracted. The PPI network was visualised with Cytoscape software (www.cytoscape.org/). CytoHubba, a Cytoscape plugin, was used to calculate the degree of connectivity for each protein node. The top ten genes were selected as hub genes.
Gene expression analysis
The A. annua “Huhao 1” used in this article is a high artemisinin producer and was cultured at Naval Medical University for several years. The seeds of A. annua was stored at 4 °C, germinated on the Murashige and Skoog (MS) medium with 3% sucrose and 0.7% agar, then the plants with 2 leaves were transferred to soil (black soil: vermiculite: perlite about 10:10:1) and cultivated in a greenhouse with a relative humidity of 70%, a photoperiod of 16-h light (23 °C) /8-h dark (20 °C). Roots were obtained from 10 days old plant. Stem, leaves and bud were collected from 4 months old plants as previously described . Total RNA was isolated with the TRIZOL Reagent (TRANS) from nine tissues collected from three independent plants: young leaf; old leaf; stem; root; epidermis; mature seed; flower; and trichomes. cDNA was synthesised from 4 mg of total RNA with Hifair® III reverse transcriptase (Hifair® III 1st Strand cDNA Synthesis Kit; YEASEN) according to manufacturer’s instructions.
Quantitative reverse transcription (qRT)-PCR was performed using QuanStudio 3 (Thermo Fisher Scientific) with the PerfectStart® Green qPCR SuperMix (TRANS). Actin (EU531837) was used as an internal control. For qRT-PCR assays, cDNA was denatured at 94 °C for 30 sec, followed by 45 cycles of 95 °C 5 s, 54 °C 15 s, and 72 °C 10s. Assays were performed in triplicate. Primers used for qRT-PCR are listed in Supplemental Material 1 Table S3.
Analysis of AabHLH gene expression across tissues and stages of development
A. annua transcriptomics data was downloaded from NCBI (PRJNA416223) . mRNA sequences were extracted with TBtools software , using Salmon software to build the index, and TPM (transcripts per million, normalised for gene length) values calculated . Results were imported into MEV4.9.0 software  to generate heat maps and perform hierarchical clustering.
Seeds access and culture
A. annua is a widely grown plant. The seeds of the “Huhao 1” cultivar line were obtained from Shanghai Jiao Tong University , deposited in our university seed bank and are freely accessible for research. The seeds were preserved, cultivated, and propagated in Naval Medical University (30° N 121° E) from April to November (the natural growing season) according to standard local practice. The seeds deposit information is as follows: ID: Huhao 1. Contact person: Prof Hexin Tan, department of pharmacy, Naval Medical University, 325 Guohe Road, Shanghai 200,433, China, Email: firstname.lastname@example.org.
Availability of data and materials
Genome sequences were from (https://www.ncbi.nlm.nih.gov/Traces/wgs/PKPP01?display=contigs&page=1); Local protein database was from (https://www.ncbi.nlm.nih.gov/Traces/wgs/PKPP01?display=proteins&page=1); A. annua transcriptomics data was downloaded from NCBI (PRJNA416223), for young leaf RNA-seq data from (https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR6472941&display=download); old leaf (https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR6472942&display=download); stem (https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR6472943&display=download); root (https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR6472944&display=download); epidermis (https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR6472945&display=download); bud (https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR6472946&display=download); seed (https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR6472947&display=download); flower (https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR6472948&display=download); trichome (https://trace.ncbi.nlm.nih.gov/Traces/?view=run_browser&acc=SRR6472949&display=download).
Availability of data and materials
Transcriptomic data of A. annua at various developmental stages and in different tissues was obtained from a previous study . All other data generated or analysed during this study are included in this published article or its supplemental material.
Quantitative reverse transcription polymerase chain reaction
Murre C. Helix–loop–helix proteins and the advent of cellular diversity: 30 years of discovery. Genes Dev. 2019;33(1–2):6–25.
Pires N, Dolan L. Origin and diversification of basic-helix-loop-helix proteins in plants. Mol Biol Evol. 2009;27(4):862–74.
Carretero-Paulet L, Galstyan A, Roig-Villanova I, Martínez-García JF, Bilbao-Castro JR, Robertson DL. Genome-wide classification and evolutionary analysis of the bHLH family of transcription factors in arabidopsis, poplar, rice, moss, and algae. Plant Physiol. 2010;153(3):1398–412.
De Martin X, Sodaei R, Santpere G. Mechanisms of binding specificity among bHLH transcription factors. Int J Mol Sci. 2021;22(17):9150.
Olsen KM, Dong H, Zhao H, Li S, Han Z, Hu G, et al. Genome-wide association studies reveal that members of bHLH subfamily 16 share a conserved function in regulating flag leaf angle in rice (Oryza sativa). PLoS Genet. 2018;14(4):e1007323.
Zhang Y, Mitsuda N, Yoshizumi T, Horii Y, Oshima Y, Ohme-Takagi M, et al. Two types of bHLH transcription factor determine the competence of the pericycle for lateral root initiation. Nat Plants. 2021;7(5):633–43.
Samira R, Li B, Kliebenstein D, Li C, Davis E, Gillikin JW, et al. The bHLH transcription factor ILR3 modulates multiple stress responses in Arabidopsis. Plant Mol Biol. 2018;97(4–5):297–309.
Guo J, Sun B, He H, Zhang Y, Tian H, Wang B. Current understanding of bHLH transcription factors in plant abiotic stress tolerance. Int J Mol Sci. 2021;22(9):4921.
Carey-Fung O, O’Brien M, Beasley JT, Johnson AAT. A Model to incorporate the bHLH transcription factor OsIRO3 within the rice iron homeostasis regulatory network. Int J Mol Sci. 2022;23(3):1635.
Zheng K, Wang Y, Wang S. The non-DNA binding bHLH transcription factor Paclobutrazol Resistances are involved in the regulation of ABA and salt responses in Arabidopsis. Plant Physiol Biochem. 2019;139:239–45.
Hao Y, Oh E, Choi G, Liang Z, Wang Z-Y. Interactions between HLH and bHLH Factors Modulate Light-Regulated Plant Development. Mol Plant. 2012;5(3):688–97.
Grandori C, Cowley SM, James LP, Eisenman RN. The Myc/Max/Mad Network and the Transcriptional Control of Cell Behavior. Annu Rev Cell Dev Biol. 2000;16(1):653–99.
Eisenman RN. Deconstructing Myc: Figure 1. Genes Dev. 2001;15(16):2023–30.
Amati B, Dalton S, Brooks MW, Littlewood TD, Evan GI, Land H. Transcriptional activation by the human c-Myc oncoprotein in yeast requires interaction with Max. Nature. 1992;359(6394):423–6.
Blackwood EM, Eisenman RN. Max: a helix-loop-helix zipper protein that forms a sequence-specific DNA-binding complex with Myc. Science. 1991;251(4998):1211–7.
Alex R, Sözeri O, Meyer S, Dildrop R. Determination of the DNA sequence recognized by the bHLH-zip domain of the N-Myc protein. Nucleic Acids Res. 1992;20(9):2257–63.
Macek P, Cliff MJ, Embrey KJ, Holdgate GA, Nissink JWM, Panova S, et al. Myc phosphorylation in its basic helix–loop–helix region destabilizes transient α-helical structures, disrupting Max and DNA binding. J Biol Chem. 2018;293(24):9301–10.
Grotewold E, Sainz MB, Tagliani L, Hernandez JM, Bowen B, Chandler VL. Identification of the residues in the Myb domain of maize C1 that specify the interaction with the bHLH cofactor R. Proc Natl Acad Sci. 2000;97(25):13579–84.
Xiao L, Tan H, Zhang L. Artemisia annua glandular secretory trichomes: the biofactory of antimalarial agent artemisinin. Sci Bull. 2016;61(1):26–36.
Yu R, Wen W. Artemisinin biosynthesis and its regulatory enzymes: Progress and perspective. Pharmacogn Rev. 2011;5(10):189.
Lu X, Shen Q, Zhang L, Zhang F, Jiang W, Lv Z, et al. Promotion of artemisinin biosynthesis in transgenic Artemisia annua by overexpressing ADS, CYP71AV1 and CPR genes. Ind Crop Prod. 2013;49:380–5.
Teoh KH, Polichuk DR, Reed DW, Covello PS. Molecular cloning of an aldehyde dehydrogenase implicated in artemisinin biosynthesis in Artemisia annua. This paper is one of a selection of papers published in a Special Issue from the National Research Council of Canada – Plant Biotechnology Institute. Botany. 2009;87(6):635–42.
Ji Y, Xiao J, Shen Y, Ma D, Li Z, Pu G, et al. Cloning and characterization of AabHLH1, a bHLH transcription factor that positively regulates artemisinin biosynthesis in Artemisia annua. Plant Cell Physiol. 2014;55(9):1592–604.
Xiang L, Jian D, Zhang F, Yang C, Bai G, Lan X, et al. The cold-induced transcription factor bHLH112 promotes artemisinin biosynthesis indirectly via ERF1 in Artemisia annua. J Exp Bot. 2019;70(18):4835–48.
Zhang Q, Wu N, Jian D, Jiang R, Yang C, Lan X, et al. Overexpression of AaPIF3 promotes artemisinin production in Artemisia annua. Ind Crop Prod. 2019;138:111476.
Bailey PC, Martin C, Toledo-Ortiz G, Quail PH, Huq E, Heim MA, et al. Update on the basic helix-loop-helix transcription factor gene family in Arabidopsis thaliana. Plant Cell. 2003;15(11):2497–502.
Li X, Duan X, Jiang H, Sun Y, Tang Y, Yuan Z, et al. Genome-wide analysis of basic/helix-loop-helix transcription factor family in rice and Arabidopsis. Plant Physiol. 2006;141(4):1167–84.
Wang R, Zhao P, Kong N, Lu R, Pei Y, Huang C, et al. Genome-wide identification and characterization of the potato bHLH transcription factor family. Genes. 2018;9(1):54.
Mao K, Dong Q, Li C, Liu C, Ma F. Genome wide identification and characterization of Apple bHLH transcription factors and expression analysis in response to drought and salt stress. Front Plant Sci. 2017;7(1):28.
Zhang T, Lv W, Zhang H, Ma L, Li P, Ge L, et al. Genome-wide analysis of the basic Helix-Loop-Helix (bHLH) transcription factor family in maize. BMC Plant Biol. 2018;18(1):1–14.
Wang L, Xiang L, Hong J, Xie Z, Li B. Genome-wide analysis of bHLH transcription factor family reveals their involvement in biotic and abiotic stress responses in wheat (Triticum aestivum L.). 3 Biotech. 2019;9(6):1–12.
Shen Q, Zhang L, Liao Z, Wang S, Yan T, Shi P, et al. The Genome of Artemisia annua provides insight into the evolution of Asteraceae family and Artemisinin Biosynthesis. Mol Plant. 2018;11(6):776–88.
Tan H, Xiao L, Gao S, Li Q, Chen J, Xiao Y, et al. TRICHOME AND ARTEMISININ REGULATOR 1 is required for trichome development and artemisinin biosynthesis in artemisia annua. Mol Plant. 2015;8(9):1396–411.
Zhou Z, Tan H, Li Q, Li Q, Wang Y, Bu Q, et al. TRICHOME AND ARTEMISININ REGULATOR 2 positively regulates trichome development and artemisinin biosynthesis in Artemisia annua. New Phytol. 2020;228(3):932–45.
Sun H, Fan H-J, Ling H-Q. Genome-wide identification and characterization of the bHLH gene family in tomato. BMC Genomics. 2015;16(1):1–12.
Ke Q, Tao W, Li T, Pan W, Chen X, Wu X, et al. Genome-wide identification, evolution and expression analysis of basic Helix-loop-helix (bHLH) gene family in Barley (Hordeum vulgare L.). Curr Genom. 2020;21(8):624–44.
Lu R, Li Y, Zhang J, Wang Y, Zhang J, Li Y, et al. The bHLH/HLH transcription factors GhFP2 and GhACE1 antagonistically regulate fiber elongation in cotton. Plant Physiol. 2022;189(2):628–43.
Zhang L-Y, Bai M-Y, Wu J, Zhu J-Y, Wang H, Zhang Z, et al. Antagonistic HLH/bHLH transcription factors mediate brassinosteroid regulation of cell elongation and plant development in rice and Arabidopsis. Plant Cell. 2009;21(12):3767–80.
López-Vidriero I, Godoy M, Grau J, Peñuelas M, Solano R, Franco-Zorrilla JM. DNA features beyond the transcription factor binding site specify target recognition by plant MYC2-related bHLH proteins. Plant Commun. 2021;2(6):100232.
Ma PCM, Rould MA, Weintraub H, Pabo CO. Crystal structure of MyoD bHLH domain-DNA complex: Perspectives on DNA recognition and implications for transcriptional activation. Cell. 1994;77(3):451–9.
Ling H-Q, Bauer P, Bereczky Z, Keller B, Ganal M. The tomato fer gene encoding a bHLH protein controls iron-uptake responses in roots. Proc Natl Acad Sci. 2002;99(21):13938–43.
Long TA, Tsukagoshi H, Busch W, Lahner B, Salt DE, Benfey PN. The bHLH Transcription factor POPEYE regulates response to iron deficiency in Arabidopsis roots. Plant Cell. 2010;22(7):2219–36.
Kondou Y, Nakazawa M, Kawashima M, Ichikawa T, Yoshizumi T, Suzuki K, et al. RETARDED GROWTH OF EMBRYO1, a new basic helix-loop-helix protein, expresses in endosperm to control embryo growth. Plant Physiol. 2008;147(4):1924–35.
Feng F, Qi W, Lv Y, Yan S, Xu L, Yang W, et al. OPAQUE11 Is a Central hub of the regulatory network for maize endosperm development and nutrient metabolism. Plant Cell. 2018;30(2):375–96.
Pillitteri LJ, Torii KU. Breaking the silence: three bHLH proteins direct cell-fate decisions during stomatal development. BioEssays. 2007;29(9):861–70.
Liu T, Ohashi-Ito K, Bergmann DC. Orthologs of Arabidopsis thaliana stomatal bHLH genes and regulation of stomatal development in grasses. Development. 2009;136(13):2265–76.
Zhao M, Morohashi K, Hatlestad G, Grotewold E, Lloyd A. The TTG1-bHLH-MYB complex controls trichome cell fate and patterning through direct targeting of regulatory loci. Development. 2008;135(11):1991–9.
Yang C, Ye Z. Trichomes as models for studying plant cell differentiation. Cell Mol Life Sci. 2012;70(11):1937–48.
Shi P, Fu X, Shen Q, Liu M, Pan Q, Tang Y, et al. The roles ofAaMIXTA1in regulating the initiation of glandular trichomes and cuticle biosynthesis inArtemisia annua. New Phytol. 2018;217(1):261–76.
Prakash A, Jeffryes M, Bateman A, Finn RD. The HMMER web server for protein sequence similarity search. Curr Protoc Bioinformatics. 2017;60(1):3–15.
Caspermeyer J. MEGA evolutionary software re-engineered to handle today’s big data demands. Mol Biol Evol. 2016;33(7):1887.
Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, et al. Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23(21):2947–8.
Chen C, Chen H, Zhang Y, Thomas HR, Frank MH, He Y, et al. TBtools: an integrative toolkit developed for interactive analyses of big biological data. Mol Plant. 2020;13(8):1194–202.
Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 2017;14(4):417–9.
Howe EA, Sinha R, Schlauch D, Quackenbush J. RNA-Seq analysis in MeV. Bioinformatics. 2011;27(22):3209–10.
We thank Dr. Natalie Betts for assistance with manuscript revision.
This work was supported by grants from the National Natural Science Foundation of China (82122072) and the Natural Science Foundation of Shanghai (21ZR1477800).
Ethics approval and consent to participate
All procedures were conducted in accordance to the guidelines.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
All Gene ontology (GO) annotation of AabHLH TFs involved into. Table S2. All AabHLH TFs are divided into different categories. Fig. S1. Protein–protein interaction (PPI) network. Fig. S2. Expression of genes encoding total AabHLH TFs across A. annua tissues and stages of development. Table S3. qRT-PCR primers.
cDNA sequences of 226 bHLH TFs.
gDNA sequences of 226 bHLH TFs.
226 TFs protein sequences.
About this article
Cite this article
Chang, S., Li, Q., Huang, B. et al. Genome-wide identification and characterisation of bHLH transcription factors in Artemisia annua. BMC Plant Biol 23, 63 (2023). https://doi.org/10.1186/s12870-023-04063-8