Identification and chromosome mapping of the foxtail millet HAT gene family
Extensive searches of public and proprietary transcripts and genomic databases, with all previously reported HAT proteins (containing GNAT, MYST, P300/CBP, and TAF1) of rice and Arabidopsis, were conducted. A total of 24 HATs were identified in foxtail millet from the Yugu1 genome after excluding redundant genes (Additional file 1). In addition, the position and direction of transcription of each gene were determined on foxtail millet chromosome pseudomolecules available on Phytozome (v12.1) (https://phytozome-next.jgi.doe.gov/), as shown in Fig. 1. The 24 foxtail millet HAT genes were found to be distributed on nine chromosomes: eight on chromosome 2; three each on chromosomes 1, 4, and 5; two each on chromosomes 6 and 9; and one each on chromosomes 3, 7, and 8 (Additional file 1, Fig. 1).
Additionally, we analyzed the physical and chemical properties of all HAT family genes and encoded proteins, including the number of amino acids, molecular weight (Mw), isoelectric point (pI), and subcellular location in foxtail millet. The sizes of the 24 predicted SiHAT proteins ranged from 425 aa (SiHAT9) to 5068 aa (SiHAT5), with molecular weights ranging from 38.08 (SiHAT16) to 563.55 kDa (SiHAT5). The pI values ranged from 4.96 (SiHAT20) to 9.86 (SiHAT24). SiHAT21, SiHAT17, and SiHAT5 were determined to be neutral proteins (–0.5 < index of GRAVY values < 0.5), whereas the GRAVY values of the remaining proteins were < 0, indicating hydrophilic properties. Subcellular localization showed that most of the HAT genes were located in the nucleus, while two were localized to the mitochondria (SiHAT6 and SiHAT24), and two were cytoplasmic (SiHAT11 and SiHAT21). Only one protein (SiHAT5) was localized to the endoplasmic reticulum. More detailed information, including sequence, aliphatic index, instability index, and subcellular localization, are listed in Additional file 1. Prediction of the secondary structure of SiHAT proteins indicated that every member contained α-helix, extended chain, β-folding, and irregular curl structures. The irregular curl and α-spiral structures were the main secondary components, accounting for 30–50% of the secondary structure, while β-folding accounted for only about 5% (Additional file 1).
A chromosome region containing more than two genes within 200 kb is defined as tandem duplication [35]. Homology analysis of SiHATs showed that there were two tandem duplication events in the foxtail millet chromosome sequences, each containing SiHAT9 and SiHAT10 on chromosome 2 and SiHAT17 and SiHAT18 on chromosome 5 (Fig. 1).
Phylogenetic analysis, motif composition, and structure analysis of SiHATs
Neighbor-joining phylogenetic analysis was performed, and the 24 SiHAT proteins were clustered into Groups I, II, III, and IV, with 12, 3, 5, and 4 members, respectively (Fig. 2a). Notably, most SiHAT proteins fell into sister pairs (SiHAT3 and SiHAT20, SiHAT9 and SiHAT22, SiHAT12 and SiHAT14), triplets (SiHAT8, SiHAT10, and SiHAT24) or quadruplets (SiHAT2, SiHAT15, SiHAT4, and SiHAT18) in the joint phylogenetic tree (Fig. 2a).
To obtain more insights into gene evolution, the exon-intron organization of SiHAT genes was investigated by aligning predicted coding sequences (CDS) against corresponding genomic sequences using the online service Gene Structure Display Server (GSDS). The number of introns in the SiHAT family ranged from 2 to 23. Overall, highly similar gene structures and domains were observed for the four HAT subfamilies. In contrast, SiHAT18 and SiHAT14 did not contain both upstream and downstream untranslated regulatory regions (UTR), and SiHAT24 did not contain upstream regulatory regions. The other 21 genes exhibited upstream and downstream regulatory regions (Fig. 2b, Additional file 2). Noticeably, the closest members from the same subgroups had highly similar intron/exon structure (intron number and exon length; Fig. 2b).
To further study the characteristic regions of SiHAT proteins, the motifs of 24 SiHAT proteins were analyzed using Multiple Expectation maximizations for Motif Elicitation (MEME). The results showed that 12 SiHAT genes from group I belonged to the GNAT family and possessed the bromodomain. (Fig. 2c). The genes in Group II contained the motifs of the CBP (ZnF_TAZ), MYST(PLN00104), and TAF1(Bromo_AAA) family, individually. While the quadruplets genes in Group III belonged to the CBP family and contained the typical motif of the HAT_KAT11 domain, PHD_SF domain, zf-TAZ, and ZZ domains. The five genes in Group IV belonged to the GNAT, CBP, and TAF families. Several HAT proteins possessed unique conserved domains, such as ELP3 in SiHAT21, HAT1 chromodomain, and Znf-C2H2 in the GNAT/MYST family. PHD (Plant Homeodomain), Znf-ZZ, and Znf-TAZ domains were observed in the CBP family. These four subfamily proteins were also compared with other typical GNAT/CBP/TAF/MYST conserved domains in Arabidopsis and O. sativa (Additional files 3, 4, 5 and 6). These conserved domains allowed SiHATs to interact with RNA Pol II during transcript elongation, bind with the transactivation domain of transcription factors and acetylated histone lysine residues, and interact with co-factors (Additional file 2). Interestingly, the sister pair genes also had the same structure and conserved domain, indicating they may have the same function in foxtail millet.
Phylogenetic relationship and collinearity analysis of HATs in Setaria italica, Oryza sativa, and Arabidopsis thaliana
To better understand the phylogeny of the foxtail millet HAT gene family, the SiHATs were subjected to synteny analysis with HAT genes of the typical model plants: the dicot Arabidopsis thaliana and monocot Oryza sativa. A total of 19 SiHAT genes were synchronized with those in O. sativa, thus, a phylogenetic tree was constructed using the protein sequences of all 24 SiHATs, 19 OsHATs, and 12 AtHATs. These 55 HAT proteins were divided into four clades (Fig. 3a). The results showed the phylogenetic relationship of HAT proteins between dicots and monocots. Apart from Clade II, which is the unique group of foxtail millet, containing 11 HAT gene family members, the other clades included HAT proteins from the three species, suggesting that these genes existed before the divergence of monocots and dicots. Clade I was further subdivided into 4 classes, namely a, b, c, and d, with 4, 3, 5, and 4 members respectively. Clade III was further divided into classes e, f, and g (Fig. 3a). Further, we analyzed the 11 SiHATs in Clade II and found they all belonged to the GNAT family, possessed a Bromo domain, and had structures obviously different from those of the other family members in other clades (Additional files 3 and 7).
The ratio of non-synonymous to synonymous (Ka/Ks) nucleotide substitutions was calculated to investigate the selection pressure on SiHATs [36]. The SiHAT genes of foxtail millet were subjected to one-to-one orthologous analysis with its homologous genes in Arabidopsis thaliana and Oryza sativa. (Fig. 3b, Additional file 8). In total, 19 SiHAT genes displayed a syntenic relationship with those in Oryza sativa, while there were two homologous pairings in Arabidopsis thaliana. The results indicated that the foxtail millet and rice HAT genes were genetically similar. We also found that some SiHAT genes were evolved from rice and Arabidopsis, respectively (Fig. 3b, Additional file 8).
Cis-elements analysis of SiHAT promoters
To further investigate the putative functions of SiHAT genes, a plant promoter database (PlantCARE) search was conducted in the promoter regions at 2000-bp upstream of the transcription initiation site of SiHAT genes. As shown in Fig. 4 and Additional file 9, three main categories of cis elements were found in the promoter sequences of SiHAT genes. The first category was involved in phytohormones, such as abscisic acid (ABA), methyl jasmonate (MeJA), auxin, and salicylic acid (SA). The second category was associated with stresses, such as anaerobic induction, drought inducibility, low-temperature responsiveness, pathogen infection, wound responsiveness, and salt inducibility. The last category was related to plant growth and development, such as zein metabolism regulation. Meristem, root, endosperm inducibility (GC-motif), and abscisic acid responsive elements were found in almost all gene promoters. Importantly, all the 24 SiHAT genes contained the light responsive element, while the MeJA-responsive element (TGACG-motif and CGTCA-motif), anoxic specific inducibility element (GC-motif), and the abscisic acid responsive element (ABRE) were found in almost all gene promoters. Interestingly, distinct differences in cis elements between the sister pair genes, including the GA and MeJA response elements, were found in the promoter of SiHAT3, whereas ABA and defense and stress response elements were found in SiHAT20. These results showed that SiHATs may have affected hormone signal responsiveness, stress adaptation, and development. No cytokinin-responsive elements were identified in these promoter regions.
Spatial and temporal expression of SiHAT genes
To obtain insight into the expression patterns of SiHAT genes in various tissues, a heat map was generated using the gene expression data in the foxtail millet Exp database. The results showed complex specific and overlapping SiHAT expression in various tissues and organs. The expression level of the same gene varied among tissues and organs; for example, SiHAT17 was highly expressed in top leaves 2–3 days after the heading stage. In contrast, low or no expression signals were detected in panicles. On the other hand, the expressions of different genes were also notably different in the same tissues and organs. For example, in the root at the filling stage, the expression of SiHAT3, SiHAT9, SiHAT13, and SiHAT22 from the GNAT gene family was significantly higher than that of other genes. Some genes were exclusively expressed in single tissues or organs; for example, SiHAT17 was expressed in leaf top after 2–3 days, and SiHAT16 expression was observed in immature ears. In addition, SiHAT3, SiHAT13, and SiHAT22 were highly expressed in all tissues at different developmental stages, whereas two SiHAT15 and SiHAT12 exhibited almost no expression in any of the tested tissues (Fig. 5). These results demonstrated that the expression patterns of SiHATs differed among tissues and were associated with plant growth and development.
Expression analysis of SiHATs under stress
To confirm whether the expression of SiHAT genes could be regulated by abiotic and biotic stress, we tested the effects of abiotic stresses such as nitrate deficiency and phosphate deficiency, salt-alkali, and drought.
Under low nitrate conditions, the expressions of most SiHAT genes were either slightly upregulated or downregulated. Most genes were downregulated after 2 h, which was the inverse of what was seen at 24 h. The expression of several genes (SiHAT17, SiHAT8, and SiHAT5) were clearly different between shoots and roots. SiHAT3 expression was continuously upregulated under low nitrate conditions in the shoot, but was only upregulated at 2 h in the root (Fig. 6). This suggests that SiHAT3 likely performs different control functions during nitrate absorption and transport.
Most SiHAT genes were not strongly upregulated under low phosphate conditions, while five SiHAT genes (SiHAT1, SiHAT6, SiHAT7, SiHAT19, and SiHAT21) were strongly upregulated in roots and weakly upregulated in shoots. In contrast, SiHAT17 was highly expressed in the shoot, and its expression pattern showed a sharp decline initially, then a gradual increase until returning to its original level at 24 h (Fig. 7).
Previous studies have reported that the expression of HAT genes was induced by drought [37]. Therefore, we analyzed the expression of the 24 SiHATs in our RNA-seq data. The results showed that except for SiHAT15, SiHAT3, SiHAT21, and SiHAT18, the other SiHAT genes responded to drought stress with different expression patterns. Nine genes (SiHAT1, SiHAT5, SiHAT7, SiHAT8, SiHAT9, SiHAT10, SiHAT13, SiHAT19, and SiHAT22) were upregulated, while SiHAT17 and SiHAT15 were downregulated under drought conditions. SiHAT9 was highly expressed in drought-sensitive foxtail millet varieties. In response to circadian and drought treatments, SiHAT3, SiHAT9, SiHAT13, and SiHAT20 showed higher expression levels under dark conditions; meanwhile, the remaining five genes (SiHAT2, SiHAT8, SiHAT11, SiHAT23, and SiHAT24) showed lower expression (Fig. 8).
Under salt and alkali stress, SiHAT6 was upregulated at low levels, while the other genes were significantly downregulated. The transcript levels of most genes were lower at the germination stage than at the two-leaf one-heart stage. Only SiHAT6 and SiHAT19 showed high expression in the germination stage. Additionally, sensitive and resistant varieties showed differences in gene expression. At the T2 stage, the expression of 9 genes (SiHAT2, SiHAT3, SiHAT4, SiHAT5, SiHAT8, SiHAT10, SiHAT14, SiHAT18, and SiHAT20) was higher in the salt-resistant variety than in the sensitive variety, while the expression of three genes (SiHAT13, SiHAT17, and SiHAT19) showed the opposite trend (Fig. 9). SiHAT12 and SiHAT15 were the only two genes not or minimally expressed under all conditions (Figs. 6, 7, 8 and 9).
SiHATs involved in Sclerospora graminicola infection
The oomycete S. graminicola (Sacc.) causes 20–30% of downey mildew cases in foxtail millet cultivated in China and results in the deterioration of yield and quality [37]. It is also prevalent in India, Japan, and Russia [37, 38]. We investigated the expression of SiHATs in our transcriptome sequencing data obtained after S. graminicola infection. Four SiHAT genes were detected in response to the infection, and their expression patterns were different. During the three-leaf-one-heart stage, SiHAT16 and SiHAT24 expressions were downregulated in the pathogen-resistant variety but upregulated in the sensitive variety, while SiHAT6 and SiHAT17 expressions were upregulated in both the resistant and sensitive varieties. At the five-leaf-one-heart stage, SiHAT6 and SiHAT24 expressions were upregulated after infection, and there was no difference between the sensitive and resistant varieties. At the seven-leaf-one-heart stage, the expression of all four genes were downregulated in the pathogen-resistant variety; however, SiHAT16 and SiHAT17 expressions were upregulated in the sensitive variety. The other genes were not or minimally expressed (Fig. 10).