Identification of FtAP2/ERF genes in tartary buckwheat
All possible AP2/ERF genes were excavated from the tartary buckwheat genome using the two BLAST methods. Because buckwheat genomes are sequenced using a genome-wide shotgun strategy, although they are located on different scaffolds, some of these AP2/ERF genes may be redundant. After removing the redundant and alternate forms of the same gene, 134 potential AP2/ERF proteins were identified and renamed based on their chromosomal location (Additional file 2: Table S1).
Gene characteristics included the coding sequence length (CDS), protein molecular weight (MW), isoelectric point (PI) and subcellular localization. Of the 134 FtAP2/ERF proteins, FtPinG0003194700.01 was the smallest protein with 71 amino acids (213 aa) and FtPinG0003183000.01 was the largest one with 641 amino acids (1923 aa) (Additional file 1). The MW of the proteins ranged from 8.09 to 71.06 KDa, and the pI ranged from 4.62 (FtPinG0005080800.01) to 12.01 (FtPinG0007485600.01). The predicted subcellular localization results showed that 111 FtAP2/ERF proteins were located in the nuclear region, 13 FtAP2/ERF proteins were located in the chloroplast region, 5 FtAP2/ERF proteins were located in the mitochondrial region, 4 FtAP2/ERF proteins were located in the cytoplasmic region, and 1 Ft AP2/ERF protein was located in the plasma membrane (Additional file 2: Table S1).
Multiple sequence alignment, phylogenetic analysis, and classification of FtAP2/ERF genes
The phylogenetic relationship of FtAP2/ERF proteins was performed by multiple sequence alignment of the AP2 domain involving approximately 60–70 amino acids and the B3 domain consisting of 100–120 residues. The sequence alignment of all AP2/ERF genes showed that the YRG (7th amino acid to 9th amino acid), LG (59th amino acid and 60th amino acid), AA (68th amino acid and 69th amino acid) and YD (72th amino acid and 73th amino acid) elements were highly conserved (Additional file 3: Figure S1). The WLG element (58th amino acid to 60th amino acid) was more highly conserved in the ERF family and RAV family than in the AP2 family. In the AP2 family, WLG elements (58th amino acid to 60th amino acid) were converted into YLG elements (58th amino acid to 60th amino acid). These conserved amino acid profiles may contribute to the classification of AP2/ERF genes in other species. By comparing the protein structure, we can effectively predict the function of proteins [5].
To explore the phylogenetic relationship of AP2/ERF proteins in tartary buckwheat, we constructed a phylogenetic tree using the neighbor-joining (NJ) method based on multiple sequence alignments of 166 A. thaliana AP2/ERF proteins and 134 tartary buckwheat AP2/ERF proteins. The phylogenetic distribution showed that AP2/ERF genes were divided into three major categories, AP2, ERF, and RAV (Fig. 1). Among 134 candidate FtAP2/ERF genes, 15 FtAP2/ERF genes containing two AP2/ERF domains were assigned to the AP2 family; 116 FtAP2/ERF genes encoding proteins containing a single AP2/ERF domain belonged to the ERF family; only 3 FtAP2/ERF genes encoded a single AP2/ERF domain; and a B3 domain was assigned to the RAV family (Fig. 1). Interestingly, FtPinG0007082400 was also found to encode two AP2/ERF domains, but they were distinct from the AP2 family and instead clustered in the ERF family.
Gene structure and motif composition of the FtAP2/ERF gene family
By comparing the genomic DNA sequences of FtAP2/ERF genes, we obtained the intron and exon structure of FtAP2/ERF genes to further understand the structural composition of FtAP2/ERF genes (Fig. 2b). The coding sequences of all tartary buckwheat AP2/ERF genes were disrupted by introns, with exon numbers ranging from 1 to 9 (Fig. 2b). Excluding four exons in the FtPinG0007082400 gene, the other members of the AP2 subfamily contained more than 7 exons. Moreover, the number of exons was conserved in the AP2 family, although the exon positions varied. Most members of the RAV subfamily and ERF subfamily contained only one exon and the AP2 domain located in the exon region (Fig. 2b). In general, the closest members from the same subfamily had similar exon / intron structures in terms of the intron number and exon length. Further analysis showed that FtAP2/ERF proteins contained, at most, two characteristic regions (Fig. 2b). The N-terminal region of all FtAP2/ERF proteins had a highly conserved AP2 region of approximately 60–70 amino acid residues corresponding to the DNA binding region, and the RAV subfamily also contained the B3 region composed of 100–120 amino acids. In general, many conserved motifs can be detected in transcriptional factor protein sequences, which may be involved in activating the expression of genes as potential DNA binding sites.
The motifs of 134 FtAP2/ERF genes were analyzed using online MEME software to further study the characteristic regions of FtAP2/ERF proteins (Additional file 4: Table S2). According to the results of the MEME motif analysis, a schematic diagram was constructed to characterize the structure of FtAP2/ERF proteins. A total of 10 conserved motifs were found in the FtAP2/ERF proteins (Fig. 2c). Motif-1, Motif-2, Motif-3, Motif-4, and Motif-7 were found in the AP2 domain regions, in which Motif-1, Motif-2, Motif-3, and Motif-4 were detected in almost all AP2/ERF proteins. All ERF subfamily genes contained Motif-1, Motif-2, Motif-3, and Motif-4, Motif-8 was detected in 17 ERF genes, Motif-7 in 20 ERF genes, Motif-10 in 5 ERF genes, and Motif-9 in only 3 ERF genes. In the AP2 subfamily, 11 genes contained Motif-1 to Motif-6 and 3 AP2 genes contained Motif-1, Motif-2, Motif-3, Motif-4, and Motif-9. The similarity in motif composition in the same subfamily indicated the conserved protein structure of a specific subfamily, and the functions of these conserved motifs must be further elucidated. The conserved motif composition and gene structure of the same subfamily were similar, thus verifying the reliability of the phylogenetic tree population classification.
Chromosomal distribution and gene duplication and synteny analysis of FtAP2/ERF genes
Chromosome mapping of FtAP2/ERF genes was performed using the latest tartary buckwheat genome database. A total of 134 AP2/ERF TFs were unevenly distributed on eight tartary buckwheat chromosomes (Fig. 3). The largest number of AP2/ERF TFs was found on chromosomes 6 and 8 (23 and 21, respectively), while chromosome 4 had the smallest number of AP2/ERF TFs (11 genes). We found only ERF subfamily members on chromosomes 5 and 7, an absence of AP2 subfamily members on chromosomes 4, 5 and 7, and three RAV subfamily members distributed on chromosomes 2, 4 and chromosome 6 (Fig. 3). Interestingly, some transcription factors with similar conserved sequences were located on the same chromosome. Similar patterns have been found in A. thaliana [9], Vitis vinifera [39] and Chinese cabbage genomes [32], which were thought to represent homologous fragments caused by ancestral polyploidy events.
In addition, we also analyzed the duplication events of AP2/ERF genes in the tartary buckwheat genome since gene replication plays an important role in the occurrence of novel functions and gene expansion. Chromosomal regions within the 200 kb range of two or more genes were defined as tandem replication events. Twelve FtAP2/ERF genes were clustered into six tandem repeat event regions in tartary buckwheat linkage group (LG) 6 and 8 (Fig. 3). LG8 had four clusters, indicating hot spots of FtAP2/ERF gene distribution. A pair of tandem replication genes (FtPinG0004790900.1 and FtPinG0004790700.1) located on LG8 contained different motifs with other genes clustered together. In addition to tandem duplications, many pairs of segmental duplications were found in the tartary buckwheat chromosomes (Fig. 4). Analyses of homologous protein families is of great significance in establishing the kinship of species and predicting the function of new protein sequences. Many homologous genes were present on different chromosomes in tartary buckwheat, supporting the high conservation of the AP2/ERF gene family (Fig. 4). In brief, based on the above results, some FtAP2/ERF genes might be produced by gene replication, and these replication events were the main driving force of FtAP2/ERF evolution.
Evolutionary analysis of FtAP2/ERF genes and several different species
To deduce the evolutionary relationship of AP2/ERF genes, phylogenetic tree analysis was performed for seven dicotyledonous plants (A. thaliana, Beta vulgaris, Glycine max, Solanum lycopersicum, Vitis vinifera, Helianthus annuus and Tartary buckwheat) and a monocotyledonous plants Oryza sativa. The AP2/ERF family of tartary buckwheat contained three subfamilies: AP2, ERF and RAV. To explore the evolutionary relationship of each gene, a phylogenetic tree analysis was performed between each subfamily of tartary buckwheat and other plant members of the same subfamily. Simultaneously, the motifs of the corresponding member proteins were determined.
From Fig. 5a, we can see that most members of the tartary buckwheat AP2 subfamily were clustered with Beta vulgaris (5 members), followed by A. thaliana (4 members), Solanum lycopersicum (3 members), and Glycine max (2 members). A total of 10 conserved motifs were detected in the protein sequences of AP2 subfamily members in all plants (Fig. 5b). Almost all members contained Motif-1, Motif-2, Motif-4, Motif-5 and Motif-7. Additionally, AP2 members with a similar relationship in different plants had the same motif composition. Based on previous studies, we performed a syntenic analysis of AP2 genes in six dicotyledonous plants (Tartary buckwheat, A. thaliana, Beta vulgaris, Glycine max, Solanum lycopersicum and Vitis vinifera) and a monocotyledonous plant Oryza sativa to speculate on the evolutionary origin of AP2 genes. The AP2 subfamily genes in tartary buckwheat have homology to reference plants, and the most syntenic conservation was observed among Glycine max (18 orthologous gene pairs distributed on LG1, LG4 LG6, LG7, LG8, LG9, LG12, LG14, LG6L, G17 and LG18), Vitis vinifera (10 orthologous gene pairs distributed on LG1, LG4, LG6, LG9, LG11 and LG18), and Solanum lycopersicum (9 orthologous gene pairs distributed on LG1, LG3, LG4, LG5, LG6 and LG11) (Fig. 6). In the syntenic analysis of AP2 genes of tartary buckwheat and Glycine max, FtPinG0009081200.01 was found to be associated with at least three syntenic gene pairs, suggesting that FtPinG0009081200.01 might play an important role in AP2 subfamily evolution (Additional file 5: Table S3).
We used the same phylogenetic tree method to analyze the clustering relationship between FtRAV genes and the RAV genes of other plants. The phylogenetic tree results showed that three FtRAV genes were closely related to RAV genes in Beta vulgaris (2 members) and Solanum lycopersicum (1 member) (Fig. 7a). The protein sequences of the RAV genes also showed 10 conserved motifs, and all the members contained Motif-1, Motif-2, Motif-3, Motif-4 and Motif-5 (Fig. 7b). Moreover, the members of the same branch of the phylogenetic tree had the same motif composition.
The ERF subfamily of tartary buckwheat contains many members. The phylogenetic tree constructed using the FtERF genes and ERF members from three dicotyledonous plants, A. thaliana, Helianthus annuus and Solanum lycopersicum, and a monocotyledonous plant Oryza sativa, indicated that the FtERFs proteins were divided into 15 groups (Fig. 8). We detected 10 motifs in the ERF subfamily of all plants. All the members, excluding group-h and group-l, contained Motif-1, Motif-2, Motif-3, and Motif-4, and the genes that clustered together contained similar motifs. Group-a and group-b specifically contained Motif-8, group-o specifically Motif-5, and group-g and group-h specifically Motif-6 (Fig. 8). Based on the syntenic results, the syntenic relationships between FtERF and ERF genes from other plants were very obvious, and according to the relationships, the order was Glycine max (198 orthologous gene pairs distributed throughout all LGs), Solanum lycopersicum (101 orthologous gene pairs distributed throughout all LGs), Vitis vinifera (73 orthologous gene pairs distributed throughout all LGs except the LG3, LG14, and LG17), Beta vulgaris (59 orthologous gene pairs distributed throughout all LGs), Helianthus annuus (44 orthologous gene pairs distributed throughout all LGs except LG5, LG11, and LG13), A. thaliana (37 orthologous gene pairs distributed throughout all LGs), and Oryza sativa (12 orthologous gene pairs distributed on LG2, LG3, LG5, LG6, LG7 LG8, and LG9), respectively (Fig. 9). Some FtERF genes were found to be associated with at least two pairs of homologous genes (especially between tartary buckwheat and Glycine max ERF genes), such as FtPinG0003005100.01 and FtPinG0002711700.01, which suggested that these genes might play an important role in the ERF subfamily during evolution (Additional file 5: Table S3). The syntenic analysis provided reliable evidence to support and validate the previous phylogenetic groupings and motif distribution. In general, these data indicated that the tartary buckwheat AP2/ERF gene family was highly conserved and the tartary buckwheat AP2/ERF genes were closer to the Glycine max genes than to the A. thaliana genes. The AP2/ERF genes might have evolved from the common ancestor in different plants.
Expression patterns of FtAP2/ERF genes in different plant tissues
To investigate the physiological roles of FtAP2/ERF genes, qRT- PCR was performed to detect the tissue-specific expression of each AP2/ERF gene. The accumulation of transcriptional products of 15 AP2 genes, 3 RAV genes and 43 ERF genes in root, stem, leaf, flower, and fruit was evaluated. The results showed that the transcriptional abundance of FtAP2/ERF genes varied greatly in different tissues and organs, suggesting that the FtAP2/ERF genes had multiple functions in tartary buckwheat growth and development.
In the AP2 subfamily, eight genes (FtPinG0003989200.01/6430500.01/3951500. 01/7082400.01/ 1,028,600.01/9489700.01/5545400.01/5986200.01) were expressed in all tissues, five genes (FtPinG0002177000.01/7214300.01/9081200.01/5947300.01/5986200.01) had the highest expression level in fruit, two genes (FtPinG0007082400.01/9489700.01) had the highest expression in flowers, and only three genes (FtPinG0007214300.01/5947300.01/5986200.01) were expressed at higher levels in reproductive organs than in other tissues (Fig. 10). Simultaneously, we studied the correlation of the FtAP2 gene expression pattern in tartary buckwheat roots, stems, flowers, leaves and fruits (Additional file 6: Figure S2). Most of the FtAP2 genes were positively correlated, and the FtAP2 genes (FtPinG0006430500.01 and FtPinG0007614100.01/1028600.01; FtPinG0003951500.01 and FtPinG0009081200.01; FtPinG0007214300.01 and FtPinG0005986200.01/5947300.01; FtPinG0003183000.01 and FtPinG0005545400.01/9489700.01/7614100.01/1028600.01) that were significantly correlated were found to be positively correlated (Additional file 6: Figure S2).
The ERF gene structures in tartary buckwheat and in A. thaliana, Glycine max, Solanum lycopersicum and Helianthus annuus were analyzed by phylogenetic tree analysis (Fig. 8). The gene structure in each group was similar, indicating the performance of potentially similar functions. The functions of many ERF genes in A. thaliana have been identified, such as AT3G15210.1, which negatively regulates ethylene and abscisic acid (ABA) responses. Therefore, we selected a total of 43 genes from each group with a similar evolutionary relationship to A. thaliana ERF (AtERF) genes for tissue-specific expression analysis. Among the 43 selected ERF genes, 60.47% were expressed in all tissues, 67.44% had the highest expression level in root, eleven (FtPinG0009372200.01/2019600.01/874400.01/401700.01/4858800.01/2103400.01/5123400.01/9155900.01/7618600.01/8406100.01/7594200.01) had the highest expression in fruit, and three (FtPinG0008861700.01/ 0008126800.01/ 9640400.01) had the highest expression in flower (Fig. 11). The correlation analysis of ERF subfamily gene expression in different tissues revealed a strong correlation between the genes, and most of the genes were positively correlated (Additional file 7: Figure S3). The expression level of FtPinG0009372200.01 was highest in fruit, which was significantly positively correlated with other genes (FtPinG0009155900.01, FtPinG0004858800.01, FtPinG0000874400.01, and FtPinG0002019600.01) with a high expression level in fruit. Similarly, FtPinG0001906300.01 was significantly positively correlated with other genes (FtPinG0006985400.01, FtPinG0007723200.01, and FtPinG0002878800.01, among others) that were highly expressed in root (Additional file 7: Figure S3).
There were only three members of the RAV subfamily, among which FtPinG0005247000.01 had the highest expression level in roots and the lowest expression level in leaves, FtPinG0004903400.01 had the highest expression level in fruits and the lowest expression level in leaves, and FtPinG0007073300.01 had the highest expression level in roots and the lowest expression level in fruits (Fig. 12).
Differential expression of FtAP2/ERF genes during fruit development of tartary buckwheat
Ethylene is a simple but very important plant hormone. It is involved in seed germination, plant flowering, fruit maturation, organ senescence, and shedding, among other processes [40]. ERF family genes located downstream of the ethylene signaling pathway are important transcription factors that regulate the ethylene biosynthesis pathway. Therefore, we can systematically study the expression of FtAP2/ERF genes at different stages of fruit development (green fruit stage, discoloration stage and initial maturity stage) and find some genes that potentially regulate fruit maturation and development.
The expression patterns of FtAP2/ERF tartary buckwheat fruits at different developmental stages (green fruit stage, discoloration stage and initial maturity stage) were different. In the AP2 subfamily, all the genes were expressed in three stages. With the development of the fruit, the expression levels of eight genes (FtPinG0003989200.01/6430500.01/7082400.01/3183000.01/28600.01/761400.01/9380600.01/9489700.01) gradually decreased and those of six genes (FtPinG0002177000.01 /3951500.01 /7214300.01 /9081200.01 /5947300.01 /5545400.01) gradually increased, but only one gene (FtPinG0005986200.01) was expressed at the highest level in the discoloration stage (Fig. 13). Concurrently, we studied the correlation of AP2 gene expression patterns with tartary buckwheat fruit development and the correlation of each gene during this process. As shown in Additional file 8: Figure S4, only FtPinG0003951500.01 was significantly positively correlated with fruit development, while the FtAP2 genes (FtPinG0002177000.01 and FtPinG0007214300.01; FtPinG0003989200.01 and FtPinG0007614100.01/6430500.01; FtPinG0006430500.01 and FtPinG0007614100.01; FtPinG0007082400.01 and FtPinG0009489700.01) were found to be significantly positively correlated.
By analyzing the expression of ERF subfamily genes during three fruit developmental stages (green fruit stage, discoloration stage and initial maturity stage), we found that all the selected ERF genes were expressed in the fruit development stage. Among the eleven genes with the highest expression level in fruit, we found that only FtPinG0008406100.01 was expressed at the highest level during the discoloration stage; the expression of all the other genes gradually increased throughout the fruit development stage (Fig. 14). During fruit development, the expression level of FtPinG0008126800.01 and FtPinG0009640400.01 gradually decreased (expressed at the highest level in flowers), while those of most genes that were highly expressed in roots also decreased with fruit development (Fig. 14). Based on the correlation analysis of ERF subfamily gene expression levels during different fruit development periods and of each gene during this process, we found that FtPinG0007618600.01 and FtPinG0005123400.01 were significantly positively correlated with fruit development, while there was a significant positive correlation among FtPinG0009372200.01 and other genes (FtPinG0009155900.01, FtPinG0007594200.01 and FtPinG0000874400.01) with high expression levels in fruit (Additional file 9: Figure S5).
In the RAV subfamily, the expression levels of two genes (FtPinG0005247000.01 and FtPinG0007073300.01) in the three stages of fruit development (green fruit stage, discoloration stage and initial maturity stage) gradually decreased, and the expression level of only FtPinG0004903400.01 was lowest in the discoloration stage (Fig. 12).