- Research
- Open access
- Published:
MYB transcription factors in Peucedanum Praeruptorum Dunn: the diverse roles of the R2R3-MYB subfamily in mediating coumarin biosynthesis
BMC Plant Biology volume 24, Article number: 1135 (2024)
Abstract
Background
The MYB superfamily (v-myb avian myeloblastosis viral oncogene homolog) plays a role in plant growth and development, environmental stress defense, and synthesis of secondary metabolites. Little is known about the regulatory function of MYB genes in Peucedanum praeruptorum Dunn, although many MYB family members, especially R2R3-MYB genes, have been extensively studied in model plants.
Results
A total of 157 R2R3-MYB transcription factors from P. praeruptorum were identified using bioinformatics analysis. Comprehensive analyses including chromosome location, microsynteny, gene structure, conserved motif, phylogenetic tree, and conserved domain were further performed. The length of the 157 transcription factors ranged from 120 to 1,688 amino acids (molecular weight between 14.21 and 182.69 kDa). All proteins were hydrophilic. Subcellular localization predictions showed that 155 PpMYB proteins were localized in the nucleus, with PpMYB12 and PpMYB157 localized in the chloroplasts and mitochondria, respectively. Ten conserved motifs were identified in the PpMYBs, all of which contained typical MYB domains. Transcriptome analysis identified 47,902 unigenes. Kyoto Encyclopedia of Genes and Genomes analysis revealed 136 pathways, of which 524 genes were associated with the phenylpropanoid pathway. Differential expressed genes (DEGs) before and after bolting showed that 11 genes were enriched in the phenylpropanoid pathway. Moreover, the expression patterns of transcription genes were further verified by qRT-PCR. With high-performance liquid chromatography (HPLC), 8 coumarins were quantified from the root, stem, and leaf tissue samples of P. praeruptorum at different stages. Praeruptorin A was found in both roots and leaves before bolting, whereas praeruptorin B was mainly concentrated in the roots, and the content of both decreased in the roots and stems after bolting. Praeruptorin E content was highest in the leaves and increased with plant growth. The correlation analysis between transcription factors and coumarin content showed that the expression patterns of PpMYB3 and PpMYB103 in roots align with the accumulation trends of praeruptorin A, praeruptorin B, praeruptorin E, scopoletin, and isoscopoletin, which declined in content after bolting, suggesting that these genes may positively regulate the biosynthesis of coumarins. Eleven distinct metabolites and 48 DEGs were identified. Correlation analysis revealed that the expression of all DEGs were significantly related to the accumulation of coumarin metabolites, indicating that these genes are involved in the regulation of coumarin biosynthesis.
Conclusions
R2R3-MYB transcription factors may be involved in the synthesis of coumarin. Our findings provide basic data and a rationale for future an in-depth studies on the role of R2R3-MYB transcription factors in the growth and regulation of coumarin synthesis.
Background
Peucedanum praeruptorum Dunn is a perennial, one-off, flowering plant of the Peucedanum genus in the family Apiaceae. In traditional Chinese medicine, the dried roots of plants are used to evacuate wind heat, reduce phlegm, and effectively treat the syndrome of wind heat invading the lungs [1]. P. praeruptorum contains a variety of bioactive components, including coumarin, polyphenols, flavonoids, and volatile oils, among which angular pyranocoumarins are the main medicinal components [2, 3]. In particular, praeruptorins A and B are important indicators of medicinal quality and have been included in the Chinese Pharmacopoeia [4]. Coumarin is a secondary metabolite that primarily accumulates in the secretory canals of roots. During the periods before and after bolting, coumarin is synthesized, transported, and stored in specific tissues [5]. Wild P. praeruptorum requires several years to bolt; however, artificially grown P. praeruptorum often bolts early when the root is not fully grown because of the influence of genetic and environmental factors, which greatly shortens the storage time of medicinal components in the root [6]. Histochemical and microscopic analyses of the roots after bolting revealed a significant increase in the area of the secondary xylem, accompanied by severe lignification and a notable decrease in pyrancoumarin content, which considerably affected the yield and quality of medicinal materials [7]. Recent domestic and international research has focused on coumarin biosynthesis. The biosynthetic pathways of some coumarins such as umbelliferone, scopoletin, and xanthotoxins have been described [8, 9]. The coumarin biosynthesis process is part of the shikimic acid branch pathway; phenylalanine can be catalyzed by PAL to produce the precursor of coumarin, cinnamic acid [10]. Key genes, including PAL, 4CL, COMT, COSY, and F6’H, involved in coumarin biosynthesis and regulation have been identified by parallel analysis of the genome, transcriptome, and metabolome of Apiaceae species such as A. sinensis and P. praeruptorum [11, 12].
Transcription factor-based interventions are currently effective methods for regulating the synthesis of secondary plant metabolites [13]. The plant root-specific transcription factor AtMYB72 orchestrates AtMYB10 to produce coumarins. Under iron or phosphorus stress, AtMYB72 induces coumarin synthesis and secretion via BGLU42 [14, 15]. In addition to directly inhibiting 4CL to regulate phenylpropanoid accumulation, AtMYB4 suppresses the expression of the cinnamate 4-hydroxylase (C4H) gene, thereby enhancing the accumulation of sinapate malate, a UV-protective compound, which is the first instance of MYB functioning as a transcriptional repressor [16]. Multiomics technology has become an effective tool for exploring plant secondary metabolism and gene function. By integrating genome, transcriptomic, proteomic, and metabolomic datasets, we can fully understand the function of the MYB gene and its role in complex biological processes. Depending on the number and location of repetitive sequences, AtMYB genes can be categorized into four subfamilies, namely R3-MYB, R2R3-MYB, R1R2R3-MYB, and a typical MYB, of which R2R3-MYB is the most common and abundant type [17]. A. thaliana is not the only model plant in which the R2R3-MYB gene family has been identified. It has also been found in Citrus reticulata [18], Scutellaria baicalensis [19], Nicotiana tabacum L [20], Fragaria ananassa [21], Trifolium repens [22], etc. MYB transcription factors can affect plant growth and development, metabolic processes, and stress responses, particularly the metabolic regulation of phenylpropanoid compounds [23]. MYB genes play a role in root epidermal cell differentiation, which primarily involves two core transcription factors. One is the R2R3-MYB transcription factor WEREWOLF (WER), and the other is the R3-MYB transcription factor CAPRICE (CPC). WER directly induces the expression of the key gene for root hair formation, GLABRA2 (GL2), and specifically activates CPC, whereas CPC inhibits the expression of WER and GL2 [24]. In addition, the transcription factor TT2 (AtMYB123) forms a complex with other proteins, such as TT8 (bHLH) and TTG1 (WD40) to stimulate the expression of anthocyanin reductase (BANYLUS) gene, which catalyzes the formation of anthocyanins [25].
R2R3-MYB transcription factors participate in regulating the biosynthesis of other secondary metabolites. SmMYB98 regulates the biosynthesis of tanshinone and salvianolic acid in Salvia miltiorrhiza hairy roots [26]. JcMYB1 participates in the synthesis of seed oil and alters the composition of fatty acids by regulating the expression of fatty acid and triglyceride biosynthesis genes in Jatropha curcas [27]. In addition, some R2R3-MYB transcription factors are involved in cell wall biosynthesis. AtMYB58, AtMYB63, and AtMYB85 can activate lignin synthesis in fibers, whereas AtMYB68 negatively regulates lignin deposition in roots [28,29,30]. Members of the MYB family are involved in the plant responses to biotic and abiotic stressors. More than 198 MYB members in A. thaliana respond differently to drought signals, and most of them are R2R3-type MYB genes [31]. For example, AtMYB60-protected plants adapt to drought by controlling stomatal opening and closing and root growth [32]. When plants are subjected to salt stress, AtMYB30 can bind to the promoter of AOX1a rendering it more active in maintaining cell redox homeostasis, which gives plants salt tolerance [33]. By combining genomic and transcriptomic data from several species, a linear relationship was discovered between the number of MYB family members and the ploidy of a species’ chromosomes. The response pathways to drought stress vary among species; however, closely related species exhibit striking similarities in their pathways [34].
Peucedanum praeruptorum Dunn contains a high amount of coumarins. Although the complex coumarin biosynthesis pathway has been elucidated, its transcriptional regulatory network remains unknown [35]. Studies on how the R2R3-MYB family regulates the molecular mechanisms of coumarin synthesis in P. praeruptorum are limited. The integration of metabolomics and transcriptomics provides an effective strategy for studying the mechanisms of coumarin biosynthesis. In this study, we performed a comprehensive bioinformatics analysis of the R2R3-MYB transcription factors identified in P. praeruptorum and screened 157 R2R3-MYB family genes, which were divided into 28 subfamilies and showed close homology with the R2R3-MYB proteins in A. thaliana. Using a transcriptomics database, the expression patterns of the R2R3-MYB genes in different tissues were analyzed, and the DEGs of the phenylpropanoid pathway were screened. The expression of tissue-specific R2R3-MYB genes and coumarin synthesis genes was analyzed in different growth periods of P. praeruptorum. This study provides a basic framework for elucidating the regulatory mechanisms of coumarin synthesis in P. praeruptorum.
Materials and methods
Sample collection and processing
The plants used in this study were cultivated at the Medicinal Botanical Garden of West Anhui University (Lu’an, China). The geographic coordinates are 116◦65′E and 31◦24′N. The entire sample was collected between August 2023 and November 2023. The enzyme-free test tubes were precooled with liquid nitrogen, and the instruments required for sampling were sterilized and treated with deribonase. Whole plants were divided into roots, stems, and leaves in enzyme-free test tubes, immediately frozen in liquid nitrogen, and stored at −80 ℃ after washing with RNase-free reagent. Three biological replicates were selected from each sample at each time point. BioMarker (Beijing, China) was used for RNA extraction, database construction, and next-generation sequencing.
Identification of R2R3-MYB family genes in P. praeruptorum
Reference genome databases for P. praeruptorum were downloaded from the NCBI Genome database (accession: PRJNA910498) and FigShare database (https://doi.org/10.6084/m9.figshare.21743984.v1). A total of 126 R2R3-MYB transcription factors in A. thaliana were obtained from the literature [11], and their protein sequence information was downloaded from the TAIR database (https://www.arabidopsis.org/). We extracted all possible R2R3-MYB protein sequences from the P. praeruptorum genome database using BLASTP search (e-value ≤ 1e−5, homology > 30%), with the A. thaliana MYB protein sequence used as the query. The hidden Markov model (HMM) profile of the MYB domain (PF00249) from the Pfam database (http://pfam.xfam.org/) was used to search for protein sequences using HMMER3.0 program. To verify the presence of the MYB domain, all the sequences were examined using NCBI CDD (https://www.ncbi.nlm.nih.gov/cdd/), InterPro (https://www.ebi.ac.uk/interpro/), and SMART (http://smart.embl-heidelberg.de/). Finally, false positive, truncated, and redundant sequences were eliminated.
Physicochemical properties and subcellular localization of R2R3-MYB protein
The ProtParam tool of ExPASy (https://web.expasy.org/compute_pi/) was used to investigate the physicochemical properties of the protein sequences, including their isoelectric point (pI) and relative molecular weight (MW). The subcellular localization of PpMYBs was predicted using Cell-PLoc 2.0 (http://www.csbio.sjtu.edu.cn/bioinf/Cell-PLoc-2/).
Chromosomal mapping and collinearity analysis of R2R3-MYB genes
The chromosome location and gene density file of R2R3-MYB genes were extracted from the gff annotation using TBtools software (v.2.119). A. thaliana, P. praeruptorum, and A. sinensis were selected, and their collinear relationship was analyzed using MCScanX software. The genome data and gff files of A. thaliana were downloaded from the TAIR database (https://www.arabidopsis.org), whereas the genome data and gff files of A. sinensis were downloaded from the public link (https://data.cyverse.org/dav-anon/iplant/home/licheng_caas/Angelica.sinensis_genome/).
Phylogenetic analysis of R2R3-MYB proteins
The selected PpMYBs and 126 R2R3-MYB sequences from A. thaliana were aligned using ClustalW. The neighbor-joining (NJ) method was used to construct a phylogenetic tree. PpMYB subfamily classification was performed based on A. thaliana MYB nomenclature and phylogenetic tree topology [36]. The ITOL web service (https://itol.embl.de/) was used to construct the phylogenetic tree.
Analysis of conserved motif and structure of R2R3-MYB genes
MEME-suite (https://meme-suite.org/meme/doc/meme.html) was used to predict the R2R3-MYB protein motif. The number of predicted parameters was set to 10, and other parameters were set to their default values. TBtools was used to visualize the acquired MEME xmL files. The conserved protein domains of the R2R3-MYB family were obtained from the NCBI CDD database, and their structures were visualized using TBTools. The exon and intron structures of PpMYBs were extracted and visualized using TBtools. The conserved domain sequences of R2 and R3 were extracted from multiple sequence alignment files and submitted to WebLogo (https://weblogo.berkeley.edu/logo.cgi) to draw the Seqlogo.
Transcriptome analysis of P. praeruptorum in different tissues at different stages
To explore the differences in expression profiles among different tissues at different stages of P. praeruptorum, we performed transcriptome sequencing and analyzed the expression patterns of related genes in the root, stem, and leaf tissue samples before and after bolting. Total RNA was extracted from plants using an RNA Prep Pure Plant Kit (Tiangen, Beijing, China). The purity and integrity of the RNA was determined using the NanoDrop 2000 RNA Nano 6000 test kit (Thermo Fisher Scientific, Wilmington, DE, USA) and the Agilent Bioanalyzer 2100 system (Agilent Technologies, CA, USA). When the parameters were qualified, samples were subjected to sequencing. BioMarker (Beijing, China) completed the sample cDNA library and transcriptome sequencing, which was performed using the Illumina NovaSeq 6000 platform. The downstream data were screened to obtain clean, high-quality data, and the sequences were compared with a reference database (https://figshare.com/articles/dataset/Peucedanum_praeruptorum_genome_assembly_and_gene_annotations/21743984/1) for subsequent analyses. The screening criteria for DEGs were set at |log2foldchange| ≥ 2 and false discovery rate < 0.01 based on the different FPKM values of gene expression in each sample. To gain a deeper understanding of the functions and metabolic pathways of DEGs across different groups, we analyzed them using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) to identify significantly enriched GO terms and the main metabolic pathways involved. The alternative splicing type for each sample was determined using the ASprofile software (v. 2.118).
Expression patterns of genes related to coumarin synthesis in different tissues
The gene expression data for each tissue before and after bolting were extracted from the transcriptome data, which were collected and analyzed using Excel, and a heat map was drawn subsequently. Based on the results of transcriptome analysis, differential motifs that may be related to coumarin synthesis were screened. Reverse transcription quantitative PCR (qRT-PCR) was used to verify the expression of the candidate genes. RNA extraction and RT-qPCR analyses were performed using the RNA Prep Pure Plant Kit (Tiangen, Beijing, China). Reverse transcription amplification was performed using the SynScript III RT SuperMix for qPCR reverse transcriptase kit (Qingke, Nanjing). The reverse-transcribed cDNA product was diluted six times as a qPCR template, and the reaction system was 20 µL. BHQH00029600 (ACTIN) was selected as a reference, and the relative expression level of each gene was calculated using this method. After testing in triplicate, GraphPad Prism v.9.0 was used to plot the expression of each gene. Primers were designed using Primer 5.0 (Additional file 7). Potential MYB genes were identified by transcriptome co-expression analysis. In brief, we constructed a hierarchical clustering heat map of all MYB family genes and coumarin biosynthetic genes. Expression profile analysis showed that some genes in the same tissue (such as roots) or in the same period (such as unbolted) had similar expression patterns. These genes were clustered into one small subclade and considered co-expressed. Pearson correlation analysis was used to further determine their positive or negative correlation.
Metabolomic analysis of P. praeruptorumin different tissues at different stages
1,000 µL of the extraction solution (methanol: acetonitrile: water = 2:2:1) was added to 50 mg of the sample in a clean tube, which was vortexed for 30 s. The samples were ground using a steel ball for 10 min and ultrasonicated for 10 min in ice water. The extractions were then centrifuged at 12,000 rpm for 15 min at 4 °C after being frozen for approximately 1 h. The supernatants were transferred to EP tubes and dried in a vacuum concentrator. The dried metabolites were combined with 160 µL of extraction solution (acetonitrile: water = 1:1), vortexed for 30 s, and sonicated in ice water for 10 min before being centrifuged at 12,000 rpm for 15 min at 4 °C. The supernatants were filtered through a 0.22 μm porous filter for subsequent analyses. We used Waters Acquity I-Class PLUS ultra-high-performance liquid chromatography along with a Waters Xevo G2-XS QTOF high-resolution mass spectrometer (MS) for mass spectrometry analysis. The column used was a Waters Acquity UPLC HSS T3 column (1.8 μm, 2.1*100 mm, USA). The conditions were as follows: Solvents A and B contained 0.1% formic acid in water and 0.1% formic acid in acetonitrile, respectively. The injection volume was 1 µL. The column temperature was 40 °C, and the flow rate was 0.4 mL/min. The gradient elution profile was 2% B at 0.25 min; 98% B was maintained from 10 to 13 min, and then returned to 2% B. The MS was operated in both positive and negative modes with electron spray ionization at capillary voltages of 2.50 and 2.0 kV. The temperature of the capillary was 100 °C. The flow velocities of the drying and auxiliary gases were 800 L/h and 50 L/h, respectively. The full MS mode spanned from 50 to 1200, with a resolution of 60,000 for primary and 7,500 for secondary; the low collision energy ranged from 2 V, while the high collision energy ranged from 10 to 40 V. Mass spectrometry was used to continuously scan the data. The original data was imported into Progenesis v2.2 software (Waters, USA), and then we compared the metabolites using KEGG, HMDB, METLIN, and other metabolic databases. PCA (Principal component analysis) was used to identify SCMs (significantly changed metabolites) that were significantly different between groups when VIP > 1, P ≤ 0.05, and fold change ≥ 1. The KEGG database was used to identify enriched metabolic pathways.
Determination of coumarin content
Fresh P. praeruptorum samples were collected before and after bolting, and the root whiskers were removed by thorough washing with water. The roots, stems, and leaves were separated into groups, dried at 50 ℃, and crushed through a No. 5 sieve (80 mesh). Methanol (25 mL) was added to the powdered sample (0.5 g) and sonicated for 30 min (250 W, 33 kHz). The samples were cooled and weighed again, and the volume was fixed with methanol, followed by shaking. Ten milliliters of the refill filtrate was removed, evaporated until dry, and then dissolved in methanol before being placed in a 25 mL pycnometer flask. Finally, the refill filtrate was filtered through a 0.45 μm filter into the brown injection bottle. Different concentrations of xanthotoxin, scopletin, bergapten, isoscopletin, and P. praeruptorum coumarin II were mixed to prepare standard samples. The concentrations were 38 µg/mL, 45.2 µg/mL, 52 µg/mL, 24.8 µg/mL, 16.8 µg/mL, 62 µg/mL, 56.4 µg/mL, and 8 µg/mL. The 0.1 mL, 0.2 mL, 0.5 mL, 1 mL, 2 mL, and 3 mL mixed reference solutions were carefully measured and poured into 5 mL volumetric flasks. Subsequently, methanol was added, and the flask was shaken. Solutions with the different concentrations listed above were then injected and measured. The standard curve was plotted with the injection concentration (µg/mL) as the horizontal coordinate and the peak area as the vertical coordinate to obtain the linear regression equation of each component. The conditions were as follows: Solvents A and B were water and methanol, respectively. The column used was the Aglient ZORBAX RRHD Eclipse Plus 95 A C18 column, which has dimensions of 1.8 μm and 2.1*100 mm. The gradient elution profile was 5% A at 0 min to 40% A at 5 min; 5–7 min: 40–45%; 7–7.5 min: 45–45.5%; 7.5–8 min: 45.5–46%; 8–9 min: 46–46.5%; 9–10 min: 46.5–50.2%; 10–10.5 min: 50.2–50.3%; 10.5–11 min: 50.3–50.4%; 11–17 min: 50.4–55%; 17–22 min: 55–80%; 22–25 min: 80–100%; 25–27 min: 100%; 27–29 min: 100–5%; 29–30 min: 5%. The injection volume was 2 µL. The column temperature was 30 °C, and the flow rate was 0.25 mL/min. The UV spectra were measured at 321 nm.
Results
Identification, physicochemical properties, and subcellular localization prediction analysis of R2R3-MYB genes
A total of 157 R2R3-MYB genes were screened from the P. praeruptorum genome using an HMM search and BLASTP. Based on their position on the chromosome, these genes were renamed PpMYB1–PpMYB157 in turn. The protein length of PpMYBs varied greatly from 1688 to 120 amino acids; the average length was 356 amino acids. The MW ranged from 14.21 to 182.69 kDa; the average MW was 40.01 kDa. The pI ranged from 4.31 to 10.15 g; among them, 94 proteins with a pI < 7 were acidic, and the remaining 62 had pI > 7 and were alkaline. The hydrophilicity score (GRAVY) of all PpMYB proteins was negative, indicating that PpMYB proteins were soluble, which is a feature of transcription factors. In addition, Cell-PLoc 2.0, was used to predict subcellular localization. The results showed that most PpMYB proteins were located in the nucleus (Additional file 8).
Chromosome mapping and collinearity analysis of R2R3-MYB genes
The positions of the PpMYBs on the chromosome were mapped using a gff file (Fig. 1A). These 157 PpMYBs were widely distributed across 11 chromosomes. Among these, 18 and 17 transcription factors were distributed on chromosomes 9 and 5, respectively. PpMYB156 and PpMYB157 were not located on the chromosome, but on the fragment. Previous studies have shown that gene replication is a key factor affecting the amplification of gene families and is of enormous significance in plant genome evolution. To further understand the evolutionary relationships of PpMYBs among different species, intraspecific syntenic analysis of the R2R3-MYB gene family was conducted (Fig. 1B). Red lines indicate duplication events of PpMYB genes within the chromosome. Gene pairs with repeating fragments were identified on all 11 chromosomes. There were 83 genes with linear homology (Additional file 9). In addition, the syntenic relationships among P. praeruptorum, A. thaliana, and A. sinensis were examined (Fig. 1C), in which the R2R3-MYB family has a greater homology with Angelica and experienced similar evolutionary pathways. Members of the R2R3-MYB gene family have been retained and copied in different species, as well as the whole-genome replication, which makes the R2R3-MYB family of P. praeruptorum has more members with a wider range of functions.
Chromosome localization of R2R3-MYB genes in P. praeruptorum and genome-wide synteny analysis. A: Chromosome localization of R2R3-MYB genes of P. praeruptorum; B: Collinear analysis and gene replication of R2R3-MYB gene family of P. praeruptorum; C: collinear analysis of P. praeruptorum, A. sinensis, and A. thaliana
Phylogenetic analysis of R2R3-MYB gene family
The functions of AtMYB proteins have been previously identified [37, 38]. Therefore, phylogenetic trees of P. praeruptorum and A. thaliana MYB proteins were constructed. Approximately 95% of PpMYBs were clustered with A. thaliana, indicating that the MYB transcription factors of P. praeruptorum and A. thaliana evolved in a very similar and conserved manner. The MYB transcription factors were divided into 28 subfamilies based on the classification of the R2R3-MYB family from A. thaliana [39](Fig. 2A). New MYB members with different roles appeared during the evolution of the prehistoric R2R3-MYB transcription factors. R2R3-MYB proteins from P. praeruptorum and A. thaliana were chosen randomly for multiple sequence alignment analysis to determine the similarity of their domains. The stable domain features of R2 and R3 are shown (Fig. 2B). The MYB domain contains three regularly spaced tryptophan (W) residues that play important roles in binding to specific DNA sequences [36]. The R2 structure of MYB comprises three highly conserved W residues, with a separation of 19 amino acid residues between each pair of W residues, while the R3 structure consists of two extremely conserved W residues, with a separation of 18 amino acid residues between each pair of W residues. There was a highly conserved W residue as well as other conserved amino acid residues, such as lysine (K), threonine (T), arginine (R), asparagine (N), glycine (G), and glutamic acid (E). These residues usually appear at the end of R2, between the second and third conserved W residues. This suggests that these conserved amino acid residues maintain the HTH structure in the MYB domain [40].
Analysis of conserved motif and gene structure of R2R3-MYB genes
Structural diversity plays a crucial role in the evolution of various gene families. Analysis of exon and intron structures typically reveals differences in gene structure composition. The 157 PpMYBs varied considerably in the number of introns and exons; however, genes in the same subgroup had similar intron and exon structures (Fig. 3A). All C5 subfamily members had one exon, whereas the C16 subfamily members had three exons. The R2R3-MYB gene family contains 10 conserved motifs in P. praeruptorum. The MYB-binding domain was composed of motif1, motif 2, motif 3, and motif 4 (Fig. 3B). The number of motifs (10) was the least among them, appearing only in six sequences that belong to the same evolutionary tree branch. According to the sequence diagram, motif4 of R2, a member of the C1 (S14) family, was replaced by motif 9, indicating that some of the R2 repeats of P. praeruptorum were different (Fig. 3B). TBtools software was used to visualize the conserved domains of genes, and all PpMYBs contained the characteristic MYB-DNA binding and SANT domains (Fig. 3C).
Transcriptome sequencing and differential gene analysis
Total RNA was extracted from root, stem, and leaf samples, and the Illumina platform was used for high-throughput sequencing (Fig. 4A). After filtering the obtained raw reads, a total of 108.83 GB of clean data were obtained with GC content > 42%. Each group had Q20 > 98.46% and Q30 > 95.62%, indicating excellent sequencing data quality. Based on the reference genome, the mapped reads were spliced using StringTie (v.2.2.0), and the efficiency of comparison with the reference varied from 79.89 to 91.84%. The raw data were uploaded to the NCBI SRA database under the accession number PRJNA1080496. DESeq2 (v.1.44.0) was used to identify DEGs, and |log2(fold change)|≥1 and FDR (false discovery rate) ≤ 0.01 were used to find the genes that were significantly different between the six groups (Fig. 4B). Compared to the unbolted plants, 3,079, 2,520, and 1,982 genes were differentially expressed in the roots, stems, and leaves, respectively. In the roots, 1,608 genes were upregulated, and 1,471 genes were downregulated. In stems, 1,010 genes were upregulated, and 1,510 genes were downregulated. In the leaves, 601 genes were upregulated, and 1,381 genes were downregulated (Fig. 4C).
Collection, sequencing, and differential gene analysis of P. praeruptorum. A: Sample images. Nr, Ns, and Nl represent the root, stem, and leaf before bolting, respectively. Br, Bs, and Bl represent the root, stem, and leaf after bolting, respectively; B: Volcano Plot of differentially expressed genes; C: Differentially expressed genes among samples
During the same period, the same gene was differentially expressed in different tissues, and upregulated genes was greater than downregulated genes in plant roots before bolting compared to stems or leaves. Upregulated genes were greater than downregulated genes in the roots. However, in the stems and leaves, genes upregulated in the root tissues may be involved in the transport process after metabolite synthesis.
GO classification, KEGG enrichment, and alternative splicing analysis of DEGs
In the GO database, the DEGs before and after bolting were compared and annotated. GO terms with a p-value ranking in the top 30 were selected to draw the classification histogram (Fig. 5A). The three processes with the highest number of differential genes in each comparison group were cellular processes (GO:0009987), metabolic processes (GO:0008152), and biological regulation (GO:0065007), where 632 genes were involved in regulating metabolic processes. Cellular anatomical entities (GO:0110165) accounted for most annotations in cell function, with 436 differential genes enriched in intracellular processes (GO:0005622) and 78 in protein complexes (GO:0032991). The top three annotated processes, in terms of molecular function, were binding (GO:0005488), catalytic activity (GO:0003824), and transcriptional activity (GO:0005215). KEGG pathway analysis of DEGs revealed that 938 DEGs were enriched in 118 pathways, with significant enrichment in the biosynthesis of unsaturated fatty acids, plant circadian rhythm, alpha-linolenic acid metabolism, and plant hormone signal transduction (Fig. 5B). KEGG analysis of DEGs from the corresponding tissues before and after bolting (Fig. 5C) indicated that the DEGs in the roots were significantly enriched in phenylpropanoid biosynthesis, glycolysis/gluconeogenesis, vitamin B6 metabolism, and protein processing pathways in the endoplasmic reticulum. The phenylpropanoid biosynthesis pathway showed the most enrichment and may be associated with coumarin synthesis. DEGs in stems were significantly enriched in plant hormone signal transduction, the MAPK signaling pathway (plant), plant-pathogen interaction, and circadian rhythm (plant), whereas DEGs in leaves were significantly enriched in plant hormone signal transduction, biosynthesis of unsaturated fatty acids, fatty acid metabolism, and linolenic acid metabolism. During the same period, DEGs were mostly involved in plant-pathogen interactions, the MAPK signaling pathway, plant hormone signal transduction, and phenylpropanoid biosynthesis (Additional file 1A, 1 C, 1E). After bolting, they were mostly involved in the biosynthesis of amino acids, carbon metabolism, and glycolysis/glucose synthesis pathways (Additional file 1B, 1D, 1F). Variable splicing produces diverse transcripts, protein structures, and functions. Analysis of the obtained transcripts revealed 958,025 alternative splicing events, divided into five types. The most common type was the start alternative first exon type (AF), with a total of 416,842; this was followed by the alternative last exon type (AL), alternative 3’/5’ splice site (A3/5), and skipped exon (SE), with 405,179, 86,398, and 37,101, respectively; the least common type was the retained intron type (RI), with 12,505 (Additional file 2).
Expression profile of R2R3-MYB genes in different tissues at different development periods
Gene expression patterns provide important clues regarding gene function. To draw a heat map, the expression data of R2R3-MYB genes and coumarin synthesis pathway enzyme genes in each tissue before and after bolting were extracted from the transcriptome data (Fig. 6). There were significant differences in the expression patterns of R2R3-MYB genes in different tissues, which could be divided into three types. The first group was represented by PpMYB101 and PpMYB45, which were highly expressed in all tissues, both before and after bolting; the second group contained tissue-specific genes, such as PpMYB110, PpMYB15, PpMYB54, and PpMYB103, which were specifically expressed in root tissues, whereas PpMYB135, PpMYB26, and PpMYB58 showed obvious trends in the above ground parts; the remainder showed low expression. In addition, some PpMYBs clustered with expression patterns similar to those of the key genes in the synthesis pathway. PpMYB3, PpMYB54, and PpMYB103 exhibited similar expression patterns to the S8H-2 across different tissues and stages and are clustered in the same branch, indicating their potential involvement in a common biological process. PT-6 and PpMYB68, as well as F6H and PpMYB6, also exhibit similar expression patterns.
RT-qPCR analysis of R2R3-MYB genes in different tissues at different development periods
To further verify the reliability of the transcriptome data, a RT-qPCR analysis was performed. Almost all the selected genes showed tissue-specific gene expression at different stages, which was consistent with the transcriptome data (Fig. 7). Among them, COSY-1 expression showed no significant change. Before bolting, the levels of F6H, S8H-2, PpMYB3, PpMYB54, PpMYB69, PpMYB103, PpMYB114, and PpMYB142 were higher in the roots than in the above ground parts but decreased after bolting. In contrast, the levels of PpMYB15, PpMYB61, PpMYB68, and PpMYB91 increased in the roots after bolting, but PpMYB61 also showed significantly higher levels of expression in the stems after bolting. Before bolting, C4H and BMT-2 were highly expressed in the above ground parts; however, there was no significant change in expression in the roots.
Metabolomics analysis and differential metabolite screening
In total, 10,254 peaks and 3,218 metabolites were identified in the 30 samples of P. praeruptorum, including lipids and lipid-like molecules, organic acids and derivatives, phenylpropanoids and polyketides, and organoheterocyclic compounds (Additional file 3). Of these, lipids and lipid-like molecules primarily consisted of prenol lipids and fatty acids; organic acids and their derivatives predominantly included carboxylic acids and their derivatives; phenylpropanoids and polyketides mainly comprised flavonoids, isoflavonoids, coumarins, and their derivatives, as well as cinnamic acids and their derivatives; and organoheterocyclic compounds mainly consisted of indole and its derivatives, pteridine derivatives, and pyridine and its derivatives.
Performing PCA on the samples can provide preliminary insights into the overall metabolic disparities among sample groups and the variability within each group (Fig. 8A). This study revealed a clear distinction between tissues within the same development period, demonstrating significant differences in metabolites among various plant components. Comparing based on bolting and unbolting, a crossover overlap was observed among the leaves, indicating a similar compound accumulation pattern. There was some overlap between stems, but there was little difference in the metabolites. In contrast, there was a clear separation between root tissues, which showed that the root compounds changed significantly after bolting. Metabolite variations differed among the distinct comparison groups (Fig. 8B, C). A total of 1,502 differential metabolites were identified by pairwise comparisons between groups, with flavonoids constituting the highest proportion, followed by coumarin and its derivatives, accounting for 44.21%. There were 32 distinct metabolites associated with the phenylpropanoid biosynthetic pathway, including coumarin and its derivatives, cinnamic acid and its derivatives, and flavonoids (Additional file 10). There were 601,733, and 620 metabolites in the Nr vs. Br, Ns vs. Bs, and Nl vs. Bl groups, respectively. As the roots of P. praeruptorum are utilized as medicinal components, we conducted an additional analysis of the differential metabolites between root tissues (Nr vs. Br). Among the 601 differential metabolites, 260 were upregulated, and 341 were downregulated, indicating that more coumarin accumulated in the roots before bolting (Additional file 10).
Metabolomic analysis of P. praeruptorum in different tissues in different periods. A: Plot of principal component analysis of samples; B: Classification of all differential metabolites; C: Venn diagram of differential metabolites; D: Coumarin constituents; E-G: Distribution and analyses of DAMs; E: Nr vs. Br; F: Ns vs. Bs; G: Nl vs. Bl
The metabolic profiling of coumarin and its derivatives revealed different accumulation patterns of different metabolites in various tissues (Fig. 8D). Overall, praeruptorin A, praeruptorin B, and praeruptorin C contents were higher in unbolted plants. Bergapten, bergaptol, osthenol, and isoscopletin are all downstream products of umbelliferone under the action of PT family proteins and had similar accumulation patterns. The DAMs of corresponding tissues before and after bolting were enriched by the KEGG analysis, and the results showed that each group of DAMs were enriched in phenylpropanoid biosynthesis (ko00940). Between the two root tissue stages, DAMs were mainly concentrated in ABC protein transport, biosynthesis of plant secondary metabolites, and phenylpropanoid biosynthesis, indicating the abundant metabolic and secretory processes during plant growth (Fig. 8E-G).
Enriched pathways in metabolomics and transcriptomics
DEGs and DAMs were identified in the root tissue before and after bolting. These genes were mapped to the phenylpropanoid biosynthesis pathway. These two groups of genes, DAMs and DEGs, were on the same branch; however, trends in expression were not identical (Fig. 9). This is because the production of coumarins by P. praeruptorum is regulated by several genes. These genes and substances work together to form a closely related regulatory network; therefore, relationships between genes and metabolites are not just one-to-one. These relationships can be represented intuitively using a correlation network diagram. The correlation between DAMs and DEGs involved in phenylpropanoid biosynthesis was analyzed. Pearson’s correlation coefficients > 0.80 and P < 0.05 were selected as the standards for significant correlation. The results revealed that 48 genes were significantly associated with 11 metabolites. This suggests that these genes are involved in controlling coumarin compound production (Fig. 10).
Pathway diagram of phenylpropanoid biosynthesis. The box represents the gene, the circle represents the metabolite; the red/green box in the background represent DEGs/DAMs, where red indicates upregulated genes/metabolites, and green indicates downregulated genes/metabolites; all the gene products in the blue frame belong to genes/metabolites with both upregulation and downregulation
Determination of coumarin in different tissues and periods
At different developmental stages, eight coumarins were identified in the roots, stems, and leaves of P. praeruptorum. The compounds identified were xanthotoxins, scopoletin, isoscopoletin, peucedanocoumarin II, and bergapten. The linear correlation between each component was strong, with R² exceeding 0.999 (Additional file 11; Additional file 4–6 ). The coumarin content in different tissues varied during the different stages of growth and development (Fig. 11). Praeruptorin B was predominantly found in the roots, whereas praeruptorin A was mainly found in the roots and leaves. However, praeruptorin E was primarily found in the leaves. The coumarin concentration in the tissues fluctuated. After bolting, the content of praeruptorin A and praeruptorin E diminished markedly in the roots, whereas they increased in the leaves. These three components were angular pyranocoumarins, which may explain the use of the root as a therapeutic element.
Discussion
Multiple roles of R2R3-MYB genes in growth and secondary metabolism
With the development of genomic and sequencing technologies, MYB transcription factors have been identified in many plants. Owing to gene replication and evolution, there are significant differences in their number, structure, and function. The R2R3-MYB transcription factor, the most common type, has been identified in many plants. In Pistacia chinensis, 158 R2R3-MYB genes have been identified. These genes are divided into 32 subgroups. PcMYB113 may help produce anthocyanins during the fall coloring season [41]. ODORANT1, a member of the R2R3-MYB family in the lily, regulates the production of benzene/phenylpropanoid compounds [42]. Although R2R3-MYB inhibits the synthesis of phenylpropanol and lignin, it also affects tea plant growth and leaf polarity [43]. In total, 437 flowering plant species from 22 major classes possess MYB genes, identified using whole-genome identification and phylogenetic analysis to investigate the evolutionary differences between flowering and non-flowering plants. Thus, the MYB genes are adaptable to the environment and promoted phenotypic differentiation among different species [44]. These findings suggest that R2R3-MYB genes regulate the synthesis of benzene/phenylpropanoid compounds either directly or indirectly. In this study, 157 members of the R2R3-MYB family were identified, all of which were hydrophilic proteins, with the majority located in the nucleus. For the WGD event analysis, P. praeruptorum had more syntenic blocks with A. sinensis, providing evidence that it evolved from Apiaceae species. In addition, whole genome duplication and segmental duplication may have accelerated the expansion of the R2R3-MYB genes. Microsynteny analysis indicated 64 gene pairs of R2R3-MYB members (Fig. 1).
Phylogenetic analysis showed that the combination pattern of the R2R3-MYB motif was consistent with the phylogenetic relationships between the sequences. Proteins within the same subgroup had similar motif compositions and numbers; however, there were large differences between the groups. Members that clustered in the same family as A. thaliana may share a common evolutionary origin and functional similarities (Figs. 2 and 3). The S3, S4, S5, S6, S7, and S12 subgroups usually play a role in regulating primary and secondary plant metabolites in A. thaliana. AtMYB58 and AtMYB63 belong to S3, which activates the expression of key upstream enzymes in lignin biosynthesis, such as PAL and 4CL. They also participate in phenylpropanoid and lignin metabolism in A. thaliana [28]. Other studies have shown that plant root-specific AtMYB72 produces coumarins with the assistance of a homologous gene AtMYB10 under iron or phosphorus stress. AtMYB72 and BGLU42 induce coumarin synthesis and secretion. This may be related to the regulation of coumarin biosynthesis in the phenylpropanoid biosynthesis pathway [14, 45]. Previous studies have shown that S4 subgroup blocks the phenylpropanoid biosynthesis pathway. Overexpression of AmMYB308 and AmMYB330 inhibits the expression of PAL and 4CL in the pathway, resulting in a reduction in downstream products [46]. ZmMYB31 and ZmMYB42 significantly inhibit the expression of key enzymes COMT and 4CL, which also belong to the homologous genes of S4 in A. thaliana [47]. There are four PpMYB members in group C9 (S4), which are in a branch with AtMYB4 in the phylogenetic tree. AtMYB4 regulates the accumulation of phenylpropanoid compounds in plants by inhibiting the transcription of C4H, suggesting that PpMYB89 may be involved in the regulation of the coumarin metabolic pathway [16]. In the C5 (S22) group, PpMYB65, PpMYB143, and PpMYB148 clustered together with AtMYB44 and AtMYB73, suggesting that they may be involved in regulating auxin and plant stress response [48,49,50].
Transcriptional regulation of R2R3-MYB genes in coumarin biosynthesis pathway
After bolting, coumarin accumulation is insufficient, leading to a decline in the medicinal quality of P. praeruptorum. Coumarin synthesis is complex and involves several key enzymes and intermediate metabolites. Currently, the biosynthesis of coumarin components occurs primarily downstream of the phenylpropanoid pathway, which has not yet been fully elucidated. As an important bridge between genetic and protein information, the transcriptome can help explore important biological functions of genes [51]. Next-generation sequencing was used to annotate genes in six different samples of P. praeruptorum before and after bolting. KEGG pathway analysis of DEGs revealed that 938 DEGs were enriched in 118 pathways, which were significantly enriched in plant hormone signal transduction, plant circadian rhythm, alpha-linolenic acid metabolism, and biosynthesis of unsaturated fatty acids (Fig. 4). Half of these DEGs were upregulated after bolting and were involved in apoptosis and material transport after root lignification. Compared to the reference genome, 13,128 new genes were discovered, nearly half of which were not annotated in the database. Functional annotation revealed that the biological functions of these new transcripts were primarily related to membrane composition, ATP binding, and cellular components. KEGG annotation showed that most of the new genes were linked to pathways for starch and sucrose metabolism, protein processing in the endoplasmic reticulum, and interactions between plants and pathogens (Fig. 5). The results of the expression profiling showed that PpMYB genes were expressed in different patterns in the roots, stems, and leaves (Fig. 6). Most genes were either expressed at low levels or not expressed at all, and these genes may only be activated under specific circumstances. For instance, AtMYB41 in A. thaliana is not expressed in any tissue at any time, but is only induced under conditions of drought stress, ABA induction, and salt stress [52]. The first steps in the phenylpropanoid pathway involve PAL, C4H, 4CL, and C2’H. Previously, these compounds were found to be expressed in different tissues and at different times. This is because these compounds are common precursors for the anthocyanin, lignin, and flavonoid pathways [53, 54]. The temporal and spatial expression patterns of the main enzymes in the coumarin synthesis pathway were also unique, with an overall decreasing trend, largely consistent with that of a previous study [55].
Identification of hub R2R3-MYB genes based on an integral transcriptomic and metabolomic approach
Analyzing the interactions between proteins can provide further insights into their associated regulatory pathways. One way to identify genes involved in coumarin synthesis is to analyze how PpMYB genes are coexpressed with key genes in the coumarin pathway. MYB, bHLH, and WD40 combine to form a ternary complex [56]. This complex regulates the flavonoid biosynthesis genes chalcone synthase (CHS) and flavonol synthase (FLS). These complexes can be positive or negative. A correlation network map was created between the coumarin synthesis genes and transcription factors. F6H is strongly associated with the MYB transcription factors. In this case, PpMYB3, PpMYB54, PpMYB103, and other genes showed similar expression patterns to S8H-2 and were grouped together in the same branch. This branch controls the expression of S8H-2, including PT−6 and PpMYB68, as well as F6H, and PpMYB6. In this study, 3,928 metabolites were detected in P. praeruptorum before and after bolting using metabolomic techniques. Of these, 1,954 were DMAs, and 32 were enriched in the phenylpropanoid pathway. Metabolomic analysis of six cultivated types of P. praeruptorum was performed, and 22 possible marker metabolites were found to be strongly linked to quality indicators [57]. Metabolomic studies have detected metabolic differences between P. praeruptorum and A. decursiva and found that the differential compounds between the two varieties were mainly enriched in the biosynthetic pathway of phenylpropanoid compounds [58]. Four coumarins, bergapten, praeruptorin A, praeruptorin B, and praeruptorin E, were detected in high amounts in bolted plants in previous studies [5]. Coumarin components in various regions of P. praeruptorum were determined using high-performance liquid chromatography. The primary medicinal components were identified as angular pyranocoumarins, which is in accordance with the metabolomic profile and our previous study [5, 8] (Fig. 11). Scopoletin, isoscopoletin, and xanthotoxin were transported to the apical parts after bolting, probably due to the large amounts of enzymes synthesized in the apical parts. In general, the patterns of accumulation and gene expression levels in each tissue showed various trends. This indicated that coumarin production was controlled by complex networks. Using transcriptomics, we can further understand the relationships between genes and metabolites in plants. Metabolomics and transcriptomics techniques were used to illustrate that the homologous protein triplets MYB99, MYB21-5, and MYB24-5 coordinated with each other in each branch of the phenylpropanoid pathway in A. thaliana mutants and affected coumarin synthesis [59]. Integration of genome, transcriptome, and metabolome analyses in Melilotus albus revealed a consistent correlation between scopolin content and the expression pattern of MabHLH11 [60]. Furthermore, co-expression of MaMYB4 with MabHLH11 resulted in enhanced scopolin accumulation. The molecular mechanism underlying the upregulation of scopoletin biosynthesis in M. albus through the interaction between MaMYB4 and MabHLH11 was elucidated, thereby improving our understanding of the scopoletin regulatory network.
Conclusions
Multi-omics based bioinformatics analysis revealed 157 R2R3-MYB transcription factors expressed in P. praeruptorum. The expression patterns of R2R3-MYB genes were different in different tissues, and half of the PpMYB genes demonstrated tissue-specific expression. The expression patterns of R2R3-MYB genes were specific to the tissues and enzymes that produce coumarin, allowing the identification of candidate genes that were co-expressed with MYB. Metabolomic analysis revealed 1,502 differential metabolites in the tissues of P. praeruptorum in the pre- and post-bolting periods, of which coumarin and its derivatives accounted for 44.21%. There was a significant correlation between the 11 DAMs and 48 DEGs that were mapped to the phenylpropanoid biosynthesis pathway. Using transcriptomic and metabolomic techniques, this study identified the coumarin content and composition in various tissues and stages, as well as the differential structural genes and transcription factors involved in coumarin synthesis. Our results provide an important basis for elucidating the regulatory mechanisms of coumarin synthesis in P. praeruptorum.
Data availability
The raw reads were deposited in BioProject under accession number PRJNA524246. The coding sequences, protein sequences, and annotations are available at https://www.ncbi.nlm.nih.gov/sra/PRJNA1080496. The raw data for mass spectrometry analysis have been deposited in the OMlX Database (https://ngdc.cncb.ac.cn/omix/release/OMIX006783).
Abbreviations
- DEGs:
-
differentially expressed genes
- pI:
-
isoelectric point
- HMM:
-
hidden Markov model
- MW:
-
molecular weight
- NJ:
-
neighbor-joining
- FPKM:
-
Fragments Per Kilobase Million
- GO:
-
Gene Ontology
- KEGG:
-
Kyoto Encyclopedia of Genes and Genomes
- RT-qPCR:
-
reverse transcription quantitative PCR
- MS:
-
mass spectrometer
- PCA:
-
Principal component analysis
- SCM:
-
significantly changed metabolites
- DAM:
-
differential accumulated metabolite
- HPLC:
-
high-performance liquid chromatography
References
Song C, Li X, Jia B, Liu L, Ou J, Han B. De novo Transcriptome sequencing coupled with co-expression analysis reveal the transcriptional regulation of key genes involved in the formation of active ingredients in Peucedanum Praeruptorum Dunn under bolting period. Front Genet. 2021;12:683037.
Chen LL, Chu SS, Zhang L, Xie J, Dai M, Wu X, Peng HS. Tissue-specific metabolite profiling on the different parts of bolting and unbolting Peucedanum Praeruptorum Dunn (Qianhu) by laser microdissection combined with UPLC-Q/TOF⁻MS and HPLC⁻DAD. Molecules. 2019;24(7):1439.
Song Y, Jing W, Yan R, Wang Y. Research progress of the studies on the roots of Peucedanum Praeruptorum dunn (Peucedani radix). Pak J Pharm Sci. 2015;28(1):71–81.
Song Y, Jing W, Yang F, Shi Z, Yao M, Yan R, Wang Y. Simultaneously enantiospecific determination of (+)-trans-khellactone, (+/-)-praeruptorin A, (+/-)-praeruptorin B, (+)-praeruptorin E, and their metabolites, (+/-)-cis-khellactone, in rat plasma using online solid phase extraction-chiral LC-MS/MS. J Pharm Biomed Anal. 2014;88:269–77.
Xie J, Tang X, Xie C, Wang Y, Huang J, Jin J, Liu H, Zhong C, Zhou R, Ren G, et al. Comparative analysis of root anatomical structure, chemical components and differentially expressed genes between early bolting and unbolting in Peucedanum Praeruptorum Dunn. Genomics. 2023;115(2):110557.
Song C, Zhang W, Manzoor MA, Sabir IA, Pan H, Zhang L, Zhang Y. Differential involvement of PEBP genes in early flowering of Peucedanum Praeruptorum Dunn. Postharvest Biol Technol. 2024;212:112860.
Chu S, Chen L, Xie H, Xie J, Zhao Y, Tong Z, Xu R, Peng H. Comparative analysis and chemical profiling of different forms of Peucedani Radix. J Pharm Biomed Anal. 2020;189: 113410.
Song C, Li X, Jia B, Liu L, Wei P, Manzoor MA, et al. Comparative transcriptomics unveil the crucial genes involved in Coumarin Biosynthesis in Peucedanum Praeruptorum Dunn. Front Plant Sci. 2022;13:1–14.
Li H, Chen L, Wang T, Xiong F. Synthesis of coumarin 3-aldehyde derivatives via Photocatalytic Cascade Radical Cyclization-Hydrolysis. ChemistrySelect. 2022;7(28):e202200822.
Zhao Y, Jian X, Wu J, Huang W, Huang C, Luo J, Kong L. Elucidation of the biosynthesis pathway and heterologous construction of a sustainable route for producing umbelliferone. J Biol Eng. 2019;13(1):44.
Han X, Li C, Sun S, Ji J, Nie B, Maker G, Ren Y, Wang L. The chromosome-level genome of female ginseng (Angelica Sinensis) provides insights into molecular mechanisms and evolution of coumarin biosynthesis. Plant J. 2022;112(5):1224–37.
Song C, Zhang Y, Manzoor MA, Wei P, Yi S, Chu S, Tong Z, Song X, Xu T, Wang F, et al. A chromosome-scale genome of Peucedanum praeruptorum provide insights into Apioideae evolution and medicinal ingredient biosynthesis. Int J Biol Macromol. 2024;255:128218.
Song C, Manzoor MA, Ren Y, Guo J, Zhang P, Zhang Y. Exogenous melatonin alleviates sodium chloride stress and increases vegetative growth in Lonicera japonica seedlings via gene regulation. BMC Plant Biol. 2024;24:790.
Palmer CM, Hindt MN, Schmidt H, Clemens S, Guerinot ML. MYB10 and MYB72 are required for growth under iron-limiting conditions. PLoS Genet. 2013;9(11):e1003953.
Robe K, Conejero G, Gao F, Lefebvre-Legendre L, Sylvestre-Gonon E, Rofidal V, Hem S, Rouhier N, Barberon M, Hecker A, et al. Coumarin accumulation and trafficking in Arabidopsis thaliana: a complex and dynamic process. New Phytol. 2021;229(4):2062–79.
Banerjee S, Agarwal P, Choudhury SR, Roy S. MYB4, a member of R2R3-subfamily of MYB transcription factor functions as a repressor of key genes involved in flavonoid biosynthesis and repair of UV-B induced DNA double strand breaks in Arabidopsis. Plant Physiol Biochem. 2024;211:108698.
Katiyar A, Smita S, Lenka SK, Rajwanshi R, Chinnusamy V, Bansal KC. Genome-wide classification and expression analysis of MYB transcription factor families in rice and Arabidopsis. BMC Genomics. 2012;13:544.
Liu C, Long J, Zhu K, Liu L, Yang W, Zhang H, Li L, Xu Q, Deng X. Characterization of a Citrus R2R3-MYB transcription factor that regulates the Flavonol and Hydroxycinnamic Acid Biosynthesis. Sci Rep. 2016;6(1):25352.
Wang W, Hu S, Zhang C, Yang J, Zhang T, Wang D, Cao X, Wang Z. Systematic analysis and functional characterization of R2R3-MYB genes in Scutellaria baicalensis Georgi. Int J Mol Sci. 2022;23(16):9342.
Yang J, Zhang B, Gu G, Yuan J, Shen S, Jin L, Lin Z, Lin J, Xie X. Genome-wide identification and expression analysis of the R2R3-MYB gene family in tobacco (Nicotiana tabacum L). BMC Genomics. 2022;23(1):432.
Liu J, Wang J, Wang M, Zhao J, Zheng Y, Zhang T, Xue L, Lei J. Genome-wide analysis of the R2R3-MYB Gene Family in Fragaria × ananassa and its function identification during anthocyanins Biosynthesis in Pink-flowered Strawberry. Front Plant Sci. 2021;12:702160.
Ma S, Yang Z, Wu F, Ma J, Fan J, Dong X, Hu R, Feng G, Li D, Wang X, et al. R2R3-MYB gene family: genome-wide identification provides insight to improve the content of proanthocyanidins in Trifolium repens. Gene. 2022;829:146523.
Jiang CK, Rao GY. Insights into the diversification and evolution of R2R3-MYB transcription factors in plants. Plant Physiol. 2020;183(2):637–55.
Wang B, Luo Q, Li Y, Yin L, Zhou N, Li X, Gan J, Dong A. Structural insights into target DNA recognition by R2R3-MYB transcription factors. Nucleic Acids Res. 2020;48(1):460–71.
Baudry A, Heim MA, Dubreucq B, Caboche M, Weisshaar B, Lepiniec L. TT2, TT8, and TTG1 synergistically specify the expression of BANYULS and proanthocyanidin biosynthesis in Arabidopsis thaliana. Plant J. 2004;39(3):366–80.
Hao X, Pu Z, Cao G, You D, Zhou Y, Deng C, Shi M, Nile SH, Wang Y, Zhou W, et al. Tanshinone and salvianolic acid biosynthesis are regulated by SmMYB98 in Salvia miltiorrhiza hairy roots. J Adv Res. 2020;23:1–12.
Khan K, Kumar V, Niranjan A, Shanware A, Sane VA. JcMYB1, a Jatropha R2R3MYB transcription factor gene, modulates lipid biosynthesis in transgenic plants. Plant Cell Physiol. 2019;60(2):462–75.
Zhou J, Lee C, Zhong R, Ye ZH. MYB58 and MYB63 are transcriptional activators of the lignin biosynthetic pathway during secondary cell wall formation in Arabidopsis. Plant Cell. 2009;21(1):248–66.
Miyamoto T, Tobimatsu Y, Umezawa T. MYB-mediated regulation of lignin biosynthesis in grasses. Curr Plant Biol. 2020;24:100174.
Zhang Q, Wang L, Wang Z, Zhang R, Liu P, Liu M, Liu Z, Zhao Z, Wang L, Chen X, et al. The regulation of cell wall lignification and lignin biosynthesis during pigmentation of winter jujube. Hortic Res. 2021;8(1):238.
Dai X, Xu Y, Ma Q, Xu W, Wang T, Xue Y, Chong K. Overexpression of an R1R2R3 MYB gene, OsMYB3R-2, increases tolerance to freezing, drought, and salt stress in transgenic Arabidopsis. Plant Physiol. 2007;143(4):1739–51.
Lee SB, Kim HU, Suh MC. MYB94 and MYB96 additively activate cuticular wax biosynthesis in Arabidopsis. Plant Cell Physiol. 2016;57(11):2300–11.
Gong Q, Li S, Zheng Y, Duan H, Xiao F, Zhuang Y, He J, Wu G, Zhao S, Zhou H, et al. SUMOylation of MYB30 enhances salt tolerance by elevating alternative respiration via transcriptionally upregulating AOX1a in Arabidopsis. Plant J. 2020;102(6):1157–71.
Li S, Chiu TY, Jin X, Cao D, Xu M, Zhu M, Zhou Q, Liu C, Zong Y, Wang S, et al. Integrating genomic and multiomic data for Angelica sinensis provides insights into the evolution and biosynthesis of pharmaceutically bioactive compounds. Commun Biol. 2023;6(1):1198.
Zhao Y, He Y, Han L, Zhang L, Xia Y, Yin F, Wang X, Zhao D, Xu S, Qiao F, et al. Two types of coumarins-specific enzymes complete the last missing steps in pyran- and furanocoumarins biosynthesis. Acta Pharm Sin B. 2024;14(2):869–80.
Saez A, Rodrigues A, Santiago J, Rubio S, Rodriguez PL. HAB1-SWI3B interaction reveals a link between abscisic acid signaling and putative SWI/SNF chromatin-remodeling complexes in Arabidopsis. Plant Cell. 2008;20(11):2972–88.
Matus JT, Aquea F, Arce-Johnson P. Analysis of the grape MYB R2R3 subfamily reveals expanded wine quality-related clades and conserved gene structure organization across Vitis and Arabidopsis genomes. BMC Plant Biol. 2008;8:83.
Oh JE, Kwon Y, Kim JH, Noh H, Hong S-W, Lee H. A dual role for MYB60 in stomatal regulation and root growth of Arabidopsis thaliana under drought stress. Plant Mol Biol. 2011;77(1):91–103.
Stracke R, Werber M, Weisshaar B. The R2R3-MYB gene family in Arabidopsis thaliana. Curr Opin Plant Biol. 2001;4(5):447–56.
Liu J, Osbourn A, Ma P. MYB transcription factors as regulators of Phenylpropanoid metabolism in plants. Mol Plant. 2015;8(5):689–708.
Song X, Yang Q, Liu Y, Li J, Chang X, Xian L, Zhang J. Genome-wide identification of Pistacia R2R3-MYB gene family and function characterization of PcMYB113 during autumn leaf coloration in Pistacia chinensis. Int J Biol Macromol. 2021;192:16–27.
Yoshida K, Oyama-Okubo N, Yamagishi M. An R2R3-MYB transcription factor ODORANT1 regulates fragrance biosynthesis in lilies (Lilium spp). Mol Breeding. 2018;38(12):144.
Li M, Li Y, Guo L, Gong N, Pang Y, Jiang W, Liu Y, Jiang X, Zhao L, Wang Y, et al. Functional characterization of tea (Camellia sinensis) MYB4a transcription factor using an Integrative Approach. Front Plant Sci. 2017;8: 943.
Song C, Cao Y, Dai J, Li G, Manzoor MA, Chen C, Deng H. The multifaceted roles of MYC2 in plants: toward transcriptional reprogramming and stress tolerance by Jasmonate Signaling. Front Plant Sci. 2022;13:868874.
Stringlis IA, Yu K, Feussner K, de Jonge R, Van Bentum S, Van Verk MC, Berendsen RL, Bakker P, Feussner I, Pieterse CMJ. MYB72-dependent coumarin exudation shapes root microbiome assembly to promote plant health. Proc Natl Acad Sci U S A. 2018;115(22):E5213-22.
Tamagnone L, Merida A, Parr A, Mackay S, Culianez-Macia FA, Roberts K, Martin C. The AmMYB308 and AmMYB330 transcription factors from antirrhinum regulate phenylpropanoid and lignin biosynthesis in transgenic tobacco. Plant Cell. 1998;10(2):135–54.
Agarwal T, Grotewold E, Doseff AI, Gray J. MYB31/MYB42 syntelogs exhibit divergent regulation of Phenylpropanoid genes in Maize, Sorghum and Rice. Sci Rep. 2016;6:28502.
Shin R, Burch AY, Huppert KA, Tiwari SB, Murphy AS, Guilfoyle TJ, Schachtman DP. The Arabidopsis transcription factor MYB77 modulates auxin signal transduction. Plant Cell. 2007;19(8):2440–53.
Han X, Li M, Yuan Q, Lee S, Li C, et al. Advances in molecular biological research of Angelica Sinensis. Med Plant Biol. 2023;2:16. https://doi.org/10.48130/MPB-2023-001.
Lee D, Polisensky DH, Braam J. Genome-wide identification of touch- and darkness-regulated Arabidopsis genes: a focus on calmodulin-like and XTH genes. New Phytol. 2005;165(2):429–44.
Wei P, Li Y, Song C, Manzoor MA, Dai J, Yin Q, Zhang Y, Han B. Analysis of coumarin content and key enzyme genes expression involved in coumarin biosynthesis from Peucedanum Praeruptorum Dunn at different stages. Acta Physiol Plant. 2023;45(12):141.
Cominelli E, Sala T, Calvi D, Gusmaroli G, Tonelli C. Over-expression of the Arabidopsis AtMYB41 gene alters cell expansion and leaf surface permeability. Plant J. 2008;53(1):53–64.
Wang Y, Liao R, Pan H, Wang X, Wan X, Han B, et al. Comparative metabolic profiling of the mycelium and fermentation broth of Penicillium restrictum from Peucedanum praeruptorum rhizosphere. Environ Microbiol Rep. 2024;16:1–15.
Bai M, Jiang S, Chu S, Yu Y, Shan D, Liu C, Zong L, Liu Q, Liu N, Xu W, et al. The telomere-to-telomere (T2T) genome of Peucedanum praeruptorum Dunn provides insights into the genome evolution and coumarin biosynthesis. GigaScience. 2024;13:13.
Song C, Li X, Jia B, Liu L, Wei P, Manzoor MA, Wang F, Li BY, Wang G, Chen C, et al. Comparative transcriptomics unveil the crucial genes involved in Coumarin biosynthesis in Peucedanum Praeruptorum Dunn. Front Plant Sci. 2022;13: 899819.
Wang B, Luo Q, Li Y, Du K, Wu Z, Li T, Shen WH, Huang CH, Gan J, Dong A. Structural insights into partner selection for MYB and bHLH transcription factor complexes. Nat Plants. 2022;8(9):1108–17.
Guo Y, Xu G, Luo S, Luo M, Yang D, Tan QS, Yang YD, Deng CF. Metabolomics of quality formation of different cultivars of Peucedanum praeruptorum. Zhongguo Zhong Yao Za Zhi. 2024;49(3):681–90.
Li Y, Zhou D, Wang M. Metabolomics analysis revealed the differential metabolites of Peucedani Radix and Peucedani Decursivi Radix. Special Wild Economic Anim Plant Res. 2024;46(04):72–81.
Battat M, Eitan A, Rogachev I, Hanhineva K, Fernie A, Tohge T, Beekwilder J, Aharoni A. A MYB Triad controls primary and Phenylpropanoid metabolites for Pollen Coat Patterning. Plant Physiol. 2019;180(1):87–108.
Duan Z, Wang S, Zhang Z, Yan Q, Zhang C, Zhou P, Wu F, Zhang J. The MabHLH11 transcription factor interacting with MaMYB4 acts additively in increasing plant scopolin biosynthesis. Crop J. 2023;11(6):1675–85.
Acknowledgements
Not applicable.
Funding
This work was supported by the National Key R&D Program of China (2023YFC3503804), the Open Fund of Anhui Engineering Laboratory for Conservation and Sustainable Utilization of Traditional Chinese Medicine Resources (TCMRPSU-2022-04), and the Open Fund of the Anhui Dabieshan Academy of Traditional Chinese Medicine (TCMADM-2023-03).
Author information
Authors and Affiliations
Contributions
CS and BXH conceived and designed the paper. RRL, JZY, YXL, HYP analyzed the experiments data. RRL finished the draft manuscript. CS, RRL, YYZ and BXH revised the manuscript. RRL, JZY, YXL and HYP collected the samples. CS and BXH acquired the funding. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Ethics approval and consent to participate
The samples were collected from the Botanical Garden growing in West Anhui University. The voucher specimens were deposited in the herbarium of West Anhui University and identified by Prof. Bangxing Han. This study complies with relevant institutional, national, and international guidelines and legislation.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
12870_2024_5864_MOESM1_ESM.tif
Additional file 1. KEGG pathway enrichment analysis of differentially expressed genes in different parts and at different periods. A-F. Unbolted root, bolting root, unbolted stem, bolting stem, unbolted leaf, and bolting leaf.
12870_2024_5864_MOESM4_ESM.tif
Additional file 4. HPLC fingerprint of coumarin standards. The chromatogram shows isoscopoletin, scopoletin, xanthotoxin, bergapten, Peucedanumcoumarin II , praeruptorin A, praeruptorin B and praeruptorin E.
12870_2024_5864_MOESM5_ESM.tif
Additional file 5. HPLC fingerprint of the extraction of the bolting P. praeruptorum. A: bolting root; B: bolting stem; C: bolting leaf.
12870_2024_5864_MOESM6_ESM.tif
Additional file 6. HPLC fingerprint of the extraction of the unbolted P. praeruptorum. A: unbolted root; B: unbolted stem; C: unbolted leaf.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Liao, R., Yao, J., Zhang, Y. et al. MYB transcription factors in Peucedanum Praeruptorum Dunn: the diverse roles of the R2R3-MYB subfamily in mediating coumarin biosynthesis. BMC Plant Biol 24, 1135 (2024). https://doi.org/10.1186/s12870-024-05864-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12870-024-05864-1










