Tc-MYBPA is an Arabidopsis TT2-like transcription factor and functions in the regulation of proanthocyanidin synthesis in Theobroma cacao
© Liu et al. 2015
Received: 28 January 2015
Accepted: 20 May 2015
Published: 25 June 2015
The flavan-3-ols catechin and epicatechin, and their polymerized oligomers, the proanthocyanidins (PAs, also called condensed tannins), accumulate to levels of up to 15 % of the total weight of dry seeds of Theobroma cacao L. These compounds have been associated with several health benefits in humans. They also play important roles in pest and disease defense throughout the plant. In Arabidopsis, the R2R3 type MYB transcription factor TT2 regulates the major genes leading to the synthesis of PA.
To explore the transcriptional regulation of the PA synthesis pathway in cacao, we isolated and characterized an R2R3 type MYB transcription factor MYBPA from cacao. We examined the spatial and temporal gene expression patterns of the Tc-MYBPA gene and found it to be developmentally expressed in a manner consistent with its involvement in PAs and anthocyanin synthesis. Functional complementation of an Arabidopsis tt2 mutant with Tc-MYBPA suggested that it can functionally substitute the Arabidopsis TT2 gene. Interestingly, in addition to PA accumulation in seeds of the Tc-MYBPA expressing plants, we also observed an obvious increase of anthocyanidin accumulation in hypocotyls. We observed that overexpression of the Tc-MYBPA gene resulted in increased expression of several key genes encoding the major structural enzymes of the PA and anthocyanidin pathway, including DFR (dihydroflavanol reductase), LDOX (leucoanthocyanidin dioxygenase) and BAN (ANR, anthocyanidin reductase).
We conclude that the Tc-MYBPA gene that encodes an R2R3 type MYB transcription factor is an Arabidopsis TT2 like transcription factor, and may be involved in the regulation of both anthocyanin and PA synthesis in cacao. This research may provide molecular tools for breeding of cacao varieties with improved disease resistance and enhanced flavonoid profiles for nutritional and pharmaceutical applications.
KeywordsTheobroma cacao Proanthocyanidin Transcription factor TT2
The regulation of proanthocyanidin (PA) synthesis has been well characterized by the analysis of transparent testa (tt) mutants that fail to accumulate PAs in the seed coat [6, 9]. Three TT loci, TT2, TT8 and TTG1, which encode R2R3-MYB, bHLH and WD40 repeat proteins respectively, are necessary for proper temporal and spatial accumulation of PAs . The combinatorial interactions of different members from these three protein families determine the specificity of target gene activation [4, 6, 10, 11]. This interaction has been shown for several flavonoids synthesis regulators isolated from Arabidopsis [4, 6, 10, 11], Zea mays [12, 13] and Petunia hybrida [14–16]. The three proteins interact and form a ternary transcriptional protein complex to activate “late” PA-specific genes including DFR (dihydroflavanol reductase), LDOX (leucoanthocyanidin dioxygenase, also called ANS, anthocyanin synthase) and BAN (ANR, anthocyanidin reductase) [10, 11, 17, 18]. Another three TT loci, TT16, TT1 and TTG2 that encode a MADS box protein, a zinc-finger protein and a WRKY transcription factor, respectively, are also important for PA synthesis . These proteins have been shown to regulate the expression of BAN protein through a posttranscriptional mechanism and thus are involved in the differentiation of PA-accumulating cells .
The TT2 gene product (TT2) is a key regulator of PA synthesis and confers target gene specificity to the MYB-bHLH-WD40 complex. It is specifically expressed in PA-accumulating cells in Arabidopsis but can induce ectopic expression of the BAN gene when constitutively expressed in the presence of a functional TT8 protein . TT2 belongs to the large R2R3-MYB protein family that has 133 members in Arabidopsis. These proteins are typically involved in many aspects of plant secondary metabolism, plant cell identity and cell fate determination [19, 20]. Members of the R2R3-MYB protein family are characterized by the presence of two highly conserved head-to-tail MYB motifs in the N-terminal region, the R2 and R3 repeats, although their C-terminal regions are very divergent. Each of the R2R3 repeats consists of three α-helices ; helix 3 of each motif is involved in interaction with DNA and helix 1 of the R3 repeat is important for corresponding bHLH recognition.
In addition to Arabidopsis, the TT2-like PA-specific R2R3-MYB transcription factors (TFs) have been characterized in grape (Vitis vinifera), Lotus (Lotus japonicus), poplar (Populus tremuloides), persimmon (Diospyros kaki), clover (Trifolium arvense) and Medicago (Medicago truncatula) [21–27]. In grape, two TT2-like MYB TFs (VvMYBPA1 and VvMYBPA2) have been identified [23, 24]. These TFs exhibit tissue-specific functions in inducing PA structural gene expression and synthesis: VvMYBPA1 is mainly expressed in seeds; and VvMYBPA2 is mainly in expressed in exocarp of young berries and in the leaves. Similar observations were reported in Lotus, in which three copies of TT2-like R2R3-MYB TFs were identified that differed in organ-specific expression and responsiveness to stress . Each of the TFs mentioned above is capable of activating the ANR promoter in transient reporter assays. In poplar, a MYB134 gene encoding a TT2-like TF was recently shown to be responsive to wounding, pathogen presence and UV-B irradiation, consistent with the biological roles of PAs in anti-herbivore, anti-pathogen and UV damage protection . Overexpression of MYB134 in poplar resulted in transcriptional activation of the genes encoding enzymes of the full PA biosynthesis pathway from PAL1 to ANR and LAR, but not FLS, which is specific to flavonol synthesis.
There are a variety of plant-based foods and beverages that serve as natural sources of flavonoids, including cacao, red wine, grape, apple and cranberries. Among those, cacao has an extraordinarily high amount of flavonoid, especially PAs , which make up about 10–14 % of dry weight in mature beans . The development of cacao and flavonoid (mainly anthocyanins) synthesis has been described previously . The development of cacao fruits can be divided into three phases . Following pollination and fertilization, the first phase of fruit development is initiated and fruit begins to expand slowly at a rate of about 30–40 cm3/ week . This phase lasts 6–7 weeks until the first division of the fertilized egg, which initiates the second phase of pod development. At the second phase, fruits expand more rapidly at a rate of about 110–130 cm3/week, and embryos enlarge but remain unpigmented till they reach the length of ovules at about 14–16 weeks after pollination [31, 33]. When the fruits are 14–16 weeks old, the pericarp begins to change color from green to orange (in Scavina 6), denoting onset of the third phase, ripening. Ripe pod color varies from bright red, purple, green, yellow and multi-colored patterns, dependent on genotype. During the third phase, the increase in the fruit external dimensions gradually slows and finally ceases. The seeds begin to solidify and their dry weight increases rapidly at a rate of about 20–40 mg/day. Seed length remains constant as they continue to accumulate anthocyanins and gradually darken until maturity at about 20 weeks after pollination [30–33].
This research describes the isolation and characterization of a cacao gene, Tc-MYBPA, which encodes an R2R3-MYB transcription factor involved in regulating the biosynthesis of cacao PAs. Constitutive expression of Tc-MYBPA in the Arabidopsis tt2 mutant not only successfully complemented its primary phenotype (a PA-deficient seed coat) but also resulted in increased anthocyanin accumulation in young seedlings, suggesting that Tc-MYBPA may regulate both the anthocyanin and PA pathways in cacao.
The Cacao Tc-MYBPA gene encodes an R2R3-MYB transcription factor
Four putative Tc-MYBPA cDNA sequences were identified in a collection of Theobroma cacao expressed sequence tags (ESTs)  by querying the cacao ESTtik database (http://esttik.cirad.fr/) with the protein sequence of Arabidopsis TT2 (accession no. Q9FJA2). This cacao EST database contains 56 cDNA libraries constructed from different organs; two main genotypes and different stress conditions thus could be considered as an exhaustive collection of cacao expressed genes . ESTs showing sequence similarity to the TT2 gene were assembled into a contig to recover full-length open reading frames (ORFs) by alignment with cDNAs of homologous genes from other species and predictions from the ORF Finder program (www.ncbi.nlm.nih.gov/projects/gorf/). The full-length coding sequence of Tc-MYBPA was amplified by RT-PCR using cDNAs isolated from young leaves of cacao (Scavina 6), in which PAs are actively synthesized and accumulated . The isolated ORF was named Tc-MYBPA (accession no. GU324346). By searching the newly assembled cacao genome , we identified the Tc-MYBPA gene (Tc01_g034240) that is 1477-bp long with two exons. It is not associated with any currently identified quantitative trait loci (QTL) related to flavonoids. However, the Tc-MYBPA is very closely associated with 7 out of 17 DFR orthologous genes located near the bottom of chromosome 1. We also searched the whole cacao genome with the protein sequence of Arabidopsis TT2 to check if there are other possible homologues genes. The search revealed 7 candidate genes with higher score than Tc-MYBPA (Additional file 1: Figure S1). However, we didn’t find any confident hits by searching their putative protein sequences back to the cacao EST database. Considering that this EST database contains a variety of tissues that have been shown to synthesize and accumulate PAs [34, 37], including leaves, roots, flowers, pods, seeds, and seed testa, the 7 candidate genes maybe be peudogenes and not express at all.
The 864-bp ORF of Tc-MYBPA encodes a protein of 287 amino acids that shares 68 % identity with grape VvMYBPA1. A protein sequence alignment of Tc-MYBPA with other PA- and anthocyanin-regulating MYB proteins revealed that Tc-MYBPA contains an N-terminal R2R3 repeat that corresponds to the DNA-binding domain of plant MYB-type proteins (Fig. 1a). Like the high sequence similarity observed between the R2R3 repeat regions shared by 126 members of Arabidopsis [19, 38], the Tc-MYBPA R2R3 repeat region is highly conserved when compared to other plant R2R3 MYBs. The Tc-MYBPA N-terminal region also contains the [D/E]LX2[R/K]X3LX6LX3R motif for interaction with bHLH partners in the R3 repeat region , whereas the C-terminal region shows little homology to the MYB proteins included in this comparison.
To investigate these relationships more closely, a phylogenetic tree was constructed using the full-length amino acid sequences of Tc-MYBPA and sequences of all functionally tested MYBs involved in regulating proanthocyanidin and anthocyanin biosynthesis, as well as MYBs associated with several other biological processes (Fig. 1b). By searching the cacao EST database using tBLASTn with the protein sequence of putative cacao MYB Tc-MYBPA as the query, three EST contigs (CL8212Contig1, CL2621Contig1 and CL158Contig1) containing MYB-like proteins were also identified as the next best cacao matches to Tc-MYBPA. The results show that the putative cacao proanthocyanidin regulatory protein Tc-MYBPA is most closely related to the grape PA regulatory MYB protein VvMYBPA1 and clusters in the same clade with all the anthocyanidin and proanthocyanidin regulatory MYB proteins.
In summary, the Tc-MYBPA protein sequence includes conserved R2R3 regions typical of plant MYB transcription factors. Moreover, in Tc-MYBPA, we were able to detect conserved amino acid homologies shared with all the TT2-like MYB regulators but absent in anthocyanin regulators. These conserved amino acids appear to be specific to this clade and may be used to identify candidate PA-specific MYB regulators from other plant species.
Expression of Tc-MYBPA correlates with PA accumulation in Theobroma cacao
We have previous identified and functionally verified key PA biosynthesis structural genes TcANR, TcANS and TcLAR . A scan of the promoter sequences in the PALACE database  of these PA synthesis genes revealed several target motifs of Myb transcription factors on each of them (Additional file 1: Figure S2). Interestingly, MYBCORE, the key cis-regulatory element for binding PA synthesis regulating Myb transcription factors , was found in all of them, suggesting that they could all be downstream targets of the putative Tc-MYBPA. To assess the involvement of Tc-MYBPA in PA biosynthesis, the expression of the putative Tc-MYBPA gene was examined in tissue samples from different developmental stages of leaves, flowers and pods in which PAs accumulate. In addition, the expression of the cacao PA biosynthesis structural genes TcANR, TcANS and TcLAR were also examined.
The coordinated expression of Tc-MYBPA and TcANS suggest that Tc-MYBPA may contribute to the regulation of anthocyanin synthesis as well as PA synthesis. Nevertheless, the regulation of the PA-specific genes TcANR and TcLAR may also involve other transcription factors such as bHLH and WD40 repeat proteins whose interactions with Tc-MYBPA determine their specific expression patterns, which are slightly different from TcANS. To gain a better understanding of their regulation, further characterization and expression analysis of bHLH and WD40 genes will be helpful.
Tc-MYBPA complements the PA-deficient phenotype of the Arabidopsis tt2 mutant
In this study, amino acid sequence motifs specific to the PA-regulating clade of MYB transcription factors from other species were used to identify a candidate cacao ortholog. We compared five genes from four species including Arabidopsis and Lotus TT2 [10, 20], grape VvMYBPA1 and VVMYBPA2 [23, 24] and poplar MYB134 . Each of these has been experimentally demonstrated to play a key role in regulating the transcription of PA biosynthesis genes. Arabidopsis and Lotus TT2, poplar MYB134 and grape VvMYBPA2 formed a phylogenetic cluster with ZmC1 from maize, which has been shown to activate the Arabidopsis ANR promoter . However, cacao Tc-MYBPA and grape VvMYBPA1 are not in the clade that contains most of the PA-regulating MYBs; they formed another cluster that is significantly closer to the TT2/C1 clade than to other functionally unrelated MYB regulators. By contrast, the multiple protein sequence alignment including all the known PA and anthocyanin-regulatory MYB proteins revealed some PA specific motifs in the N-terminal domain. Five sites (1 or 2 amino acids) were conserved in all PA-specific MYBs, including ZmC1, but were absent from all other anthocyanin-specific MYBs. The discrepancy between the phylogenetic analysis, which showed a separate clade of Tc-MYBPA and VvMYBPA1 distinct from all other PA-regulatory MYBs, and the protein alignment, which clearly showed highly conserved PA-specific protein motifs in all PA MYBs, may result from the low homology C-terminal domain of those R2R3 MYB proteins. Similar to the results of Bogs et al. , none of the conserved motifs in the C-terminal domain described by Stracke et al.  were found. By contrast, phylogenic analysis seems to be a strong predictor of the anthocyanin regulatory MYB proteins, with all the functionally proven anthocyanin specific MYB transcription factors falling into the same subgroup [15, 42–44]. Interestingly, grape and cacao also share the distinction, together with tea, of being commercial species containing the highest levels of PA in all commonly consumed foods .
The analysis of PA levels during leaf development revealed that PA synthesis in cacao leaves occurs at higher levels in young leaves then in older leaves. This correlates with the synthesis of anthocyanins, which are present at a much higher concentrations in younger stage leaves than in mature leaves . Anthocyanin and PA synthesis share common structural enzymes in the PA synthesis pathway, including anthocyanin synthase (ANS/LDOX), which produces cyanidins used in the ANR reaction leading to epicatechins and in the UFGT reaction leading to anthocyanidins. Consistent with the PA and anthocyanin accumulation patterns, the cacao PA-specific structural genes ANR and LAR and the anthocyanin PA-common gene ANS were all co-regulated in developing leaves and more highly expressed in younger leaves compared to older leaves. Expression of the Tc-MYBPA gene correlated well with PA accumulation rates and expression of the PA biosynthetic genes TcANR, TcANS and TcANR. Similar results were observed from Tc-MYBPA transcript profiling in young pods and exocarp tissues, in which Tc-MYBPA exhibits the exactly same pattern with the co-regulated PA synthesis genes TcANR, TcANS and TcANR, suggesting that the Tc-MYBPA protein is involved in regulation of PA biosynthesis in leaves, young pods and exocarp.
In cacao reproductive tissues, PA synthesis began in developing flowers prior to pollination and continued in fruits until maturation, while anthocyanin synthesis began at the onset of fruit ripening and paralleled PA synthesis until maturation. Distinct from co-regulated expression of TcANS, TcANR and TcLAR genes in fruit exocarp, the TcANS gene had a different expression pattern from that of TcANR and TcLAR in ovules. TcANR and TcLAR were still co-regulated in ovules throughout developmental stages and both dropped at 16 WAP when fruit ripening commences and anthocyanin synthesis begins, while TcANS expression remained relatively high at 16 WAP, likely contributing to anthocyanin synthesis. Surprisingly, Tc-MYBPA shared the same expression pattern with TcANS rather than with the PA-specific genes TcANR and TcLAR, and the expression level remained stable, showing no decrease at 16 WAP. Similar observations were observed regarding the expression pattern of VvMYBPA1 in grape skins, in which VvMYBPA1 retained a relatively high transcript level two weeks after the onset of ripening and PA synthesis completely stopped when anthocyanin synthesis began . One interpretation is that the high levels of VvMYBPA1 could also contribute to anthocyanin synthesis, as it could activate the promoter of the VvANS (VvLDOX) gene. Overall, the expression pattern of Tc-MYBPA suggests that the encoded protein is involved in regulation of PA biosynthesis; moreover, it may also be involved in regulation of anthocyanin biosynthesis.
Overexpression of Tc-MYBPA in the Arabidopsis tt2 mutant complemented the PA-deficient phenotype in Arabidopsis mature seeds (Fig. 6). This indicated that this R2R3-type MYB transcription factor was able to substitute for the function of the key Arabidopsis PA regulator TT2. In contrast to grape VvMYBPA1 (the MYB protein most similar to Tc-MYBPA1), which can induce ectopic PA accumulation when overexpressed in Arabidopsis, Tc-MYBPA-tt2 transgenic plants accumulated PAs only in the seed coat. This tissue specific phenotype was similar to Arabidopsis TT2, which also failed to induce PA accumulation in tissues other than seed coat when ectopically expressed. Gene expression analysis of Tc-MYBPA-tt2 transgenic plants showed that overexpression of Tc-MYBPA induced only late flavonoid biosynthetic genes, DFR, LDOX and BAN, similar to Arabidopsis TT2, which also controls only the late flavonoid biosynthetic genes DFR and BAN . By contrast, VvMYBPA1 regulates the entire flavonoid pathway branch leading to PA synthesis, including both early and late flavonoid biosynthetic genes .
In transgenic Arabidopsis expressing the Tc-MYBPA gene, an increased accumulation of anthocyanins was also observed in hypocotyls of young seedlings; especially in Line 6, which showed an obvious visual color difference compared to untransformed controls. This could be explained by the ability of Tc-MYBPA to induce the expression of LDOX (ANS), which is a structural gene contributing to both the anthocyanin and the proanthocyanin pathway. This is different from the Arabidopsis TT2 MYB transcription factor, which has been shown to involved specifically in the genetic control of flavonoid late biosynthesis genes (LBGs) including DFR, LDOX and BAN only in seeds . However, both BAN and TT2 are not expressed in seedlings, while both DFR and LDOX are expressed in seedlings, contributing to anthocyanin synthesis. Their expression is controlled by another MYB transcription factor, AtPAP1 [47–49], whereas over-expression of AtTT2 did not increase the expression levels of LBGs in seedlings, with the exception of BAN, suggesting its specific involvement in PA synthesis . The activity of Tc-MYBPA was in contrast to grape VvMYBPA1. Although VvMYBPA1 could activate the VvLDOX gene promoter in transient reporter gene assays, it failed to induce anthocyanin synthesis when overexpressed in Arabidopsis . Bogs et al. also showed that anthocyanin synthesis in grape was regulated by another MYB transcription factor VvMYBA2 . However, the data from this research in transgenic Arabidopsis demonstrated that activation of anthocyanin synthesis was consistent with the Tc-MYBPA gene expression pattern in cacao, which was co-regulated with the TcANS gene and coincided with anthocyanin synthesis. Taken together, in cacao, Tc-MYBPA appeared to be capable of regulating both the PA and anthocyanin pathway by activating late PA biosynthetic genes. Potentially, this could provide a means to manipulate the amount and composition of PAs and anthocyanin together in cacao and possibly in other fruits. The different activities of the related MYB transcription factor genes from diverse species could reflect the evolutionary specialization of duplicated gene family members which appears to have taken slightly different functions over evolutionary time and can account in part for the differences in PA and anthocyanin accumulation patterns in these species.
In summary, our results support the conclusion that Tc-MYBPA from cacao is involved in regulation of transcription of several PA biosynthesis genes. This is based on several lines of evidence. First, protein sequence comparison showed that Tc-MYBPA was most similar to the grape PA transcriptional regulator VvMYBPA1 and shared the conserved motifs of all the other functionally characterized R2R3-MYB PA synthesis regulators. Second, transcript profiling showed that Tc-MYBPA was expressed in all tissues accumulating PAs and consistently co-regulated with PA biosynthesis structural genes including TcANR, TcANS and TcLAR. Third, over-expression of Tc-MYBPA in Arabidopsis was able to functionally complement the PA-deficient phenotype in the seeds of the tt2 mutant and resulted in a significant increase of PA accumulation compared to the tt2 mutant. This was the result of activation of the PA biosynthetic genes including DFR, LDOX and ANR as shown by gene expression analysis of transgenic plants relative to untransformed tt2 and Col-0 plants.
Two Theobroma cacao varieties: Scavina 6 and Amelonado were used for this study. Cacao plants were grown in greenhouse as previously described . Leaf and flower tissues were collected from Scavina 6 plants. For leaf tissues, various stage leaves were collected. The definition of leaves stages were previously described , briefly, Stage A leaves are newly emerged and are 5–10 cm long; stage B leaves are larger, soft, red and translucent, 10–15 cm long; Stage C leaves are green and remain soft; Stage D leaves are at an early stage of lignification; Stage E leaves are fully lignified and mature. Stage A and B leaves were pooled together because of the limited amount of Stage A leaves. Cacao pods were obtained by hand pollinating Amelonado (a self-compatible variety). Upon harvesting, pods were bisected, and seeds and pod exocarps collected separately. Exocarp samples represent the outer 1–3 mm layer of the fruit obtained using a fruit peeler. All samples were frozen in liquid nitrogen upon collection and stored at −80 °C until extraction.
Arabidopsis plants (Arabidopsis thaliana) were grown in soil at 22 °C, 50 % humidity and a 16 h/8 h light/dark photoperiod in a growth chamber (Conviron, Pembina, ND, USA). Plants grown aseptically were plated on MS medium  with 2 % (w/v) sucrose solidified with 0.6 % (w/v) agar. Arabidopsis ecotype Columbia (Col-0) plants were used as the wild type. T-DNA insertion mutant tt2 (SALK_005260) were obtained from The Arabidopsis Biological Resource Center (Columbus, OH, USA).
Isolation of a Tc-MYBPA cDNA from Theobroma cacao
Total RNA from stage A/B leaves of Theobroma cacao (Scavina 6) was isolated using a modified cetyl trimethyl ammonium bromide (CTAB) extraction method as previously described  with the following modifications. RNA isolated from the CTAB extraction and LiCl precipitation was further purified and concentrated using RNeasy columns (Qiagen, Valencia, CA, USA), but the phenol/chloroform extraction and sodium acetate/ehanol precipitation steps were omitted. The quality of RNA was verified by observing absorbance ratios of A260/A280 (1.8-2.0) and A260/A230 (1.8-2.2) and by separating 200 ng RNA samples on 0.8 % agarose gels to examine intact ribosomal bands.
First strand cDNA was synthesized using the SMART RACE cDNA amplification kit (Clontech, Mountain View, CA, USA). The putative EST sequence of Tc-MYBPA was obtained by searching the Theobroma cacao EST database (http://esttik.cirad.fr/)  using BLAST (program: tBLASTn)  with the protein sequence of TT2 (AT5G35550) from Arabidopsis thaliana as the query sequence. The ORF of putative Tc-MYBPA was amplified with the Advantage cDNA PCR Kit (Clontech, Mountain View, CA, USA) using cDNA from stage A/B leaves as template with the following primer pairs: Tc-MYBPA_F (5’- GTCC ATG GGAAGGGCTCCTTGTTGTTC -3’) and Tc-MYBPA_R (5’- AGCGGCCGC TCAGATCAATAATGATTCAGC -3’). To facilitate the subsequent cloning into binary vectors, an NcoI site (CCATGG) was added at the start codon (ATG) and a NotI site (GCGGCCGC) was added immediately 3' to the stop codon (TCA) respectively (sites are shown in italics and the start or stop codons are underlined). The PCR reaction was carried out in a total volume of 20 μL at 94 °C for 5 min; 5 cycles of 94 °C for 30 s, 55 °C for 30 s, and 72 °C for 1 min; another 23 cycles of 94 °C for 30 s, 60 °C for 30 s, and 72 °C for 1 min; followed by a final extension at 72 °C for 5 min. The PCR products were gel purified and cloned into the pGEM-T Easy plasmid (Promega, Madison, WI, USA) and replicated in E. coli strain DH5α. DNA sequencing was performed using 12 of the resulting DNA clones (pGEMT-Tc-MYBPA), and two clones had the precise sequence of the consensus sequences. One clone (pGEMT-Tc-MYBPA-3) was chosen for cloning into the binary vector for plant transformation and subsequent experiments.
Protein sequence alignment and phylogenetic analysis
PA-specific R2R3-MYB protein sequences were retrieved from GenBank (http://www.ncbi.nlm.nih.gov/Genbank/), including AtTT2 from Arabidopsis (CAC40021) , VvMYBPA1 and VvMYBPA2 from grape (AM259485, ACK56131) [23, 24], LjTT2a, LjTT2b and LjTT2c from Lotus japonicus (AB300033, AB300034, AB300035)  and MYB134 from Populus tremuloides (FJ573151) . A protein sequence alignment performed with the ClustalW algorithm was used to construct a phylogenetic tree using the neighbor-joining method in the MEGA package . One thousand bootstrap datasets were used to estimate the confidence of each tree clade. Protein sequence alignment of anthocyanin- and proanthocyanin-specific MYB proteins was performed using the same method as was for the phylogenetic tree but was edited and displayed using GENEDOC software (Version 2.6.02, http://www.nrbsc.org/gfx/genedoc/gddl.htm).
Proanthocyanidin (PAs) quantification
To extract soluble PAs from cacao tissues, 0.3-0.5 g of frozen tissues were ground into a fine powder in liquid nitrogen and then extracted with 5 mL of extraction solution (70 % acetone: 29.5 % water: 0.5 % acetic acid) by vortexing for 5 s followed by water bath sonication for 15 min using a bench top ultrasonic cleaner (Model 2510, Bransonic, Danbury, CT, USA). To extract soluble PAs from Arabidopsis seeds, the same extraction solution and method were applied, except that 100-500 mg dry seeds were used as grinding samples, and 500 μL extraction solution were used. After sonication, samples were vortexed again and centrifuged at 2500 g for 10 min. The supernatant was transferred to a new tube and the pellet was re-extracted twice as above. Pooled supernatants were extracted twice with hexane to remove fat and chlorophyll and then filtered through a 0.45 μm polytetrafluoroethylene (PTFE) syringe filter (Millipore, Billerica, MA, USA). Depending on availability of plant samples, different numbers of biological replicates were performed for cacao and Arabidopsis samples. For cacao, there were at least five biological replicates, and for Arabidopsis, there were three biological replicates.
To quantify PA levels, 50 μL aliquots of samples were mixed with 200 μL of dimethylaminocinnamaldehyde (DMACA; Sigma-Aldrich, MO, USA) reagent (0.1 % DMACA, 90 % reagent-grade ethanol, 10 % HCl) in 96-well microtiter plates. Absorption was measured at 640 nm at one-minute intervals for 20 min, and the mean value of peak readings during this time period was recorded. For each biological replicate, triple technical replicates were performed to obtain mean values. The total PA levels were calculated using a standard molar absorbance curve prepared using procyanidin B2 (Indofine, NJ, USA).
For quantitative analysis of insoluble PAs from cacao tissues, the residues from soluble PA extractions were air dried in an exhaust hood for two days, weighed, and 5 mL butanol-HCl reagent (95 % butan-1-ol: 5 % concentrated HCl) was added and the mixture was sonicated for one hour followed by centrifugation at 2500 g for 10 min. An aliquot of clear supernatant was diluted 40-fold in butanol-HCl reagent and absorbance was measured at 550 nm to determine the amount of background absorption. The samples were then boiled for 1 h with vortexing every 20 min, cooled to room temperature and centrifuged again at 2500 g for 10 min. The supernatant from boiled sample was diluted 40-fold in butanol-HCl reagent and absorbance was measured at 550 nm. The values were normalized by subtraction of the background absorbance and the PA levels were calculated as cyanidin equivalents using cyanidin-3-glucoside (Sigma-Aldrich, MO, USA) as standards.
To visualize the presence of PAs in Arabidopsis young seedlings and dry seeds, tissues were immersed in 4-dimethylaminocinnamaldehyde (DMACA) reagent (2 % (w/v) DMACA, 90 % ethanol, 10 % HCl) for 2 days as described previously  and then washed 3 times with 70 % ethanol.
Transformation of Arabidopsis
The coding sequence of Tc-MYBPA was excised from the intermediate cloning vector (pGEMT-Tc-MYBPA-3) with NcoI and NotI restriction enzymes and introduced into the pE2113-EGFP  intermediate vector to substitute the coding sequence of Tc-MYBPA for the original EGFP coding sequence. As a result, the cacao gene coding sequence is located immediately downstream of the very strong E12-Ω promoter (a modified CaMV35S promoter) and upstream of the CaMV35S terminator. The over-expression cassette was excised out from pE2113 vector with EcorI and PvuII restriction enzymes and introduced into the pCAMBIA-1300 binary vector (CAMBIA, Canberra, Australia).
This binary transformation construct was introduced into Agrobacterium tumefaciens strain AGL1  by electroporation as previously described . Arabidopsis transformation was carried out using the floral dip method , and T1 transgenic plants were selected on MS media supplemented with 2 % sucrose, 0.65 % agar and 25 mg/L hygromycin. Hygromycin-resistant T1 seedlings were transferred to soil 7 days after germination and grown in a growth chamber as described above.
Gene expression analysis
Total RNA from leaves, flowers, pods, pod exocarp and ovules of Theobroma cacao (Scavina 6 and Amelonado) was isolated as described above. Total RNA from young Arabidopsis seedlings was isolated using the RNeasy Plant mini kit (Qiagen, Valencia, CA, USA). cDNA was synthesized from 1 μg of total RNA in a total volume of 20 μL using M-MuLV Reverse Transcriptase (NEB, Ipswich, MA, USA) according to the supplier’s protocols, and 2 μL of this reaction were used in the subsequent RT-PCR reactions.
Sequences of the primers used in the gene expression study
Sequence (5’ to 3’)
TT2 like MYB transcriptoin factor
flavonol synthase 1
To ensure accurate semi-quantitative RT-PCR measurements, each primer set was tested in time course PCR reactions to measure amplification kinetics and to determine the optimal PCR cycle in which the reaction is well within the linear range (28 cycles). PCR reactions were carried out in a total volume of 20 μL at 94 °C for 5 min; 28 cycles of 94 °C for 30 s, 55 °C for 30 s, and 72 °C for 45 s; followed by a final extension at 72 °C for 5 min. The PCR products were visualized on 1 % agarose gels stained with ethidium bromide and photographed using a Molecular Imager Gel Doc XR+ System equipped with a 16-bit CCD camera (Bio-Rad Laboratories, Hercules, CA). Relative fluorescent intensity of the separated PCR products was quantified using Quantity One 1-D Analysis Software (Bio-Rad Laboratories, Hercules, CA). Expression levels were calculated relative to the expression of TcActin in each sample.
Availability of supporting data
The phylogenetic tree for the study have been submitted to DRYAD (doi:10.5061/dryad.57fc0).
- DFR :
- LDOX :
- ANR :
Open reading frames
Expressed sequence tags
Week after pollination
We thank undergraduate students at The Pennsylvania State University: Fred Gouker for assistance with Tc-MYBPA cloning and Dennis Arocena for assistance with PA extraction. We appreciate the help from people in Guiltinan lab, especially Ann Young and Sharon Pishak for assistance with green house maintenance, tissue culture and cacao sample collection. This work is supported in part by The Pennsylvania State University, The Huck Institutes of Life Sciences and American Research Institute Penn State Endowed Program in the Molecular Biology of Cacao.
- Treutter D. Significance of flavonoids in plant resistance and enhancement of their biosynthesis. Plant Biol. 2005;7(6):581–91.PubMedView ArticleGoogle Scholar
- Norman KH, Naomi DLF, Marjorie LM. Flavanols, the Kuna, cocoa consumption, and nitric oxide. J Am Soc Hypertens. 2009;3(2):105–12.View ArticleGoogle Scholar
- Xie DY, Dixon RA. Proanthocyanidin biosynthesis–still more questions than answers? Phytochemistry. 2005;66(18):2127–44.PubMedView ArticleGoogle Scholar
- Winkel-Shirley B. It takes a garden. How work on diverse plant species has contributed to an understanding of flavonoid metabolism. Plant Physiol. 2001;127(4):1399–404.PubMed CentralPubMedView ArticleGoogle Scholar
- Dixon RA, Pasinetti GM. Flavonoids and isoflavonoids: from plant biology to agriculture and neuroscience. Plant Physiol. 2010;154(2):453–7.PubMed CentralPubMedView ArticleGoogle Scholar
- Lepiniec L, Debeaujon I, Routaboul JM, Baudry A, Pourcel L, Nesi N, et al. Genetics and biochemistry of seed flavonoids. Annu Rev Plant Biol. 2006;57:405–30.PubMedView ArticleGoogle Scholar
- Grotewold E. The genetics and biochemistry of floral pigments. Annu Rev Plant Biol. 2006;57:761–80.PubMedView ArticleGoogle Scholar
- Ramsay NA, Glover BJ. MYB bHLH WD40 protein complex and the evolution of cellular diversity. Trends Plant Sci. 2005;10(2):63–70.PubMedView ArticleGoogle Scholar
- Abrahams S, Tanner GJ, Larkin PJ, Ashton AR. Identification and biochemical characterization of mutants in the proanthocyanidin pathway in Arabidopsis. Plant Physiol. 2002;130(2):561–76.PubMed CentralPubMedView ArticleGoogle Scholar
- Nesi N, Jond C, Debeaujon I, Caboche M, Lepiniec L. The Arabidopsis TT2 gene encodes an R2R3 MYB domain protein that acts as a key determinant for proanthocyanidin accumulation in developing seed. Plant Cell. 2001;13(9):2099–114.PubMed CentralPubMedGoogle Scholar
- Baudry A, Heim MA, Dubreucq B, Caboche M, Weisshaar B, Lepiniec L. TT2, TT8, and TTG1 synergistically specify the expression of BANYULS and proanthocyanidin biosynthesis in Arabidopsis thaliana. Plant J. 2004;39(3):366–80.PubMedView ArticleGoogle Scholar
- Grotewold E, Sainz MB, Tagliani L, Hernandez JM, Bowen B, Chandler VL. Identification of the residues in the Myb domain of maize C1 that specify the interaction with the bHLH cofactor R. Proc Natl Acad Sci USA. 2000;97(25):13579–84.PubMed CentralPubMedView ArticleGoogle Scholar
- Allan AC, Hellens RP, Laing WA. MYB transcription factors that colour our fruit. Trends Plant Sci. 2008;13(3):99–102.PubMedView ArticleGoogle Scholar
- de Vetten N, Quattrocchio F, Mol J, Koes R. The an11 locus controlling flower pigmentation in petunia encodes a novel WD-repeat protein conserved in yeast, plants, and animals. Genes Dev. 1997;11(11):1422–34.PubMedView ArticleGoogle Scholar
- Quattrocchio F, Wing J, van der Woude K, Souer E, de Vetten N, Mol J, et al. Molecular analysis of the ANTHOCYANIN2 gene of petunia and its role in the evolution of flower color. Plant Cell. 1999;11(8):1433–44.PubMed CentralPubMedView ArticleGoogle Scholar
- Spelt C, Quattrocchio F, Mol JN, Koes R. ANTHOCYANIN1 of petunia encodes a basic helix-loop-helix protein that directly activates transcription of structural anthocyanin genes. Plant Cell. 2000;12(9):1619–32.PubMed CentralPubMedView ArticleGoogle Scholar
- Nesi N, Debeaujon I, Jond C, Pelletier G, Caboche M, Lepiniec L. The TT8 gene encodes a basic helix-loop-helix domain protein required for expression of DFR and BAN genes in Arabidopsis siliques. Plant Cell. 2000;12(10):1863–78.PubMed CentralPubMedView ArticleGoogle Scholar
- Pelletier MK, Murrell JR, Shirley BW. Characterization of flavonol synthase and leucoanthocyanidin dioxygenase genes in Arabidopsis. Further evidence for differential regulation of "early" and "late" genes. Plant Physiol. 1997;113(4):1437–45.PubMed CentralPubMedView ArticleGoogle Scholar
- Stracke R, Werber M, Weisshaar B. The R2R3-MYB gene family in Arabidopsis thaliana. Curr Opin Plant Biol. 2001;4(5):447–56.PubMedView ArticleGoogle Scholar
- Stracke R, Ishihara H, Huep G, Barsch A, Mehrtens F, Niehaus K, et al. Differential regulation of closely related R2R3-MYB transcription factors controls flavonol accumulation in different parts of the Arabidopsis thaliana seedling. Plant J. 2007;50(4):660–77.PubMed CentralPubMedView ArticleGoogle Scholar
- Yoshida K, Iwasaka R, Kaneko T, Sato S, Tabata S, Sakuta M. Functional differentiation of Lotus japonicus TT2s, R2R3-MYB transcription factors comprising a multigene family. Plant Cell Physiol. 2008;49(2):157–69.PubMedView ArticleGoogle Scholar
- Mellway RD, Tran LT, Prouse MB, Campbell MM, Constabel CP. The wound-, pathogen-, and ultraviolet B-responsive MYB134 gene encodes an R2R3 MYB transcription factor that regulates proanthocyanidin synthesis in poplar. Plant Physiol. 2009;150(2):924–41.PubMed CentralPubMedView ArticleGoogle Scholar
- Bogs J, Jaffe FW, Takos AM, Walker AR, Robinson SP. The grapevine transcription factor VvMYBPA1 regulates proanthocyanidin synthesis during fruit development. Plant Physiol. 2007;143(3):1347–61.PubMed CentralPubMedView ArticleGoogle Scholar
- Terrier N, Torregrosa L, Ageorges A, Vialet S, Verries C, Cheynier V, et al. Ectopic expression of VvMybPA2 promotes proanthocyanidin biosynthesis in grapevine and suggests additional targets in the pathway. Plant Physiol. 2009;149(2):1028–41.PubMed CentralPubMedView ArticleGoogle Scholar
- Akagi T, Ikegami A, Yonemori K. DkMyb2 wound-induced transcription factor of persimmon (Diospyros kaki Thunb.), contributes to proanthocyanidin regulation. Planta. 2012;232(5):1045–59.View ArticleGoogle Scholar
- Hancock KR, Collette V, Fraser K, Greig M, Xue H, Richardson K, et al. Expression of the R2R3-MYB transcription factor TaMYB14 from Trifolium arvense activates proanthocyanidin biosynthesis in the legumes Trifolium repens and Medicago sativa. Plant Physiol. 2012;159(3):1204–20.PubMed CentralPubMedView ArticleGoogle Scholar
- Verdier J, Zhao J, Torres-Jerez I, Ge S, Liu C, He X, et al. MtPAR MYB transcription factor acts as an on switch for proanthocyanidin biosynthesis in Medicago truncatula. Proc Natl Acad Sci U S A. 2012;109(5):1766–71.PubMed CentralPubMedView ArticleGoogle Scholar
- Keen CL, Holt RR, Polagruto JA, Wang JF, Schmitz HH. Cocoa flavanols and cardiovascular health. Phytochem Rev. 2002;1:231–40.View ArticleGoogle Scholar
- Niemenak N, Rohsius C, Elwers S, Ndoumou DO, Lieberei R. Comparative study of different cocoa (Theobroma cacao L.) clones in terms of their phenolics and anthocyanins contents. J Food Comp Anal. 2006;19(6–7):612–9.View ArticleGoogle Scholar
- Wright DC, Park WD, Leopold NR, Hasegawa PM, Janick J. Accumulation of lipids, proteins, alkaloids and anthocyanins during embryo development in vivo of Theobroma cacao L. J Am Oil Chem Soc. 1982;59(11):475–9.View ArticleGoogle Scholar
- Alemanno L, Berthouly M, Michaux Ferriere N. A comparison between Theobroma cacao L. zygotic embryogenesis and somatic embryogenesis from floral explants. In Vitro Cell Dev Biol Plant. 1997;33(3):163–72.View ArticleGoogle Scholar
- Cheesman EE. Fertilization and embryogeny in Theobroma cacao. L Ann Bot. 1927;41(161):107–26.Google Scholar
- Lehrian DW, Keeney PG. Changes in lipid components of seeds during growth and ripening of cacao fruit. J Am Oil Chem Soc. 1980;57(2):61–5.View ArticleGoogle Scholar
- Argout X, Fouet O, Wincker P, Gramacho K, Legavre T, Sabau X, Risterucci AM, Da Silva C, Cascardo J, Allegre M, et al. Towards the understanding of the cocoa transcriptome: Production and analysis of an exhaustive dataset of ESTs of Theobroma cacao L. generated from various tissues and under various conditions. BMC genomics. 2008;9:512.
- Chaves FC, Gianfagna TJ. Cacao leaf procyanidins increase locally and systemically in response to infection by Moniliophthora perniciosa basidiospores. Physiol Mol Plant Pathol. 2007;70(4–6):174–9.View ArticleGoogle Scholar
- Argout X, Salse J, Aury JM, Guiltinan MJ, Droc G, Gouzy J, et al. The genome of Theobroma cacao. Nat Genet. 2011;43(2):101–8.PubMedView ArticleGoogle Scholar
- Liu Y, Shi Z, Maximova S, Payne MJ, Guiltinan MJ. Proanthocyanidin synthesis in Theobroma cacao: genes encoding anthocyanidin synthase, anthocyanidin reductase, and leucoanthocyanidin reductase. BMC plant biology. 2013;13:202.PubMed CentralPubMedView ArticleGoogle Scholar
- Czemmel S, Stracke R, Weisshaar B, Cordon N, Harris NN, Walker AR, et al. The grapevine R2R3-MYB transcription factor VvMYBF1 regulates flavonol synthesis in developing grape berries. Plant Physiol. 2009;151(3):1513–30.PubMed CentralPubMedView ArticleGoogle Scholar
- Higo K, Ugawa Y, Iwamoto M, Korenaga T. Plant cis-acting regulatory DNA elements (PLACE) database: 1999. Nucleic Acids Res. 1999;27(1):297–300.PubMed CentralPubMedView ArticleGoogle Scholar
- Akagi T, Ikegami A, Tsujimoto T, Kobayashi S, Sato A, Kono A, et al. DkMyb4 is a Myb transcription factor involved in proanthocyanidin biosynthesis in persimmon fruit. Plant Physiol. 2009;151(4):2028–45.PubMed CentralPubMedView ArticleGoogle Scholar
- McMurrough I, McDowell J. Chromatographic separation and automated analysis of flavanols. Anal Biochem. 1978;91(1):92–100.PubMedView ArticleGoogle Scholar
- Mathews H, Clendennen SK, Caldwell CG, Liu XL, Connors K, Matheis N, et al. Activation tagging in tomato identifies a transcriptional regulator of anthocyanin biosynthesis, modification, and transport. Plant Cell. 2003;15(8):1689–703.PubMed CentralPubMedView ArticleGoogle Scholar
- Deluc L, Barrieu F, Marchive C, Lauvergeat V, Decendit A, Richard T, et al. Characterization of a grapevine R2R3-MYB transcription factor that regulates the phenylpropanoid pathway. Plant Physiol. 2006;140(2):499–511.PubMed CentralPubMedView ArticleGoogle Scholar
- Deluc L, Bogs J, Walker AR, Ferrier T, Decendit A, Merillon JM, et al. The transcription factor VvMYB5b contributes to the regulation of anthocyanin and proanthocyanidin biosynthesis in developing grape berries. Plant Physiol. 2008;147(4):2041–53.PubMed CentralPubMedView ArticleGoogle Scholar
- U.S. Department of Agriculture, Agricultural Research Service. 2014. USDA National Nutrient Database for Standard Reference, Release 27. Nutrient Data Laboratory Home Page, http://www.ars.usda.gov/ba/bhnrc/ndl
- Lee DW, Brammeier S, Smith AP. The selective advantages of anthocyanins in developing leaves of mango and cacao. Biotropica. 1987;19(1):40–9.View ArticleGoogle Scholar
- Dare AP, Schaffer RJ, Lin-Wang K, Allan AC, Hellens RP. Identification of a cis-regulatory element by transient analysis of co-ordinately regulated genes. Plant Methods. 2008;4:17.PubMed CentralPubMedView ArticleGoogle Scholar
- Cominelli E, Gusmaroli G, Allegra D, Galbiati M, Wade HK, Jenkins GI, et al. Expression analysis of anthocyanin regulatory genes in response to different light qualities in Arabidopsis thaliana. J Plant Physiol. 2008;165(8):886–94.PubMedView ArticleGoogle Scholar
- Borevitz JO, Xia Y, Blount J, Dixon RA, Lamb C. Activation tagging identifies a conserved MYB regulator of phenylpropanoid biosynthesis. Plant Cell. 2000;12(12):2383–94.PubMed CentralPubMedView ArticleGoogle Scholar
- Walker AR, Lee E, Bogs J, McDavid DA, Thomas MR, Robinson SP. White grapes arose through the mutation of two similar and adjacent regulatory genes. Plant J. 2007;49(5):772–85.PubMedView ArticleGoogle Scholar
- Maximova S, Miller C, Antunez de Mayolo G, Pishak S, Young A, Guiltinan MJ. Stable transformation of Theobroma cacao L. and influence of matrix attachment regions on GFP expression. Plant Cell Rep. 2003;21(9):872–83.PubMedGoogle Scholar
- Liu Y: Molecular analysis of genes involved in the synthesis of proanthocyanidins in theobroma cacao. PhD Dissertation. Univesity Park: The Pennsylvania state university; 2010
- Murashige T, Skoog F. A revised medium for rapid growth and bio assays with tobacco tissue cultures. Physiol Plant. 1962;15(3):473–97.View ArticleGoogle Scholar
- Verica JA, Maximova SN, Strem MD, Carlson JE, Bailey BA, Guiltinan MJ. Isolation of ESTs from cacao (Theobroma cacao L.) leaves treated with inducers of the defense response. Plant Cell Rep. 2004;23(6):404–13.PubMedView ArticleGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10.PubMedView ArticleGoogle Scholar
- Kumar S, Tamura K, Nei M. MEGA3: Integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief Bioinform. 2004;5(2):150–63.PubMedView ArticleGoogle Scholar
- Lazo GR, Stein PA, Ludwig RA. A DNA transformation-competent Arabidopsis genomic library in Agrobacterium. Biotechnology (N Y). 1991;9(10):963–7.View ArticleGoogle Scholar
- Lin JJ. Optimization of the transformation efficiency of Agrobacterium tumefaciens cells using electroporation. Plant Sci. 1994;101(1):11–5.View ArticleGoogle Scholar
- Clough SJ, Bent AF. Floral dip: a simplified method for Agrobacterium-mediated transformation of Arabidopsis thaliana. Plant J. 1998;16(6):735–43.PubMedView ArticleGoogle Scholar
- Ahn JH. Semiquantitative Analysis of Arabidopsis RNA by Reverse Transcription Followed by Noncompetitive PCR. Cold Spring Harbor Protocols 2009, 2009. pdb.prot5296, doi:10.1101/pdb.prot5296.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.