Transcriptional regulation of the TCA cycle, amino acid metabolism, and photosynthesis
Analysis of the functional genomics of sorghum started only after completion of the genomic sequencing of sorghum BTx623 in 2009 . To identify key expressed genes for sorghum-specific phytoalexin synthesis and elucidate their coordinated expression, we performed whole mRNA sequencing by using massive parallel sequencing technology (Table 1); differentially expressed genes, including unannotated genes, were identified on the basis of the piling-up of mapped reads (Tables 2; Figures 1
2; Additional Files 1
4: Tables S1-S4). We have validated the differential expression of these annotated or Cufflinks-predicted unannotated genes by using qRT-PCR experiment of biological replicates (Additional File 6: Figure S3).
The glyoxylate shunt in the TCA cycle, which involves the action of isocitrate lyase and malate synthase, was activated (Figure 3A). The shunt pathway of the TCA cycle allows increased production of carbon compounds by bypassing the CO2-generating steps of the TCA cycle and contributes to the synthesis of cell components such as cell-wall polysaccharides, nucleotides, and amino acids. Genes in the shikimate pathway are ubiquitously expressed (Additional File 1: Table S1); this reinforces the supply of Phe and Tyr, which are precursors of phytoalexins (Figure 3C). The glyoxylate shunt is widespread in plants, bacteria, and fungi . In plants, the glyoxylate shunt is of primary importance for the growth of plant seedlings; it is involved in the conversion of stored lipids to carbohydrates that serve as primary nutrient sources before photosynthesis . However, synthesis of cellar components results in the consumption of components of the TCA cycle. To compensate for this loss, succinate, a substrate for the TCA cycle, could be supplied from glutamate, because production of glutamate decarboxylase (Sb01g041700), which catalyzes the first step from Glu to succinate, was highly induced (Figure 3B). Thus, boosting of the glyoxylate shunt suggests that there is change in the role of the TCA cycle from energy production to synthesis of cellar components.
Amino acids are not only building blocks of protein; they also serve as biosynthetic precursors for anti-pathogen metabolites. Decarboxylation of Tyr (Figure 3C) is the first step in the production of complex isoquinoline alkaloids, which comprise more than 2500 known compounds found in various plants . The upregulation of genes encoding polyphenol oxidase (Additional File 5: Figure S2) also supports the synthesis of isoquinoline alkaloids (Additional File 5: Figure S2). PALs, which were highly expressed constitutively (Additional File 5: Figure S1), are involved in the first step in the biosynthesis of flavonoids by catalyzing the deamination of Phe (Figure 3C). Ser acetyltransferase links Ser metabolism to Cys biosynthesis (Figure 3C). Cys serves as a precursor for various sulfur-containing metabolites, including glutathione (GSH), cofactors, essential vitamins, and sulfur esters [42–44]. The upregulation of genes encoding glutathione S-transferases (Sb02g003090.1, Sb01g030880.1; Additional File 1: Table S1) also supports the activation of GSH-dependent detoxification.These amino acid metabolizing enzymes are located at the branch point between primary and secondary metabolism, suggesting that their upregulation enables irreversible commitment to the pathway (Figure 3C).
An inverse correlation between photosynthesis- and defense-related gene expression has been observed in the C3 plants tobacco  and potato . In contrast, sorghum has genes for C4 photosynthesis ; six of seven previously identified C4 photosynthesis genes (two for carbonic anhydrase and one each for malate dehydrogenase, malic enzyme, phosphoenolpyruvate carboxylase, and pyruvate orthophosphate dikinase) were downregulated (Additional File 2: Table S2). Even though the changes were small (0.35 to 0.71 fold; Additional File 2: Table S2), the basal expression levels of photosynthesis genes were high (e.g. the RPKM [mock-infected] for pyruvate orthophosphate dikinase was 3915.65; Additional File 5: Figure S2 and Additional File 1: Table S1), and thus the absolute amounts of transcripts would have changed substantially. This response supports the inverse relationship between C4 photosynthesis- and defense-related gene expression in sorghum.
Coordinated gene expression for sorghum-specific responses
Our mRNA-seq analysis revealed the transcriptional regulation of key enzymatic steps for synthesizing sorghum-specific phytochemicals. By our genome-wide analysis we also identified candidate genes responsible for the missing steps of sequential reaction that causes the accumulation of phytoalexins. Sorghum BTx623 exhibits typical reddish orange leaf lesions after infection with the conidia of B. sorghicola. Apigeninidin, one of the 3-deoxyanthocyanidins, was accumulated after infection with B. sorghicola (Figure 4d). 3-deoxyanthocyanidin is also accumulated after infection with Colletotrichumsublineolum  or Cochliobolus heterostrophus . In BTx623 we found coordinated gene expression and suppression of genes; this included the upregulation of CHS, CHI, an unannotated DFR gene (CUFF.115357.1) and a putative anthocyanidinreductase candidate (Sb06g029550), as well as the suppression of F3H and ANS genes (Figure 4A). These findings suggest that accumulation of 3-deoxyanthocyanidin, but not anthocyanidin, occurs upon infection with B. sorghicola. In another sorghum accession, DK46, anthocyanin pigment is accumulated through sequential reactions catalyzed by F3H, DFR, and ANS , suggesting that expression of the genes encoding these proteins has changed during the history of sorghum breeding.
What controls the coordinated expression of such genes? As a candidate, Yellow seed1 (Y1), which encodes a MYB-type regulatory protein, plays pivotal roles in pericarp pigmentation with 3-deoxyanthocyanidin in seeds of sorghum; deletion of the Y1 allele in BTx623 produces seeds without these 3-deoxyflavonoid pigments . Expression of a putative flavonoid 3’hydroxylase (F3′H) gene is under the control of the sorghum Y1 gene in synthesizing 3-deoxyanthocyanidin phytoalexins . P1, a y1 homolog in maize, activates the expression of genes encoding CHS and CHI . In this study, leaf expression of y1 was completely suppressed with or without infection with B. sorghicola (Additional file 1: Table S1), but the genes encoding CHS, CHI, and F3′H were differentially expressed (Additional File 5: Figure S1 and S2). We therefore consider that regulation of phytoalexin synthesis could differ between seed and leaf. Other transcription factors may be responsible for the expression of genes for 3-deoxyanthocyanidin production in the leaves of BTx623. Expression of genes for transcription factor families such as ERF, WRKY, DREB, and the zinc finger family was induced (Additional File 4: Table S4). Transcription factors were also duplicated in the sorghum genome. For example, a number of WRKYs have been annotated and have had a lineage-specific gene expansion during the course of plant evolution: one in Chlamydomonas reinhardtii, 37 in the moss Physcomitrella patens, 74 in Arabidopsis, almost 200 in soybean, and 93 in sorghum [51, 52]. Expansion of the numbers of genes of this family (i.e. WRKY) is likely to be associated with the ongoing development of highly sophisticated defense mechanisms co-evolving in plants together with pathogens.
Dhurrin content could be regulated by changes in both synthesis and degradation (Figure 5A); degradation of dhurrin results in release of HCN, which can be lethal to animals, insects, fungi, and plants [13, 53, 54]. In the sorghum genome we identified genes responsible for dhurrin metabolism, and we showed that pathogen infection favored the accumulation, not degradation, of dhurrin (Figure 5B). As the release of HCN inhibits phytoalexin production , HCN might be more damaging to the plant than to the invader. Therefore, in the case of fungal infection, the genes responsible for dhurrin degradation might be strictly suppressed. Moreover, expression of CYP79A1, which is responsible for the synthesis of p-hydroxymandelonitrile (an intermediate of dhurrin; Figure 5A), was also slightly suppressed by infection (Additional File 1: Table S1), whereas CYP79A1 expression is induced by feeding greenbugs . Thus, the defense mechanism related to dhurrin in fungal infection might differ from that in insect feeding.
Evolutionary history of phytoalexin synthesis after sorghum and rice split
The size of the sorghum genome is approximately 730 Mb , which is twice the 389 Mb of the rice genome . This difference in size is due mainly to differences in the content of repetitive sequences: 55% of the sorghum genome consists of retrotransposon sequences, compared with the smaller rice genome (26%) . Alignment of genetic  and cytological maps  suggests that sorghum and rice have similar quantities of euchromatin (252 and 309 Mb, respectively), with a largely collinear gene order . Nevertheless, some of the genes in the sorghum genome were duplicated after the sorghum–rice split. We demonstrated the tandemly duplicated genes and the diversity of pathogen-inducible expression of the genes encoding aromatic-L-amino acid decarboxylase (Figure 3B), DFR, putative anthocyanidin reductase (Figure 4B), dhurrinase (Figure 5B), PAL, CHS (Additional File 5: Figure S1), F3′H, and polyphenol oxidase (Additional File 5: Figure S2). The synthesis of sorghum-specific phytochemicals was explained by the presence of the sorghum-specific genes encoding p-(S)-hydroxymandelonitrile lyase and CUFF115357.1/DFR3, which were acquired after the sorghum–rice split (Figures 4D5C). The sorghum genome had three tandemly duplicated aromatic-L-amino acid decarboxylase genes (Figure 3B) and six PAL genes (Additional File 5: Figure S1), but the rice genome had two tandemly duplicated aromatic L-amino acid decarboxylase genes and four PAL genes , suggesting that the extra copy was acquired and thus strengthened the pathway after the sorghum–rice split. Cytochrome P450 domain–containing genes, which are often involved in phytoalexin synthesis and the scavenging of toxins, are abundant in sorghum, which has 326 such genes, versus 228 in rice , even though the target products have not been not fully identified in vivo. These duplications in sorghum have likely resulted in the diversity of both their genomic sequences and their expression; these genes have thereby developed different functions on an evolutionary time scale.
Advantage of mRNA-seq for identification of pathogen-inducible genes
mRNA-seq provides information on all transcribed genes without the need to rely on annotation. Whole-genome tiling arrays can also be used to identify unannotated transcripts, but not for alternative splicing variants; this is the advantage of mRNA-seq over microarray technology. We predicted transcripts on the basis of the piling-up of mapped reads; 7674 transcripts were unannotated in Phytozome (Figure 2A). The differentially expressed unannotated transcripts encoded, for example, proteins similar to DFR, responsible for 3-deoxyanthocyanidin biosynthesis, or to maize ZRP4 , which encodes the o-methyltransferase involved in suberin biosynthesis (Figure 2B; Additional File 3: Table S3). Suberin is a component of the polymer matrices in lipophilic cell wall barriers. These barriers control the fluxes of gases, water, and solutes, and they also help to protect plants from biotic and abiotic stresses and to control plant morphology . The unannotated differentially expressed genes could be identified only by mRNA-seq. Moreover, mRNA-seq could identify and distinguish the expression of each duplicated gene; it is therefore a powerful tool for analyzing genomes that have large numbers of such duplications. This application of mRNA-seq has generated many new leads and hypotheses in regard to metabolic pathways. Functional linkage of the transcriptome and metabolome is very important and should be elucidated systematically in the future.
Minimizing the technical error is important. We previously validated our sequence-based gene expression profiling against array-based technology in rice. For each gene from shoots (N = 14,575) and roots (N = 14,861), the ratio obtained from the array and the corresponding ratio obtained from RPKM were highly correlated over a broad range (r = 0.72 in shoot and 0.80 in root) . Moreover, we confirmed the differential expression by using qRT-PCR of three biological replicates for the genes of interest (Additional file 6: Fig S3). We therefore consider that our sequence-based approach was generally valid as a gene expression profiling technology.
Following the rapid progress of massive parallel sequencing technology, whole mRNA sequencing has been used for gene expression profiling in sorghum. During the time when this paper was under review, a transcriptome analysis of sorghum bicolor in response to osmotic stress and abscisic acid was reported .
Pathogen infection activated the glyoxylate shunt in the TCA cycle; this changes the role of the TCA cycle from energy production to synthesis of cell components. Genes encoding amino acid metabolizing enzymes located at the branch point between primary and secondary metabolism of phytoalexin synthesis or of sulfur-dependent detoxification were upregulated. The coordinated gene expression upon pathogen infection suggests the accumulation of the sorghum-specific phytochemicals 3-deoxyanthocyanidin. Particular genes in tandemly duplicated putative paralogs were highly upregulated. Key enzymes for synthesizing these sorghum-specific phytochemicals were not found in the corresponding region of the rice genome. Therefore, pathogen infection dramatically changed the expression of particular paralogs that putatively encode enzymes involved in the sorghum-specific metabolic network.