Sorghum [Sorghum bicolor (L.) Moench] is a staple cereal crop for millions of people in the marginal, semi-arid environments of Africa and South Asia. Its unique and advanced ability to grow in regions of low and variable rainfall highlight its potential to impact agricultural productivity in widespread water-limited environments [1, 2]. Originating and evolving across the diverse environmental landscape of Africa, morphological and physiological adaptation strategies has advanced sorghum as a naturally heat and drought-tolerant warm season C4 grass that is more efficient at utilizing water, nitrogen and energy resources with respect to other major crops, including maize and wheat [1, 3, 4]. Occupying seven million hectares of farmland, the United States is currently the world’s top sorghum producer (8.8 million annual metric tons), followed by India (7.0), Mexico (6.9), and Nigeria (4.8) (http://cgiar.org/sorghum). Cultivated in diverse climates and environmental conditions, the challenges of increasing performance and yield on marginal lands and cooler climates remains at the forefront of sorghum improvement efforts worldwide [5, 6].
Sorghum is globally established as an important source of food, feed, sugar and fiber, and recent interest in bioenergy feedstocks also spotlights sorghum as an attractive alternative for sustainable biofuel production . Framed upon the 2009 sorghum reference genome , translational genomic resources have been developed that directly impact research in other closely related C4 feedstock grasses, including switchgrass and Miscanthus[8, 9]. Comprehensive understanding of the genetic and molecular mechanisms that regulate metabolite biosynthesis, transport and storage in these species is essential for the efficient development of biofuel feedstocks.
Global transcriptome profiling further provides a means to access gene networks for the discovery of functional connections between genes, mRNAs and their regulatory proteins, and complex traits expressed through coordinated and dynamic gene networks across different tissues and developmental stages . Over the last decade, microarray-based expression profiling has provided a reliable high-throughput platform for genome-wide analysis of gene expression in many organisms. Microarrays offer substantial advantages for functional genomics, as they are increasingly cost-effective, provide a comparable accuracy of expression profiling to RNA-sequencing, and have been shown to provide comprehensive expression data (up to 90% of the transcriptome) in a given tissue . Well-established microarray data analysis tools are also available for querying, visualizing and analyzing the genomes and predicted genes [12, 13], as well as for analyzing the transcriptome profiling data and integrating with other public datasets [14–17].
To provide insight into the sorghum transcriptome, we generated a record of gene expression in a set of seven tissues and six diverse sorghum genotypes. The choice of samples reflects our aim to develop and enrich the current sorghum transcriptome literature. Previous studies have predominantly focused on reproductive tissues, and the majority of these reports do not represent the complete sorghum transcriptome. Several of these studies have also been limited to the reference genome (BTx623) or Keller, a recently resequenced sweet sorghum variety [18–22].
Comparable whole plant transcriptome maps are available for a number of other model species, including Arabidopsis thaliana, maize (Zea mays), barley (Hordeum vulgare) , rice (Oryza sativa) [26, 27], and soybean (Glycine max) . These recent transcriptome surveys were constructed with only one genotype or line/accession for their respective species of interest, whereas the present study aims to highlight the practical importance of examining expression profiles across diverse tissue types, developmental stages, as well as genotypes in order to accurately target genes and metabolic pathways for the efficient development of improved feedstocks.
Fundamental understanding of sorghum genomics is necessary for improving sorghum for agronomic and compositional traits. Specifically, genotypes with high biomass and increased levels of fermentable stem sugars are ideal for developing feedstocks for the biofuel industry. We developed this genomic resource, the whole-transcriptome array as well as the vegetative transcriptome in diverse genotypes and tissues, in order to facilitate the characterization of molecular networks and regulatory mechanisms governing important metabolic pathways including, but not limited to, cell wall biosynthesis for lignocellulosic biomass as well as synthesis, translocation, and storage of fermentable photosynthates for energy content. The relevance of our dataset is demonstrated by genotype and tissue-specific expression of the phenylpropanoid and lignin biosynthetic pathway genes.
Intended as readily available public resource for functional gene characterization, the transcriptome data presented here is available through NCBI's Gene Expression Omnibus (GEO) under accession number GSE49879, and the Sorghum Genome Array is available through Affymetrix (http://affymetrix.com).