VitisExpDB: A database resource for grape functional genomics
© Doddapaneni et al; licensee BioMed Central Ltd. 2008
Received: 17 September 2007
Accepted: 28 February 2008
Published: 28 February 2008
The family Vitaceae consists of many different grape species that grow in a range of climatic conditions. In the past few years, several studies have generated functional genomic information on different Vitis species and cultivars, including the European grape vine, Vitis vinifera. Our goal is to develop a comprehensive web data source for Vitaceae.
VitisExpDB is an online MySQL-PHP driven relational database that houses annotated EST and gene expression data for V. vinifera and non-vinifera grape species and varieties. Currently, the database stores ~320,000 EST sequences derived from 8 species/hybrids, their annotation (BLAST top match) details and Gene Ontology based structured vocabulary. Putative homologs for each EST in other species and varieties along with information on their percent nucleotide identities, phylogenetic relationship and common primers can be retrieved. The database also includes information on probe sequence and annotation features of the high density 60-mer gene expression chip consisting of ~20,000 non-redundant set of ESTs. Finally, the database includes 14 processed global microarray expression profile sets. Data from 12 of these expression profile sets have been mapped onto metabolic pathways. A user-friendly web interface with multiple search indices and extensively hyperlinked result features that permit efficient data retrieval has been developed. Several online bioinformatics tools that interact with the database along with other sequence analysis tools have been added. In addition, users can submit their ESTs to the database.
The developed database provides genomic resource to grape community for functional analysis of genes in the collection and for the grape genome annotation and gene function identification. The VitisExpDB database is available through our website http://cropdisease.ars.usda.gov/vitis_at/main-page.htm.
Expressed sequence tags (ESTs) are an abundant genomic resource with 44,203,116 ESTs deposited as of July 2007 in the GenBank repository. ESTs are an important genomic resource for all species, and more so in systems for which no genome sequences are available. In such cases, ESTs are used as the basis for structural genomic annotation. ESTs are also fundamentally important for studying global expression patterns . Utilization of effective bioinformatics tools has broadened the applications of EST analysis into the fields of genomics, marker development and genome annotation among others [2–4].
The V. vinifera-based European grapevine is the most economically important fruit species worldwide, with over 7.4 million hectares under cultivation. Grapes are produced for fruit, juice, raisins, wine and spirits. Currently, a draft genome is available, with comprehensive genome annotation still in progress . Recently, several Vitis species genomic projects have contributed to the growing number of ESTs that are publicly available. There are numerous North American native grape species and cultivars bred from these species that have economic value and also are rich in germplasm resistant to different biotic and abiotic stresses (See Additional File 1).
Sources of Vitis and the ESTs sequences in the database. There are a total of 329,964 ESTs currently in the database.
Number of ESTs
V. vinifera (wine grape)
V. hybrid cultivar ++
V. rupestris × V. arizonica ++
V. cinerea × V. rupestris
V. cinerea × V. riparia
In 2004, as part of its GeneChip® Consortia Program, Affymetrix introduced the 16K array GeneChip® Vitis vinifera (Grape) Genome Array ver. 1.0 (Affymetrix® Inc., Santa Clara, CA), fabricated mainly for V. vinifera microarray studies. Recently, there have been six reports of mRNA expression profiling studies using either cDNA or oligo arrays for measuring gene expression profiles for flowers and berry skin development [10–12], and from water-deficit and iso-osmotic salinity stress in grapevine shoot tissues , as well as expression profiles associated with viral diseases  and tissue specific profiles on berry tissues . GrapePLEX is a MIAME-compliant and Plant Ontology-enhanced expression database for Vitis microarray studies that is part of the Plant Expression Database [PLEXdb] . Currently, the expression datasets derived from the GeneChip® Vitis vinifera (Grape) Genome Array ver. 1.0 (Affymetrix® Inc., Santa Clara, CA) can be deposited here.
Other online specialized databases contain information on Vitis ESTs, such as the DFCI grape gene index database that store information from V. vinifera ESTs (191,616 in total) including information on the tentative consensus sequences, their BLAST and Gene Ontology details . Also, functional tools are available that allow comparison of EST expression profiles from different libraries, and information on alternate spiced forms, among others. Similarly, EST sequence information along with their microarray probe information can be obtained from the Plant genome database PlantGDB, where about 210,000 V. vinifera ESTs are stored . Information on the V. vinifera ESTs in different metabolic pathways can be obtained from the KEGG website . A more up-to-date data on the clustered EST sequences from different Vitis species with large EST collections can be obtained from the TIGR plant transcript assemblies database . The V. aestivalis var. Norton EST database is available at Missouri State Univ.-Mountain Grove and also includes a defense gene database called GREED .
VitisExpDB was developed to curate and permit easy access to all the available grape EST sources and integrate with the microarray data, especially data from custom arrays from non-vinifera varieties and species, which are a known source of biotic and abiotic stress resistance germplasm. Putative homologs have been identified across different Vitis cultivars and species and with the model plant Arabidopsis. Several search and retrieval forms along with online bioinformatics tools were developed to create a comprehensive data warehouse for Vitis genomics research. The database will be updated every six months with available new data sets.
Construction and Content
EST sets were down loaded from NCBI data banks (UniGene for V. vinifera, dbEST for the rest of the Vitis species, Table 1) and searched using the BLASTX algorithm against the entire 'nr' protein database using the NCBI BLAST service. These results were reconfirmed by repeating the similarity search using the Personal BLAST Navigator (PLAN) software server . In both the cases, an identity cut-off E value of 10-4 was used. The Gene Ontology terms  were generated using the High Throughput Gene Ontology Functional Annotation (Ht-Go-Fat) toolkit . Sequence similarity search was carried out using the default BLAST search parameters and a cut off E value of 10-4. The generated GO terms were hyperlinked to their definitions and ontologies in the latest release of the "Gene Ontology, OBO v1.2", downloaded from the GO website . The database also lists similar gene sequences among different species of Vitis that were identified using our recently developed nWayComp tool . For this, ESTs from V. vinifera, V. shuttleworthii, V. arizonica, V. aestivalis, V. riparia were subjected to reciprocal BLAST searches using the BLASTN program with an expectation score cutoff of E-010. Potential applications of identifying such putative homologs across different Vitis species include deducing presumptive function of the cloned ESTs, and cloning of ESTs from other varieties based on primers designed from conserved regions. Similarly, such putative homologs can be used to develop SSR and SNP markers for varietal identification and construction of genetic maps for marker assisted breeding. For the V. vinifera dataset, the latest expression profiles downloaded from the UniGene built #21 have been wrapped up with the PHP scripts to dynamically generate digital EST expression profiles across nine different tissue types.
We have designed a V. vinifera and non-vinifera EST-enriched custom high density microarray gene chip with a total of 20,020 ESTs (1,947 from the SSH libraries, 40 from the cDNA-AFLP experiments, 10,014 from V. vinifera, 5,470 from V. shuttleworthii, 1,219 from V. aestivalis, 780 from V. rupestris × V. arizonica and 588 from V. riparia). The database includes analyzed global microarray expression profiles generated in hosts infected with the plant pathogenic bacterium X. fastidiosa, which causes Pierce's disease in grapevines. The data were generated from 36 hybridization experiments from three time points: early (1 week), mid (6 weeks) and late (10 weeks) stages of disease development from both infected and non-infected tissues of stem and leaf from resistant and susceptible genotypes. In addition, DNA sequence information on each of the EST sequences on the custom microarrays along with the spotted probe and annotation details is accessible. Further, the generated expression profiles have been mapped onto 25 metabolic pathways using the TAIR's Pathway Tools Omics Viewer . For this, the normalized and fold change calculated expression values for the Vitis and X. fastidiosa interactions experiments were mapped on to these pathways. Putative homologs Arabidopsis gene IDs of the Vitis ESTs on the microarray chip were used for this purpose. Details of the microarray experiments can be viewed at the website . Two other published custom microarray datasets [11, 12] have also been added to the database. Data will be updated every six months.
Utility and Discussion
On the main search page, there are two side panels with the panel on the left listing hyperlinks to the different search pages and online tools. The three main components, ESTs, microarrays and the bioinformatics tools, are listed here. The panel on the right lists hyperlinks to the Web Pages that describe the contents of the database. Information, such as experimental set up, data analysis and other relevant text, is provided in these pages. A number of useful query interfaces for data mining, analysis and visualization have been developed. This includes simple and advance search forms that facilitate either single query or multiple query search options for both EST and microarrays components.
Under the microarray warehouse, a separate HTML page has been designed that has hyperlinked icons to various metabolic pathways. There are 25 different pathways for each of the 12 microarray experiments studied, created as HTML pages from images generated by mapping the differentially regulated Vitis ESTs as described under the sub section Microarray datasets. A separate Web Page lists different pathways where the Arabidopsis gene IDs has been linked to the putative homologs ESTs in the VitisExpDB to retrieve further information on interesting Arabidopsis genes. In addition, a microarray data repository that will include all the custom microarray data sets is under development.
In addition to having the most current EST data pool with annotation and Gene Ontology curation, VitisExpDB also has other unique features, such as information on putative homologs from different Vitis species, information on their Arabidopsis putative homologs, integration of microarray and EST databases, mapping of transcriptional responses on to metabolic pathways and several data analysis tools.
The VitisExpDB database is a valuable resource for broad applications to Vitis genetics and breeding, genomics, proteomics and genome annotation. Future expansion plans of the database include cataloging splice variants from the ESTs, identifying full-length ESTs based on tentative consensus sequences that are backed up by the genomic data and generation of genomic landscape maps of gene expression. Development of cross reference tools for users to compare data between Affymetrix gene chip array and other custom arrays is also planned as a part of future database expansion. The VitisExpDB database will be updated frequently as and when more information becomes available.
Availability and requirements
The database is open and freely available .
Project name: VitisExpDB database;
Project home page: http://cropdisease.ars.usda.gov/vitis_at/main-page.htm;
Operating system(s): Platform independent;
Programming language: Perl, HTML, MySQL and PHP;
We gratefully acknowledge the financial support from the California Citrus Research Board (CRB project No. 5300-05F) for a portion of this work and California Department of Food and Agriculture's Pierce's Disease Board. We thank the author's of the PLAN server for kindly accommodating our requirement from time to time for large scale data searches.
- Rudd S: Expressed sequence tags: alternative or complement to whole genome sequences. Trends Plant Sci. 2003, 8: 321-329. 10.1016/S1360-1385(03)00131-6.PubMedView Article
- Dong Q, Kroiss L, Oakley FD, Wang BB, Brendel V: Comparative EST analyses in plant systems. Methods Enzymol. 2005, 395: 400-418.PubMedView Article
- Ohlrogge J, Benning C: Unraveling plant metabolism by EST analysis. Curr Opin Plant Biol. 2000, 3: 224-228.PubMedView Article
- Gupta PK, Rustgi S: Molecular markers from the transcribed/expressed region of the genome in higher plants. Funct Integr Genomics. 2004, 4: 139-162. 10.1007/s10142-004-0107-0.PubMedView Article
- The French-Italian Public Consortium for Grapevine Genome Characterization, Jaillon O, Aury JM, Noel B, Policriti A, Clepet C, Casagrande A, Choisne N, Aubourg S, Vitulo N, Jubin C, Vezzi A, Legeai F, Hugueney P, Dasilva C, Horner D, Mica E, Jublot D, Poulain J, Bruyere C, Billault A, Segurens B, Gouyvenoux M, Ugarte E, Cattonaro F, Anthouard V, Vico V, Del Fabbro C, Alaux M, Di Gaspero G, Dumas V, Felice N, Paillard S, Juman I, Moroldo M, Scalabrin S, Canaguier A, Le Clainche I, Malacrida G, Durand E, Pesole G, Laucou V, Chatelet P, Merdinoglu D, Delledonne M, Pezzotti M, Lecharny A, Scarpelli C, Artiguenave F, Pe ME, Valle G, Morgante M, Caboche M, Adam-Blondon AF, Weissenbach J, Quetier F, Wincker P: The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature. 2007, 449: 463-467. 10.1038/nature06148.View Article
- Da Silva FG, Iandolino A, Al-Kayal F, Bohlmann MC, Cushman MA, Lim H, Ergul A, Figueroa R, Kabuloglu EK, Osborne C, Rowe J, Tattersall E, Leslie A, Xu J, Baek J, Cramer GR, Cushman JC, Cook DR: Characterizing the grape transcriptome. Analysis of expressed sequence tags from multiple Vitis species and development of a compendium of gene expression during berry development. Plant Physiol. 2005, 139: 574-597. 10.1104/pp.105.065748.PubMedView Article
- Moser C, Segala C, Fontana P, Salakhudtinov I, Gatto P, Pindo M, Zyprian E, Toepfer R, Grando MS, Velasco R: Comparative analysis of expressed sequence tags from different organs of Vitis vinifera L. Func Integr Genomics. 2005, 5: 208-17. 10.1007/s10142-005-0143-4.View Article
- Peng FY, Reid KE, Liao N, Schlosser J, Lijavetzky D, Holt R, Martinez Zapater JM, Jones S, Marra M, Bohlmann J, Lund ST: Generation of ESTs in Vitis vinifera wine grape (Cabernet Sauvignon) and table grape (Muscat Hamburg) and discovery of new candidate genes with potential roles in berry development. Gene. 2007, 402: 40-50. 10.1016/j.gene.2007.07.016.PubMedView Article
- Lin H, Doddapaneni H, Takahashi Y, Walker MA: Comparative analysis of ESTs involved in grape responses to Xylella fastidiosa infection. BMC Plant Biology. 2007, 7: 8-10.1186/1471-2229-7-8.PubMedPubMed CentralView Article
- Terrier N, Glissant D, Grimplet J, Barrieu F, Abbal P, Couture C, Ageorges A, Atanassova R, Léon C, Renaudin JP, Dédaldéchamp F, Romieu C, Delrot S, Hamdi S: Isogene specific oligo arrays reveal multifaceted changes in gene expression during grape berry (Vitis vinifera L.) development. Planta. 2005, 222: 832-847. 10.1007/s00425-005-0017-y.PubMedView Article
- Waters DL, Holton TA, Ablett EM, Lee LS, Henry RJ: cDNA microarray analysis of developing grape (Vitis vinifera cv. Shiraz) berry skin. Funct Integr Genomics. 2005, 5: 40-58. 10.1007/s10142-004-0124-z.PubMedView Article
- Waters DL, Holton TA, Ablett EM, Lee LS, Henry RJ: The ripening wine grape berry skin transcriptome. Plant Science. 2006, 171: 132-138. 10.1016/j.plantsci.2006.03.002.View Article
- Cramer GR, Ergul A, Grimplet J, Tillett RL, Tattersall EA, Bohlman MC, Vincent D, Sonderegger J, Evans J, Osborne C, Quilici D, Schlauch KA, Schooley DA, Cushman JC: Water and salinity stress in grapevines: early and late changes in transcript and metabolite profiles. Funct Integr Genomics. 2007, 7: 111-134. 10.1007/s10142-006-0039-y.PubMedView Article
- Espinoza C, Vega A, Medina C, Schlauch K, Cramer G, Arce-Johnson P: Gene expression associated with compatible viral diseases in grapevine cultivars. Funct Integr Genomics. 2007, 7: 95-110. 10.1007/s10142-006-0031-6.PubMedView Article
- Grimplet J, Deluc LG, Tillett RL, Wheatley MD, Schlauch KA, Cramer GR, Cushman JC: Tissue-specific mRNA expression profiling in grape berry tissues. BMC Genomics. 2007, 8: 187-10.1186/1471-2164-8-187.PubMedPubMed CentralView Article
- Web address of GrapePLEX. [http://www.plexdb.org/plex.php?database=Grape]
- Web address of then DFCI Grape Gene Database. [http://compbio.dfci.harvard.edu/tgi/cgi-bin/tgi/gimain.pl?gudb=grape]
- The Plant Genome Database: PlantGDB. [http://gremlin3dev.gdcb.iastate.edu/]
- Web address of the KEGG Vitis Metabolic Pathway Maps database. [http://www.genome.jp/kegg-bin/show_organism?menu_type=pathway_maps;org=evvi]
- Web address of the TIGR Plant transcript assemblies. [http://plantta.tigr.org/cgi-bin/plantta_release.pl]
- Web address for the Grape Resistance-gene ESTs and Expressions database (GREED). [http://mtngrv.missouristate.edu/CGB/NortonGeneDatabase/NortonGeneDatabase.htm]
- He J, Dai X, Zhao X: PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results. BMC Bioinformatics. 2007, 8: 53-10.1186/1471-2105-8-53.PubMedPubMed CentralView Article
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nature Genetics. 2000, 25: 25-29. 10.1038/75556.PubMedPubMed CentralView Article
- Web address of the High Throughput Gene Ontology Functional Annotation (Ht-Go-Fat) toolkit [188.8.131.52/ht-go-fat.htm]
- Web address of the GO web site. [http://www.geneontology.org/GO.downloads.shtml]
- Yao J, Lin H, Doddapaneni H, Civerolo EL: nWayComp: A Tool for universal comparison of DNA and protein sequences. Silico Biology. 2007, 20-
- TAIR's Pathway Tools Omics Viewer . [http://www.arabidopsis.org:1555/expression.html]
- Web address of the XCluster software. [http://genetics.stanford.edu/~sherlock/cluster.html]
- Web address of the VitisExpDB. [http://cropdisease.ars.usda.gov/vitis_at/main-page.htm]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.