Whole genome wide expression profiles of Vitis amurensis grape responding to downy mildew by using Solexa sequencing technology

Background Downy mildew (DM), caused by pathogen Plasmopara viticola (PV) is the single most damaging disease of grapes (Vitis L.) worldwide. However, the mechanisms of the disease development in grapes are poorly understood. A method for estimating gene expression levels using Solexa sequencing of Type I restriction-endonuclease-generated cDNA fragments was used for deep sequencing the transcriptomes resulting from PV infected leaves of Vitis amurensis Rupr. cv. Zuoshan-1. Our goal is to identify genes that are involved in resistance to grape DM disease. Results Approximately 8.5 million (M) 21-nt cDNA tags were sequenced in the cDNA library derived from PV pathogen-infected leaves, and about 7.5 M were sequenced from the cDNA library constructed from the control leaves. When annotated, a total of 15,249 putative genes were identified from the Solexa sequencing tags for the infection (INF) library and 14,549 for the control (CON) library. Comparative analysis between these two cDNA libraries showed about 0.9% of the unique tags increased by at least five-fold, and about 0.6% of the unique tags decreased more than five-fold in infected leaves, while 98.5% of the unique tags showed less than five-fold difference between the two samples. The expression levels of 12 differentially expressed genes were confirmed by Real-time RT-PCR and the trends observed agreed well with the Solexa expression profiles, although the degree of change was lower in amplitude. After pathway enrichment analysis, a set of significantly enriched pathways were identified for the differentially expressed genes (DEGs), which associated with ribosome structure, photosynthesis, amino acid and sugar metabolism. Conclusions This study presented a series of candidate genes and pathways that may contribute to DM resistance in grapes, and illustrated that the Solexa-based tag-sequencing approach was a powerful tool for gene expression comparison between control and treated samples.


Background
Downy mildew of grapes occurs in most parts of the world where grapes are grown, but favors those regions that experience warm, wet conditions during the vegetative growth of the vine. A major outbreak of the disease can cause severe losses in yield and berry quality. Symptoms of DM are usually first noticed on leaves as yellowish and later oily lesions on the leaf's upper surface with a 'downy' mass observed on the corresponding underside of the leaf. It can also cause deformation of shoots, tendrils, inflorescences and clusters of young berries. Berries become less susceptible as they mature, however rachis infection can spread into the older fruit which leads to direct crop loss by shelling of berries [1]. Downy mildew is caused by the pathogen Plasmopara viticola (PV). Primary infection begins with the overwintering oospore on infected leaves or plant litter in the soil that germinates in the spring and produces a sporangium [2]. When plant parts are covered with a film of moisture from rain or irrigation, the sporangium releases small swimming spores (zoospores) that are then spread by splashing water. The spores can germinate by producing a germ tube that enters the green tissue (including leaves, inflorescences, bunches and young berries) through the stomates [3]. Secondary infection, which is the major source of disease spread, produces spores that may be mobilized by wind and rain to establish new infection sites. The cycle ends with the sexual production of over-wintering oospores [2].
Different genotypes of grapes show varying level of resistance to PV, ranging from susceptible V. vinifera, to the moderately resistant V. rupestris and V. amurensis, V. cinerea, V. riparia and V. candicans, to the totally resistant Muscadinia rotundifolia [4][5][6]. The world-wide grape industry relies predominantly on V. vinifera, which requires chemical protection to produce healthy fruits. However, such chemicals may have negative environmental impacts and/or pose risk to human health. A promising alternative strategy that could simultaneously improve grape health and limit chemical use is to identify the unique genes or mechanisms from resistant species that could potentially confer resistance to the pathogen or lower presentation of symptoms. These elements may potentially be introduced into V. vinifera through long-term breeding efforts or transgenic methods. With this perspective, it is important to unravel the molecular basis of natural defense responses in resistant grapevines to DM challenge, including identification of the genetic processes that may contribute to resistance.
To understand the mechanism(s) of the host resistance at the molecular level, a critical first step is to identify the transcripts that accumulate in response to the pathogen attack. In this study, "Zuoshan-1", a clonal selection from wild V. amurensis with cold hardiness and high resistance to DM [24], was employed to identify a set of candidate genes associated with DM resistance using Solexa sequencing technology. Solexa sequencing is a technology capable of obtaining novel information for wholegenome-wide transcript expression without prior sequence knowledge. This report presents the finding of these tests.

Inoculation and symptom development
The fourth unfolded leaf from the shoot apex of "Zuoshan-1" was inoculated with PV. No visible symptoms were observed in the first 4 days (Figure 1a and 1b). The 'downy' mass was obviously observed on the 6th day ( Figure 1c) and exacerbated on the 8th day ( Figure 1d). Oil spots emerged gradually on the site of pathogen and the spores did not spread to the other healthy tissues 18 days after inoculation (Figure 1e and 1f ).

Tag identification and quantification
A total of 8,549,948 and 7,527,499 tags were sequenced in infected (INF) and control (CON) libraries, respectively (Table 1). After filtering out low quality tags (tags containing 'N' and adaptor sequences), 8,474,583 and 7,525,307 tags (noted herein as "clean" tags) remained in INF and CON libraries. To increase the robustness of the approach, single-copy tags in the two libraries (247,900 in INF and 253,156 in CON library) were excluded from further analysis. As a result, a total of 8,226,683 and 7,272,151 clean tags remained from the two libraries, from which 233,653 (INF) and 203,514 (CON) unique tags were obtained. There were 30,139 more unique tags in the INF than in the CON library, possibly representing genes related to pathogen interaction and symptom development. The percentage of unique tags rapidly declined as copy number increased, indicating only a small portion of the transcripts were expressed at high level in the conditions tested.

Depth of sampling
Saturation of the library is determined by identification of unique tags. Sequencing reaches saturation when no new unique tags are detected. The results shown in Figure 2 indicate that INF and CON libraries were sequenced to saturation, producing a full representation of the transcripts in the conditions tested. In both libraries fewer unique tags were identified as the number of sequencing tags increases, reaching a plateau shortly after 6 M tags were sequenced. No new unique tags were identified as the total tag number approached 8.5 M in INF library and 7.5 M in CON library.

Annotation analysis of the unique tag
The unique tags were compared against the genome and gene sequences of V. vinifera cv. Pinot Noir [25] using blastn. Tags with a complete match or one base pair mismatch were considered further. The results in Table 2 show that a substantial proportion of tags  (Table 2). These data indicated that  approximately 50% of transcripts predicted in grape are expressed in the infected or control leaves, with more transcripts present in the infected sample. Tags with no homology to grape were compared with blastn to the VBI Microbial Database [26] containing genomic sequence information from Phytophthora sojae, Phytophthora infestans and Hyaloperonospora parasitica. There were 251 tags identified in INF library found to be identical to those of the oomycete during PV infection (additional file 1).

Comparison of gene expression level between the two libraries
Differences of tag frequencies that appeared in the INF and CON libraries were used for estimating gene expression levels in response to PV infection. The transcripts detected with at least two-fold differences in the two libraries are shown in Figure 3 (FDR <0.001). The red dots (3,125) and green dots (1,847) represent transcripts higher or lower in abundance for more than two fold in INF library, respectively. The blue dots represent transcripts that differed less than two fold between the two libraries, which were arbitrarily designated as "no difference in expression". The DEGs with five fold or greater differences in accumulation were shown in Figure 4. A total of 513 genes (about 0.9% total unique tags) increased by at least five fold, and 167 genes (about 0.6% total unique tags) were decreased by at least five fold in the INF library, while the expression level of 98.5% unique tags was within five-fold difference between the two samples.
Of DEGs with differences greater than twenty fold (Table 3), 69 genes were present at higher levels in the INF library, 67 of which were associated with defense (6), transport (3), transcription (11), signal transduction (14) and metabolism (33). The highest DEG was phosphate-induced protein gene which was present at 229 fold of control levels. Among these highly expressed genes, many were associated with senescence, abiotic and biotic stresses.
Fifteen DEGs were less abundant in the INF library. Those present twenty fold or more in the CON library were also listed in Table 3, in which 13 genes were classified as defense (2) and metabolism (11), including genes encoding cytochrome P450 and PR proteins. The greatest differences between INF and CON DEGs were (-)-germacrene D synthase and immunoglobulin/major histocompatibility complex that both were present 164fold lower in the INF library than in the CON library.

Real-time RT-PCR analysis
In order to validate Solexa expression profiles, the steady-state transcript levels of 12 "defense related" genes were analyzed. Among them, seven genes (CHI4D, TL3, PR10, TIP2;1, CYSP, ERF4, STS5) were upregulated and five genes (THX, SHM1, HypP, GLO, ClpP) were downregulated ( Figure 5). Actin, tested to be stable in our previous work, was chosen as a reference gene for data normalization. The trend of RT-PCR based expression profiles among these selected genes was similar to those detected by Solexa-sequencing Note: *percentage of matched tags/total tags; # percentage of matched genes/total assembled CDs of "Pinot Noir".

Figure 3
Comparision of gene expression level between the two libraries. For comparing gene expression level between the two libraries, each library was normalized to 1 million tags. Red dots represent transcripts more prevalent in the infected leaf library, green dots show those present at a lower frequency in the infected tissue and blue dots indicate transcripts that did not change significantly. The parameters "FDR <0.001" and "log2 Ratio ≥ 1" were used as the threshold to judge the significance of gene expression difference.
based method. However, the scales of difference between the INF and CON were generally smaller in Real-time PCR (1-18 fold differences) than in those detected by the Solexa-sequencing based method (2 -57 folds) ( Table 4).

Pathway enrichment analysis of DEGs
The PV affected biological pathways were evaluated by enrichment analysis of DEGs. Significantly enriched metabolic pathways and signal transduction pathways were identified. A total of 115 pathways were affected by up-and 107 were affected by down-regulated DEGs, respectively (additional file 2 and 3). DEGs with pathway annotation were listed according to enrichment priority (additional file 4 and 5). The first ten enriched pathways were reported in Table 5. Pathways with Q value < 0.05 are significantly enriched. Ribosomal-associated proteins constituted the only significantly affected pathway for the upregulated DEGs (Q <0.05). Other non-significant enriched pathways with large number of upregulated DEGs included amino sugar and nucleotide sugar metabolism, starch and sucrose metabolism, secondary metabolism, plant hormone biosynthesis, and splicesome associated proteins. There were more significantly enriched pathways (10) for the downregulated DEGs, which were involved in photosynthesis, as well as metabolism of folate, nicotinate, nicotinamide, fructose, mannose, pyruvate, polyketide sugar unit, and purines, along with alkaloids from histidine and purines.

Discussion
In this report Solexa sequencing technology, a highthroughput DNA sequencing approach, was utilized to estimate gene expression in libraries prepared from infected and control tissues. The results ( Figure 2) provided estimates of gene expression as determined by the frequency that any given tag (representing a transcript) is sequenced. The data indicate that there is sufficient coverage depth to reach saturation, that is, a complete assessment of all transcripts present in the libraries.
Theoretically, the rate of novel tag discovery should equal zero if all unique tags of the initial sample had been sequenced. However, this number might be slightly higher because new tags may be added due to the accumulation of sequencing errors as the size of the library increased [27]. Strict filtering and conservative matching allows recognition of erroneous tags, which are then disregarded. All of these precepts may contribute to a loss of substantial sequence information. However, loss of some data potentially made the results more conservative, revealing only robust and bona fide differences. Moreover, the total number of tags after stringent filtering was sufficient for annotation to the reference genes in the grape genome sequence. Theoretically, tags should be generated by NlaIII from the 3'-most ends of transcripts, but almost 50% of tags from other NlaIII sites were also generated in our result. Since only one tag could be generated in each transcript from any NlaIII site in a cDNA, these other NlaIII tags represented a given gene redundantly in the expression profile. This phenomenon accounts for the inflated number of unique tags generated (about 200,000) relative to that of the annotated grape genome (about 30,000). These other tags may also arise because of alternative splicing or incomplete enzyme digestion. The results represent the first large-scale investigation of the gene expression in DM analysis of grapevine. Polesani et al [28] reported 804 transcripts identified in PV infected leaves of susceptible cultivar "Riesling" using cDNA-AFLP. Figueiredo et al [29] found 121 transcripts, representing 29 unique gene differentially expressed between two V. vinifera cultivars "Regent" and "Trincadeira" (resistant and susceptible to fungi, respectively) by cDNA microarray. In the current study, 15,249 putative genes were identified among the Solexa sequencing tags for the INF library and 14,549 for the CON library. The steady-state transcript level for a set of selected genes was confirmed by Real-time RT-PCR. Although the differences in gene expression did not match the magnitude of those detected by Solexa-based sequencing method, the trends of up-and down-regulation were similar. The lower expression level detected by Real-time The "x" axis represents fold-change of differentially expressed unique tags in the INF library. The "y" axis represents the number of unique tags (log10). Differentially accumulating unique tags with a 5-fold difference between libraries are shown in the red region (98.49%). The blue (0.89%) and green (0.61%) regions represent unique tags that are up-and downregulated for more than 5 fold in the INF library, respectively.  RT-PCR could be due to the difference of sensitivity between the two technologies. Solexa sequencing has been documented to be more sensitive for estimation of gene expression, especially for low-abundance transcripts compared to microarrays and Real-time RT-PCR [30]. The difference could also be attributed to different inoculation seasons and developmental stages of the grapevines. The materials used for the Solexa sequencing method were obtained from materials inoculated and harvested in September, while materials used for the Real-time RT-PCR analyses were obtained from plants inoculated and harvested in June.
Due to the sensitivity of Solexa sequencing technology, many rare transcripts were detected. Among 536 transcripts present predominantly (<2-20 fold) in the INF library, 89 were not detected in the CON library at all. These genes were predicted to be involved in many plant  biological processes, including defense. For example, genes encoding cinnamyl alcohol dehydrogenase, lipaselike protein, glutathione synthetase, GDSL-motif lipase, ankyrin repeat family protein, serine hydrolase, prolinerich cell wall protein and multicopper oxidase were previously described as plant defense-related genes. Other rare transcripts detected by Solexa technology were predicted to function in signal transduction (protein kinase, calcium ion binding protein, wall-associated kinase), transport (type IIIa membrane protein, ATP  binding protein, D-galactonate transporter, peptide transporter), transcription (ccaat-binding transcription factor, AP2/ERF domain-containing transcription factor, mutator-like transposase-like protein), and protein metabolism (ubiquitin-protein ligase, 50S ribosomal protein, Slocus-specific glycoprotein S13 precursor, Rab5-interacting protein). Two novel genes (nectar protein 1, vernalization-insensitive protein) and some genes encoding hypothetical proteins (LOC100244011, LOC100258240, LOC100249110) were also identified from the PVinduced rare DEGs. Among the 608 rare transcripts present more in CON than INF, 69 were not detected at all in the INF library. Most of these transcripts have predicated biological functions in growth regulation (growth regulator protein, A-type cyclin, auxin response factor 8), transport (ATP-binding cassette transporter, AWPM-19like membrane family protein, copper-transporting atpase p-type), signal transduction (serine-threonine protein kinase, leucine-rich repeat family protein, calciumbinding EF hand family protein, calcium-dependent phospholipid binding ), and metabolism (galacturonosyltransferase 6, methylenetetrahydrofolate dehydrogenase, iron ion binding/oxidoreductase, trehalose-6-phosphate synthase, senescence-associated protein). Pathway enrichment analysis revealed the most significantly affected pathways during the PV infection in "Zuoshan-1". It is not surprising that the "ribosomerelated" pathway was the most affected for the DEGs more common in INF library. This finding implies that the grapevine utilizes new ribosomes or changes in ribosome components to help synthesize additional proteins, such as PR proteins, to protect itself from the pathogen attack. The second affected pathway was the "amino sugar and nucleotide sugar metabolism" pathway. In this pathway genes encoding chitinase were more prevalent in the INF than the CON library. In addition, genes required for cell wall biosynthesis were also affected, such as D-xylan synthase, UDP-glucose dehydrogenase, and UDP-glucose 4,6-dehydratase. These enzymes are involved in the interconversion of nucleotide sugars, and may regulate glycosylation patterns in response to pathogen, thereby linking signaling with primary metabolism and the dynamics of the extracellular matrix. The other noticeable pathways with a large amount of DEGs associated with PV infection were starch and sucrose metabolism, secondary metabolism, plant hormone biosynthesis, and splicesome-associated proteins. For DEGs less prevalent in infected vs. control libraries, there was significant enrichment for transcripts associated with photosynthesis. This result was similar to the reports of Polesani et al [28,31]. Photosystem I proteins (PsaA, PsaB, PsaC), photosystem II proteins (PsbB, PsbD, PsbO, PsbP, PsbS), cytochorme b6/f complex (PetD, PetN) and F-type ATPase (beta, alpha, delta, a, b) were all substantially lower in abundance in INF libraries compared to CON libraries. The reduction of photosynthesis was possibly due to the increase of invertase activity in nucleotide sugar metabolism pathway. Invertase would cleave sucrose into hexose sugars and their accumulation inhibits the Calvin cycle.
It was observed that 251 tags identified in INF library were homologous to the oomycete, indicating that they may belong to PV transcripts, predictably noting the presence of the pathogen. Many of these putative PV transcripts corresponded to genes involved in protein metabolism (16S, 18S, 26S, 28S and 60S ribosomal protein subunits) as a requirement for protein synthesis in the pathogen during the plant-pathogen interaction. Many housekeeping genes (alpha-tubulin, elongation factor 1 alpha, ubiquitin and heat shock protein 70) and genes related to immune response (spike 1 protein and cyclophilin) were also detected. Several PV transcripts showed similarity to enzymes involved in carbohydrate and amino acid metabolism (chlorophyll apoprotein, aspartate aminotransferase, glutamine synthetase and hyaluronoglucosaminidase-4), energy production (ATP synthase subunit B, glyceraldehyde-3-phosphate dehydrogenase, phosphoenolpyruvate carboxykinase and nitrate reductase), and cellular transport (transportin 1, K + channel protein and calmodulin).

Transcripts more abundant in infected leaves
A set of transcripts were clearly more abundant in tissue arising after PV infection compared to control. This group possibly contains elements that confer resistance to the spread of the pathogen in "Zuoshan-1". Among these transcripts, those expressed at a relatively high level in infected tissue are of the most interest. These transcripts likely encode genes responding to the pathogen or genuine factors that underlie genetic resistance, which were broadly grouped into the following categories based on their known roles in other plant systems.

Defense response genes
Among defense response genes, thaumatin-like protein [17], polygalacturonase-inhibiting protein (PGIP) [32,33], harpin-induced protein-related [34,35], glutaredoxin [36,37] and beta-glucosidase [38,39] have been widely studied in plant pathogen resistance. Thaumatin-like protein, like many other disease resistant proteins [40], is also induced by abiotic stresses, which may indicate existence of a crosstalk between pathogen and abiotic stresses. In this category, tobacco mosaic virus (TMV) response -related protein (+32 fold in INF vs CON) is associated with TMV attack and may also play an important role in DM resistance of grape.

Transport
Three transcripts were associated with transport function. Multidrug resistance pump proteins (+121 fold in INF vs CON) and multidrug resistance ABC transporter (+25 fold in INF vs CON) are well known transporters in clinical study for bacteria infection of human [41]. Such transporters also have been isolated from plants, such as Coptis japonica [42]. They transport several compounds associated with multidrug (antibiotic) resistance which can inhibit pathogen infection in animal model [41,43]. Another gene identified to be transport related is mitochondrial dicarboxylate carrier protein (+38 fold in INF vs CON) which might be involved in the excretion of organic acids and rhizotoxic aluminum tolerance [44].

Signal transduction
There were fourteen transcripts in our results associated with signal transduction. Two came from genes (GSVIVT00030628001, GSVIVT00030574001) encoding leucine-rich repeat receptor-like protein kinases which were more prevalent (145 and 20 fold) in the INF library than in control. Molecules that indicate the presence of pathogen (elicitors) activate host receptors and that rapidly generate an internal signal that triggers early defense responses [45]. Various signals presented in our results, including phytohormones like ABA and ethylene, as well as intracellular messengers like calcium, phosphoinositide and kinases, have been proposed to regulate plant responses in adverse environmental conditions and thus contribute to the coordination of plant stress physiology [46]. Transcripts representing three kinase-encoding genes (GSVIVT00030628001, GSVIVT00006178001, GSVIVT00019504001) were present 52-145 fold higher in INF than CON, and have been widely documented as signaling factors in many stresses [47][48][49][50] and senescence [51]. Four transcripts (GSVIVT00002706001, GSVIVT00020989001, GSVIVT00036549001, GSVIVT00002973001) were found to be more abundant (27 to 39 fold) in INF than CON, and were associated with calcium signaling pathway. All of these are also induced by senescence [52] and many stresses [53,54]. Nodulin-like protein (+23 fold in INF vs CON) induced in fungal pathogen treatment [55] and drought/heat combination stress [40] has been shown to be involved in salicylic acid (SA) signaling pathway [56]. A RING-H2 gene (+22 fold in INF vs CON) has demonstrated regulatory function in ABA signaling [57], drought tolerance [57], regulation of growth and defense responses against abiotic/biotic stresses [58]. Ethylene-regulated transcript 2 (ERT2) (+34 fold in INF vs CON) is involved in ethylene response 'circuit' including ethylene synthesis, perception, signal transduction and regulation of gene expression [59]. The PAR-1a (photoassimilate-responsive) protein (+22 fold in INF vs CON) is a serine/threonine kinase with diverse phosphorylation targets and has been reported to be induced by infection with potato virus Y [60,61].

Synthesis of the hormones
S-adenosyl-L-methionine (GSVIVT00024884001) and 9cis-epoxycarotenoid dioxygenase 1(NCED1) (GSVIVT00000988001) are transcripts related to synthesis of plant hormones, and were found more frequently (97 and 62 fold, respectively) in the INF library. S-adenosyl-L-methionine is the precursor of ethylene [70] which participates in regulation of growth, development, and responses to stress and pathogen attack in plants [71]. NCED is an important enzyme in synthesizing the phytohormone ABA which plays a central role in responses to pathogen attack [72].

Secondary metabolism
This subcategory contained 4 genes, including a higher level of tropinone reductase (GSVIVT00018424001, +48 fold in INF vs CON) transcript in infected leaves, consistent with previous reports showing it to be more abundant after pathogen infection [81]. Isoflavone reductase-like protein 3 (GSVIVT00019233001, +31 fold in INF vs CON) also has a potential pathogen resistance role because it is involved in biosynthesis of isoflavonoid phytoalexins [82], an important product in resistance to pathogen infection [83,84]. UDPglucose glucosyltransferase (GSVIVT00002450001, + 24 fold in INF vs CON) and galactinol synthase (GSVIVT00019669001, + 24 fold in INF vs CON) are reported to be induced by abiotic stresses [85,86].

Cell wall organization
Three genes were classified into this subcategory. Cellulose synthase-like D1 (GSVIVT00014029001, + 31 fold in INF vs CON) and beta-expansin 1a precursor (GSVIVT00036225001, + 27 fold in INF vs CON) contribute to cell wall synthesis and modification [87,88]. The wound-induced protein (WIN2) (GSVIVT00007452001, + 26 fold in INF vs CON) with anti-fungal activity [89] possesses a domain that binds PAMP (pathogen-associated molecular patterns) elicitors (e.g., chitin) [90] and is induced in response to pathogen. In addition, other highly expressed metabolic genes in the INF samples were glucose-1-phosphate adenylyltransferase (GSVIVT00036349001, + 24 fold in INF vs CON), cytochrome P450 (GSVIVT00014730001, + 70 fold in INF vs CON) and serine acetyltransferase (GSVIVT0000-7984001, + 30 fold in INF vs CON). These transcripts are related to carbohydrate metabolism, photosynthesis and cysteine synthesis. Cysteine synthesis has reported to respond to oxidative stress by calcium signaling [91].

Transcripts less abundant in infected leaves
The most striking functions for transcripts less abundant in infected tissue were those associated with metabolism and defense response to pathogen attack. Fifteen DEGs were detected to be less prevalent in the INF libraries more than 20 fold compared to CON, most of which, such as (-)-germacrene D synthase [92], non-specific lipid transfer protein [93], major histocompatibility complex [94], thioredoxin [95], beta-cyano-alanine synthase [96], expansin [97] and UDPglucosyltransferase [98] are reported to be positively associated with plant defense responses to pathogen attack. However, our data indicated that the expression level of these transcripts was lower in infected tissues.
Another two transcripts that were less prevalent in infected tissue (GSVIVT00014727001, -35 fold in INF vs CON; GSVIVT00014725001, -41 in INF vs CON) belong to cytochrome P450 family with oxidative function. Interestingly, a novel gene encoding male sterility-related protein was also identified in this group, and its function associated with DM response has not been clarified.

Conclusions
Solexa-based sequencing can be used for analyzing variation in gene expression between two samples. The gene expression level in "Zuoshan-1" leaves infected with PV changed significantly in comparison with control leaves. Analysis of differentially-expressed genes involved in the pathogen infection allows delineation of candidate genes potentially relevant to DM resistance in grapevines.

Plants material and pathogen infection
One-year-old, certified virus-free seedlings of "Zuoshan-1" were grown and maintained in the greenhouse under a 16h light/8-h dark photoperiod at 25°C, 85% relative humidity. Control plants were maintained under the same conditions. P. viticola was collected from sporulated field leaves and used for the artificial inoculations of surface-sterilized leaves. Infections were conducted by dipping the fourth grapevine leaves in a suspension of 10,000 sporangia per ml pure water. The leaves were covered with plastic bags for one night to ensure high humidity. The fourth unfolded leaf from the shoot apex was harvested from each of three vines, and the three leaves were combined to represent one replicate. Three independent replicates were collected for each sample. Infected leaves were collected every 24 h for 9 days. Control samples were harvested from water-treated leaves incubated under the same conditions.

Preparation of Digital Expression Libraries
Samples from infected leaves from 4 d to 8 d were pooled for RNA isolation and library construction. Comparable control leaves were treated identically and in parallel. Total RNA was isolated from the leaf mixture using a modification of the CTAB method as presented by Murray and Thompson [99]. Sequence tag preparation was done with the Digital Gene Expression Tag Profiling Kit (Illumina Inc; San Diego, CA, USA) according to the manufacturer's protocol (version 2.1B). Six micrograms of total RNA was extracted and mRNA was purified using biotin-Oligo (dT) magnetic bead adsorption. First-and second-strand cDNA synthesis was performed after the RNA was bound to the beads. While on the beads, double strand cDNA was digested with NlaIII endonuclease to produce a bead-bound cDNA fragment containing sequence from the 3'-most CATG to the poly (A)-tail. These 3' cDNA fragments were purified using magnetic bead precipitation and the Illumina adapter 1 (GEX adapter 1) was added to new 5' end. The junction of Illumina adapter 1 and CATG site was recognized by MmeI, which is a Type I endonuclease (with separated recognition sites and digestion sites). The enzyme cuts 17 bp downstream of the CATG site, producing 17 bp cDNA sequence tags with adapter 1. After removing 3' fragments with magnetic bead precipitation, the Illumina adapter 2 (GEX adapter 2) was ligated to 3' end of the cDNA tag. These cDNA fragments represented the tag library.

Solexa sequencing
Sequencing was performed by "HuaDa Gene" [100] with the method of sequencing by synthesis. A PCR amplification with 15 cycles using Phusion polymerase (Finnzymes, Espoo, Finland) was performed with primers complementary to the adapter sequences to enrich the samples for the desired fragments. The resulting 85 base strips were purified by 6% TBE PAGE Gel electrophoresis. These strips were then digested, and the singlechain molecules were fixed onto the Solexa Sequencing Chip (flow cell). Each molecule grew into a single-molecule cluster sequencing template through in situ amplification. Four color-labeled nucleotides were added, and sequencing was performed with the method of sequencing by synthesis. Image analysis and basecalling were performed using the Illumina Pipeline, and cDNA sequence tags were revealed after purity filtering. The tags passing initial quality tests were sorted and counted. Each tunnel generates millions of raw reads with sequencing length of 35 bp (target tags plus 3'adaptor). Each molecule in the library represented a single tag derived from a single transcript.

Sequence annotation
"Clean Tags" were obtained by filtering off adaptor-only tags and low-quality tags (containing ambiguous bases). Comparison of the sequences by blastn was carried out using the following databases: NCBI [101], Genoscope Grape Genome database [25] and VBI Microbial Database [26]. All clean tags were annotated based on grape reference genes. For conservative and precise annotation, only sequences with perfect homology or 1 nt mismatch were considered further. The number of annotated clean tags for each gene was calculated and then normalized to TPM (number of transcripts per million clean tags) [30,102]. Sequences were manually assigned to functional categories based on the analysis of scientific literature.

Identification of differentially expressed genes (DEGs)
A rigorous algorithm to identify differentially expressed genes between two samples was developed [103]. P value was used to test differential transcript accumulation. In the formula below the total clean tag number of the CON library is noted as N1, and total clean tag number of INF library as N2; gene A holds x tags in CON and y tags in INF library. The probability of gene A expressed equally between two samples can be calculated with: FDR (False Discovery Rate) was applied to determine the threshold of P Value in multiple tests and analyses [104]. An "FDR < 0.001 and the absolute value of log2-Ratio ≥ 1" was used as the threshold to judge the significance of gene expression difference.

Real-time RT-PCR analysis
Samples were prepared using the same method mentioned above and total RNA was isolated from the leaf mixture. Experiments were carried out on three independent biological replicates each containing three technical replicates. First-strand cDNA was synthesized from 650 ng DNase (Promega, Madison, Wisconsin, USA) -treated total RNA using "ImProm-II TM Reverse Transcriptase" (Promega, Madison, Wisconsin, USA) and diluted 20 fold as template. Specific primer pairs of twelve randomly selected genes were designed (Table 4) using Primer Express 3.0 and tested by Real-time RT-PCR. Primers specific for V. vinifera actin (Forward: AATGTGCCTGCCATGTATGT; Reverse: TCACAC-CATCACCAGAATCC) were used for the normalization of reactions. Experiments were carried out using Power SYBR Green PCR Master Mix (Applied Biosystems, Warrington, UK) in a StepOne™ Real-Time PCR System (Applied Biosystems). The reaction volume was 20 μl, including 10 μl Power SYBR Green PCR master mix, 0.9 μl 10 mM primer, 2.0 μl cDNA sample and 6.20 μl dH2O. The following thermal cycling profile was used: 95°C 10 min; 40 cycles of 95°C for 15 s, 59°C for 1 min; 95°C for 15 s, 60°C for 1 min, 95°C for 15 s. Data were analyzed using StepOne™ Software Version 2.0 (Applied Biosystems). Actin expression was used as an internal control to normalize all data. The fold change in mRNA expression was estimated using threshold cycles, by the ΔΔCT method [105].

Pathway Enrichment Analysis of DEGs
Pathway enrichment analysis based on KEGG [106] was used to identify significantly enriched metabolic pathways or signal transduction pathways in differentiallyexpressed genes comparing with the whole genome background. The calculating formula is: where N is the number of all genes that with KEGG annotation, n is the number of DEGs in N, M is the number of all genes annotated to specific pathways, and m is number of DEGs in M. Q value was used for determining the threshold of P Value in multiple test and analysis [107]. Pathways with Q value < 0.05 are significantly enriched in DEGs.