Generation and analysis of 9792 EST sequences from cold acclimated oat, Avena sativa
© Bräutigam et al; licensee BioMed Central Ltd. 2005
Received: 11 March 2005
Accepted: 01 September 2005
Published: 01 September 2005
Oat is an important crop in North America and northern Europe. In Scandinavia, yields are limited by the fact that oat cannot be used as a winter crop. In order to develop such a crop, more knowledge about mechanisms of cold tolerance in oat is required.
From an oat cDNA library 9792 single-pass EST sequences were obtained. The library was prepared from pooled RNA samples isolated from leaves of four-week old Avena sativa (oat) plants incubated at +4°C for 4, 8, 16 and 32 hours. Exclusion of sequences shorter than 100 bp resulted in 8508 high-quality ESTs with a mean length of 710.7 bp. Clustering and assembly identified a set of 2800 different transcripts denoted the Avena sativa cold induced UniGene set (AsCIUniGene set). Taking advantage of various tools and databases, putative functions were assigned to 1620 (58%) of these genes. Of the remaining 1180 unclassified sequences, 427 appeared to be oat-specific since they lacked any significant sequence similarity (Blast E values > 10-10) to any sequence available in the public databases. Of the 2800 UniGene sequences, 398 displayed significant homology (BlastX E values ≤ 10-10) to genes previously reported to be involved in cold stress related processes. 107 novel oat transcription factors were also identified, out of which 51 were similar to genes previously shown to be cold induced. The CBF transcription factors have a major role in regulating cold acclimation. Four oat CBF sequences were found, belonging to the monocot cluster of DREB family ERF/AP2 domain proteins. Finally in the total EST sequence data (5.3 Mbp) approximately 400 potential SSRs were found, a frequency similar to what has previously been identified in Arabidopsis ESTs.
The AsCIUniGene set will now be used to fabricate an oat biochip, to perform various expression studies with different oat cultivars incubated at varying temperatures, to generate molecular markers and provide tools for various genetic transformation experiments in oat. This will lead to a better understanding of the cellular biology of this important crop and will open up new ways to improve its agronomical properties.
Avena sativa (oat) belongs to the Poaceae family. Other cereals in this family are wheat, barley and rye . Wild oats are diploid, but all cultivated oats are hexaploid with an estimated 1C genome size of 13.23 pg, corresponding to about 13000 Mbp . The commercial value of oat is derived both from its high-energy grain and from superior break-crop benefits. Oat plantations also have a comparatively low input demand of insecticides, fungicides and fertiliser due to high disease tolerance and low nourishment requirements of this crop . Today oat is mainly used as animal feed, but it is one of the most promising future cereals in the functional food area. It has unique and well-documented cholesterol lowering effects, as a result of its soluble dietary fibres and high β-glucan content [4–6]. An oat diet greatly improves the well feeling of persons with celiac disease and also reduces the risk of diet-related diseases [7–9]. Oat is rich in natural phenolic antioxidants [10–14], which prevent the development of cardiovascular disease and certain cancers. In Sweden most of the harvested oat is used for animal feed. Only about 35000 tons per year are used for human consumption. However, due to its many health-enhancing properties, the market for oat and processed oat products like oat milk for human consumption is rapidly growing.
Many European countries grow winter oat, i.e. oat that is sown in the autumn and survives the winter in the field. Winter oat therefore has a longer growth season compared to summer varieties, allowing an earlier harvest and giving a higher yield. However, inherently oat is not as winter hardy as rye, wheat and barley. Due to the harsh climate in the Scandinavian countries, winter oat is therefore not grown there. Based on the English experience with winter oat, a Swedish winter oat would probably increase the yield of the harvest by at least 30% (John Valentine, IGER, UK, personal communication). In addition, since oat is the most important rotation crop for wheat and oil crops, an early harvest of winter oat would mean an earlier sowing also of the rotation crops, resulting in increased yields also for these crops. To develop a winter oat suitable for the Scandinavian climate is therefore of high priority (Anders Jonsson, The SwedishFarmers Supply and Crop Marketing Co-operative, personal communication). Since cold hardiness is a quantitative trait controlled by several genes , the traditional plant breeding programs have so far been of limited success in improving the cold hardiness for any of the important crop species  and the Swedish oat breeders have more or less given up their efforts to produce a Swedish winter oat (Rickard Jonsson, Svalöf Weibull AB, personal communication).
A cost efficient and rapid way to obtain new data from an organism with a large, complex and unknown genome is through partial sequencing of randomly selected cDNA clones. The resulting collection of expressed sequence tags (ESTs) reflects the level and complexity of gene expression in the sampled tissue and will also give an insight into gene structure of the chosen organism. This not only leads to the identification of a number of genes from the new organism that have known or putative functions but also to the discovery of completely novel, previously unknown putative proteins.
In this work, starting from 9792 EST sequences, we identified 2800 putative oat genes, several of which showed similarities to genes previously defined as cold stress related or involved in transcriptional regulation, signal transduction or metabolism. Several sequences that could represent new, unknown and unique oat genes were also identified. This data will now be used to study cold-acclimation in oat, to identify key genes in regulating winter survival, to produce molecular markers to facilitate the breeding for winter hardiness and to construct transgenic oat with increased freezing tolerance. These experiments will increase our general knowledge about the physiology of cold acclimation in plants in general and in oat in particular and in the long term allow the development of a Scandinavian winter oat.
Cold acclimation in oat. Frost damage in non-acclimated (20°C) and acclimated (12 or 24 h at +4°C) oat spring varieties Birgitta and Matilda and winter varieties Gerald and 83-48-CH. Plants were incubated at -15°C for 3 h, 6 h and 12 h and then recovered for 1 week in the greenhouse. Frost damage was recorded on a scale from 1 to 5, where 1 represents no damage and 5 represents dead plants.
12 h acclimation
24 h acclimation
To confirm cold acclimation on the molecular level a known marker for cold acclimation was investigated. Total RNA was isolated from 3 week-old plants of the winter variety Gerald after incubation at +4°C for 4, 16, 32 and 64 hours. An oat sequence, corresponding to the wheat COR410 gene, known to be cold inducible in wheat , was amplified by PCR from genomic oat DNA. A northern analysis using RNA isolated from cold induced plants and oat COR410 as a probe, showed that the COR410 gene expression could be detected after 4 h and then remained strongly expressed even after 64 h of cold incubation (Figure 1E).
EST sequencing and UniGene set construction
Annotation and functional classification
The annotation of the AsCIUniGene set is based on homology. Each gene in the AsCIUniGene set inherited the annotation from the best match found after a BlastX search against nr protein database at NCBI. An expectation value (E value) threshold of 10-10 was used. All sequences, in total 427, that had E values above this threshold were annotated as unknown.
The most abundant ESTs
To analyze the most abundant ESTs from the cold induced cDNA library we grouped the sequences by means of the KOG (Clusters of Eukaryotic Orthologous Groups of Proteins) database. This database currently includes 7 eukaryotic genomes, including Arabidopsis. This gave a functional annotation of sequences based on orthologous proteins . In addition we used complementary databases to annotate our sequences like FOGs (Fuzzy Orthologous Group), which contains proteins with promiscuous domains that has not been assigned a KOG identity due to unclear orthologous relationships, TWOGs, which contains provisional clusters of proteins that are represented in two genomes and LSEs, which contains proteins that are lineage-specific expansions of paralogs present in the KOG database. The classification was based on best homology match of BlastX searches against Arabidopsis protein sequences where an expectation value (E value) threshold of 10-10 was used. Proteins annotated in this way were termed "KOG-TWOG-LSE". Since not all Arabidopsis proteins are represented in the KOG database, not all ESTs could be annotated with a KOG-TWOG-LSE. Oat ESTs that had a homology match in Arabidopsis but not a KOG-TWOG-LSE annotation, inherited the annotation from MIPS annotation in MATDB. In addition, several of the sequences did not have an Arabidopsis homolog match with an E value above the threshold. These sequences were annotated with the best homolog match from a BlastX search against the nr protein database at NCBI. Again, an E value threshold of 10-10 was used.
The 20 most frequent, randomly picked, EST sequences in two different oat leaf libraries. Gene family, EST sequences identified as belonging to the indicated gene family; CI, total number of genes found in the indicated family in the cold induced leaf library; % of total, relative amount of genes in the indicated gene family, NI, total number of genes found in the indicated family in the non-induced leaf library.
% of total
% of total
Chlorophyll a/b binding
Ribulose 1, 5-bisphosphate carboxylase/oxygenase small chain
Ribulose bisphosphate carboxylase/oxygenase activase
Carbonic anhydrase 2
Cold-induced COR410 (Wcor410)
Photosystem II oxygen-evolving complex (PsbP1)
LEA/RAB-related COR protein (Wrab17)
Photosystem II oxygen-evolving complex (PsbO2)
Photosystem II (PsbR)
Photosystem I reaction centre subunit XI
Photosystem I reaction centre subunit psaN
Cold stress related oat genes. Distribution of genes in the AsCIUniGene set into different functional categories. Functional class, determined as described in Methods; AsCIUniGene set, the set of 2800 different oat genes; CSDB, set of cold related genes extracted from the public domain; % CSR, relative number of oat genes in each family that is similar to cold related genes.
Cell cycle, DNA processing, cell fate and development
Cell rescue, defence and virulence
Cellular communication and signal transduction mechanisms
Cellular transport and transport mechanisms
Percentage of cold stress related sequences in different UniGene sets. UniGene specifies the different UniGene sets, which were: AsCI, Avena sativa cold induced; AsNI, Avena sativa non-induced; TaCI, Triticum aestivum cold induced; HvCI Hordeum vulgare cold induced. Amount, refers to the number of sequences that were analysed. In CSDB, indicates how many of the total genes that also were present in the collection of cold stress related genes. Cold related (%), gives the percentage of genes in each set that were cold stress related.
Cold related (%)
Phylogenetic analysis of AP2 containing proteins
Classification of oat transcription factors. Distribution of the 107 transcription factors, identified in the AsCIUniGene set, in different families classified according to Reichmann et al, 2000.
No of genes
Other transcription factors
List of sequences containing the ERF/AP2 DNA-binding domain. The list of ERF/AP2 sequences used in the phylogenetic analysis (figure 4), some of the sequences were also used in the ClustalW alignment (figure 5). The first two letters in the protein name represent the initial letters of the Latin binomial, followed by the gene abbreviation. Each sequence is assigned to one of three different subgroups of the ERF/AP2 superfamily. A. sativa sequences are grouped according to our phylogenetic analysis (Figure 4). All GI accession numbers correspond to protein sequences in gene bank at NCBI , and the EMB accession numbers correspond to EST sequences available in the EMBL-nucleotide sequence database .
Expression of the AsCBFgenes
Identification of microsatellites
Work is now in progress to determine which of these SSRs can be reproducibly amplified by PCR, are polymorphic, and can be linked to a phenotypic marker. The vast majority of the oat SSRs were found in non-coding DNA. Since they nevertheless represent actively transcribed genes, we expect that several of these will turn out to be useful markers for breeding.
Plant expressed sequence tags (ESTs) have proven to be valuable tools in molecular biology research and a number of collections from many different plant species are now publicly available . In cereals, which are the most important food providers on earth, several major EST sequencing projects have been carried out. At the time of writing, there are 284 779 publicly available ESTs from Oryza sativa (rice), 562 786 from Triticum aestivum (wheat) and 367 768 from Hordeum vulgare (barley). In contrast, there are only 7 624 entries for Avena sativa (oat) and no sequences from cold acclimated oat are available. Obviously, there is a great need for more EST sequencing also on this important crop. Here we contribute an additional 9 792 sequences, originating from cold-acclimated oat, to the research community. Since we were mainly interested in genes involved in the perception, signal transduction and early regulation of cold acclimation, we focused on short incubation times from a few minutes to 24 h. Already after 12 hours acclimation, there was a clear difference in freezing tolerance between acclimated and non-acclimated plants and winter varieties were more tolerant than spring varieties (Table 1). To confirm that cold induced genes were overrepresented in these plants, a northern analysis was performed on an oat gene corresponding to the previously described cold induced wheat COR410 gene on RNA isolated from several different time points during cold acclimation at +4°C. This revealed that the diagnostic COR410 gene was cold induced also in oat and, interestingly, the peak expression level was higher in the winter varieties (Figure 1 and Table 1). The same tendency with earlier induction and higher expression levels was also seen for other cold induced genes (data not shown).
Pooled leaves from confirmed cold induced plants were used for cDNA construction and EST sequence generation. Since leaves were used as the RNA source, the most common ESTs in our collection represent various genes involved in photosynthesis, like chlorophyll a/b binding protein, ribulose 1, 5-bisphosphate carboxylase/oxygenase, ribulose bisphosphate carboxylase/oxygenase activase, photosystem I reaction centre protein, fructose-bisphosphate aldolase, carbonic anhydrase/carbonate dehydrase and photosystem II oxygen-evolving complex proteins. Other well-represented sequences are those encoding ribosomal proteins (Table 2). However, among the 20 most expressed gene families, dehydrin was also present, indicating that our collection indeed represents plants with a cold stressed induced condition. This was confirmed by a direct comparison to an EST set derived from leaves of non-induced plant. In this collection, dehydrin and other cold induced genes were not among the 20 most highly expressed.
From our cold induced EST collection, an AsCI UniGene set of 2 800 genes was identified. Of these, 1 726 could be placed into the functional groups defined by MIPS (Figure 3), leaving a relatively large proportion of the genes (approx. 40%) outside of this classification. Perception of the stress stimuli, transduction of the stress signal and a molecular response are necessary activities if the plant is to react to abiotic stress. In oat, however, very little is known about cold stress response at this level, although the general mechanisms are probably similar in all plants. In order to better identify oat genes involved in the cold response we therefore created a database denoted CSDB (cold stress data base), in which all genes available from the public domain and classified as cold stress responsive or cold induced were collected. When the sequences in the CSDB were compared to the oat AsCIUniGene set we found that 398 sequences matched, indicating that at least 14% of all the genes in the AsCIUniGene set are involved in cold stress. Among these, sequences encoding activities related to perception, signal transduction and transcriptional regulation were overrepresented (Table 3). From the oat EST collection generated from leaves of three weeks old oat plants grown under green house conditions we created a UniGene set of 1445 different transcripts using the same tools as with the AsCIUniGene set. This non-induced leaf set was denoted AsNIUniGene. When the CSDB was searched with AsNIUniGene only 5.1% of the genes were found to be similar (Table 4), a dramatic difference to the AsCIUniGene set. These studies were extended to EST collections from cold acclimated wheat and barley. By creating UniGene sets (TaCIUniGene and HvCIUniGene) both these collections were analysed in the CSDB. We then found that the amount of cold stress related genes were around 10% in both the wheat and barley collections (Table 4). Generalising, it seems like that at least 10% of all expressed genes in cold acclimating plants are devoted to various cellular responses needed to prepare the plant to freezing temperatures. The cold induced oat gene collection will now be a valuable new asset in further analysis of such genes.
Our functional analysis of the AsCIUniGene set showed that transcription factor genes were represented by 107 sequences, belonging to at least 14 different families (Table 5). Of these, 51 were homologous to cold-induced genes from other systems. Of special importance for cold acclimation is the CBF transcription factor family. Genes in this family regulate several different downstream genes, including the COR genes [16, 30]. However, this regulation is complex and several different CBF genes are involved. From the AsCIUniGene set we identified four oat CBF genes, denoted AsCBF1, AsCBF2, AsCBF3 and AsCBF4. Our phylogenetic and multiple alignment analysis showed that all four belonged to the monocot DREB subfamily of ERF/AP2 domain proteins. The AsCBF1 and AsCBF2 genes were very closely related, while the AsCBF3 and AsCBF4 genes were somewhat more distantly related to each other and also belonged to a different clade than the AsCBF1 and AsCBF2 genes (Figure 4 and 5).
To investigate the expression profile of the AsCBF genes, we designed gene specific primers and by RT-PCR analysis showed that these genes indeed were cold induced, but that their expression patterns were different. Their expression ranged from early induction already after 15 min (AsCBF3) to induction after 1 h (AsCBF4) and from peaking at 4 h (AsCBF1 and AsCBF3) to peaking at 8 h (AsCBF4) (Figure 6). The AsCBF3 was particularly interesting since it was weakly constitutively expressed, showed a clear increase in expression after cold treatment and still expressed after 24 h. Despite several attempts using different primer pairs we could not obtain a reproducible expression pattern of the AsCBF2 gene. The reason for this is presently not known. The complex regulation of the AsCBF genes is different from what was previously described in Arabidopsis  where the AtCBF1, AtCBF2, and AtCBF3 genes follow more or less the same expression pattern with a rapid induction after 15 min and a peak after 2 h. This indicates that CBF factors have intricate and different individual roles in inducing and maintaining cold acclimation in oat. This is corroborated by preliminary data from barley. This cereal has at least 10 different genes encoding CBF factors, which are all differentially regulated (Eric Stockinger, Ohio State University, personal communication). Thus, a more detailed analysis of the structure and regulation of CBF genes in cereals may reveal new pathways of cold induction, not present in Arabidopsis.
A number of genes with hitherto unknown functions were identified in the AsCIUniGene set. These were divided into two groups, one in which homologous or similar genes from other systems exist and one where no significant similarities could be found to any other sequence, i.e. genes that could be oat specific. In order to rule out that small "non-real" peptides contributed to this group, only sequences with open reading frames of 100 aa or more were included. Work is now in progress to elucidate which of the 427 oat specific unknown genes that are induced by cold stress, drought stress or combinations of different stress factors. Assuming that approx. 10% of these sequences are cold related, more than 40 completely new oat genes involved in cold acclimation will be present in this collection. Such genes are potentially very interesting and could encode hitherto uncharacterised proteins or regulatory factors involved in cold-adaptation and freezing protection
Microsatellites (SSRs) are excellent DNA markers for genetic mapping, since they are polymorphic, abundant, show a co-dominant inheritance and are easy to analyse by PCR . SSRs have therefore been widely utilized in plant genomic studies [33–36]. They are especially advantageous when there is a need to track desirable traits in large-scale breeding programs and when defining anchor points for map-based gene cloning strategies. However, only a few oat SSRs are currently available. Here we identify approximately 400 potential oat SSRs, the majority present in the non-coding part of the EST sequence. Work is now in progress to optimise primers for these SSRs and to identify the ones that give reliable PCR products and are polymorphic. Crude genetic maps have been developed for both diploid [37, 38] and hexaploid oat , but these maps need to be improved . The best SSR markers will therefore be mapped to the oat genome, and linked to valuable genetic markers.
The AsCIUniGene set will now be used to fabricate an oat biochip carrying all 2800 identified genes. In addition, by constructing subtractive libraries, more cold-related ESTs will be generated. Various expression studies will be performed and genes from our collection that show a rapid induction to either cold or drought will be selected for further analysis. We are especially interested in those genes in our EST collection that show a very rapid induction at +4°C and have DNA binding properties. Especially promising genes will be tested in transgenic Arabidopsis and oat systems [41, 42] and by complementing chosen Arabidopsis T-DNA knock-out mutants.
A UniGene set of 2800 genes was produced from a cold induced oat cDNA library. Further analysis revealed that genes related to cold stress were overrepresented in this library and that several genes could encode hitherto unknown functions. RT-PCR analyses of CBF transcription factor genes revealed that they are differentially expressed in oat and therefore might regulate different cold pathways. Approx. 400 potential SSR markers are also present in the collection, several in non-coding regions and in close vicinity to genes involved in regulating cold acclimation.
Oat plants, Avena sativa v. Gerald, 83-48 CH, SW Matilda and SW Birgitta were obtained from the SW-collection (Svalöf Weibull AB, Landskrona, Sweden). Gerald and 83-48 are English winter varieties while Matilda and Birgitta are Swedish spring varieties. Seeds were germinated in 2-liter pots filled with fertilized and pressed peat. Plants were cultured in a greenhouse under natural light supplemented with metal halogen lamps, giving a photon flux density of 240 μmol per m2 per sec. The photoperiod was 18 h, the day/night temperature was 20/12°C and the relative humidity about 70%. The plants were watered as needed.
Cold induction experiments
To investigate the cold acclimation capacity of the chosen oat varieties, 24 pots with 10 seeds each of Gerald, 83-48 CH, Matilda, and Birgitta were prepared. About three weeks after germination, when each plant had produced 3 – 4 leaves, pots were moved to a dark cold room at +4°C (± 0.5°C) and incubated for 12 and 24 hours. After this period the pots were moved to -15°C (± 1°C) for 3 h, 6 h and 12 h. In addition, plants were moved from the greenhouse directly to -15°C, and incubated for 3 h, 6 h and 12 h. After the cold incubation period, the plants were moved back to the greenhouse for recovery. One week later the cold damage was visually scored.
Total RNA preparation
Winter oat (Gerald) was germinated and grown for three weeks in the greenhouse. They were then incubated in the dark at +4°C (± 0.5°C) for 4, 16, 32 or 48 hours. At every timepoint, leaves were randomly picked from several individual plants and pooled. RNA was extracted from pooled leaves essentially as described by Chang et al. (1993). Tissue isolates were ground in liquid nitrogen, transferred to 65°C CTAB extraction buffer (2% CTAB [hexadecyltrimethylammonium bromide] [Sigma], 2% PVP [polyvinyl pyrrolidone, intrinsic viscosity 29–32] [Sigma], 100 mM Tris-HCl [pH 8.0], 25 mM EDTA, 2.0 M NaCl, 0.5 g per L spermidine, 2% β-mercaptoethanol), and extracted twice in equal volumes of chisam (phenol:chloroform:iso-amyl-alcohol 1:1:24). RNA was precipitated overnight at 4°C by adding 0.25 v/v 10 M LiCl. The precipitate was dissolved in 1 × SSTE (1.0 M NaCl, 0.5% SDS, 10 mM Tris-HCl, 1 mM EDTA), pH 8.0, extracted with an equal volume of chisam, precipitated with two volumes 99.5% ethanol, and re-suspended in distilled water treated with DEPC. Total RNA of each sample was quantified spectrophotometrically at OD260. An OD260 of 1 corresponded to 40 μg/ml RNA. Subsequently, the RNA was precipitated and re-suspended in DEPC-treated distilled water to a final concentration of 1 mg/ml.
Ten μg of total RNA were denatured with glyoxal/DMSO  and separated on a 1% agarose gel. The RNA was blotted onto a nylon membrane (Boeringer-Mannheim) and hybridized in Church hybridization buffer . An oat sequence, similar to the wheat COR410 gene was used as a probe. This was isolated by PCR amplification from oat genomic DNA using the forward primer 5'-ATGGAGGATGAGAGGAGCAC-3' and the reverse primer 5'-TTTCTTCTCCTCCTCGGGC-3'. Primer design was based on the wheat sequence. Amplification resulted in an 530 bp sequence which was verified by DNA sequencing (data not shown). The fragment was labelled with 32P-dCTP (Amersham), using a random hexanucleotide mix and labelling-grade Kleenow enzyme (Boeringer-Mannheim). Stringency washes were performed at 65°C for 2 × 5 min in 2 × SSC, 0.5% SDS and for 4 × 5 min in 0.2 × SSC, 0.1% SDS. Membranes were exposed to X-ray film (Du Pont Medical Scandinavia AB).
cDNA library construction and EST sequencing
Total RNAs isolated from plants incubated at +4°C for 6, 12 and 24 hours were pooled. The RNA pooled preparation was sent to MWG Biotech (Germany) where cDNA libraries were constructed, cDNA cloned into the pSPORT1 vector  and EST sequencing was performed.
All similarity searches were batch executed locally using the BlastN, BlastX or TBlastX tools , all included in the BLASTALL program package . Transeq, a program from the EMBOSS package [48, 49] was used to translate DNA sequences into protein sequences. Conserved domain search (CD-search) was performed against the Conserved domain database (CDD) at NCBI  using the Reversed position specific Blast (RPS-BLAST) algorithm with translated ESTs. InterProScan [50, 51] was used to scan translated ESTs for protein signatures in the InterPro member databases. For the multiple alignments we used ClustalW , included in the MacVector 7.2.2 package (Accelrys Inc). The phylogenetic tree was constructed by means of the MacVector 7.2.2 tool kit using the neighbour-joining (NJ) algorithm. Appropriate PERL scripts were written in order to pipeline the process of running tools in sequence, parsing result files and loading the results into the database. All data and results are stored in a PostgreSQL database.
Data sets and treatment
In this paper started off with four different primary sequence data sets. The first set was the 9792 ESTs from cold acclimated oat, which was denoted the Avena sativa Cold Induced (AsCI) data set. The second data set comprises 2189 ESTs , which originated from untreated green leafs of 3 weeks old oat plants and was denoted the Avena sativa NonInduced (AsNI) data set. The third data set, which contains 4337 sequences originates from cold acclimated wheat  was denoted the Triticum aestivum Cold Induced (TaCI) data set and the final data set comprises 5418 sequences from cold stressed barley plants  and was denoted the Hordeum vulgare Cold Induced (HvCI) data set.
EST clustering and assembly
The AsCI data set was filtered, clustered and assembled with the Paracel Transcript Assembler™ (PTA) program (Paracel, Pasadena, CA), which integrates quality filtering, clustering, and assembly into a single pipeline. The filtering step includes masking of vector sequence, low-complexity, low-quality, repeats and poly(A) regions. In the next step clustering was performed. Here PTA utilizes the Haste algorithm in an all-versus-all sequence comparison. The criterion set for clustering sequences together was an alignment of a minimum of 100 bases and with at least 93% similarity between the aligned sequences. Sequences that did not fit into such clusters were defined as singlets. In the assembly step PTA uses CAP4, which is a refinement of the CAP3 algorithm . Sequences that did not fit into a contig were also defined as singlets. Finally, ESTs in singlets that had passed the filters but had an unmasked sequence < 100 bases were discarded. The resulting singlets and contigs represented the AsCI candidate gene set.
The other data sets where clustered and assembled into candidate gene sets using the TGI clustering tool . The clustering was performed by a slightly modified version of NCBI's MegaBlast program  and the resulting clusters were assembled using CAP3.
Most abundant ESTs
Individual ESTs were first annotated by the best BlastX homolog match against Arabidopsis thaliana, where an E value of < 10-10 was used. The A. thaliana proteins were retrieved from the MIPS Arabidopsis database (MATDB). Thereafter the annotation given in the KOG database  was retrieved for each A. thaliana protein. For those A. thaliana homologs that did not have a KOG annotation, the EST sequence inherited the annotation from MATDB. Those sequences that did not have an A. thaliana homolog above the threshold were annotated with the best homolog match from a BlastX search versus the nr database at NCBI. Again an E value of < 10-10 was used.
UniGene set determination
Non-redundant sets of genomic singlets and contigs (UniGene sets) were created in a two-step procedure. First sequence information derived from rRNA, chloroplastic DNA or mitochondrial DNA was identified by comparison to homologous Arabidopsis sequences (accession nr. X52322, AP000423, and Y08501/Y08502 respectively) using BlastN. In this way, sequences containing rRNA or mitochondrial DNA were separated from the genomic sequences.
The second step was a BlastX search of the non-redundant (nr) protein database. The accession numbers and E values of the best matches were extracted from the result file. The criterion used for a sequence to be identified as non-redundant was based on a unique best match based on the accession numbers from the BlastX search. If two or more query sequences resulted in best matches with identical accession numbers they were sorted according to their E values. Only the sequence with the lowest E value was included in the UniGene set.
Annotation and functional classification
The UniGene sets were annotated based on the results of BlastX searches of the nr database. The definition line of the Blast match was used as a description of the putative function of the UniGene gene. An E value threshold of 10-10 was used and UniGene genes that did not meet this requirement were annotated as unknown.
Our functional classification of individual genes followed the functional categories as defined in the Munich Information Centre for Protein Sequences (MIPS) Arabidopsis thaliana functional catalogue (MATDB; downloaded from http://mips.gsf.de). To create a semi-automated functional classification pipeline a two-step procedure was developed. First, a BlastX search was performed with the UniGene set versus the MATDB, requiring an E value of < 10-10. Locus name and E value of the best match for each gene were extracted from the result file. Secondly, the functional classification was identified by a search with the locus name in the Arabidopsis functional catalogue. Genes that did not meet the criteria for being functionally classified based on the semi-automated procedure were classified manually based on the annotation and the result from a conserved domain search versus the conserved domain database (CDD) downloaded from the NCBI web site.
Identification of microsatellite sequences
A set of 3716 sequences resulting from the clustering and assembly steps, comprising a total of 5.3 Mb of sequence, was searched for microsatellites (simple sequence repeats; SSRs) in the form of mononucleotide repeats of > 15 bp, dinucleotide repeats of > 14 bp, trinucleotide repeats of > 15 bp, tetranucleotide repeats of > 16 bp, and pentanucleotide repeats of > 20 bp, as previously described by . To better locate di- to pentanucleotide repeats, we also used the program Sputnik, developed at the Washington University . This program allows minor imperfections on the SSRs by implementing a scoring system for insertions, mismatches and deletions. To locate mononucleotide repeats we used a simple PERL script developed by ourselves.
Reverse Transcriptase Polymerase Chain Reactions (RT-PCR) were performed on total RNA prepared from leaves of three weeks old oat plants (variety Gerald), incubated for 0 min, 15 min, 30 min, 45 min, 1 h, 2 h, 8 h and 24 h at +4°C, using the SuperScript™ III One-Step RT-PCR system (Invitrogene™). RNA samples were first DNase treated using the Dnase I Amplification Garde from Invitrogene™. To amplify the different AsCBF genes the following primers were used:
AsCBF1 forward primer 5'-CCACAGTCCACCGTATCAGCAAG-3'
AsCBF1 reverse primer 5'-CGTCTCCTTGAACTTGGTGCG-3'
AsCBF3 forward primer 5'-CGGGCAAAGTTGAGGCAGGC-3'
AsCBF3 reverse primer 5'-TAGGCTCTGGCTCGGCACCTTC-3'
AsCBF4 forward primer 5'-CCCAGCCTTCAGCAGCGTC-3'
AsCBF4 reverse primer 5'-TCTCCACAGTCTCCTCCGTGC-3'
For the AsCBF1 gene a product size of 174 bp was expected, for AsCBF3 104 bp and for AsCBF4 172 bp. The AsActin gene used as a control and was amplified using the forward primer 5'-GCGACAATGGAACTGGC-3 and the reverse primer 5'-GTGGTGAAGGAGTAACCTCTCTCG-3'. In this case the expected product size was 580 bp. The RT-PCR reactions were run according to the manufacturer's instructions and 100 ng total RNA were used in each reaction. A 30-min reverse transcription at 55°C followed by a PCR amplification step with 30, 35 or 40 cycles were used. To verify the outcome of the RT-PCR reactions, equal amounts (30%) of the corresponding RT-PCR reaction mixes were applied on 1% agarose gels containing ethidium bromide (0.5 ng per ml).
This work was supported by grants from the VL-foundation, The SwedishFarmers Supply and Crop Marketing Co-operative, the Swedish Research School in Genomics and Bioinformatics and the Swedish Research Council (VR).
- Kellogg EA: Relationships of cereal crops and other grasses. Proc Natl Acad Sci U S A. 1998, 95 (5): 2005-2010. 10.1073/pnas.95.5.2005.PubMedPubMed CentralView ArticleGoogle Scholar
- Bennett MD, Smith JB: Nuclear dna amounts in angiosperms. Philos Trans R Soc Lond B Biol Sci. 1976, 274 (933): 227-274.PubMedView ArticleGoogle Scholar
- Green C: Oats in a new era. Semundo Limited , 49 North Road, Great Abington, Cambridge; 1999:88.Google Scholar
- Glore SR, Van Treeck D, Knehans AW, Guild M: Soluble fiber and serum lipids: a literature review. J Am Diet Assoc. 1994, 94 (4): 425-436. 10.1016/0002-8223(94)90099-X.PubMedView ArticleGoogle Scholar
- Brown L, Rosner B, Willett WW, Sacks FM: Cholesterol-lowering effects of dietary fiber: a meta-analysis. Am J Clin Nutr. 1999, 69 (1): 30-42.PubMedGoogle Scholar
- Trowell HC, Burkitt DP: Western diseases: their emergence and prevention. Cambridge, MA: Harvard University Press; 1981.Google Scholar
- Janatuinen EK, Pikkarainen PH, Kemppainen TA, Kosma VM, Jarvinen RM, Uusitupa MI, Julkunen RJ: A comparison of diets with and without oats in adults with coeliac disease. New Engl J Med. 1995, 333: 1033 – 1037-10.1056/NEJM199510193331602.PubMedView ArticleGoogle Scholar
- Janatuinen EK, Kemppainen TA, Julkunen RJ, Kosma VM, Maki M, Heikkinen M, Uusitupa MI: No harm from five year ingestion of oats in coeliac disease. Gut. 2002, 50: 332-335. 10.1136/gut.50.3.332.PubMedPubMed CentralView ArticleGoogle Scholar
- Størsrud S, Olsson M, Arvidsson-Lenner1, R. N, L.A., Nilsson O, Kilander A: Adult coeliac patients do tolerate large amounts of oats. European Journal of Clinical Nutrition. 2003, 57: 163-169. 10.1038/sj.ejcn.1601525.PubMedView ArticleGoogle Scholar
- Duve KJ, White PJ: Extraction and identification of antioxidants in oats. Journal of the American Oil Chemists Society. 1991, 68: 365–370.View ArticleGoogle Scholar
- Forssell P, Cetin M, Wirtanen G, Malkki Y: Antioxidative effects of oat oil and its fractions. Fett Wissenschaft Technologie-Fat Science Technology. 1990, 92: 319–321.Google Scholar
- Xing Y, White PJ: Identification and function of antioxidants from oat groats and hulls. J Am Oil Chem Soc. 1997, 74: 303-307.View ArticleGoogle Scholar
- Auerbach RH, Gray DA: Oat antioxidant extraction and measurement—towards a commercial process. Journal of the Science of Food and Agriculture. 1999, 79: 273–282-10.1002/(SICI)1097-0010(19990301)79:3<385::AID-JSFA260>3.0.CO;2-L.View ArticleGoogle Scholar
- Tian LL, White PJ: Antioxidant activity of oat extract in soybean and cottonseed oils. J Am Oil Chem Soc. 1994, 70: 1079 -11085.View ArticleGoogle Scholar
- Thomashow MF: Molecular genetics of cold acclimation in higher plants. Advances in genetics. 1990, 28: 99-131.View ArticleGoogle Scholar
- Sarhan F, Danyluk J: Engineering cold-tolerant crops -throwing the master switch. Trends in plant science. 1998, 3 (8): 289-291. 10.1016/S1360-1385(98)01285-0.View ArticleGoogle Scholar
- Danyluk J, Houde M, Rassart E, Sarhan F: Differential expression of a gene encoding an acidic dehydrin in chilling sensitive and freezing tolerant gramineae species. FEBS Lett. 1994, 344 (1): 20-24. 10.1016/0014-5793(94)00353-X.PubMedView ArticleGoogle Scholar
- Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA: The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003, 4 (1): 41-10.1186/1471-2105-4-41.PubMedPubMed CentralView ArticleGoogle Scholar
- NCBI: Expressed Sequence Tags Database [http://www.ncbi.nlm.nih.gov/dbEST]. 2005Google Scholar
- Tsuda K, Tsvetanov S, Takumi S, Mori N, Atanassov A, Nakamura C: New members of a cold-responsive group-3 Lea/Rab-related Cor gene family from common wheat (Triticum aestivum L.). Genes Genet Syst. 2000, 75 (4): 179-188. 10.1266/ggs.75.179.PubMedView ArticleGoogle Scholar
- Jang JY, Kim DG, Kim YO, Kim JS, Kang H: An expression analysis of a gene family encoding plasma membrane aquaporins in response to abiotic stresses in Arabidopsis thaliana. Plant Mol Biol. 2004, 54 (5): 713-725. 10.1023/B:PLAN.0000040900.61345.a6.PubMedView ArticleGoogle Scholar
- Moons A, Gielen J, Vandekerckhove J, Van der Straeten D, Gheysen G, Van Montagu M: An abscisic-acid- and salt-stress-responsive rice cDNA from a novel plant gene family. Planta. 1997, 202 (4): 443-454. 10.1007/s004250050148.PubMedView ArticleGoogle Scholar
- Chinnusamy V, Ohta M, Kanrar S, Lee BH, Hong X, Agarwal M, Zhu JK: ICE1: a regulator of cold-induced transcriptome and freezing tolerance in Arabidopsis. Genes Dev. 2003, 17 (8): 1043-1054. 10.1101/gad.1077503.PubMedPubMed CentralView ArticleGoogle Scholar
- Seki M, Narusaka M, Ishida J, Nanjo T, Fujita M, Oono Y, Kamiya A, Nakajima M, Enju A, Sakurai T, Satou M, Akiyama K, Taji T, Yamaguchi-Shinozaki K, Carninci P, Kawai J, Hayashizaki Y, Shinozaki K: Monitoring the expression profiles of 7000 Arabidopsis genes under drought, cold and high-salinity stresses using a full-length cDNA microarray. Plant J. 2002, 31 (3): 279-292. 10.1046/j.1365-313X.2002.01359.x.PubMedView ArticleGoogle Scholar
- Fowler S, Thomashow MF: Arabidopsis transcriptome profiling indicates that multiple regulatory pathways are activated during cold acclimation in addition to the CBF cold response pathway. Plant Cell. 2002, 14 (8): 1675-1690. 10.1105/tpc.003483.PubMedPubMed CentralView ArticleGoogle Scholar
- Thomashow MF: Plant Cold Acclimation: Freezing Tolerance Genes and Regulatory Mechanisms. Annu Rev Plant Physiol Plant Mol Biol. 1999, 50: 571-599. 10.1146/annurev.arplant.50.1.571.PubMedView ArticleGoogle Scholar
- Sakuma Y, Liu Q, Dubouzet JG, Abe H, Shinozaki K, Yamaguchi-Shinozaki K: DNA-binding specificity of the ERF/AP2 domain of Arabidopsis DREBs, transcription factors involved in dehydration- and cold-inducible gene expression. Biochem Biophys Res Commun. 2002, 290 (3): 998-1009. 10.1006/bbrc.2001.6299.PubMedView ArticleGoogle Scholar
- Dubouzet JG, Sakuma Y, Ito Y, Kasuga M, Dubouzet EG, Miura S, Seki M, Shinozaki K, Yamaguchi-Shinozaki K: OsDREB genes in rice, Oryza sativa L., encode transcription activators that function in drought-, high-salt- and cold-responsive gene expression. Plant J. 2003, 33 (4): 751-763. 10.1046/j.1365-313X.2003.01661.x.PubMedView ArticleGoogle Scholar
- Cardle L, Ramsay L, Milbourne D, Macaulay M, Marshall D, Waugh R: Computational and experimental characterization of physically clustered simple sequence repeats in plants. Genetics. 2000, 156 (2): 847-854.PubMedPubMed CentralGoogle Scholar
- Stockinger EJ, Gilmour SJ, Thomashow MF: Arabidopsis thaliana CBF1 encodes an AP2 domain-containing transcriptional activator that binds to the C-repeat/DRE, a cis-acting DNA regulatory element that stimulates transcription in response to low temperature and water deficit. Proc Natl Acad Sci U S A. 1997, 94 (3): 1035-1040. 10.1073/pnas.94.3.1035.PubMedPubMed CentralView ArticleGoogle Scholar
- Gilmour SJ, Zarka DG, Stockinger EJ, Salazar MP, Houghton JM, Thomashow MF: Low temperature regulation of the Arabidopsis CBF family of AP2 transcriptional activators as an early step in cold-induced COR gene expression. Plant J. 1998, 16 (4): 433-442. 10.1046/j.1365-313x.1998.00310.x.PubMedView ArticleGoogle Scholar
- Weber JL: Informativeness of human (dC-dA)n.(dG-dT)n polymorphisms. Genomics. 1990, 7: 524 -5530. 10.1016/0888-7543(90)90195-Z.PubMedView ArticleGoogle Scholar
- Hamann A, Zink D, Nagl W: Microsatellite fingerprinting in the genus Phaseolus. Genome. 1995, 38: 507 -5515.PubMedView ArticleGoogle Scholar
- Morgante M, Olivieri AM: PCR-amplified microsatellites as markers in plant genetics. Plant J. 1993, 3 (1): 175-182. 10.1046/j.1365-313X.1993.t01-9-00999.x.PubMedView ArticleGoogle Scholar
- Rongwen J, Akkaya MS, Bhahwat AA, Lavi U, Cregan PB: The use of microsatellite DNA markers for soybean genotype identification. Theoretical and Applied Genetics. 1995, 90 (1): 43 -448. 10.1007/BF00220994.PubMedView ArticleGoogle Scholar
- Sharon D, Adato A, Mhameed S, Lavi U, Hillel J, Gomolka M, Epplen C, Epplen J: DNA fingerprints in plants using simple-sequence repeat and minisatellite probes. HortScience. 1995, 30 (1): 109 -1112.Google Scholar
- O'Donoughue LS, Wang Z, Roder M, Kneen B, M. L, Sorrells ME, Tanksley SD: An RFLP-based map of oat on a cross between two diploid taxa (Avena atlantica x A. hirtula). Genome. 1992, 35: 765-771.View ArticleGoogle Scholar
- Rayapati PJ, Gregory JW, Lee M, Wise RP: A linkage map of diploid oat Avena based on RFLP loci and a locus conferring resistance to Puccinia coronata var. avenae. Theoretical and Applied Genetics. 1995, 89: 831-837.Google Scholar
- O'Donoughue LS, Kianian SF, Rayapati PJ, Penner GA, Sorrells ME, Tanksley SD, Phillips RL, Rines HW, Lee M, Fedak G, Molnar SJ, Hoffman D, Salas CA, Wu B, Autrique E, Van Deynze A: A molecular map of cultivated oat. Genome. 1995, 38: 368-380.PubMedView ArticleGoogle Scholar
- Wight CP, Tinker NA, Kianian SF, Sorrells ME, O'Donoughue LS, Hoffman DL, Groh S, Scoles GJ, Li CD, Webster FH, Phillips RL, Rines HW, Livingston SM, Armstrong KC, Fedak G, Molnar SJ: A molecular marker map in 'Kanota' x 'Ogle' hexaploid oat (Avena spp.) enhanced by additional markers and a robust framework. Genome. 2003, 46 (1): 28-47. 10.1139/g02-099.PubMedView ArticleGoogle Scholar
- Gharti-Chhetri G, Olsson O: Establishment of a highly efficient callus proliferation and plant regeneration systems from different explants of seven commercial Swedish oat (Avena sativa L.) cultivars. (Submitted to Plant Cell Tissue and Organ Culture). 2005.Google Scholar
- Olsson O, Gharti-Chhetri GB: Novel transformation method and transformed plants Patent pending. 2005.Google Scholar
- Ausubel FM, Brent R, Kingston RE, Moore DD, Seidman JG, Smith JA, Struhl K: Short Protocols in molecular Biology. New York , John Wiley & Sons; 1995.Google Scholar
- Church GM, Gilbert W: The genomic sequencing technique. Prog Clin Biol Res. 1985, 177: 17-21.PubMedGoogle Scholar
- Yulan P, Naomi T, Meng KL, Minoru SHK: Total RNAs by a Universal PCR Amplification Method. Genome Research. 2001, 11 (9): 1553-1558. 10.1101/gr.185501.View ArticleGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215: 403-410. 10.1006/jmbi.1990.9999.PubMedView ArticleGoogle Scholar
- NCBI: Basic Local Alignment Search Tool [http://www.ncbi.nlm.nih.gov/blast]. 2005.Google Scholar
- Rice P, Longden I, Bleasby A: EMBOSS: The European Molecular Biology Open Software Suite Trends in Genetics. 2000, 16: 276-277.Google Scholar
- EMBOSS: The European Molecular Biology Open Software Suite [http://emboss.sourceforge.net]. 2005.Google Scholar
- Zdobnov EM, Apweiler R: InterProScan - an integration platform for the signature-recognition methods in InterPro. Bioinformatics. 2001, 17: 847-848. 10.1093/bioinformatics/17.9.847.PubMedView ArticleGoogle Scholar
- EBI: InterProScan [http://www.ebi.ac.uk/InterProScan/] . 2005.Google Scholar
- Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research. 1994, 22: 4673-4680.PubMedPubMed CentralView ArticleGoogle Scholar
- NCBI: National Center for Biotechnology Information [http://www.ncbi.nlm.nih.gov]. 2005.Google Scholar
- Huang X, Madan A: CAP3: A DNA sequence assembly program. Genome Res. 1999, 9 (9): 868-877. 10.1101/gr.9.9.868.PubMedPubMed CentralView ArticleGoogle Scholar
- TIGR: TGI Clustering tools [http://www.tigr.org/software/]. 2003.Google Scholar
- Zhang Z, Schwartz S, Wagner L, Miller W: A greedy algorithm for aligning DNA sequences. J Comput Biol. 2000, 7 (1-2): 203-214. 10.1089/10665270050081478.PubMedView ArticleGoogle Scholar
- Abajian C: Sputnik [http://espressosoftware.com/pages/sputnik.jsp]. 1994.Google Scholar
- EBI: EMBL-Nucleotide Sequence Database [http://www.ebi.ac.uk/embl/index.html]. 2005.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.