Cross-species EST alignments reveal novel and conserved alternative splicing events in legumes
© Wang et al; licensee BioMed Central Ltd. 2008
Received: 16 October 2007
Accepted: 19 February 2008
Published: 19 February 2008
Although originally thought to be less frequent in plants than in animals, alternative splicing (AS) is now known to be widespread in plants. Here we report the characteristics of AS in legumes, one of the largest and most important plant families, based on EST alignments to the genome sequences of Medicago truncatula (Mt) and Lotus japonicus (Lj).
Based on cognate EST alignments alone, the observed frequency of alternatively spliced genes is lower in Mt (~10%, 1,107 genes) and Lj (~3%, 92 genes) than in Arabidopsis and rice (both around 20%). However, AS frequencies are comparable in all four species if EST levels are normalized. Intron retention is the most common form of AS in all four plant species (~50%), with slightly lower frequency in legumes compared to Arabidopsis and rice. This differs notably from vertebrates, where exon skipping is most common. To uncover additional AS events, we aligned ESTs from other legume species against the Mt genome sequence. In this way, 248 additional Mt genes were predicted to be alternatively spliced. We also identified 22 AS events completely conserved in two or more plant species.
This study extends the range of plant taxa shown to have high levels of AS, confirms the importance of intron retention in plants, and demonstrates the utility of using ESTs from related species in order to identify novel and conserved AS events. The results also indicate that the frequency of AS in plants is comparable to that observed in mammals. Finally, our results highlight the importance of normalizing EST levels when estimating the frequency of alternative splicing.
Alternative splicing (AS) is an important cellular process that leads to multiple mRNA isoforms from a single pre-mRNA in eukaryotic organisms. Plant AS events used to be regarded as rare. However, a growing number of computational studies have now demonstrated that the frequency of alternatively spliced genes in plants is higher than previously estimated [1, 2]. 20–30% of expressed genes are alternatively spliced in Arabidopsis thaliana (At) and rice (Oryza sativa, Os) as revealed by large scale EST-genome alignments [1, 2]. A recent study using EST pairs gapped alignments (EST-EST) surveyed 11 plant species and suggested that overall AS frequencies vary greatly in different plant species, with some rates comparable to those observed in animals . In mammals, exon skipping (ExonS) is the most common type of AS [4, 5], but in At and Os, intron retention (IntronR) is most abundant . Alternative acceptor site (AltA) and alternative donor site (AltD) are also common in these two model plants [1, 2]. A rare type of AS event is alternative position (AltP), where an alternative intron differs from its constitutive form in both donor and acceptor sites . Examples of all five types of AS events are shown in Additional file 1 (Supplementary Figure S1). Recently, a novel approach involving whole-genome microarray data revealed that IntronR can be detected in ~8% of At genes . The prevalent IntronR events suggest that an intron recognition mechanism is predominant in At and Os . A small fraction of conserved AS events have also been discovered and confirmed between At and Os, strongly indicating the functional importance of AS in plants .
Most computational studies on AS in mammals and plants use transcript sequences from the same species as their genome sequences. For species with relatively small EST/cDNA collections, transcript sequences from closely related species can be a valuable resource for identification of additional AS events. Even for species with large EST collections, including human and mouse, cross-species EST alignment have been used to reveal novel AS events. As many as 42% of human genes show novel AS patterns by aligning mouse transcripts to human genome , and more than 10% of human loci exhibit conserved AS events in mouse . Another study applying the cross-species strategy to human, mouse and rat identified 758 novel cassette-on exons (ExonS) as well as 167 novel retained introns (IntronR). RT-PCR validated 50~80% of tested events, indicating the impressive potential of the cross-species method in identifying novel AS events . In plants, cross-species transcripts have been used mainly for gene annotation. For example, transcript assemblies from 185 species were mapped to the Os genome, confirming about 90% of gene predictions plus about 500 novel genes . Similarly, approximately 850 novel genes and 1,000 novel AS events were annotated in Os by aligning ESTs from seven plant species . The AS events supported by cross-species transcripts are likely to be functional, as they are conserved between species.
Experimental studies provide additional insight into the function of AS in plants. A wide range of plant genes with diverse functions are regulated through AS, including (but not limited to) genes involved in transcription, splicing, photosynthesis, disease resistance, stress, flowering and grain quality (reviewed in [12, 13]). Genes involved in splicing, especially in splicing regulation, seem to have a higher frequency of AS . Several recent studies have revealed that serine/arginine-rich (SR) protein transcripts exhibit extensive levels of AS and that some AS pattern are conserved between At and Os [15–18]. Maize SR protein transcripts are also alternatively spliced [19, 20]. Temperature stress (cold and heat) as well as hormone treatment can change the AS patterns of SR proteins in At, suggesting an important role for AS in the stress response . One At U2AF35 homolog (atU2AF35a) is alternatively spliced by removing non-canonical introns with repeated borders in the 3'-end of the coding region. Changing the expression of U2AF35 homologs alters the splicing pattern of the FCA gene and, in turn, causes variation in flowering time . The U1-70K gene encodes a core protein in U1 small nuclear ribonucleoproteins (snRNP). The sixth intron of U1-70K can be retained in At , an event conserved between At and Os . Recently, the IntronR event was experimentally confirmed in Os and maize .
Over 400 genes in 54 plant species are now known to be alternatively spliced . Only a few AS events, however, have been reported in legumes (Fabaceae), one of the largest and most important plant families. In Lotus japonicus (Lj), a phytochelatin synthase gene (LjPCS2) can be alternatively spliced, with one isoform present in nodules (LjPCS2-7N) and another isoform in roots (LjPCS2-7R). The two isoforms encode proteins differing only in five amino acids, where one protein (LjPCS2-7N) confers cadmium (Cd) tolerance while the other does not, at least not when ectopically expressed in yeast cells . A nodule specific gene (LjNOD70) shows an IntronR event in Lj, where the spliced isoform is less abundant in nodules . Six sucrose synthase genes exist in At, Os and Lj, but only the Lj homolog (LjSUS2) is alternatively spliced . In soybean (Glycine max,Gm), a nodule specific gene (GmPGN) has been identified through EST data mining. Experiments confirmed the tissue specificity and also revealed AS events for this gene . In kidney bean (Phaseolus vulgaris), a single gene (PvSBE2) can be alternatively spliced to produce two starch-branching enzyme isoforms, each with distinct characteristics and subcellular localization . A highly abundant novel giant retroelement (Orge) of pea (Pisum sativum) is partially spliced, probably regulating the ratio of full-length protein, as the retained intron causes truncation .
Two legume plants, Medicago truncatula (Mt) and L. japonicus (Lj), have large-scale genome sequencing projects in progress . In late 2006, the Medicago genome sequence consortium (MGSC) constructed a partial genome assembly based on 1,996 Bacterial Artificial Chromosome (BAC) clone sequences as a basis for constructing draft pseudochromosomes. A total of 42,358 genes were annotated by the International Medicago Genome Annotation Group (IMGAG) , representing ~60% of all Mt genes. The data has been released as Mt1.0, available at . In parallel, Lj has 1,394 Transformation-competent Artificial Chromosomes (TACs) in GenBank (as of mid-2006), with 488 of them at phase 3 (finished). Both legume model plants have relatively large EST collections (over 150,000 sequences). There are also large numbers of transcript sequences from other legume species, especially soybean. These features make Mt and Lj ideal for computational comparison of AS events in legume and other plants.
In this study, all available transcript sequences from legumes were aligned to Mt and Lj BAC/TAC sequences. At and Os transcript sequences were also aligned to their own genome sequences for comparison purpose. The frequency of alternatively spliced genes is very similar across the different plant species as long as the number of ESTs used as a basis for analysis is standardized across different species. In the case of Mt, about 10% of expressed genes are alternatively spliced at current EST coverage, with IntronR the most abundant type. Novel and conserved AS events can be identified if cross-species ESTs are aligned to the genome. These results provide a basis for analyzing AS events conserved in all plants as well as those found in legumes only. This is the first large-scale analysis of AS using EST-genome alignments in plants other than At and Os, and it is also the first detailed comparison using cross-species transcript sequences in plants.
Characteristics of legume exons and introns
Transcript alignments, intron and exon features in plants
Mapped to genome^
Transcription unit (TU)/Genes
Average (Median) ESTs/gene
Number of Introns
Average (Median) intron size
Long intron (>1000 nt)
Number of internal exons
Average (Median) internal exon
As noted previously [1, 36], the GC-content of introns and exons is ~5% lower in At than in Os. The GC-content of legume introns and exons is very similar to that of At, although Mt has slightly lower GC-content than either At or Lj in both intronic and exonic regions (see Additional file 1, Supplementary Table S1 and Supplementary Figure S2). G-content and A-content are similar in all species including Os, although Os introns are relatively more C-rich and less U-rich. There is more variation in the distribution of U-(T-) and A- content than in G- or C-content in all species (see Additional file 1, Supplementary Figure S3). The difference in GC-content between introns and exons is about 10% in all four species, with Mt showing the largest difference of 11.7% and Os showing the smallest, 9.6% (see Additional file 1, Supplementary Table S1).
Different plant species have similar levels of alternatively spliced genes
Comparison of alternative splicing events and frequencies in plants
IntronR is the most abundant AS type in legumes
As shown in Table 2, the proportions of different AS types are similar in Mt, At and Os. (Lj data are also listed but are not included in the analysis as only ~100 AS events were identified). More than half of AS events in plants are IntronR, 6–11% are ExonS, and the remaining 30–40% involve different splice sites (AltD/A/P). These numbers are quite similar to those observed previously . Mt has a slightly lower ratio of IntronR (51%) and a higher ratio of AltD (13%) compared with At and Os. Different levels of EST coverage have little effect on the composition of AS events. As shown in Additional file 1 (Supplementary Figure S4), the ratios of different AS types remain largely constant across all EST levels, particularly in At and Os. IntronR is the most abundant at all levels, with a relatively lower ratio in Mt. The ExonS ratio is consistently lower in At than in Os (and Mt), while the AltA ratio is higher.
Cross-species EST alignment in Medicago reveals hundreds of novel AS events
Cross-species EST alignments in Medicago
Mapped to Mt BACs
Genes without Mt EST
AS events predicted from cross-species EST alignment in Medicago
Approximately 90% of cross-species AS events are located in open reading frames (ORFs), much higher than the fraction (70–75%) in same-species AS events. There seem to be more cross-species and same-species AS events in the 5'-UTR than in the 3'-UTR (data not shown and ). For AS events in ORFs, the fractions of translation-readthrough events, where some amino acids are added to or removed from the protein without changing the reading frame, are similar (20–24%) in cross-species and same-species events. AltA has the highest translation-readthrough ratio (35–40%), and IntronR has the lowest (2–10%). Intriguingly, the ratio of AS events producing substrates for nonsense-mediated decay (NMD)  is higher in cross-species AS events than in same-species AS events. Nearly half of the cross-species AS events produce NMD substrates, compared with 30–40% in same-species AS events.
Conserved AS events identified from cross-species EST alignments in legumes
To identify AS events with direct evidence of conservation in multiple species, two approaches were employed: (1) Align all legume ESTs to Lj TACs to identify conserved AS events predicted by the same ESTs between Mt and Lj; (2) Identify conserved AS events in Mt with EST evidence from multiple legume species, all showing the same AS pattern. A total of 242 AS events conserved between Mt and Lj were identified through method (1), including 92 (38.0%) IntronR, 26 (10.7%) ExonS, 78 (32.2%) AltA, 41 (17.0%) AltD, and 5 (2.1%) AltP events. These AS events are viewable at the ASIP website. Method (2) identified 22 completely conserved AS events in Mt (see Additional file 1, Supplementary Table S3). Nine of the 22 genes also have At and/or Os close homologs sharing the same AS pattern. For instance, Mt hypothetical protein AC156627_1 has both soybean and Mt ESTs support for an AltA event in the first ORF intron, whereby an isoform utilizes an alternative acceptor site 5-nt upstream (AACAG) of the constitutive acceptor site (AGCAG), producing a substrate possibly subject to NMD. At homologs (At5g25360.1 and At1g15350.1) and Os homolog (LOC_Os02g10720) both have exactly the same AS pattern, including the alternative acceptor sites. This gene seems to be plant-specific, as non-plant homologs can not be identified. Another example of completely conserved AS events is the Mt AP2 domain containing protein AC151460_3, where the 3'-UTR intron can be retained. One At homolog and three Os homologs also have the same intron retained. There are also some AS events conserved in legumes but not observed in At and Os. One example is AC124951_11, a highly expressed carbonic anhydrase gene with the 3'-UTR intron alternatively spliced (AltD) in legumes species. The AltD event is conserved in all legume species (Mt, Lj, Gm, and others), but not in At and Os even though hundreds of ESTs exist, indicating that this AS event is probably legume-specific.
Comparison of AS frequencies in different species
In this study, alignment of current EST and genomic sequences revealed that ~10% of expressed genes are alternatively spliced in Mt compared with 20% in At and Os. This difference is mainly due to the lower EST coverage found in Mt. We demonstrated that the AS frequencies in the three plants are essentially similar when adjusted for genes having comparable EST numbers. This conclusion is different from the conclusion drawn in a recent study based on EST pairs gapped alignments, in which a greater degree of variation was observed for different plant species . Interpretation of EST-only data can be confounded by extensive gene duplication events. With more plant genome sequences becoming available, it should soon be possible to more precisely address the intriguing questions concerning the extent and evolution of AS in plants.
Alternatively spliced isoforms are usually in low abundance, the chance of capturing them in a small EST collection is low, making it difficult to estimate AS frequencies accurately. Supposing a functional event has certain percentage p of transcripts alternatively spliced, the probability of observing an AS event with n ESTs covering the alternative splice site is 1 - (1 - p) n . For example, if an alternatively spliced isoform were generated p = 10% of the time, n = 10 transcript sequences would give a 65% probability of observing this event, and 22 transcript sequences would be required to have >90% probability of observing the event. Our results show that the AS frequency for genes with small numbers of ESTs are similar in Mt, At, and Os, suggesting that they all have similar levels of functional AS events.
In cases where AS isoforms are even lower in abundance, greater numbers of transcripts would be clearly necessary to detect the event. Nevertheless, Os seems to have a higher frequency of AS in genes with >30 ESTs than either Mt or At. Focusing on genes with >40 ESTs only, the AS frequency in Os is consistently (>10%) higher than in At. For this analysis, we did not include transcripts from Os subspecies indica in order to eliminate the possibility that the higher AS frequency is falsely caused by cross-subspecies ESTs. In any case, the error rates from EST sequencing or genome contamination are probably similar in all three plants. Consequently, Os does seem to have higher levels of low-abundance AS events than At (or Mt). Some of the low-abundance events may be splicing errors captured in EST libraries constructed from plant tissues under various growth conditions, so the higher level of low-abundance AS events in Os could indicate higher error rates for the Os spliceosome.
Not surprisingly, observed AS frequency is highly correlated with EST numbers in all three plants. Highly expressed genes (genes with large numbers of ESTs) are more likely to be detected as alternatively spliced. Over 60% and 40% genes with more than 500 ESTs are alternatively spliced in Os and At, respectively. This is comparable to the level in human . Half of human genes are alternatively spliced by the criterion that AS isoforms occurs in at least 1% of the observed transcripts, but only 20% of human genes are alternatively spliced if the required abundance level is increased to >10% . This frequency is notably similar to the frequency in plants under the same abundance level, suggesting that the frequency of regulated AS events in plants may not be significantly lower than in mammals.
Splicing errors and functional AS events
A clear difference between AS in plants and mammals is the predominance of IntronR in plants and ExonS in mammals. Both model legumes, Mt and Lj, have 40–50% of AS events as IntronR, a level noticeably lower than in At and Os, but still much higher than in mammals. Similar to the situation in At and Os , introns shorter than 70 nt are more likely to be retained in legumes (data not shown). The spliceosome is a large dynamic RNA-protein complex involving hundreds of proteins. If an intron is too small, the assembly and structure transformation of spliceosome will be constrained and may lead to inefficient splicing and IntronR . As the size of introns is considerably larger in Mt and Lj, fewer introns will be retained due to steric hindrance, possibly leading to a lower frequency of IntronR in legumes. These data also suggest that some AS events may be splicing errors. As we proposed in , the most common splicing error in plants is probably a failure to recognize and splice out introns, so IntronR should be the most common AS type. In mammals, where introns are defined through an exon recognition mechanism, a failure to recognize some exons, and therefore skip them, is likely the most common error. Consequently, ExonS is the most common AS type in human.
Observed AS events are a mixture of functional AS events and splicing errors. Other types of error, such as sequencing errors, genome contamination, and alignment errors, will also contribute to the predicted level of AS events. Two alignment programs (GeneSeqer and GMAP) were applied and only common AS events were used in this study to minimize alignment errors. Genome contamination could be minimized by elimination of ESTs retaining all predicted introns. Distinguishing functional AS events from splicing errors, however, is not an easy task. We attempted to achieve this goal by two methods. First, we selected AS events with each isoform supported by multiple transcripts. As splicing errors are expected to occur at low frequency, the chances they will be captured in two distinct transcripts are low. In this data set, the frequency of IntronR is slightly lower, but still the highest among the five AS types, indicating that IntronR is indeed the most abundant regulated AS result. The second method is to look for conserved AS events through cross-species EST comparison and orthologous gene comparison. A few AS events were completely conserved in Mt, Lj, At and Os.
Functional AS events, however, may not always be conserved. As a dynamic process, splicing requires hundreds of proteins as well as some snRNAs to function accurately . Mutations in both trans- and cis-elements on target genes will impact splicing patterns. Depending on when the mutation and fixation event occurs, functional AS events can be shared among closely related species or be lineage-specific. The AltD event in 3'-UTR of the highly expressed carbonic anhydrase gene (AC124951_11) may be a good example shared by legume species. Lineage-specific functional AS events are difficult to define from EST data alone.
Centralized data place and standard data set for ASIP
As more plant genomes and ESTs are being sequenced, more AS events will be identified in the future. It is important to have a centralized place to store and compare all AS data. In animal systems, a comprehensive database, ASAP  includes AS data from 16 sequenced animals, which makes a comparison across different animal species straightforward. Such a database is also needed in plants, as the study of splicing signals and alternative splicing are just starting. The AS data identified in this study have been deposited in the ASIP database at PlantGDB , where previous AS data are stored and can be easily compared . Moreover, a database collecting genes related to splicing in At, animals and yeast is available through the SRGD database at PlantGDB [14, 44]. In the future, the database will be expanded to Os and other sequenced plant genomes including Mt, Lj and poplar. The analysis programs and plant genome browsers available at PlantGDB should facilitate the deep mining of AS data in plants. A core data set in which the AS events are conserved in all sequenced plants will be extremely useful for understanding the function of AS events, as well as the signals and regulation of this important and intriguing phenomenon.
As in At and Os, AS events are also widespread in the two model legumes Mt and Lj. Thousands of AS events were identified in Mt through a combination of same- and cross-species EST alignments. The frequency of alternatively spliced genes is similar across different plant species when the number of ESTs is standardized. Compared with mammals, plants are thought to have a relatively low frequency of alternatively spliced genes. Our results indicate that this assessment may be due in part to the comparatively low EST coverage in plant species. Among all five AS types discussed, IntronR is the most abundant in different subsets of genes, as previously observed in At and Os. We also identified hundreds of novel and conserved AS events through cross-species ESTs alignments. This is the first study in plants using cross-species ESTs to explore AS. For species with large EST collections but scant genome sequence data, including wheat and barley, aligning their ESTs to a closely related reference genome, such as Os, should shed light on alternative splicing in these species.
The Medicago Genome Sequence Consortium (MGSC) release 1.0, consisting of the 1,826 BACs analyzed in this study, were downloaded from Medicago genome sequencing project website . The assembly comprises a total of 186.2 Mb of non-redundant genome sequence, an estimated 38–47% of the entire genome and 55–58% of total gene space . All other sequence data sets used in this study were current as of July 17, 2006, the cutoff date for BACs incorporated into the Mt1.0 genome assembly. For Lotus japonicus, 1,394 BAC/TACs were downloaded from the NCBI  nucleotide database using the query "txid34305 [ORGN:noexp] AND HTG [KYWD]". Arabidopsis genome sequences and gene annotation (TAIR release 6.0) were downloaded from the GenBank FTP site , and rice genome sequences and gene annotation (TIGR release 4.0) were downloaded from the TIGR FTP site .
All EST sequences (including full-length cDNAs) were retrieved from GenBank nucleotide database. Sets of 225,920 Mt and 150,855 Lj transcript sequences were collected using the queries (txid3880 [ORGN] AND "biomol mrna" [PROP]) and (txid34305 [ORGN] AND "biomol mrna" [PROP]), respectively. Soybean transcript sequences (359,834) were retrieved using the query (txid3847 [ORGN] AND "biomol mrna" [PROP]), and 127,684 transcript sequences from all other legumes were retrieved by using the query (txid3803 [ORGN:exp] NOT txid3880 [ORGN] NOT txid34305 [ORGN] NOT txid3847 [ORGN] AND "biomol mrna" [PROP]). For At, 691,516 transcript sequences were retrieved using the query (txid3702 [ORGN] AND "biomol mrna" [PROP] AND srcdb_ddbj/embl/genbank [PROP]). For Os, 1,009,574 ESTs from the japonica cultivar-group were retrieved using query (txid39947 [ORGN] AND "biomol mrna" [PROP] AND srcdb_ddbj/embl/genbank [PROP]). We intentionally excluded transcript sequences from the indica cultivar-group to reduce possible false positive alignments caused by differences between the two Os cultivar-groups.
Spliced alignment of transcript to genome sequences
The legume transcript sequences were mapped to the Mt and Lj BAC sets using the two computer programs GeneSeqer  and GMAP . The splice site models for GeneSeqer were set to Medicago-specific parameters using the program option "-s Medicago". Default parameters were used for all other options. Default alignment parameters were used for GMAP. For At and Os, only GMAP alignments were performed locally, and GeneSeqer alignments derived from a larger data set were downloaded from PlantGDB .
GMAP and GeneSeqer output alignment files were processed by a pipeline (ASpipe1.0, available through SourceForge ) developed from Perl and shell scripts used in a previous study . ASpipe extracts coordinates and scores for high-quality intron/exon/alignments from the original program outputs and stores them in MySQL5.0 databases. For same-species EST alignments, the criteria for high-quality alignments were >95% sequence identity and >80% coverage (defined as the portion the transcript sequence aligned to the genomic sequence). The high identity (95%) cutoff minimizes false mapping of transcript sequences to incomplete genomes. For cross-species transcript alignments, the identity cutoff was decreased to 80%, which selects reliable alignments from divergent transcript sequences. Redundant EST alignments in Mt were removed by comparison with the non-redundant gene list provided for Mt1.0 . Exons mapped with >95% and >80% sequence identity were considered as reliably identified exons for same-species and cross-species mappings, respectively. Introns with reliable neighboring exons on both ends were considered as reliably identified introns. A transcription unit (TU) was defined as a consecutive genomic region where transcript sequences were mapped and clustered. Annotated gene models may contain multiple TUs. For Mt, At and Os, annotated genes were used as the base for analysis. For Lj, where no gene annotation is available, TUs were the base for analysis.
Identification of alternative splicing (AS) and conserved AS events
The coordinates of reliable introns and exons were compared in a pairwise fashion in order to identify candidates for AS events. For intron/intron comparison, if two introns had the same 3'-end but a different 5'-end, this event was classified as AltD. If two introns differed only in the 3'-ends, this event was classified as AltA. AltP events refer to introns overlapping with each other but with both 5'- and 3'-ends differing. For intron/exon comparisons, if an intron was completely covered by an exon, the event was classified as IntronR. If an exon was completely covered by an intron, the event was classified as ExonS. ExonS events involving terminal exons and the AltA/D/P events related to ExonS events were removed. The process and algorithm for identifying and analyzing AS events is described in more detail in . AS events identified from cross-species EST alignment were labeled as "cross-species AS events". Correspondingly, the events from same-species EST alignment were referred to as "same-species AS events".
Conserved AS events were identified in two ways: (1) Comparing cross-species AS events with same-species AS events and other cross-species AS events from different species; (2) Identifying orthologous gene pairs between Mt and Lj and comparing their AS events. In the first method, the Mt genome coordinates of the AS events predicted from multiple species ESTs were compared. Only events with identical coordinates of an alternatively processed intron(s)/exon(s) were regarded as completely conserved. In the second method, the orthologous genes were identified by searching ESTs mapped in both Mt and Lj genomes. In some cases, orthologs in At and Os were identified by reciprocal BLAST using annotated protein sequences from At, Mt and Os. Gene structures and AS events of orthologous genes were then compared to identify conserved AS events.
Alternative Acceptor site
Alternative Donor site
Alternative Position (both donor and acceptor sites are different). AS, Alternative Splicing
expressed sequence tag
- Lj :
- Mt :
open reading frame
BBW and MOT were supported by National Science Foundation grants DBI-0321460 and DBI-0606966 to NY. Data generated in this study are hosted at and publicly available through the ASIP database at PlantGDB , funded through NSF grant DBI-0606909 to VB.
- Wang BB, Brendel V: Genomewide comparative analysis of alternative splicing in plants. Proc Natl Acad Sci USA. 2006, 103 (18): 7175-7180. 10.1073/pnas.0602039103.PubMedPubMed CentralView ArticleGoogle Scholar
- Campbell MA, Haas BJ, Hamilton JP, Mount SM, Buell CR: Comprehensive analysis of alternative splicing in rice and comparative analyses with Arabidopsis. BMC Genomics. 2006, 7: 327-10.1186/1471-2164-7-327.PubMedPubMed CentralView ArticleGoogle Scholar
- Ner-Gaon H, Leviatan N, Rubin E, Fluhr R: Comparative cross-species alternative splicing in plants. Plant Physiol. 2007, 144 (3): 1632-1641. 10.1104/pp.107.098640.PubMedPubMed CentralView ArticleGoogle Scholar
- Kim E, Magen A, Ast G: Different levels of alternative splicing among eukaryotes. Nucleic Acids Res. 2007, 35 (1): 125-131. 10.1093/nar/gkl924.PubMedPubMed CentralView ArticleGoogle Scholar
- Gupta S, Zink D, Korn B, Vingron M, Haas SA: Genome wide identification and classification of alternative splicing based on EST data. Bioinformatics. 2004, 20: 2579-2585. 10.1093/bioinformatics/bth288.PubMedView ArticleGoogle Scholar
- Ner-Gaon H, Fluhr R: Whole-genome microarray in Arabidopsis facilitates global analysis of retained introns. DNA Res. 2006, 13 (3): 111-121. 10.1093/dnares/dsl003.PubMedView ArticleGoogle Scholar
- Kan Z, Castle J, Johnson JM, Tsinoremas NF: Detection of novel splice forms in human and mouse using cross-species approach. Pac Symp Biocomput. 2004, 9: 42-53.Google Scholar
- Sugnet CW, Kent WJ, Ares M, Haussler D: Transcriptome and genome conservation of alternative splicing events in humans and mice. Pac Symp Biocomput. 2004, 66-77.Google Scholar
- Chen FC, Chen CJ, Ho JY, Chuang TJ: Identification and evolutionary analysis of novel exons and alternative splicing events using cross-species EST-to-genome comparisons in human, mouse and rat. BMC Bioinformatics. 2006, 7: 136-10.1186/1471-2105-7-136.PubMedPubMed CentralView ArticleGoogle Scholar
- Zhu W, Buell CR: Improvement of whole-genome annotation of cereals through comparative analyses. Genome Res. 2007, 17 (3): 299-310. 10.1101/gr.5881807.PubMedPubMed CentralView ArticleGoogle Scholar
- Chen FC, Wang SS, Chaw SM, Huang YT, Chuang TJ: Plant Gene and Alternatively Spliced Variant Annotator. A plant genome annotation pipeline for rice gene and alternatively spliced variant identification with cross-species expressed sequence tag conservation from seven plant species. Plant Physiol. 2007, 143 (3): 1086-1095. 10.1104/pp.106.092460.PubMedPubMed CentralView ArticleGoogle Scholar
- Reddy AS: Alternative Splicing of Pre-Messenger RNAs in Plants in the Genomic Era. Annu Rev Plant Biol. 2007, 58: 267-294. 10.1146/annurev.arplant.58.032806.103754.PubMedView ArticleGoogle Scholar
- Reddy ASN: Nuclear pre-mRNA splicing in plants. Critical Rev Plant Sci. 2001, 20: 523-571. 10.1016/S0735-2689(01)80004-6.View ArticleGoogle Scholar
- Wang BB, Brendel V: The ASRG database: identification and survey of Arabidopsis thaliana genes involved in pre-mRNA splicing. Genome Biol. 2004, 5 (12): R102-10.1186/gb-2004-5-12-r102.PubMedPubMed CentralView ArticleGoogle Scholar
- Palusa SG, Ali GS, Reddy AS: Alternative splicing of pre-mRNAs of Arabidopsis serine/arginine-rich proteins: regulation by hormones and stresses. Plant J. 2007, 49 (6): 1091-1107.PubMedView ArticleGoogle Scholar
- Kalyna M, Lopato S, Voronin V, Barta A: Evolutionary conservation and regulation of particular alternative splicing events in plant SR proteins. Nucleic Acids Res. 2006, 34 (16): 4395-4405. 10.1093/nar/gkl570.PubMedPubMed CentralView ArticleGoogle Scholar
- Iida K, Go M: Survey of conserved alternative splicing events of mRNAs encoding SR proteins in land plants. Mol Biol Evol. 2006, 23 (5): 1085-1094. 10.1093/molbev/msj118.PubMedView ArticleGoogle Scholar
- Isshiki M, Tsumoto A, Shimamoto K: The serine/arginine-rich protein family in rice plays important roles in constitutive and alternative splicing of pre-mRNA. Plant Cell. 2006, 18 (1): 146-158. 10.1105/tpc.105.037069.PubMedPubMed CentralView ArticleGoogle Scholar
- Gupta S, Wang BB, Stryker GA, Zanetti ME, Lal SK: Two novel arginine/serine (SR) proteins in maize are differentially spliced and utilize non-canonical splice sites. Biochim Biophys Acta. 2005, 1728 (3): 105-114.PubMedView ArticleGoogle Scholar
- Gao H, Gordon-Kamm WJ, Lyznik LA: ASF/SF2-like maize pre-mRNA splicing factors affect splice site utilization and their transcripts are alternatively spliced. Gene. 2004, 339: 25-37. 10.1016/j.gene.2004.06.047.PubMedView ArticleGoogle Scholar
- Wang BB, Brendel V: Molecular characterization and phylogeny of U2AF35 homologs in plants. Plant Physiol. 2006, 140 (2): 624-636. 10.1104/pp.105.073858.PubMedPubMed CentralView ArticleGoogle Scholar
- Golovkin M, Reddy AS: Structure and expression of a plant U1 snRNP 70K gene: alternative splicing of U1 snRNP 70K pre-mRNAs produces two different transcripts. Plant Cell. 1996, 8 (8): 1421-1435. 10.1105/tpc.8.8.1421.PubMedPubMed CentralView ArticleGoogle Scholar
- Gupta S, Ciungu A, Jameson N, Lal SK: Alternative splicing expression of U1 snRNP 70K gene is evolutionary conserved between different plant species. DNA Seq. 2006, 17 (4): 254-261. 10.1080/10425170600856642.PubMedView ArticleGoogle Scholar
- Zhou Y, Zhou C, Ye L, Dong J, Xu H, Cai L, Zhang L, Wei L: Database and analyses of known alternatively spliced genes in plants. Genomics. 2003, 82 (6): 584-595. 10.1016/S0888-7543(03)00204-0.PubMedView ArticleGoogle Scholar
- Ramos J, Clemente MR, Naya L, Loscos J, Perez-Rontome C, Sato S, Tabata S, Becana M: Phytochelatin synthases of the model legume Lotus japonicus. A small multigene family with differential response to cadmium and alternatively spliced variants. Plant Physiol. 2007, 143 (3): 1110-1118. 10.1104/pp.106.090894.PubMedPubMed CentralView ArticleGoogle Scholar
- Szczyglowski K, Kapranov P, Hamburger D, de Bruijn FJ: The Lotus japonicus LjNOD70 nodulin gene encodes a protein with similarities to transporters. Plant Mol Biol. 1998, 37 (4): 651-661. 10.1023/A:1006043428636.PubMedView ArticleGoogle Scholar
- Horst I, Welham T, Kelly S, Kaneko T, Sato S, Tabata S, Parniske M, Wang TL: TILLING Mutants of Lotus japonicus Reveal that Nitrogen Assimilation and Fixation can Occur in the Absence of Nodule-enhanced Sucrose Synthase. Plant Physiol. 2007Google Scholar
- Jeong SC, Yang K, Park JY, Han KS, Yu S, Hwang TY, Hur CG, Kim SH, Park PB, Kim HM, Park YI, Liu JR: Structure, expression, and mapping of two nodule-specific genes identified by mining public soybean EST databases. Gene. 2006, 383: 71-80. 10.1016/j.gene.2006.07.015.PubMedView ArticleGoogle Scholar
- Hamada S, Ito H, Hiraga S, Inagaki K, Nozaki K, Isono N, Yoshimoto Y, Takeda Y, Matsui H: Differential characteristics and subcellular localization of two starch-branching enzyme isoforms encoded by a single gene in Phaseolus vulgaris L. J Biol Chem. 2002, 277 (19): 16538-16546. 10.1074/jbc.M110497200.PubMedView ArticleGoogle Scholar
- Neumann P, Pozarkova D, Macas J: Highly abundant pea LTR retrotransposon Ogre is constitutively transcribed and partially spliced. Plant Mol Biol. 2003, 53 (3): 399-410. 10.1023/B:PLAN.0000006945.77043.ce.PubMedView ArticleGoogle Scholar
- Young ND, Cannon SB, Sato S, Kim D, Cook DR, Town CD, Roe BA, Tabata S: Sequencing the genespaces of Medicago truncatula and Lotus japonicus. Plant Physiol. 2005, 137 (4): 1174-1181. 10.1104/pp.104.057034.PubMedPubMed CentralView ArticleGoogle Scholar
- Town CD: Annotating the genome of Medicago truncatula. Curr Opin Plant Biol. 2006, 9 (2): 122-127. 10.1016/j.pbi.2006.01.004.PubMedView ArticleGoogle Scholar
- Medicago genome sequence release 1.0. [http://www.medicago.org/genome/downloads/Mt1/]
- Brendel V, Xing L, Zhu W: Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus. Bioinformatics. 2004, 20 (7): 1157-1169. 10.1093/bioinformatics/bth058.PubMedView ArticleGoogle Scholar
- Wu TD, Watanabe CK: GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics. 2005, 21 (9): 1859-1875. 10.1093/bioinformatics/bti310.PubMedView ArticleGoogle Scholar
- Goodall GJ, Filipowicz W: Different effects of intron nucleotide composition and secondary structure on pre-mRNA splicing in monocot and dicot plants. Embo J. 1991, 10 (9): 2635-2644.PubMedPubMed CentralGoogle Scholar
- Alternative Splicing In Plants (ASIP). [http://www.plantgdb.org/ASIP/]
- Johnson JM, Castle J, Garrett-Engele P, Kan Z, Loerch PM, Armour CD, Santos R, Schadt EE, Stoughton R, Shoemaker DD: Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays. Science. 2003, 302 (5653): 2141-2144. 10.1126/science.1090100.PubMedView ArticleGoogle Scholar
- Brett D, Pospisil H, Valcarcel J, Reich J, Bork P: Alternative splicing and genome complexity. Nat Genet. 2002, 30 (1): 29-30. 10.1038/ng803.PubMedView ArticleGoogle Scholar
- Alexandrov NN, Troukhan ME, Brover VV, Tatarinova T, Flavell RB, Feldmann KA: Features of Arabidopsis Genes and Genome Discovered using Full-length cDNAs. Plant Mol Biol. 2006, 60 (1): 69-85. 10.1007/s11103-005-2564-9.PubMedView ArticleGoogle Scholar
- Lewis BP, Green RE, Brenner SE: Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans. Proc Natl Acad Sci USA. 2003, 100 (1): 189-192. 10.1073/pnas.0136770100.PubMedPubMed CentralView ArticleGoogle Scholar
- Kan Z, States D, Gish W: Selecting for functional alternative splices in ESTs. Genome Res. 2002, 12 (12): 1837-1845. 10.1101/gr.764102.PubMedPubMed CentralView ArticleGoogle Scholar
- Kim N, Alekseyenko AV, Roy M, Lee C: The ASAP II database: analysis and comparative genomics of alternative splicing in 15 animal species. Nucleic Acids Res. 2007, D93-98. 10.1093/nar/gkl884. 35 Database
- Splicing Related Genes Database (SRGD). [http://www.plantgdb.org/SRGD]
- Medicago genome sequencing project. [http://www.medicago.org/genome/]
- Medicago Genome Sequence Release 1.0 white book. [http://www.medicago.org/genome/downloads/Mt1/Mt1.0.pdf]
- National Center for Biotechnology Information (NCBI). [http://www.ncbi.nlm.nih.gov/]
- NCBI Arabidopsis Genome Sequence FTP Site. [ftp://ftp.ncbi.nih.gov/genomes/Arabidopsis_thaliana/]
- TIGR Rice Genome Sequences Release 4.0 FTP site. [ftp://ftp.tigr.org/pub/data/Eukaryotic_Projects/o_sativa/annotation_dbs/pseudomolecules/version_4.0/]
- Dong Q, Lawrence CJ, Schlueter SD, Wilkerson MD, Kurtz S, Lushbough C, Brendel V: Comparative plant genomics resources at PlantGDB. Plant Physiol. 2005, 139 (2): 610-618. 10.1104/pp.104.059212.PubMedPubMed CentralView ArticleGoogle Scholar
- ASpipe project at SourceForge. [https://sourceforge.net/projects/aspipe/]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.