- Research article
- Open Access
Assessing the contribution of alternative splicing to proteome diversity in Arabidopsis thalianausing proteomics data
© Severing et al; licensee BioMed Central Ltd. 2011
- Received: 6 February 2011
- Accepted: 16 May 2011
- Published: 16 May 2011
Large-scale analyses of genomics and transcriptomics data have revealed that alternative splicing (AS) substantially increases the complexity of the transcriptome in higher eukaryotes. However, the extent to which this complexity is reflected at the level of the proteome remains unclear. On the basis of a lack of conservation of AS between species, we previously concluded that AS does not frequently serve as a mechanism that enables the production of multiple functional proteins from a single gene. Following this conclusion, we hypothesized that the extent to which AS events contribute to the proteome diversity in Arabidopsis thaliana would be lower than expected on the basis of transcriptomics data. Here, we test this hypothesis by analyzing two large-scale proteomics datasets from Arabidopsis thaliana.
A total of only 60 AS events could be confirmed using the proteomics data. However, for about 60% of the loci that, based on transcriptomics data, were predicted to produce multiple protein isoforms through AS, no isoform-specific peptides were found. We therefore performed in silico AS detection experiments to assess how well AS events were represented in the experimental datasets. The results of these in silico experiments indicated that the low number of confirmed AS events was the consequence of a limited sampling depth rather than in vivo under-representation of AS events in these datasets.
Although the impact of AS on the functional properties of the proteome remains to be uncovered, the results of this study indicate that AS-induced diversity at the transcriptome level is also expressed at the proteome level.
- Alternative Splice
- Detection Experiment
- Parent Protein
- Equal Expression
- Proteome Level
Alternative splicing (AS) is a common phenomenon in higher eukaryotes that involves the production of multiple distinct mRNA molecules from a single gene. RNA-Seq surveys have shown that more than 90% of human and over 40% of Arabidopsis thaliana and rice genes are capable of producing multiple diverse mRNA molecules through AS [1–3]. A large fraction of AS events are predicted to result in transcripts that encode premature termination codons (see for instance [1, 4]) and that are likely to be degraded through the nonsense mediated decay (NMD) pathway . Although it has been the subject of several genome-wide studies (e.g. [6–8]), the extent to which the remaining fraction of AS events contribute to the functional protein repertoires of eukaryotes remains relatively unknown.
We concluded in a previous genome-wide comparative analysis of AS in three plant species that AS does not substantially contribute to functional diversity of the proteome . Our conclusions were based on the limited conservation of AS events that can contribute to proteome diversity and the lack of conserved patterns that relate AS to gene function. Following this conclusion, it is conceivable that most AS events, in particular those that are not targeted towards NMD, result from noise in the splicing process  and are not strongly manifested at the protein level. However, lack of conservation can also mean that many protein isoforms have a confined, species-specific function rather than no function at all. In this scenario, it might be expected that most AS events are also expressed at the protein level. Determining which of these two scenarios is the most likely has been a difficult task because the majority of genome-wide studies of AS have been performed using protein isoforms deduced from transcriptomics data. For most of these isoforms no evidence for their expression at the protein level was available.
The gap between the availability of transcriptomics and proteomics data is steadily being bridged by the advancing field of mass spectrometry-based proteomics. This technology, which can be used to characterize complex protein mixtures , is of great value for studying the impact of AS at the proteome level. Indeed, a number of studies have appeared that describe the use of proteomics data for the identification of protein polymorphisms that are the result of AS [10–12].
In this study we address the impact of AS on proteome diversity in the model species Arabidopsis thaliana by reanalyzing the data from two independent large-scale proteomics studies [13, 14]. Although AS was briefly addressed in these studies, their primary focus was on the confirmation and revision of existing gene structures and on the identification of new protein coding genes. The main objective of our study is to assess whether the predicted contribution of AS to the proteome diversity in A. thaliana, as based on transcriptomics data, is indeed observed at the proteome level.
We limited our study to those AS events that could be deduced from the annotated gene structures in the genome annotation database of A. thaliana version TAIR 10.0 (http://www.arabidopsis.org) and that are predicted to contribute to proteome diversity in this species. The absolute numbers of AS events that could be confirmed using the experimental peptide sets were by themselves not very indicative for the contribution of AS to the proteome diversity in A. thaliana. This is because these numbers depend on the depth of sampling in the experiments. We therefore performed in silico AS detection experiments using randomly generated peptide sets to assess the representativeness of the experimental sampling. This type of in silico experiments has previously been described and applied to Drosophila data .
We show that the outcome of the in silico experiments can lead to conflicting conclusions about the impact of AS on the proteome diversity, depending on the assumption that is used for generating the random peptide sets. We evaluate two of such assumptions and according to the biologically most realistic one, we show that AS events were not under-represented in the analyzed proteomics sets. This implies that variation due to splicing is to a large extent expressed at the proteome level.
Throughout this study we used three experimental datasets, the first two of which, hereafter referred to as the Castellana and Baerenfaller sets, contain peptides from two large-scale proteomics experiments on A. thaliana [13, 14]. The third set, hereafter called the Merged set, was created by merging the Castellana and Baerenfaller sets into a non-redundant set. As it was essential for our study that each experimentally identified peptide could be reproduced by an in silico digestion of its parent protein, we only considered those peptides that met the following criteria: first, only one missed cleavage site (internal lysine or argine residues that were not used as cleavage sites by the trypsine enzyme) was allowed per peptide. Second, only those peptides that could be mapped to their parent proteins according to a strict set of rules were considered (see Material and Methods).
The initial set of annotated A. thaliana proteins (TAIR10.0) was also filtered by removing all proteins for which the exon/intron structure underlying its CDS region was not sufficiently supported by transcript data (see Material and Methods). The filtered protein set contained a total of 25,039 unique protein sequences derived from 21,136 nuclear-encoded, protein-coding TAIR 10.0 loci. Around 14.2% of the loci within the filtered protein set were predicted to produce distinct proteins through AS (hereafter called AS loci).
Identification of nuclear encoded TAIR 10 loci.
Total number of
Nr. of mapped
% of peptides
Nr. of TAIR loci
% of TAIR loci
We note that a large fraction of the peptides from both the Baerenfaller (~16%) and Castellana (~45%) sets could not be mapped to any protein using our stringent criteria. These were kept stringent to ensure reproducibility of mapping results in the in silico experiments.
AS detection results
Experimentally confirmed AS events.
Identifiable AS events
AS loci w. confirmed
Number of confirmed
A total of 38 AS events, corresponding to 38 AS loci were confirmed using the experimentally identified peptides from the Castellana set. Usage of the peptides from the Baerenfaller set resulted in the confirmation of 21 AS events from 21 AS loci (Table 2). Although more peptides from the Baerenfaller set could be mapped to their parent proteins than from the Castellana set, more AS events were confirmed using the latter set (Table 2). Comparison of the AS loci revealed that seven AS loci had confirmed AS events in both the Castellana- and Baerenfaller sets. In total, 60 AS events corresponding to 59 AS loci were confirmed using the experimental peptide set. These AS events represent ~2.9% of all AS events that could theoretically be confirmed using the merged peptide set. We note that for the Merged set the number of confirmed AS events was higher than the number of AS loci with confirmed AS events. This was due to a single AS locus that had more than one confirmed AS event. An overview of the annotations corresponding to the AS loci with confirmed AS events is provided in Additional file 2, Table S1.
Sampling of AS regions
Sampling of AS events.
Nr. of sampled
% of identifiable
AS loci w. sampled
% of AS loci
In silico AS detection experiments
The composition of the random peptide sets and therefore also the AS detection outcome depends on the pooling probabilities that are assigned to the individual peptides in the initial in silico peptide populations. These pooling probabilities simply reflect the relative abundances of the peptides within the initial populations (see Material and Methods). We used two different assumptions for assigning pooling probabilities to the individual peptides (Figure 1C). The first assumption, to which we refer as the "equal pooling probability" assumption, has previously been described by Tress and co-workers . Under this assumption, all peptides in the initial population are unique and therefore have the same probability of being pooled. Under the second assumption, hereafter referred to as the "equal expression" assumption, it was assumed that all genes were represented by equal numbers of protein molecules and that all isoforms of an AS locus were equally abundant in the protein sample. A consequence of this assumption was that the peptides within the initial populations were not equally abundant (Figure 1C).
In silicoAS detection experiments.
Number of experimentally
confirmed AS events
Mean nr. of AS events
Mean nr. of AS events
A different picture emerged from the simulations performed using the "equal expression" assumption. In this case, the number of experimentally confirmed AS events in the Castellana set was around 1.9 times larger than the expected number of events (Table 4; Simulations B). In contrast, the number of experimentally confirmed AS events for the Baerenfaller set fell within just 1 SD of the mean number of events as determined by the in silico experiments. Finally, the number of experimentally confirmed events for the Merged set was one and a half times larger than the expected number of events. In summary, under the "equal expression" assumption the in silico experiments indicate that; (i) AS events were not under-represented in the Baerenfaller set, and; (ii) AS events were over-represented in both the Castellana- and the Merged set.
Genome-wide studies that address the impact of AS on proteome diversity have thus far mainly been performed using indirect evidence from transcriptomics data. Data that can be used to directly assess this impact is increasingly being provided by high-throughput proteomics experiments. Here we studied the impact of AS on proteome diversity in the model species Arabidopsis thaliana by reanalyzing data from two previous, large-scale proteomics studies [13, 14]. The main goal of our study was to determine whether the contribution of AS events to proteome diversity as predicted using transcriptomics data, is indeed observed at the proteome level.
The absolute numbers of AS events that could be confirmed using the experimentally identified peptides were not particularly high and only represented around 2 to 3% of identifiable AS events. Analysis of the representation of protein regions corresponding to the location of AS events that were sampled in the experiments showed that for roughly two thirds of AS loci no peptides were detected that could discriminate between the different protein isoforms. The absolute numbers of confirmed AS per se are therefore not very indicative for the extent to which AS contributes to proteome diversity in A. thaliana.
We performed in silico AS detection experiments to determine how well AS events were represented in the biological samples, given the sampling depth achieved in the proteomics experiments. The in silico experiments should thus reveal whether the number of AS events identified using the experimental peptide sets significantly deviated from the expected number of AS events. The latter was calculated using an equally-sized random subset of in silico peptides pooled from the an initial peptide population. This initial peptide population consisted of all peptides that theoretically could be obtained through digestion of the proteins (including isoforms resulting from AS) that were encoded by the loci expressed in the experimental samples.
One factor that critically influenced the outcome of these in silico experiments involved the pooling probabilities that were assigned to the individual peptides in the initial population. We performed the in silico experiments using two different pooling probability assumptions. The first, "equal pooling probability" assumption, indicated that AS events were under-represented in all experimental peptide sets. In a previous proteomics study performed on Drosophila data, the same "equal pooling probability" assumption was used for generating peptide samples and determining the number of expected AS events . The results in our study are comparable to those obtained for the Brunner set in that study.
The results of the in silico experiments were very different for the "equal expression" assumption. In this case, AS events were found to be over-represented in the Castellana and Merged sets, while for the Baerenfaller set, the number of experimentally identified AS events fell within 1 SD of the expected number of events. The observation that AS events were not under-represented in the experimental samples corresponds to the results of a recent study in which many AS transcript isoforms were shown to be actively translated .
The inconsistency between the conclusions obtained under the two pooling probabilities assumptions is the result of the fact that isoform-specific peptides associated with AS events have higher pooling probabilities under the "equal pooling probability" assumption than under the "equal expression" assumption. Under the first assumption, isoform-specific peptides and non isoform-specific peptides are equally abundant. In contrast, under the "equal expression" assumption, non isoform-specific peptides are more abundant than isoform-specific peptides (Figure 1C). This difference results in different pooling probabilities, in which the "equal pooling probability" assumption provides an upper bound to the expected number of AS events. The "equal expression" assumption, however, does not provide a corresponding lower bound, because it does not consider the relative expression levels between two or more AS isoforms. Indeed, the effect of lowering of the expected number of events would only further increase if unequal expression of isoforms would be taken into account and would therefore strengthen the conclusion that AS events were not under-represented in the experimental peptide sets.
Although neither of the two pooling probability assumptions is truly realistic in a biological sense, the "equal expression" assumption arguably provides the better approximation. This follows from the fact that isoform-specific peptides are necessarily less abundant than non-isoform specific peptides. Using Figure 1 as illustration, this can be understood by considering the total amount of peptides produced from a single locus, whatever the relative expression level of the two underlying isoforms is: the amounts of the constitutive peptides p1 and p4 will be the same and will always equal the sum of p2+p3. Given this reasoning, the conclusion derived under the "equal expression" assumption, namely that AS is over-represented, or at least not under-represented in the experimental proteomics datasets, is the most plausible.
A key factor that might explain the over-representation of AS events in the Castellana set compared to the Baerenfaller set, involves the bias of AS events towards disordered regions of proteins in the former set. AS events located within disordered regions can introduce variations that have a limited impact on protein folding . Because cells have evolved mechanisms that can recognize and remove incorrectly folded proteins , AS events that have a limited impact on the protein structure are more likely to be viable and manifested at the protein level. In fact, it has recently been shown that pairs of AS isoforms, for which evidence was available that they were expressed, differed by polymorphisms that were more often located within disordered regions than expected .
One property of disordered regions is that they allow proteins to bind with multiple partners with high specificity and low affinity . AS within such regions are interesting because they might play an important role in regulating protein-protein interactions.
We conclude that the low numbers of AS events that could be confirmed using the proteomics datasets for A. thaliana are the result of a relatively low depth of sampling in the proteomics experiments. In silico AS detection experiments, performed under the assumption of equal expression of isoforms, indicate that AS events were not under-represented in the experimental peptide sets. An important implication of this is that much or all of the AS variation in A. thaliana that is expressed at the transcriptome level and not degraded through the NMD pathway, is also manifested at the proteome level. The true extent, however, to which AS variants are functional remains to be uncovered. Given that AS variation is not well conserved in plants , genome-wide expression of AS variation at the proteome level could point to the possibility that many of the AS events are associated with protein isoforms that either have a species-specific function or that are stable enough to escape rapid protein turnover.
Peptide sequences from the study performed by Baerenfaller and co-workers  were obtained by querying the Pride database  using the available BioMart interface. Peptide sequences from the study of Castellana and co-workers  were downloaded from the webpage of the authors (site referenced in their publication). An additional peptide set was constructed by merging the Baerenfaller and Castellana peptide sets into a non-redundant set. Because trypsin was used for digesting proteins in both proteomics studies, peptides containing internal lysine (K) or arginine (R) residues that were not immediately followed by a proline (P) residue, were considered to be the result of missed cleavage sites. All peptides that contained two or more missed cleavage sites were discarded.
The predicted proteome of Arabidopsis thaliana version TAIR 10 was downloaded from http://www.arabidopsis.org. The information within the "confidenceranking_exon"-file (ftp://ftp.arabidopsis.org/home/tair/Genes/
TAIR10_genome_release/confidenceranking_exon) was used for filtering the proteome using the following criteria: (i) a protein encoded by a multi exon gene was only kept if all splice junctions located within the corresponding CDS region were supported by transcript data (mRNA) data, and; (ii) a protein encoded by a single exon gene was kept if at least 80% of the gene was supported by transcript data.
Mapping peptides against their parent proteins
Vmatch (http://www.vmatch.de/) was used for performing exact searches with the peptides against the filtered proteome of A. thaliana. All matches were subsequently filtered using the following criteria: (i) peptides that did not map to the C-terminus of their parent protein were required to have a K- or R- residue at their C-terminus; (ii) peptide matches were discarded if the corresponding region of the parent protein was not immediately preceded by a K- or R-residue, unless the peptide mapped to the N-terminus of the parent protein; (iii) peptide matches were discarded if the corresponding region of the parent protein was immediately followed by a P-residue. Finally, only those proteins were considered that had at least one mapped peptide which was unique for the locus from which the protein originated.
Identification of AS events at the proteome level
AS events were deduced from the annotated gene structures using a previously described method . The identification of AS events at the proteome level was only performed with peptides that were unique for one or more, but not all of the protein isoforms of a locus. A schematic overview of the rules that were used for the identification of AS events at the proteome level is provided in Additional file 1, Figure S1.
In silico generation of peptide fragments
Peptides were generated by performing an in silico trypsin digestion involving cleavage after K- and R- residues that were not followed by a P-residue. Only one missed cleavage site was allowed per peptide. All peptides with a mass outside the observed mass-range of the experimentally identified peptides (~523-5,399 Da and ~725-4,962 Da for the Castellana set and Baerenfaller set, respectively) were discarded.
In silico AS detection experiments
The in silico AS detection experiments involved randomly pooling non-redundant peptide samples, equal in size to the experimental peptide samples, from an initial peptide population. This initial population only contained peptides that mapped to the protein products encoded by the loci which were expressed in the experimental samples. The probability of pooling a particular peptide depends on its abundance within the initial peptide population. The in silico detection experiments were performed using either one of the following two assumptions on the abundance of individual peptides within the initial peptide populations.
Under the first assumption to which we refer as the "equal pooling probability" assumption, all in silico generated peptides are equally abundant and therefore have the same probability (1/N) of being pooled, which depends on the size of initial peptide population (N). This pooling strategy, which has previously been described in , reflects a biological scenario in which individual proteins within an experimental sample are present in such numbers that subsequent digestion of the sample results in a population of equally abundant peptides.
Under the second assumption, to which we refer as the "equal expression" assumption, two basic rules are applied: (i) all genes are represented by equal amounts of protein molecules, and; (ii) all protein isoforms from an AS locus are present in equal numbers. The abundance of each protein within the sample is therefore determined as follows: Let M be the number of protein isoforms produced by the alternatively spliced gene with the highest number of unique protein isoforms. In order for rule (i) to be fulfilled, each gene has to produce M protein molecules. The protein product from a constitutively spliced gene is therefore present M times within the entire protein sample. To fulfill rule (ii), the number of molecules that correspond to a particular protein isoform of an AS locus that produces X different protein isoforms equals M /X. As a consequence, each peptide originating from this specific protein isoform is also represented by M/X molecules in the total peptide mixture after digestion. When for simplicity each peptide within the final sample is considered to be unique (even when multiple exact sequence copies exists), its pooling probability equals its abundance divided by the total number of peptides within the initial peptide population.
Prediction of disordered regions
Putative disordered regions were predicted using the FoldIndex method  which is based on an algorithm developed by Uversky and co-workers . In brief, the method uses hydrophobicity and net charge of protein sequence segments in order to distinguish disordered from ordered regions. By sliding over the protein sequences using a window of 51 AA and a step size of 1, disordered regions were identified as regions of at least five consecutive amino acid residues located in the centre of a window with a negative FoldIndex value.
This work was supported by the BioRange programme (SP 3.2.1) of the Netherlands Bioinformatics Centre (NBIC), which is supported through the Netherlands Genomics Initiative (NGI).
- Filichkin SA, Priest HD, Givan SA, Shen R, Bryant DW, Fox SE, Wong WK, Mockler TC: Genome-wide mapping of alternative splicing in Arabidopsis thaliana. Genome research. 2010, 20 (1): 45-58. 10.1101/gr.093302.109.PubMedPubMed CentralView ArticleGoogle Scholar
- Lu T, Lu G, Fan D, Zhu C, Li W, Zhao Q, Feng Q, Zhao Y, Guo Y, Huang X, et al: Function annotation of the rice transcriptome at single-nucleotide resolution by RNA-seq. Genome research. 2010, 20 (9): 1238-1249. 10.1101/gr.106120.110.PubMedPubMed CentralView ArticleGoogle Scholar
- Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB: Alternative isoform regulation in human tissue transcriptomes. Nature. 2008, 456 (7221): 470-476. 10.1038/nature07509.PubMedPubMed CentralView ArticleGoogle Scholar
- Zhang G, Guo G, Hu X, Zhang Y, Li Q, Li R, Zhuang R, Lu Z, He Z, Fang X, et al: Deep RNA sequencing at single base-pair resolution reveals high complexity of the rice transcriptome. Genome research. 2010, 20 (5): 646-654. 10.1101/gr.100677.109.PubMedPubMed CentralView ArticleGoogle Scholar
- Lewis BP, Green RE, Brenner SE: Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans. Proceedings of the National Academy of Sciences of the United States of America. 2003, 100 (1): 189-192. 10.1073/pnas.0136770100.PubMedPubMed CentralView ArticleGoogle Scholar
- Melamud E, Moult J: Stochastic noise in splicing machinery. Nucleic acids research. 2009, 37 (14): 4873-4886. 10.1093/nar/gkp471.PubMedPubMed CentralView ArticleGoogle Scholar
- Severing EI, van Dijk AD, Stiekema WJ, van Ham RC: Comparative analysis indicates that alternative splicing in plants has a limited role in functional expansion of the proteome. BMC genomics. 2009, 10: 154-10.1186/1471-2164-10-154.PubMedPubMed CentralView ArticleGoogle Scholar
- Tress ML, Martelli PL, Frankish A, Reeves GA, Wesselink JJ, Yeats C, Olason PL, Albrecht M, Hegyi H, Giorgetti A, et al: The implications of alternative splicing in the ENCODE protein complement. Proceedings of the National Academy of Sciences of the United States of America. 2007, 104 (13): 5495-5500. 10.1073/pnas.0700800104.PubMedPubMed CentralView ArticleGoogle Scholar
- Aebersold R, Mann M: Mass spectrometry-based proteomics. Nature. 2003, 422 (6928): 198-207. 10.1038/nature01511.PubMedView ArticleGoogle Scholar
- Mo F, Hong X, Gao F, Du L, Wang J, Omenn GS, Lin B: A compatible exon-exon junction database for the identification of exon skipping events using tandem mass spectrum data. BMC bioinformatics. 2008, 9: 537-10.1186/1471-2105-9-537.PubMedPubMed CentralView ArticleGoogle Scholar
- Tanner S, Shen Z, Ng J, Florea L, Guigo R, Briggs SP, Bafna V: Improving gene annotation using peptide mass spectrometry. Genome research. 2007, 17 (2): 231-239. 10.1101/gr.5646507.PubMedPubMed CentralView ArticleGoogle Scholar
- Tress ML, Bodenmiller B, Aebersold R, Valencia A: Proteomics studies confirm the presence of alternative protein isoforms on a large scale. Genome biology. 2008, 9 (11): R162-10.1186/gb-2008-9-11-r162.PubMedPubMed CentralView ArticleGoogle Scholar
- Baerenfaller K, Grossmann J, Grobei MA, Hull R, Hirsch-Hoffmann M, Yalovsky S, Zimmermann P, Grossniklaus U, Gruissem W, Baginsky S: Genome-scale proteomics reveals Arabidopsis thaliana gene models and proteome dynamics. Science (New York, NY. 2008, 320 (5878): 938-941. 10.1126/science.1157956.View ArticleGoogle Scholar
- Castellana NE, Payne SH, Shen Z, Stanke M, Bafna V, Briggs SP: Discovery and revision of Arabidopsis genes by proteogenomics. Proceedings of the National Academy of Sciences of the United States of America. 2008, 105 (52): 21034-21038. 10.1073/pnas.0811066106.PubMedPubMed CentralView ArticleGoogle Scholar
- Olsen JV, Ong SE, Mann M: Trypsin cleaves exclusively C-terminal to arginine and lysine residues. Mol Cell Proteomics. 2004, 3 (6): 608-614. 10.1074/mcp.T400003-MCP200.PubMedView ArticleGoogle Scholar
- Jiao Y, Meyerowitz EM: Cell-type specific analysis of translating RNAs in developing flowers reveals new levels of control. Mol Syst Biol. 2010, 6: 419-PubMedPubMed CentralView ArticleGoogle Scholar
- Romero PR, Zaidi S, Fang YY, Uversky VN, Radivojac P, Oldfield CJ, Cortese MS, Sickmeier M, LeGall T, Obradovic Z, et al: Alternative splicing in concert with protein intrinsic disorder enables increased functional diversity in multicellular organisms. Proceedings of the National Academy of Sciences of the United States of America. 2006, 103 (22): 8390-8395. 10.1073/pnas.0507916103.PubMedPubMed CentralView ArticleGoogle Scholar
- Goldberg AL: Protein degradation and protection against misfolded or damaged proteins. Nature. 2003, 426 (6968): 895-899. 10.1038/nature02263.PubMedView ArticleGoogle Scholar
- Hegyi H, Kalmar L, Horvath T, Tompa P: Verification of alternative splicing variants based on domain integrity, truncation length and intrinsic protein disorder. Nucleic acids research. 2010, 39 (4): 1208-19.PubMedPubMed CentralView ArticleGoogle Scholar
- Dunker AK, Oldfield CJ, Meng J, Romero P, Yang JY, Chen JW, Vacic V, Obradovic Z, Uversky VN: The unfoldomics decade: an update on intrinsically disordered proteins. BMC genomics. 2008, 9 (Suppl 2): S1-10.1186/1471-2164-9-S2-S1.View ArticleGoogle Scholar
- Jones P, Cote RG, Cho SY, Klie S, Martens L, Quinn AF, Thorneycroft D, Hermjakob H: PRIDE: new developments and new datasets. Nucleic acids research. 2008, D878-883. 36 DatabaseGoogle Scholar
- Prilusky J, Felder CE, Zeev-Ben-Mordehai T, Rydberg EH, Man O, Beckmann JS, Silman I, Sussman JL: FoldIndex: a simple tool to predict whether a given protein sequence is intrinsically unfolded. Bioinformatics (Oxford, England). 2005, 21 (16): 3435-3438. 10.1093/bioinformatics/bti537.View ArticleGoogle Scholar
- Uversky VN, Gillespie JR, Fink AL: Why are "natively unfolded" proteins unstructured under physiologic conditions?. Proteins. 2000, 41 (3): 415-427. 10.1002/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.