Variation in numbers and chromosomal locations of rDNA
Variation in numbers and distribution patterns of rDNA loci among related species is commonly observed in many different plant genera, including Brassicaceae [10], Cyperaceae [11], Asteraceae [25, 26], Leguminosae [27], Pinus[28], and Rosaceae [14]. Plants typically show some degree of conservatism of rDNA repeat duplication, such that when multiple loci do appear, species are commonly polyploid relatives of diploids. There is no evidence at all, however, for polyploidy in Paphiopedilum, where the only chromosome number differences are aneuploid, in a series reflective of centric fission or fusion.
In general, FISH patterns of 25S rDNA loci are reported to be more polymorphic than those of the 5S rDNA [12–14, 26, 28–32]. Conversely, in all sections of Paphiopedilum, except for Parvisepalum and Concoloria, 5S rDNA sites showed much more variability both in number and physical location than did 25S rDNA sites.
The most parsimonious ancestral number of 25S rDNA sites in Paphiopedilum is two, based on outgroup comparison to the genera Mexipedium and Phragmipedium (unpublished results;[3, 22]). Duplication of 25S rDNA sites was observed only in three of the seven sections of Paphiopedilum: Parvisepalum (2n = 26), Coryopedilum (2n = 26) and Pardalopetalum (2n = 26) (Table 1). The physical positions of 25S rDNA loci are relatively conservative. In most Paphiopedilum species we analyzed, 25S rDNA signals are located in terminal chromosome positions. Variation was only observed in three species, Paphiopedilum adductum, P. randsii and P. lowii, which showed 1-4 subtelomeric 25S rDNA signals (Figures 6E, F and 7A, respectively). The ancestral number of 5S rDNA sites, again by outgroup comparison, is 2 (unpublished results from Mexipedium and Phragmipedium), and is only observed in sections Parvisepalum and Concoloria. Massive duplication and amplification of 5S rDNA loci, leading to large-scale polymorphism of numbers, sizes and physical positions of signals, was found prevalent in the remaining five sections. The numbers and distribution of rDNA loci vary widely among plants; however, usually less than one-third of chromosomes display either 45S rDNA or 5S rDNA [13]. It is therefore noteworthy that in some lineages of Paphiopedilum, up to 24 of the 26 chromosomes bear at least one rDNA locus, and a single chromosome can bear up to five major 5S rDNA loci.
Apparently, there is no strong correlation between the increase in the number of rDNA sites and the increase in the number of chromosomes or genome size. A similar situation has also been described in many other diploid species, e.g. the diploid lineage of Brassicaceae [10], Cyperaceae [11, 12], Iris[13], and Rosaceae [14]. The massive duplication of rDNA loci in Paphiopedilum sections Cochlopetalum, Paphiopedilum, Coryopedilum and Pardalopetalum could partly contribute to the increase of genome size. Perhaps paradoxically, species with the smallest (P. exul; section Paphiopedilum) and largest (P. dianthum; section Pardalopetalum) haploid genome sizes are both members of groups that show considerable 25S and 5S locus duplication in our FISH experiments. These two species differ more than two-fold in genome size, 16.1 to 35.1 Mb, respectively [33]. If we assume that the number of distinct genes among Paphiopedilum species is roughly constant, this would suggest that genome size increase is primarily due to repetitive element amplification, but that since rDNA duplication is associated with both smaller and larger genomes in the genus, size differences may be more logically traceable to other repetitive DNAs, such as mobile elements. However, a possible tendency for elimination of rDNA loci was found in section Barbata, which has the greatest average genome sizes and chromosome numbers [4]. The number of 25S rDNA loci in Barbata remains two through all the species we studied, while the distribution pattern of 5S rDNA is less dispersed than its sister group, Coryopedilum plus Pardalopetalum. Due to the derived phylogenetic position of section Barbata (Figure 1), it is most parsimonious to conclude that unique chromosomal conditions seen in the group would be similarly derived (autapomorphic). As such, centric fission in Barbata appears to be associated with loss of rDNA loci, while in other systems, centric fission has led to rDNA gains [34]. Elimination of rDNA loci during chromosomal evolution has been documented in, e.g., Brassicaceae and Rosaceae [10, 14]. The mechanism that accounts for such loss of rDNA loci, however, remains unclear. A presumed evolutionary loss of abundant terminal nucleolar organizing regions (NOR) in Arabidopsis has been hypothesized to be the consequence of an ancient fusion event [35]. In the case of section Barbata, additional traceable chromosome markers are needed to provide further evidence that chromosomal rearrangements are related to rDNA loss.
A combination of different mechanisms causes high mobility of rDNA
Different mechanisms have been postulated to account for the mobility and polymorphism of numbers, sizes and positions of rDNA sites, such as transposon-mediated transpositional events [36–38], and chromosome rearrangements (translocation, inversion, duplication, deletion) caused by homologous or non-homologous unequal crossing-over and gene conversion [9, 28, 30, 36]. These processes could act alone or in combination, and they do not necessarily imply changes in overall chromosome morphology [31, 34].
The great degree of 5S repeat dispersion seen in sections Cochlopetalum, Paphiopedilum, Coryopedilum and Pardalopetalum has, to our knowledge, only been observed in the monocots Alstroemeria, Tulipa, and Iris[13, 39, 40]. The original seeding of rDNA repeats to ectopic locations in the genome could be the result of transposable element activity or perhaps incorporation of array segments into breakpoints as part of non-homologous end joining during DNA repair. Indeed, some of the signals we observed may be pseudogenes transported within the genomes by retroelements, therefore leading to the false interpretation that we are visualizing entire and active rDNA arrays. Both subtelomeric and pericentromeric regions are well known as hot spots of breakpoints and are also enriched for TEs [5, 6]. Considering the abundant minor loci we observed in these regions, a contribution of transpositions to the dispersed distribution pattern is tenable, and TEs containing 5S rDNA-derived sequences have in fact been observed in many plants [41] and animals [42]. It is nonetheless possible that due to the similarity of rDNA arrays, chromosomal rearrangement could be induced via heterologous recombination, and in turn, rearrangement could generate repeated sequences through unequal crossovers. After generation of a novel locus, in situ amplification cycles via rearrangement could lead to the origin of FISH-detectable loci. Furthermore, hemizygous 5S rDNA sites have been widely observed in many Paphiopedilum species. A double-strand break occurring in a hemizygous region would increase the probability of causing other rearrangements, owing to the absence of a homologous template for its repair [5]. The lack of dispersed repeats in the basalmost section Parvisepalum may reflect either a lack of seeding events or slow amplification processes that do not yield hybridization-visible arrays. However, in the case of 5S rDNA, there is in fact strong evidence for NTS sequence diversity, which could either be accounted for by the presence or small loci below the FISH detection limit or perhaps by considerable within-array diversity. One future experimental approach to determine whether considerable intra-array diversity indeed exists would be to perform FISH using 5S-NTS-specific probes.
Diversification of 25S rDNA distribution patterns is also observed in Paphiopedilum, but the numbers of loci and degree of dispersion is much lower than for 5S rDNA. Therefore, 5S rDNA might be more frequently seeded by TEs via transpositional events, or, amplification or maintenance of 5S rDNA loci via rearrangement could be more effective and tolerated during the chromosome evolution process. The different evolutionary tendencies between 25S and 5S rDNA might be caused by their function and sequence divergence or localization in distinct nuclear compartments [43].
5S-NTS sequences highlight interlocus and intralocus diversity and weak concerted evolutionary forces
Previous studies of other angiosperm species have suggested that intralocus 5S rDNA diversity occurs. Within-array 5S rDNA diversity appears very likely in Paphiopedilum as well, since many species (e.g., all Parvisepalum and Concoloria) have only one observable 5S locus. For example, 6 species of section Parvisepalum are represented in our phylogenetic analysis by 6-8 distinct sequence variants. These 5S-NTS variants can be concluded to occur within at least partial arrays, pseudogenized or not, since the amplified pieces include sections of 5S rDNA at their 5' and 3' ends. Recent within-species duplication events may be indicated by single-species clades of 5S-NTS sequences, such as P. dayanum, P. lowii, P. sangii, but these could just as well indicate within-array variation, as single-species clades of Parvisepalum (e.g., P. malipoense) and Concoloria (P. bellatulum) most likely do. In many cases, it can be readily seen that duplication of 5S loci has occurred prior to speciation, for example, within Coryopedilum (a large group of sequences representing P. sanderianum, P. stonei, and P. supardii; similarly also within a group of P. adductum and P. randsii sequences). In some cases, ancient duplications must be much older than the major phylogenetic groups of Paphiopedilum, since, for example, P. delenatii shares sequence variants similar to other Parvisepalum species yet has at least one other variant that is more similar to sequences from all other sections. We investigated the possibility of contamination regarding this finding, but discovered similar repeats across 8 distinct P. delenatii accessions (results not shown). Another explanation for multispecies clades, e.g., within well-defined groups such as Parvisepalum could be ancient hybridization.
We observed that increasing within-species 5S NTS sequence diversity correlates with increasing minimum numbers of visible 5S rDNA loci in Paphiopedilum (Figure 9); therefore we infer that interlocus concerted evolution is weak within the genus. Our conclusion concurs with previous findings in many plant genera, such as Gossypium[17], Triticum[18], Chenopodium[19], Nicotiana [20] and Pinus[27]. So far, to our knowledge, noticeable interlocus concerted evolution of 5S rDNA arrays has not been demonstrated in plants.
The best supported hypothesis to explain weak homogenization forces on 5S rDNA arrays is that the chromosomal location of rDNA arrays has a substantial impact on interlocus concerted evolution [17, 20, 44–47]. Arrays located in subtelomeric regions are thought to undergo stronger interlocus homogenization forces than ones located in proximal regions. Potential evidence was observed in section Barbata, in which all of six species studied possess two 5S loci. These six species can be categorized into two groups according to the locations of 5S loci. One group harboring proximal 5S loci includes P. wardii, P. dayanum, P. venustum and P. acmodontum, while the other group harboring subtelomeric 5S loci includes P. purpuratum and P. sangii (Figure 8; Table 1). Considering that all six species are closely related and possess the same number of loci, it can be logically assumed that the difference in sequence polymorphism between the two groups is caused by the different locations of the 5S loci. The fact that the average number of polymorphic sites in the proximal-loci group (151.5) is 18% more than that in the subtelomeric-loci group (128), indicates that proximally located loci seem less homogenized than the subtelomerically located loci.
Additionally, we found that not only interlocus but also intralocus concerted evolution is also influenced by chromosomal localization. In section Concoloria, two closely related species, P. bellatulum and P. niveum, both have one 5S locus, but with different localizations. The difference in sequence polymorphism between the two species may be caused by the different locations of the 5S loci. P. niveum, which has a pericentromeric locus, showed 1.68 - fold more polymorphic sites than P. bellatulum, which has a subtelomeric locus (Table 1).
It is well-known that meiotic homologous recombination has been largely suppressed in pericentromeric and centromeric regions. Unequal crossovers between sister chromatids and gene conversion documented in the centromeres of many organisms have been postulated as the major homogenization force for tandem repeats located in these areas [48–52]. A plausible explanation for this has been proposed previously: if unequal crossover events between rDNAs of two chromosomes occurred in the proximal region to centromeres, this may result in the exchange of not only a fraction of the rDNA but also the centromeres themselves. Such an event is more likely to have significantly greater negative consequences to the organism than if the event occurred in the subtelomeric region, which then might result in exchange of telomeres [17, 47]; loss of centromeres would prohibit cell division, whereas loss of telomeres might not restrict mitosis or meiosis. As such, centromerically-located rDNA arrays are expected to show weaker homogenization forces, since fewer individuals with unequal crossovers in this region are expected to survive. In contrast, the subtelomeric region is characterized by a higher rate of interchromosomal exchange [5], thus stronger concerted evolutionary forces could be expected in this region.
All species of section Parvisepalum, as with P. bellatulum, have subtelomeric 5S loci, some of which are closely linked with 25S loci. If 5S localization correlates significantly with homogenization, as with 25S, which is always telomeric-subtelomerically located, we should expect subtelomeric 5S repeats to show decreased sequence diversity due to stronger concerted evolutionary forces. However, this is not the case, since variation in the number of polymorphic sites is not significantly different by section (with or without Pardalopetalum included in Coryopedilum; single factor ANOVA P = 0.06 and 0.1, respectively). We therefore infer that localization of the 5S rDNA arrays only partially contributes to the weak concerted evolution observed in Paphiopedilum.
There are several other hypothesized mechanisms that could lead to the weak concerted evolutionary force on 5S rDNA arrays. For example, ongoing chromosomal rearrangement such as insertion, deletion, or transposition could occur within arrays too frequently for interlocus concerted evolution to be effective. Another possibility is that concerted evolutionary processes homogenize 5S rDNA arrays at rates lower than the rate of speciation, thus novel mutations cannot be fixed or removed and high levels of intralocus polymorphism are expected within arrays [17]. Additionally, the base composition and secondary structure of rDNA sequences may also affect the rate of concerted evolution [53]. It is unknown whether weak concerted evolutionary forces are shared by other Paphiopedilum tandem repeats, or if this is characteristic of 5S rDNA arrays only. This issue can be elucidated by further studies on other tandem repeats, such as 25S rDNA arrays.