Earlier Oryza chloroplast phylogenies examined the diversity of chloroplasts from divergent species across the genus [6]. The range of wild rice chloroplast genome sequences in samples collected in Asia may be due to both divergence in the wild and gene flow from domesticated to wild populations making interpretation of the basis of relationships difficult. Rices with chloroplasts very similar to those in the domesticated gene pool may be found in Asian collections [5] but the direction of gene flow may be difficult to establish without detailed populations studies as revealed in a recent study of wild and domesticated barley [13]. The current study expands specifically on the relationships within the clade that includes Asian domesticated rices. The current phylogeny includes far more samples than any previous study (more than 3000) and is likely to be more accurate [14] than any earlier study. Dramatically larger numbers of domesticated and more wild rices are now included in the analysis.
Recent analysis of the diversity of domesticated rice based upon nuclear genome analysis suggests three separate domestication of the japonica, indica and aus type [15] based upon nuclear genome analysis.
The extensive analysis of domesticated rices in this study does not refute this but shows that domesticated rice genotypes can also all be grouped into two main clades strongly suggestive of domestication of two distinct clades of the cytoplasmic genome supporting the concept of multiple origins contributing to a single domestication [16]. These two clades (A and B) may be in part associated with the distinct origins of japonica and indica rice. However, the two chloroplast clades generated by neighbour joining and subsequently distinguished by unique polymorphisms (Table 3) each contained a wide range of rice types (Table 2). Clade A included mainly indica rice. Most japonica rices were in clade B with distinct chloroplast sub-clades for basmati, tropical japonica and temperate japonica. Despite these general groupings most groups were mixed. Indica rices were predominant in clade A but also widely distributed across clade B including all sub-clades. Clade A had more aus types but many were also present in clade B, confirming the earlier report of aus rice types with both main types of chloroplast [5]. These results demonstrate that cytoplasmic genomes and nuclear genomes have been widely recombined to generate the current domesticated rice gene pool. Rice types are likely to be overwhelmingly determined by the nuclear genome but the chloroplast genome may often have originated from another rice type. The extent to which this is a product of genetic recombination in the breeding of modern varieties or due to events of domestication or following domestication is not clear.
Domesticated rice appears to have common domestication loci [15] but be derived from more than one ancestral population. One model for rice domestication suggests domestication of japonica and possibly indica from distinct wild populations corresponding to O. rufipogon and O. nivara respectively. This needs to be followed by some level of introgression to generate the two distinct types of cultivated rice with common alleles at domestication loci. The crosses between these two populations could have either parent as the maternal parent allowing domesticated populations with both chloroplast types to evolve. This potential is illustrated by recent evidence for wild populations generating hybrids by crossing in both directions [17]. The existence of reproductive barriers [17] may result in unidirectional pollen flow and rapid nuclear genome replacement or chloroplast capture. The implications of the chloroplast genome type for plant performance are not known but may be significant.
Analysis of the domestication of crop plants is complicated by the potential for wild populations to originate from domesticated plants that develop traits that allow them to perform well outside cultivation. Gene flow from domesticated to wild populations can also further complicate analysis. Early domesticates like rice may have had a longer period over which these processes may have occurred. Barley, possibly the first plant domesticated, was domesticated in the fertile-crescent. The presence of distinct wild barley populations in Tibet suggested the possibility of a second independent domestication. However, recent genome analysis showed that the wild barley in Tibet may have been derived from the domesticated barley introduced to the region by humans in the last 4000 years [13].
In the case of rice evidence for close relationships between wild and domesticated rice has been used as evidence for the primary site of domestication [18]. However, gene flow from domesticated to wild populations may also explain a close relationship. The two distinct chloroplast genomes found in the Oryza sativa gene pool is only easily explained by the domestication of two distinct maternal genomes. The nuclear genome domestication history is more difficult to define but the two main types, indica and japonica, are likely products of separate domestications. Human movement of rice and modern plant breeding have resulted in the widespread recombination of nuclear and maternal genomes reported here. Domestication loci may now be shared across the modern O. sativa gene pool despite at least two separate primary domestications that have capture two distinct maternal gene pools. Indica and japonica types have long been recognized [1]. A third group, aus, has been explained by a third possible domestication. The chloroplast evidence presented here shows that aus genotypes all have one of the two main types of chloroplast genome and do not represent a separate maternal domestication.
The chloroplast and nuclear genome phylogenies show significant discordance across the Oryza.
O. longistaminata in Africa and O. glumaepatula in South America have similar chloroplast genomes [4] despite their more distant nuclear genome sequences. O. glumaepatula is a part of the A genome clade of close relatives of domesticated rice but has a divergent chloroplast genome being the most divergent A genome species. However nuclear genome analysis puts the diverse O. meridionalis [19, 20] from Australia as the most divergent in the A genome clade [9].
Another example is provided by the Australian populations with morphological similarity to O. rufipogon in Asia. This taxa has a chloroplast genome that is close to that of wild rices in Australia [21] but a nuclear genome that groups it with the Asian wild rices [9]. The general conclusion from this analysis is that chloroplast transfer between closely related Oryza taxa has been widespread. The recent discovery [22] of some wild populations of hybrids between taxa indicates that this is an ongoing evolutionary process in the Oryza.
The two maternal lineages found in domesticated rice are very distinct (Table 3) suggesting a long period of divergence of these two ancestral types. The presence of functional polymorphisms that alter the encoded amino acid is significant especially given the highly conserved nature of the chloroplast genome. Divergence times of the order of more than half a million years have been suggested for these chloroplasts [6]. The annotation tool, Geseq, allowed the detection of more genes than earlier tools that have been widely used for chloroplast annotation. Some genes of unknown function (hypothetical proteins) are now correctly annotated (Supplementary files 1 and 2). Additional tRNA genes were revealed in the chloroplast genome. Geseq has significant advantages as it uses more de novo predictors and searches by profile hidden Markov model (HMM) in comparison to the classic annotation procedure. In addition, this tool has identified more introns in chloroplast genes [12].{Tillich, 2017 #2} reference 13.
Functional adaptation of Oryza chloroplasts has been suggested. The photosynthesis gene psb B that was found in this study to be polymorphic in domesticated rice has been implicated in adaptation to shade and sun in Oryza species growing in different habitats [23]. The other chloroplast genes that show differences that would result in changes in protein sequence are associated with gene expression; being 2 changes in an RNA polymerase gene (rpoC2) and a difference in a ribosomal protein (rpl20). The two chloroplast clades also show a consistent difference in a tRNA sequence. The presence of an alanine/valine polymorphism in the psb B gene suggests the possibility of parallel adaptation in the two chloroplast lineages. The impact of these differences on rice performance in different environments is worth careful evaluation as may be the use of chloroplast genomes from wild relatives [24]. The potential for genetic improvement of rice by selection of the maternal genome requires careful analysis. This would be facilitated by the availability of rice genotypes with near identical nuclear genomes and divergent maternal genomes.