Camelina sativa is a re-emerging oilseed with tremendous potential as an alternative biofuel crop and for which genomic information is becoming increasingly available. We have obtained molecular data for nine genes, characterized in detail two genes encoding fatty acid biosynthesis enzymes and, in the process, have discovered unexpected complexity in the C. sativa genome.
The close relationship between C. sativa and the model plant Arabidopsis thaliana [3, 4] facilitates the manipulation of known pathways, such as the one regulating fatty acid biosynthesis. C. sativa seed oil is high in both polyunsaturated and long chain fatty acids [5, 60, 61], suggesting that both CsFAD2 and CsFAE1 are present and active. Three copies each of the FAD2 and FAE1 genes were isolated from an agronomic accession of C. sativa using primers designed from A. thaliana or Crambe abyssinica sequence. Previously identified conserved sites in CsFAD2 [44–46] and CsFAE1 [49, 50, 62] are present in all three copies of each gene and a 5' intron shown to be important in regulating FAD2 expression in sesame  was identified in all three CsFAD2 copies. Real time qPCR data and Sequenom MassARRAY SNP analysis of the CsFAD2 and CsFAE1 cDNA showed that all three copies of each gene are expressed in developing seeds. Thus, it seems likely that all three copies of FAD2 and FAE1 in C. sativa are functional.
The cloning of three copies of FAD2 and FAE1 from the C. sativa genome, as well as the observation of three LFY hybridization signals by Southern analysis and three expressed haplotypes for 6 more predicted single-copy genes in developing seeds, could be explained by at least two possible scenarios: segmental duplications of selected regions within a diploid genome either through tandem duplications or through transpositions, or whole genome duplications resulting from polyploidization. Segmental duplications or transpositions affecting all nine examined loci are improbable compared with the explanation of polyploidy. Furthermore, no evidence of recent segmental duplication involving multiple genes has been observed in sequenced plant genomes [36, 63–65].
Triplication of the C. sativa genome therefore likely occurred through whole genome duplication, either through autopolyploidization or through allopolyploidization. An autopolyploidy event might have triplicated a single diploid genome resulting in an autohexaploid with a haploid genome of 18, 21, or 24 chromosomes. Given that C. sativa has a chromosome count of n = 20, chromosome splitting or fusion could then have increased the chromosomes from 18 to 20, or decreased the chromosomes from 21 or 24 to 20.
Alternatively, triplication of the C. sativa genome might have resulted from two allopolyploidy events, resulting in first a tetraploid then a hexaploid, similar to the origin of cultivated wheat. According to this hypothesis, the three copies of each gene diverged in different diploid genomes before converging through polyploidy events. Taking into consideration the reported chromosome counts of various Camelina species, the basal chromosome number of the diploid parental species contributing to the C. sativa haploid genome of 20 chromosomes could be 7+7+6 or 8+6+6. The allopolyploid hypothesis is supported by the observation that C. sativa demonstrates diploid inheritance [2, 66], as would be expected for an allopolyploid . A hexaploid C. sativa could also be derived from the combination of an autotetraploid and a diploid species if, in an autopolyploidized genome, homologous chromosomes differentiated so that the subsequent chromosome-specific pairing mimicked an allopolyploid genome in its diploid inheritance patterns. Regardless of its evolutionary path, the C. sativa genome appears organized in three redundant and differentiated copies and can be formally considered to be an allohexaploid.
Results from our phylogenetic analyses support a history of duplication for both FAD2 and FAE1 in Camelina. For FAD2, duplications were only recovered for C. sativa, C. microcarpa, and C. rumelica. These data are consistent with genome size data, which indicate that all three genomes are larger than C. laxa and C. hispida, from which only a single FAD2 copy was recovered. Taken together, the results suggest that C. sativa, C. microcarpa, and C. rumelica are likely polyploids. Given the slightly smaller genome size of C. rumelica, and the fact that we recovered only two FAD2 copies from it, the C. rumelica sampled may be tetraploid while C. sativa and C. microcarpa are hexaploid. Interestingly, in both the FAD2 and FAE1 trees, one copy each of C. rumelica and C. microcarpa are strongly supported as sister. Thus, trees from these genes indicate that C. rumelica and C. microcarpa are closely related. The various placement of C. microcarpa FAD2 and FAE1 copies can be explained if C. microcarpa is the result of a hybridization event between C. rumelica and a currently unsampled, and thus unidentified species of Camelina. Two of the three copies of both FAD2 and FAE1 are identical, or nearly identical, in C. sativa and C. microcarpa, suggesting that C. sativa and C. microcarpa share a parental genome. Thus, we suggest that a Camelina species we did not sample contributed its genome to the hybrid formation of both C. sativa and C. microcarpa. In the case of C. microcarpa, the hybridization event likely involved C. rumelica. Given the chromosome count of n = 6 for C. rumelica, we expect the other putative parent to have an x = 7 genome, and furthermore to be tetraploid at n = 14. Such a cross would result in the observed C. microcarpa genome, with chromosome count n = 20. Interestingly, C. hispida is the only species we sampled with a chromosome count of n = 7, however no strong relationship between C. hispida and C. microcarpa is inferred in either gene tree. However, we do infer a weak relationship between C. sativa and C. hispida in the FAE1 tree, and thus the possibility that C. hispida is involved in the polyploid formation of C. sativa should be explored further.
What is the age of the polyploidization events likely to have formed the C. sativa genome? A complete answer will require a better understanding of its genome, but two findings suggest a recent origin. First, the chromosome number of C. sativa is inconsistent with extensive karyotype evolution and likely represents the sum of the ancestral contributions. Second, paleopolyploids such as soybean and maize display duplication of many, but not all genes as a sizeable number have decayed to singleton state. In contrast, the presence of triplicates for nine test genes of C. sativa is consistent with high retention of duplicates, as expected in recent polyploids.
The likely allohexaploid nature of the Camelina sativa genome has multiple implications. Its vigor and adaptability to marginal growth conditions may result at least in part from polyploidy. Polyploids are thought to be more adaptable to new or harsh environments, with the ability to expand into broader niches than either progenitor [67, 68]. Indeed, C. hispida and C. laxa, both of which are likely diploids, are found only in Turkey, Iran, Armenia, and Azerbaijan, while C. microcarpa and C. sativa are distributed throughout Asia, Europe, and North Africa and are naturalized in North America [8, 69]. The mechanisms behind this increased adaptability are not completely understood, but have been attributed to heterosis, genetic and regulatory network redundancies, and epigenetic factors [30, 70].
Allohexaploidy might also affect any potential manipulations of the C. sativa genome, such as introgression of germplasm or induced mutations. Introgression of an exotic germplasm could be facilitated by the type of polyploidy-dependent manipulations that are possible in wheat, a potentially comparable allohexaploid [71, 72]. In addition, polyploids have displayed excellent response to reverse genomics approaches such as Targeting Induced Local Lesions in
Genomes (TILLING) [73, 74]. As in wheat, any recessive induced mutations could be masked by redundant homoeologous loci that have maintained function [75, 76]. This mutation masking implies that multiple knockout alleles at different homoeologous sites can be combined to achieve partial or complete suppression of a targeted function [77, 78]. We also expect that single locus traits, whether transgenic or not, will display diploid inheritance due to preferential intragenomic pairing.
In a hexaploid oilseed crop such as C. sativa, manipulations of oil composition and/or yield should therefore be possible through transgenic or reverse genetic approaches, or through other genome manipulations similar to those performed in wheat. For example, the characterization of FAD2 and FAE1 in C. sativa could enable the use of TILLING techniques to isolate C. sativa plants with mutations in each of the three identified copies of both genes. We expect these mutations to result in plants with reduced levels of polyunsaturated fatty acids or long chain fatty acids, possibly in a dosage dependent manner. This will allow us to manipulate the seed oil composition of C. sativa, potentially creating a broad spectrum of C. sativa varieties possessing useful biodiesel properties, thereby further increasing the utility of this emerging biofuel crop.