Does epigenetic polymorphism contribute to phenotypic variances in Jatropha curcas L.?

Background There is a growing interest in Jatropha curcas L. (jatropha) as a biodiesel feedstock plant. Variations in its morphology and seed productivity have been well documented. However, there is the lack of systematic comparative evaluation of distinct collections under same climate and agronomic practices. With the several reports on low genetic diversity in jatropha collections, there is uncertainty on genetic contribution to jatropha morphology. Result In this study, five populations of jatropha plants collected from China (CN), Indonesia (MD), Suriname (SU), Tanzania (AF) and India (TN) were planted in one farm under the same agronomic practices. Their agronomic traits (branching pattern, height, diameter of canopy, time to first flowering, dormancy, accumulated seed yield and oil content) were observed and tracked for two years. Significant variations were found for all the agronomic traits studied. Genetic diversity and epigenetic diversity were evaluated using florescence Amplified Fragment Length Polymorphism (fAFLP) and methylation sensitive florescence AFLP (MfAFLP) methods. Very low level of genetic diversity was detected (polymorphic band <0.1%) within and among populations. In contrast, intermediate but significant epigenetic diversity was detected (25.3% of bands were polymorphic) within and among populations. More than half of CCGG sites surveyed by MfAFLP were methylated with significant difference in inner cytosine and double cytosine methylation among populations. Principal coordinates analysis (PCoA) based on Nei's epigenetic distance showed Tanzania/India group distinct from China/Indonesia/Suriname group. Inheritance of epigenetic markers was assessed in one F1 hybrid population between two morphologically distinct parent plants and one selfed population. 30 out of 39 polymorphic markers (77%) were found heritable and followed Mendelian segregation. One epiallele was further confirmed by bisulphite sequencing of its corresponding genomic region. Conclusion Our study confirmed climate and practice independent differences in agronomic performance among jatropha collections. Such agronomic trait variations, however, were matched by very low genetic diversity and medium level but significant epigenetic diversity. Significant difference in inner cytosine and double cytosine methylation at CCGG sites was also found among populations. Most epigenetic differential markers can be inherited as epialleles following Mendelian segregation. These results suggest possible involvement of epigenetics in jatropha development.


Background
Since the petroleum crisis in 1970 s and the recognition of limited world fossil energy resources, plant oils which can replace fossil oil have been given more attention [1]. Special interest has been shown in the cultivation of Jatropha curcas (jatropha, a member of Euphorbiaceae family) for its drought and poor soil tolerance. It can be cultivated on marginal land and does not compete with food production crops. Jatropha, also known as physic nut, is a small tree or large shrub which can reach a height of 5 m [2]. The plant is monoecious and the flowers are unisexual, occasionally hermaphrodite flowers occur. Pollination of the jatropha is by insects because its sweet, heavy perfume, greenish white flowers, versatile anthers and protruding sexual organs, copious nectar, and absence of visible nectar guides [3]. During field trials, it was observed that a number of different insects visited the flowers and helped in pollination. When insects are excluded from the greenhouse, seed set does not occur without hand-helping pollination. It is speculated that the center of origin of jatropha is Central America. From the Caribbean, this species was probably distributed by Portuguese seafarers via the Cape Verde Islands and former Portuguese Guinea (now Guinea Bissau) to other countries in Africa and Asia [2].
The understanding of biodiversity (both phenotypic and genetic diversity) of jatropha is essential to developing strategies for collection, conservation and new variety development. Before 1996, few systematic provenance trials had been conducted to exam morphological characteristic differences among various jatropha collections. All except one found out clear phenotypic differences among collections (summarized by Heller 1996) [2]. For example, a trial found out a good degree of variation in plant height, branches per plant, branched habit, leaves and seeds size and seed yields between India and Nicaragua collections [2]. Recently, Kaushik [4] assessed the variability in seed traits and oil content of 24 accessions of jatropha collected from different agroclimatic zones of Haryana state, India and revealed that there were significant differences in seed size, 100-seed weight and oil content among accessions. There have been many efforts to access the extent of genetic diversity in jatropha by using various molecular markers. Basha et al reported low degree of variation among the accessions collected from different geographical regions of India [5]. In their further research [6], genetic background of 72 jatropha accessions representing 13 countries were elucidated using molecular analysis and biochemical analysis. The biochemical composition analysis of seeds showed wide variation in crude protein, oil and ash content and phorbol ester in seed. However, simple sequence repeat (SSR) analysis found out that accessions from many countries failed to be distinguished with the exception of accessions from Mexico and El Salvador. A recent study [7] used RAPD, AFLP and cTBP to assess genetic polymorphism in J. curcas accessions from 13 countries on 3 continents. High degree of monomorphism in J. curcas accessions was found among the collections, only the accessions from Mexico and Costa Rica exhibited polymorphism. Overall, clear phenotypic variation among jatropha collections is well established. However, high phenotypic variation is matched by low level of genetic diversity as revealed by molecular markers.
Methylation of cytosine is an important epigenetic modification of the nuclear DNA of many eukaryotic organisms. Although these changes do not alter the primary DNA sequence, they are frequently heritable through cell division, sometimes for multiple generations and can thus often be classified as epigenetic marks [8]. These conserved epigenetic marks have been found to influence many aspects of gene expression and chromosome biology, and they have characteristic genomic distribution [9].
Epigenetic changes can occur at a high frequency in crop plants and might generate phenotypic variation that is not correlated with genetic variation [10]. In this study, collections from 5 different countries on three continents have been evaluated for phenotypic variation under the same agronomic climate and practices. The important agronomic traits included in our study are: first year yield per plant (1yY), second year yield (2yY), number of primary branch (NPB), number of fruiting branch (NFB), diameter of canopy (DiC) after two years, plant height (H) after two years, oil content in seed (Oil %) after one year, days from sowing to flowering (DtF) and dormancy of the branch shoot. Many of the traits have not been monitored by any previous work. On the other hand, genetic diversity was evaluated by fAFLP, epigenetic diversity was evaluated by methylation sensitive fAFLP (MfAFLP). An intra-species hybrid F1 population and one selfed population were used to evaluate inheritance of epigenetic markers and their pattern of inheritance. One epigenetic allele was further confirmed by bisulphite sequencing.

Phenotypic variances between different populations
Randomly selected continuous 10 individuals from each population were tracked for the following agronomic traits: first year yield per plant (1yY), second year yield (2yY), number of primary branch (NPB), number of fruiting branch (NFB), diameter of canopy (DiC) after two years, plant height (H) after two years, oil content in seed (Oil %) after one year, days from sowing to flowering (DtF) and dormancy of the branch shoot at drough periods. Significant variations of agronomic traits were observed in the five populations studied (Table 1). First flowering time ranged from 105 days for MD and 175 days for TN, a difference of around 70 days. Accumulated yield for the first year (1yY) varied greatly among population, with an average of 370 grams per tree for MD and as little as 25 grams per tree for CN, a difference of more than 10 times. Such yield difference continued but narrowed down in the second year, with 708 grams per tree for MD and 217 grams per tree for CN. There were also significant differences in plant height and tree canopy size: with AF being the tallest population and MD being the population with largest canopy. Primary branches as well as fruiting branches were most numerous in SU population. In the very hot and dry months, apical bud dormancy was observed but with different severity: most serous in TN and CN populations but much less in MD population. Seed oil content was also variable among populations with AF population had the highest oil content and CN had the lowest oil content. Overall, the population from Indonesia was the earliest to flower and the most productive for both the first and the second year.
Besides the differences among populations, there were also the differences within populations. The coefficient variation (CV%, calculating SE as a percentage of the average) for traits 1yY, 2yY, NFB and NPB were larger than 15% within population, suggesting big variation of agronomic traits even within the same population.

Very low level of genetic diversity between and within different populations
Initially 64 primer combinations of EcoR I/Mse I with three nucleotides selective primers were screened for fAFLP DNA genotyping. A total of 23 primer combinations with each giving about 25 distinct bands (see Additional file 1 for details of primers), were selected for further analysis. A total number of 575 bands were identified by fAFLP analysis of all 162 individuals, out of which only 3 polymorphic bands were detected in 5 plants from China and India (one representative fAFLP result is given in Additional file 2). This represents that only 0.52% of bands were polymorphic. The experiment was repeated twice with the same result. To further investigate the genetic diversity of jatropha, RAPD, DAMD and SSR primers which other researchers used in their studies were also used to analyze our populations. However, the 22 RAPD primers and 2 SCAR primer pairs used by Basha et al [5] and the 7 RAPD primers and 4 DAMD primers from Shirish et al (2008) [11] and 12 SSR primer pairs from Sudheer et al (2009) [12] all failed to detect a single polymorphism in our samples. AFLP, RAPD and DAMD markers are generally believed to distribute randomly throughout genome, SCAR and SSR have specific distribution patterns. Together, these results suggest little genetic diversity among and within the populations we studied.

Moderate but significant epigenetic diversity revealed by MfAFLP genotyping
For MfAFLP analysis, 54 primer combinations for EcoR I/Msp I (EM) and EcoR I/Hpa II (EH) with three nucleotide selective primers were firstly screened on 5 random selected samples with one from each population. 14 primer combinations (Additional file 1) with the most number of amplified bands were used for further analysis of the same 162 individuals by the genetic diversity analysis. A total of 562 distinct bands differentiating between EM and EH in at least one sample were identified (Representative MfAFLP results are given in Additional file 3 and file 4). The percentage of polymorphic epigenetic band varied from 22.42% in AF population and 27.58% in CN population with the mean percentage of polymorphic bands being 25.30% and mean heterozygosity being 0.062 (Table 2). Only MD population had one private band ( Table 2). The AMOVA analysis showed that 28% of the epigenetic variance was assigned to variance among populations and 72% to variance within populations and the differences among and within populations were significant. Relativeness among the 5 populations was examined by Nei's epigenetic distances analysis and showed in Table  3. Principal coordinates analysis (PCoA) was performed based on the matrix of Nei's epigenetic distances among samples and showed in Figure 1. PCoA analysis revealed two jatropha clusters, one containing AF and TN populations, and the other containing CN, MD and SU populations.

Variation of CCGG methylation among populations
MfAFLP is a modified version of standard AFLP [13] using a pair of methylation-sensitive restriction enzymes, Hpa II and Msp I, which are a pair of isoschizomers, recognize the same restriction site CCGG but possess differential methylation sensitivity at the inner or outer cytosine [14]. In the absence of cytosine methylation, both enzymes can cut a CCGG site. Hpa II can't cut when the inner cytosine is methylated. Msp I can't cut when the outer cytosine is methylated. Any difference in AFLP banding pattern between EcoR I + Hpa II (EH) and EcoR I + Msp I (EM) digested DNAs should reflect different states of cytosine methylation at CCGG sites. We found out that all bands were differentiating EH and EM in at least one sample, indicating absence of genetic variation at all CCGG sites surveyed, hence the MfAFLP result exclusively reflects epigenetic diversity at the CCGG sites. The ANOVA analysis for the variation of CCGG methylation pattern among 5 populations was conducted based on frequencies of absence of cytosine methylation (non-C m ), methylation at both inner and outer cytosines (both-C m ), methylation of inner cytosine only (only-inner-C m ) and methylation of outer cytosine only (only-outer-C m ) in each population (Table 4). These multiple comparison tests were based on Tukey (HSD) analysis of the differences between populations with a confidence level of 99%. Nearly half of CCGG sequences were not methylated (non-C m ), with no significant difference among populations. For the three types of methylation, there were significant differences among populations in inner cytosine and double cytosine methylation. MD, CN and SU are higher for inner cytosine methylation but lower for double cytosine methylation. TN and AF have the exact opposite situations. Inner cytosine methylation and double cytosine methylation together happen in nearly half of CCGG sequences. The single outer cytosine methylation happens in less than 5% of CCGG sequences. Consistent with the PCoA analysis, AF and TN populations form a separate group from the group of MD, SU and CN populations. The high and varied level of CCGG methylation suggests possible role of methylation in gene regulation.

Heritability of Epialleles
To reveal the heritability of epigenetic bands of jatropha, we constructed a crossed pollinated population by controlled pollination in an insect free green house between MD24 and TN02 (MT2402 F1Pop), which were from Indonesia and India, respectively. They were different in many agronomic traits including plant height, branching pattern and total number of flowers in florescence.  Table 5 shows the 30 heritable epigenetic bands, their segregation pattern in the 2 populations, corresponding c 2 values, and the deduced epiallele's patterns (Ho for homozygous, or He for heterozygous) in MD24 and TN02. The other 9 polymorphic bands segregated but did not follow Mendelian segregation in one or both populations. For example, bands E1H10-076, E4H9-071 and E4H15-550 had significantly biased segregation in MD24ScPop population. Two bands E1H14-228 and E4H1-400 were not present in parents but appeared in some progenies.
To confirm the heritance of epialleles, band E1H5-486, which was present in TN02 but absent in MD24, was isolated from agarose gel, cloned and sequenced. The regions corresponding to EcoR I and Hap II/Msp I sites were sequenced in MD24, TN02, and there were no sequence difference found between them. Bisulphite conversion the genomic DNAs from the parents and the 17 individuals in MT2402 F1pop was conducted. A pair of bisulphite conversion specific primer (Additional file 1) was used to amplify the E1H5-486 region. PCR amplified bands were cloned into pGEM-Teasy vector and twelve randomly chosen colonies containing insert were sequenced around E1H5-486 region at the restriction site of Hap II/Msp I. For a sample homozygote for the locus, all sequences from 12 colonies would be the same. For a heterozygote sample for the locus, two types of sequence were obtained with each represent one allele. It was found that MD24 was homozygote in the locus with the inner cytosine methylated while TN02 was heterozygote with one allele not methylated. For the progenies, their sequences matched exactly with E1H5-486 band in each plant: absence of the band when the locus is homozygote with inner cytosines methylated  and presence of band when the locus is heterozygote with one allele not methylated (Table 6 and Additional file 5).

Discussion
In this study, jatropha seeds were collected in five countries on three continents. Seeds were brought into Singapore and grown in one single farm under the same agronomic practices in order to reveal difference in heritable agronomic traits. The plants were tracked for many agronomic traits including time to first flowering, accumulated yield in one year and the second year, number of primary branch, number of fruiting branch, diameter of canopy after two years, plant height after two years, dormancy of branch and oil content in seed. These traits are important for agronomy study but seldom studied systematically. Tracking for two years will ensure reliability of data. With this systematic comparison of populations for two years, agronomic differences that are more genetic dependent and less climate and agronomic practice dependent is confirmed among collections. It should be noted that within population performance is also present, as indicated by big standard deviation (>15% average) in some traits especially yields. Better field performance of Indonesian population (MD) may suggest its better adaptation to the humid tropical climate in Indonesia and Singapore.
Surprisingly, there was very little genetic diversity among the populations to match with agronomic trait differences we found. AFLP is believed to be a highly polymorphic genotyping technology that can scan the genome for nucleotide sequence difference at and near restriction sites. In our study, almost all the 575 distinct bands (with the exception of only three bands in five individuals) were monomorphic. We repeated the same experiment twice with the same results. To address the concern of insufficient or biased coverage by our fAFLP analysis, we used polymorphic molecular markers reported by others, of multiple technologies including RAPD, DAMD, SSR and SCAR to analyse the same samples. These further studies all confirmed the lack of detectable genetic diversity in the populations studied. There are two possible reasons for this paradox. One is that all the genotyping methods together are not sensitive enough to identify genetic diversity among the collections. Alternatively, epigenetic mechanism that changing phenotype (appearance) or gene expression with no requirement of changing to the underlying DNA sequence is possibly involved.
The lack of genetic diversity prompted us to look into the possible epigenetic diversity. Methylation of DNA is one of the major epigenetic markers which affect gene expression directly or indirectly. DNA methylation sensitive fAFLP (MfAFLP) is a modified version of AFLP [13] using a pair of methylation insensitive/sensitive isoschizomer restriction enzymes to locate methylated AFLP markers. In normal AFLP analysis, a methylation insensitive enzyme Mse I is used (cutting at 5'-TTAA-3') to digest genomic DNA together with a six nucleotide cutter like EcoR I before PCR amplification, variation in banding pattern reflects only the sequence difference. In this study, a pair of isoschizomers, Hpa II and Msp I, which recognize the same tetranucleotide 5'-CCGG-3' but with different sensitivity to outer and inner cytosine methylation were used in place of Mse I. The same genomic DNA sample is digested in two reactions, one with EcoR I/Msp I (EM) and the other with EcoR I/Hpa II  (EH). Differences in EM and EH amplified bands identify cytosine methylation. CpG methylation (inner cytosine methylation) was found to be the most frequent followed by double cytosine methylation. For these two methylations, there were highly significant differences (p < 0.01) among different collections. Compared to the very low percentage of polymorphic band (0.52%) revealed by fAFLP, epigenetic polymorphic bands ranged from 22.42% to 27.58% in the populations. The AMOVA showed significant diversity both among populations and within populations. Similarly, there are numerous cytosine methylation polymorphisms between rice cultivars [15] and cotton collections [16], and the number of methylation differences is not correlated with their genetic distance. This was also found to be true in A. thaliana accessions [17]. Since methylation has been implicated in gene regulation and various development processes in plants [10], epigenetic diversity we found in jatropha may play a role in determining jatropha agronomic traits and its development.
The genetic improvement of plants requires individuals that differ in heritable traits. DNA methylation can generate novel and heritable phenotypic variation by influencing gene expression [10]. Stable epigenetic events might be important for plant breeding. In this study, jatropha plants collected from various parts of the world exhibited different morphology features when they were planted side by side, notably in their seed yields, oil contents, branching pattern, flowering time, as well as plant height, canopy and dormancy of the branches. The different Further research on inheritance of epigenetic markers based on the MfAFLP method in an intra-species hybrid population (MT2402F1pop) and a selfed population (MD24F1Scpop) revealed that most polymorphic epigenetic markers were heritable and conformed to Mendelian segregation pattern. In the near future, we will focus on the development of epigenetic markers which are linked to important traits and use these markers to perform marker-assisted breeding.
In tropical and subtropical countries, jatropha has high potential as a biofuel crop. Among the oil-bearing tree species, jatropha is desired due to its multiple positive attributes including drought hardiness, rapid growth, easy propagation, high oil content, small gestation period, wide adaptation [18] as well as high quality biodiesel that its oil can be converted to. It is, however, not domesticated, as indicated by the highly variable and unpredictable field performance. There is an immediate need to breed elite varieties for plantation. It is widely believed that the key to success of any breeding program is the availability of collections with desired traits and maximum diversity among collections. Phenotypic variation and seed biochemical composition have been widely reported. Genetic diversity, on the other hand, has been found to be lower than previously thought. Here we report the moderate level of epigenetic diversity and heritability of epigenetic markers, which can also be used to evaluate diversity among jatropha collections.

Conclusions
In summary, we systematically compared collections from five countries on three continents. Seed derived plants were planted in the same farm under the same agronomic practices. Multiple agronomic traits were monitored over a period of two years. Our result confirmed the differences in various agronomic traits among collections as well as within collections. These differences should be determined genetically because of their growth in the same climate under the same practices. Paradoxically, there is lack of genetic diversity to match with agronomic differences. We found moderate level of epigenetic diversity, variation in CCGG methylation pattern and proved that many epigenetic markers are heritable and most epialleles follow Mendelian segregation pattern. All these point to the possible involvement of epigenetics in jatropha development, which needs to be further substantiated.

Plant material
Seeds of jatropha collected from five countries/regions including Yunnan province, China (CN); Java, Indonesia (MD); Tamil Nadu, India (TN); Suriname, South America (SU) and Tanzania, Africa (AF) were germinated and transplanted into pits of 1 m × 1 m × 1 m filled with top soil with space of 2 m × 2 m for each tree on the farm of Temasek Life sciences Laboratory (TLL), Singapore. The field management, such as fertilization, pesticides spraying and weeds controlling, followed the normal practices. Only one or two times of irrigation was conducted in the dry season. One year later from the date of sowing, the plants were pruned at the height of 70 cm from ground. A set of 162 randomly selected plants of jatropha from 5 populations were used for genetic and epigenetic diversity analysis. Ten plants randomly pickup from each population were labeled and monitored over a period of two years from March 2007 to February 2010 for various agronomic traits. One intraspecies hybrid population (MT2402F1pop) and one selfed population from the parent MD24 (MD24Scpop) of jatropha were used to detect the inheritance of epigenetic bands.

DNA extraction
Total genomic DNA was extracted from young leaves of each plant using the DNeasy plant DNA extraction kit (QIAGen, Singapore) following the manufacture's instruction. DNA concentrations were determined using the ND-1000 spectrophotometer (Nanodrop Technologies, Rockland, Delaware, USA) and the quality was checked by electrophoresis on 0.8% agarose gel.  fAFLP, MfAFLP, SSR and RAPD analysis fAFLP analysis was conducted as previously reported [19]. Selective amplification was carried out with 23 different combinations of EcoR I and Mse I selective primers, with each gave about 25 distinct bands. DNA methylation-sensitive fAFLP (MfAFLP) is a modified version of fAFLP using a pair of methylation-sensitive restriction enzymes, Msp I and Hpa II. They are a pair of isoschizomers, which recognize the same tetranucleotide 5'-CCGG-3' but have different sensitivity to CpG methylation [14]. Msp I and Hpa II were used to replace the enzyme Mse I in fAFLP. The structure of Msp I-Hpa II adapter and primers were ordered from Oligo (Singapore) according to Xu (2000) [20]. 14 primer pairs from 54 primer pairs with the most number of amplified bands were used to carry out selective amplification. 0.6 uL PCR products of the amplification were analyzed with ABI 3730 ×l DNA analyzer and data processed by  [12].

Data collection and statistical analysis
An AFLP Excel Macro [21] was used to convert allele size data from GeneMapper 3.7 (Applied Biosystems, USA) into binary form, to indicate the presence (1) or absence (0) of alleles. To elucidate the epigenetic diversity within and among populations, we scored the MfAFLP bands by the criterion described in Li et al (2008) [22] with several modifications. Bands presented in both lanes of EcoR I/Msp I (EM) and EcoR I/Hpa II (EH) were given a binary genotype of EM/EH (1/1); those bands presented in either Msp I or Hpa II were given a binary genotype of EM/EH (1/0) or EM/EH (0/1); bands absent from both Msp I and Hpa II were given a binary genotype of EM/EH (0/0). GenAlEx 6 [23] was employed to compute allele frequency in populations, genetic heterozygosity within populations and, the pairwise Nei's genetic distance between populations. It was also used to conduct Analysis of Molecular Variance (AMOVA) and Principle Coordinates Analysis (PCoA). The ANOVA analysis for the variation of CCGG methylation among 5 populations was conducted with XLStat (version 7.5.2, Addinsoft, USA), which was also used for multiple group Tukey (HSD) analysis. The same software was also used for ANOVA analysis and Tukey (HSD) analysis of all agronomic traits.