Plant material and plant growth conditions
J. curcas PZM16 was crossed to J. integerima S001 and hybrids CI7041 were generated. Then a backcrossing (BC) population was constructed consisting 286 individuals derived from the backcross PZM16 × CI7041. The population and parental lines were planted under standard growth conditions in experimental field of Lim Chu Kang farm, Singapore.
Isolation of genomic DNA and synthesis of cDNA
Total DNA from leaves was extracted and purified using the DNeasy plant mini kit (QIAGEN, Germany). Oil bodies are located inside the cells of mature seeds. Total oil content and fatty acid composition in mature seeds are agronomic traits of importance. To investigate expressions of oleosin genes in mature seeds which are used for oil extraction, total RNA was isolated from mature seeds using plant RNA purification reagent (Invitrogen). Poly(A) tails were then added to the 3' end of the RNAs by poly(A) polymerase (Ambion), and the polyadenylated RNAs were reverse transcribed by SuperScript II reverse transcriptase (Invitrogen) with the oligo(dT) 3'-RACE adaptor (Ambion).
Trait measurement and data collection
Each sample of QTL mapping population was grinded with liquid nitrogen, divided into 3 copies. Every sample consists of 3 mature seeds collected randomly from the same tree. Fatty acid compositions were analyzed by Gas chromatography (GC). Total lipid, extracted from 100 mg mature seeds, was transmethylated with 3 N methanolic-HCl (Sigma, St. Louis, MO, USA) plus 400 μL 2,2,-dimethoxypropane (Sigma, St. Louis, MO, USA). Oil was extracted using solvent (hexane) extraction followed by esterification to transfer from oil to methyl ester. The fatty acid methyl esters (FAME) was analyzed by GC using GC Agilent 6890 (Palo Alto, CA, USA) employing helium as the carrier gas and DB-23 columns for components separation. The GC analytical method was performed at 140°C for 50 s and a 30°C min-1 ramp to 240°C, and the final temperature was maintained for 50 s for a total run time of 32 min. FA composition value included in the analyses was calculated based on peak area.
To amplify the mRNA from the reverse transcribed cDNAs and determine expression levels, real-time PCR was conducted with Real-Time PCR machine (I-Cycle, BioRad). Each reaction contained 200 ng of first-strand cDNAs, 0.5 μL of 10 mmol L-1 gene-specific primers, and 12.5 μL of real-time PCR SYBR MIX (iQ™ SYBR® Green Supermix, Bio-Rad). Amplification conditions were 95°C for 5 min followed by 40 cycles of 95°C for 15 s and 60°C for 60 s. The jatropha 18S rRNA was selected as the endogenous reference was used as a control to test for sample-to-sample variation in the amount of cDNA. cDNA from mature seeds of jatropha PZM16 was used as the calibrator on each real-time PCR plate. Two technical replicates of each reaction were performed. Normalized expression for each line was calculated as described in [10], i.e. ΔΔCT = (CT, Target - CT, 18S) genotype - (CT, Target - CT, 18S)calibrator. Lower ΔΔCT value means stronger gene expression and vice versa. Five mature seeds from each plant of QTL mapping population were used to determine the relative expression levels of OleI, OleII and OleIII. The results presented are means of the biological replicates for each plant.
DNA markers and genotyping
Ninety-five markers almost evenly covering the 11 LGs were selected from a first-generation linkage map of jatropha [13]. One primer of each pair was labeled with FAM or HEX fluorescent dyes at the 5'end. The PCR program for microsatellite amplifications on PTC-100 PCR machines (MJ Research, CA, USA) consisted of the following steps: 94°C for 2 min followed by 37 cycles of 94°C for 30 s, 55°C for 30 s and 72°C for 45 s, then a final step of 72°C for 5 min. Each PCR reaction consisted of 1× PCR buffer (Finnzymes, Espoo, Finland) with 1.5 mM MgCl2, 200 nM of each PCR primer, 50 μM of each dNTP, 10 ng genomic DNA and one unit of DNA-polymerase (Finnzymes, Espoo, Finland). Products were analyzed using a DNA sequencer ABI3730xl, and fragment sizes were determined against the size standard ROX-500 (Applied Biosystems, CA, USA) with software GeneMapper V3.5 (Applied Biosystems, CA, USA) as described previously [26].
Statistical analysis and QTL (eQTL) mapping
QTL (eQTL) analysis allows the genetic basis of variation of quantitative traits of interest to be dissected. Scoring every individual of a mapping population for the trait of interest and establishing a genetic linkage map for that population are two prerequisites for QTL (eQTL) detection. In this study, expression level data of fatty acid composition and content, and OleI, OleII and OleIII expression levels of the backcross population consisting of 286 individuals were collected with 3 replications. Pearson phenotypic correlations among traits were calculated by SAS PROC CORR. The 95 markers were genotyped in the QTL mapping population. SNP markers for mapping the three genes and primer pairs for determining expression levels by real-time PCR were listed in Table 4.
Linkage map was constructed using the software CRIMAP 3.0 to detect linkage and build map [27]. All multipoint distances were calculated using the Kosambi function. MapChart 2.2 software was used for graphical visualization of the linkage groups [28]. QTL (eQTL) analysis was performed using QTL Cartographer version 2.5 [29]. Model 6 of composite interval mapping was deployed for mapping QTLs (eQTLs) and estimating their effects. The genome was scanned at 2 cM intervals, and the forward regression method was selected. The log of the odds (LOD) score for declaring a significant QTL (eQTL) by permutation test analyses (1,000 permutations, 5% overall error level) as described previously. To find as many putative QTLs (eQTLs) as possible, and to obtain a clearer understanding of the relationships among examined traits, a threshold eQTL analysis of oleosin genes in of 2.0 for declaring a QTL (eQTL) was employed. Low thresholds may not be useful in plant breeding programs but they have been shown to help in understanding relationships among traits [18].
The maximum LOD score along the interval was taken as the position of the QTL (eQTL), and the region in the LOD score within 1 LOD unit of maximum was taken as the confidence interval. Additive effects of QTL (eQTL) detected were estimated from composite interval mapping results as the mean effect of replacing hybrid (CI7041)'s alleles at the locus of interest by J. curcas (PZM16) alleles. Thus, at a QTL (eQTL) having a positive effect, the alleles of J. curcas will increase the trait value. The contribution of each identified QTL (eQTL) to total phenotypic variance (r2) was estimated by variance component analysis. QTL (eQTL) nomenclature was adapted as following: starting with "q," followed by an abbreviation of the trait name, the name of the linkage group and the number of QTL (eQTL) affecting the trait on the linkage group.
In order to investigate associations between phenotypic traits and genotypes of two QTLs on LGs 1 and 4, mean phenotypic values of traits were calculated for those progeny with the alternate alleles of the microsatellite markers, inherited from the J. integerrima (aa), alleles inherited from the J. curcas (AA). A two-way ANOVA was performed on the progeny using two allelic combinations (AA, Aa) from markers linked to QTLs. This was conducted by using the general linear model (GLM) procedure of SAS (SAS Institute) and the Bonferroni method of multiple comparisons with α < 0.01.