Direct stacking of sequence-specific nuclease-induced mutations to produce high oleic and low linolenic soybean oil

Background The ability to modulate levels of individual fatty acids within soybean oil has potential to increase shelf-life and frying stability and to improve nutritional characteristics. Commodity soybean oil contains high levels of polyunsaturated linoleic and linolenic acid, which contribute to oxidative instability – a problem that has been addressed through partial hydrogenation. However, partial hydrogenation increases levels of trans-fatty acids, which have been associated with cardiovascular disease. Previously, we generated soybean lines with knockout mutations within fatty acid desaturase 2-1A (FAD2-1A) and FAD2-1B genes, resulting in oil with increased levels of monounsaturated oleic acid (18:1) and decreased levels of linoleic (18:2) and linolenic acid (18:3). Here, we stack mutations within FAD2-1A and FAD2-1B with mutations in fatty acid desaturase 3A (FAD3A) to further decrease levels of linolenic acid. Mutations were introduced into FAD3A by directly delivering TALENs into fad2-1a fad2-1b soybean plants. Results Oil from fad2-1a fad2-1b fad3a plants had significantly lower levels of linolenic acid (2.5 %), as compared to fad2-1a fad2-1b plants (4.7 %). Furthermore, oil had significantly lower levels of linoleic acid (2.7 % compared to 5.1 %) and significantly higher levels of oleic acid (82.2 % compared to 77.5 %). Transgene-free fad2-1a fad2-1b fad3a soybean lines were identified. Conclusions The methods presented here provide an efficient means for using sequence-specific nucleases to stack quality traits in soybean. The resulting product comprised oleic acid levels above 80 % and linoleic and linolenic acid levels below 3 %. Electronic supplementary material The online version of this article (doi:10.1186/s12870-016-0906-1) contains supplementary material, which is available to authorized users.

Due to high levels of polyunsaturated fatty acids, soybean oil has poor oxidative and frying stability, which limits its use in food products and industrial applications. In an effort to lower the levels of polyunsaturated fatty acids, soybean oil is partially hydrogenated; however, partial hydrogenation significantly increases transfatty acids, which have been linked with coronary heart disease and buildup of plaque in arteries [1]. The Food and Drug Administration (FDA) made a preliminary determination that partially hydrogenated oils are no longer 'generally recognized as safe' (GRAS) and is now taking steps to remove artificial trans fats from human food [2]. Altering the composition of soybean oil by decreasing the levels of polyunsaturated fatty acids may help reduce the need for hydrogenation.
Decreasing levels of linolenic acid is predicted to improve soybean oil characteristics by decreasing total levels of polyunsaturated fatty acids, and subsequently increasing frying and oxidative stability. Conversion of linoleic to linolenic acid is catalyzed by the fatty acid desaturase 3 (FAD3) enzyme, which is produced by a family of genes consisting of FAD3A (Glyma14g37350), FAD3B (Glyma02g39230) and FAD3C (Glyma18g06950). Consistent with its high expression in developing seeds, FAD3A has the greatest effect on linolenic acid concentrations in soybean oil [15]. Combining mutations within FAD3A with FAD3B and/or FAD3C resulted in oil having <3 % linolenic acid [16][17][18][19][20] With the advent of sequence-specific nucleases, including TALENs and CRISPR/Cas, it has become possible to introduce targeted knockout mutations within genes of interest [21]. When delivered to plant cells, sequencespecific nucleases generate targeted DNA double-strand breaks. These double-strand breaks are then repaired predominantly by non-homologous end joining (NHEJ), which may result in the introduction of small insertions or deletions at the repair site. If double-strand breaks are generated within gene coding sequences, imprecise repair by NHEJ has potential to introduce frameshift mutations or in-frame deletions that destroy protein function. The objective of this study was to create high oleic and low linolenic soybean lines by stacking targeted mutations within FAD2-1A, FAD2-1B and FAD3 genes. With the current industrial standard for low linolenic acid soybean oil being 3 % [20], we sought to inactivate a sufficient number of FAD3 genes to achieve linolenic levels <3 %.

Results
Designing TALENs targeting the soybean FAD3 genes Soybean oil is primarily composed of palmitic acid, stearic acid, oleic acid, linoleic acid and linolenic acid. In oil from wild type plants, these five fatty acids are present at approximately 13, 4, 20, 55 and 8 %, respectively (Fig. 1a). Previously, we used TALENs to generate knockout mutations within both FAD2-1A and FAD2-1B genes [14]. Oil from the resulting plants contained higher levels of oleic acid (~79 %) and lower levels of a b c d Fig. 1 Design of TALENs targeting FAD3 genes within Glycine max. a Illustration of the fatty acid pathway. Relative percent composition of individual fatty acids in the oil from WT and fad2-1 knockout plants is shown on the right. b Schematic of the FAD3A genomic sequence. Triangles, approximate TALEN binding sites; black boxes, exons; gray boxes, 5' and 3' untranslated regions. c Nucleotide sequences of the predicted TALEN target sites within the FAD3A, FAD3B, and FAD3A genes. Bold and underlined nucleotides indicate TALEN binding sequence. Lower case nucleotides indicate positions of SNPs. d Illustration of a TALEN monomer expression vector. P NOS , nopaline synthase promoter; SV40 NLS, simian virus 40 large T-antigen nuclear localization signal; T NOS , nopaline synthase terminator; AmpR, ampicillin resistance gene linoleic and linolenic acids (~5 % each), compared to oil from WT plants. Here, we sought to further improve oil characteristics by knocking out genes involved in the conversion of linoleic to linolenic acid. We predicted that by knocking out the FAD3 linoleate desaturase genes, levels of linolenic acid would further decrease.
The soybean genome contains three linoleate desaturase genes: FAD3A (Glyma14g37350), FAD3B (Glyma02g39230) and FAD3C (Glyma18g06950). In terms of nucleotide similarity of their coding sequences, FAD3A shares 96.2 % identity to FAD3B and 14.4 % identity to FAD3C. Compared to FAD3B and FAD3C, mutations in FAD3A confer the greatest decrease in linolenic acid levels in soybean oil (from~8 to~4 %) [22], which is consistent with higher expression of FAD3A within developing seeds [15]. Therefore, we sought to design TALENs that primarily recognize FAD3A sequence. Three TALEN pairs were synthesized which recognize sequence within exon two or exon three of FAD3A (designated as GmFAD3_T01.1, GmFAD3_T02.1, and GmFAD3_T03.1) (Fig. 1b). TALENs were designed to recognize FAD3A sequence which is partially conserved between FAD3B and FAD3C; however, the recognition sequences for all TALEN pairs at FAD3B and FAD3C contained at least one single-nucleotide polymorphism (SNP), but up to 11 SNPs, when compared to the FAD3A sequence (Fig. 1c).

Assessing TALEN activity in protoplasts by deepsequencing
To determine TALEN activity, protoplasts were transformed with plasmid DNA encoding each TALEN pair and the FAD3 target sites were deep-sequenced. To this end, approximately 500 000 protoplasts were transformed with 15 μg each of two plasmids encoding a complete TALEN pair. Protoplasts were transformed using polyethylene glycol. Genomic DNA was isolated~48 h post transformation and used as a template in a PCR with primers designed to individually amplify TALEN target sites within the FAD3A, FAD3B or FAD3C gene. Amplicon pools for each TALEN target site were sequenced by 454 pyrosequencing. For all three TALEN pairs, we observed evidence of NHEJ mutations in two of the three FAD3 genes (Fig. 2a). TALEN pair GmFAD3_T02.1 introduced mutations within both FAD3A and FAD3B, and, relative to the other TALEN pairs, had the highest activity at its intended target sequence, FAD3A (16.0 %). On the other hand, TALEN pair GmFAD3_T03.1 had the lowest activity at its intended FAD3A target sequence (4.9 %). Activity of all three TALEN pairs at the FAD3B and FAD3C target sites was lower than the respective FAD3A target site, which is most likely due to SNPs within the FAD3B and FAD3C TALEN binding sites.
We observed a correlation between the number of SNPs within TALEN binding sites and the relative mutation frequencies (Fig. 2b). Mutation frequencies at FAD3A target sites (containing 0 SNPs) for TALEN pairs GmFAD3_T01.1, GmFAD3_T02.1, GmFAD3_T03.1 were 11.2, 16.0 and 4.9 % respectively. After normalizing TALEN mutation frequencies at FAD3A, the relative mutation frequencies at FAD3B and FAD3C were determined. Target sites with one or two SNPs decreased mutation frequencies to~53 or 63 %, respectively, relative to the activity of the corresponding TALEN FAD3A; target a b Fig. 2 FAD3 TALEN activity in soybean protoplasts. a TALEN pairs were assessed for their activity~48 h after transformation in soybean protoplast. The frequency of mutagenesis represents the total number of sequence reads with insertions or deletions divided by the total number of sequence reads. The resulting number was then divided by the transformation frequency (90 %) which was determined using a YFP control plasmid. b TALEN activity relative to the number of SNPs present within the predicted TALEN binding sites sites with four SNPs decreased mutation frequencies to 14 %; target sites with five SNPs decreased mutation frequencies to 0.041 %, and target sites with >5 SNPs decreased mutation frequencies to undetectable levels. Whereas these data do not account for relative position of the SNPs, they provide evidence for TALEN target site specificity, indicating that target sites with five or more SNPs are unlikely to be recognized and cleaved.

Generating soybean plants with FAD3 mutations
To generate soybean plants with knockout mutations in FAD3 genes, DNA encoding TALEN pair GmFAD3_T02.1 was stably integrated into the soybean genome [14,23]. Both WT and fad2-1a fad2-1b mutant soybean lines were transformed; from four independent transformations, a total of 72 events were generated (Table 1). To detect TALEN-induced mutations, the FAD3A gene was amplified and digested with T7 endonuclease I. We observed that 16 of the 72 events had cleavage products consistent with mutations within the GmFAD3_T02.1 target sequence. Cloning and sequencing of FAD3A amplicons revealed that all 16 plants harbored short deletions within the TALEN spacer sequence, ranging from 4 to 135 bp. Together, these results confirm the successful mutagenesis of FAD3A within T0 soybean plants, with a mutagenesis frequency of~22 %.
To confirm TALEN-induced mutations can be stably transmitted to subsequent generations, candidate T1 plants derived from experiment Gm183 were screened for mutations within FAD3A by PCR amplification and sequencing of clones. From three different T0 events (Gm183-4, Gm183-5 and Gm183-6), we identified T1 plants harboring heterozygous or homozygous mutations within FAD3A, indicating that mutations were stably transmitted to the next generation (Table 2). Further, we assessed T1 plants by PCR for the presence of transgene sequence. Of the 25 T1 plants assayed, 20 were positive for transgene sequence and five were negative (i.e., null segregant for the TALEN transgene). Importantly, two of the five transgene-free T1 plants harbored mutations within FAD3A. These two plants were self-pollinated to produce homozygousmutant, transgene-free fad2-1a fad2-1b fad3a soybean plants. Notably, we also identified a single-gene fad3a knockout T1 plant from experiment Gm184 (identified as Gm184-3-20) which contains a homozygous −4 bp deletion within FAD3A. We failed, however, to identify plants with combinations of FAD3A and FAD3B mutations, indicating that the frequency of mutagenesis at FAD3B was <1.4 % (i.e., less than 1 out of 72 events).

Discussion
In 2015, the FDA ruled that trans fat is no longer 'generally recognized as safe' for use in food, and has set a 3 year deadline to remove partially hydrogenated oils from food products. In an effort to improve shelf life and cooking characteristics, soybean oil is partially hydrogenated. However, partial hydrogenation results in increased levels of trans fats. Generating soybean oil with lower levels of polyunsaturated fatty acids promises to enhance shelf life and heat stability, thereby reducing the need for hydrogenation. Previously, we generated soybean that produce high oleic acid oil by knocking out both FAD2-1A and FAD2-1B genes. Here, we further improved oil characteristics by decreasing polyunsaturated fatty acids (linoleic and linolenic) to levels below 3 %. The methods and products presented here provide solutions for the demand of soybean oil with increased oxidative stability.
We observed lower levels of linoleic acid and higher levels of oleic acid within oil from fad2-1a fad2-1b fad3a plants, when compared to oil from fad2-1a fad2-1b plants. It would be expected that knockout mutations within desaturase genes would result in accumulation of the corresponding substrate. Indeed, this is the case for plants containing mutations in either FAD3A or FAD2-1A FAD2-1B; mutations in FAD3A resulted in increased levels of linoleic acid, and mutations in FAD2-1A FAD2-1B resulted in increased levels of oleic acid. When we introduced FAD3A mutations within fad2-1a fad2-1b soybeans, the level of linoleic acid decreased and the levels of oleic acid increased. This trend was also observed in high oleic and low linolenic soybean plants generated after combining different sources of mutant FAD2-1A, FAD2-1B and FAD3A genes [20]; however, fatty acids levels were significantly affected by environmental conditions. Further, and unexpectedly, we observed that oil within fad3a and fad2-1a fad2-1b plants had deceased levels of the two fatty acids immediately preceding the substrate of the inactivated desaturase (i.e., palmitic and stearic acid in fad2-1a fad2-1b plants, or stearic and oleic acid in fad3a plants). Understanding properties of additional soybean desaturase proteins and the effects of genetic background and environmental conditions may provide a better understanding of the lipid biosynthetic pathway in soybean.
Here, we used TALENs to generate fad2-1a fad2-1b fad3 knockout soybean lines; however, there are other sequence-specific nucleases that can be used for plant genome editing, including meganucleases, zinc-finger nucleases and CRISPR/Cas systems. Three key parameters for choosing a sequence-specific nuclease include efficacy (i.e., how likely will the nuclease introduce a desired modification), target site specificity, and ease of construction. Although meganucleases and zinc-finger nucleases have achieved acceptable mutation frequencies and target site specificity [28,29], their widespread use has been hindered due to difficulties with construction [30,31]. TALEN and CRISPR/Cas systems have overcome these challenges as they have a modular 'one RVD to one base pair' design or a RNA-DNA interaction, allowing for efficient reconstruction of nucleases with altered target sites. One difference between TALENs and CRISPR/Cas is target site length. CRISPR/Cas9 from Streptococcus pyogenes recognizes, in general, 17-20 nucleotides of sequence plus a three nucleotide PAM sequence (NGG), providing~19-22 nucleotides of target site specificity. TALEN pairs, on the other hand, are frequently engineered to recognize 30-40 nucleotides, which may lead to fewer off-target double-strand breaks. Whereas both TALENs and CRISPR/Cas9 tolerate certain nucleotide changes within their target sequences, both provide sufficient specificity to target a single site within a complex plant genome, provided the target site (or a similar target site) is not repeated elsewhere in the genome.
An advantage of engineering crops with sequencespecific nucleases is that the resulting product is not required to harbor transgenic DNA. Within this study, we identified two modified soybean lines with undetectable levels of transgenic DNA. The genotype of these plants were described to the USDA for the purpose of determining regulatory status. An opinion letter, issued May 20th, 2015, indicated that the resulting FAD3A knockout plants are not regulated by the USDA under seven CFR part 340 [32]. This means that trials can be launched with transgene-free fad3a knockout plants to assess their phenotype in field grown conditions. Due to the lengthy and costly deregulation process, the technology and methods presented within this study provide a clear advantage over conventional transgenesis, thereby enabling more groups to contribute to crop improvement and food security.

Conclusions
Here we describe methods to efficiently stack quality traits within plants using sequence-specific nucleases. TALENs targeting FAD3 were directly delivered to soybean fad2-1a and fad2-1b knockout lines to produce triple knockout fad2-1a fad2-1b fad3 plants. Seed oil from the triple knockout lines had significantly altered fatty acid levels, compared to the parent fad2-1a fad2-1b lines. The polyunsaturated fatty acids, linoleic and linolenic acid, decreased to levels below 3 %, and the monounsaturated fatty acid oleic acid increased to levels over 80 %.

Plant material
Plant material used within this study was from soybean [Glycine max (L.) Merr.] variety 'Bert'.

Plasmid construction
Coding sequences for the TALEN pairs used in this study (GmFAD3_T01.1, GmFAD3_T02.1, and GmFAD3_T03.1) were synthesized as previously described [33]. Individual TALEN monomers were cloned into protoplast expression vectors harboring a nopaline synthase (NOS) promoter and terminator. TALEN backbone architecture comprised N-terminal truncations (N152: TAAAKFERQHMDSID IADLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGF THAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAI VGVGKQWSGARALEALLTVAGELRGPPLQLDTGQLL KIAKRGGVTAVEAVHAWRNALTGAPLN) and C-ter minal truncations (C40: SIVAQLSRPDPALAALTNDHLV ALACLGGRPALDAVKKGL). Each TALEN monomer comprised 15 repeat domains for targeting 15 nucleotides of FAD3 sequence, as shown in Fig. 1c. Repeat variable diresidues within the TALE repeats included NI (for targeting adenine), HD (for targeting cytosine), NN (for targeting guanine), and NG (for targeting thymine). To facilitate trafficking to plant cell nuclei, an SV40 NLS (PKKKRKV) was added to the N-terminus of the TALEN protein. The size of plasmids encoding TALE monomers was 6151 bp. Plasmids were isolated from bacteria using the QIAGEN® maxiprep kit.

Soybean transformation
Experiments within this study were performed using the soybean variety, 'Bert' , and the fad2-1a fad2-1b double homozygous mutant soybean line as previously described [14]. Transformation was carried out using following previously described protocols [14,23]. Briefly, half-seeds were transformed with plasmid sequence encoding TALEN pairs and a selectable marker, and soybean plants were regenerated on medium containing glufosinate [34]. Explants were incubated in a growth incubator at 28°C with~110 μmol/m 2 /s of light. Rooted seedlings were transferred to soil containing a peatbased substrate (BM1, Berger, Les Tourbièr Berger Ltee, Saint-Modeste, QC, Canada), and acclimated to ambient humidity.

Protoplast transformation
TALEN pairs were assessed for activity using soybean protoplasts. Protoplasts were isolated from immature cotyledons similar to previously described protocols [35]. Briefly, immature cotyledons were digested in an enzyme solution containing 0.45 M D-mannitol, 20 mM MES, 2 % cellulose, 0.5 % macerozyme, pH 5.8. Digestion was carried out for 16 h at 25°C in the dark with shaking at 26 rpm. Protoplasts were passed through a 100 μm cell filter and collected in a 50 mL Falcon tube. Protoplasts were then pelleted by centrifugation at 100 rpm for 5 min. Supernatant was removed and cells were resuspended in WB-N solution (0.45 M D-mannitol, 10 mM calcium chloride, pH 5.8). Protoplasts were transformed using polyethylene glycol 4000 (20 % diluted concentration) for 30 min. For each TALEN pair,~500 000 protoplasts were transformed with 30 μg of plasmid (15 μg for each TALEN pair). Protoplasts were washed three times in WB-N and transferred to low retention 15 × 10 mm petri plates. Protoplasts were incubated at 25°C for 48 h before genomic DNA was isolated.

Oil analysis
Individual T2 seeds from homozygous-mutant fad2-1a fad2-1b fad3a T1 plants were isolated and assessed for oil composition, as shown in Fig. 3. Five T2 seeds from each of four different T1 parent plants with a fad2-1a fad2-1b fad3a genotype were sampled; 20 seeds from fad2-1a fad2-1b lines were sampled, five seeds from one fad3a parent was sampled, and four WT seeds were sampled. Seeds were sent to Eurofins BioDiagnostics (507 Highland Drive, River Falls, WI 54022) for fatty acid analysis. Oil composition was determined using gas chromatographic analysis of fatty acid methyl esters (GC FAME Analysis). The fatty acid levels were reported as the percentage of palmitic, stearic, oleic, linoleic, and linolenic acids to the total fatty acids. Raw GC FAME data is presented in Additional file 1.

Additional file
Additional file 1: GC FAME data contains the raw GC FAME data used to generate the graph in