Genetic variation assessment of stacked-trait transgenic maize via conventional breeding

Background The safety assessment and control of stacked transgenic crops is increasingly important due to continuous crop development and is urgently needed in China. The genetic stability of foreign genes and unintended effects are the primary problems encountered in safety assessment. Omics techniques are useful for addressing these problems. The stacked transgenic maize variety 12–5 × IE034, which has insect-resistant and glyphosate-tolerant traits, was developed via a breeding stack using 12–5 and IE034 as parents. Using 12–5 × IE034, its parents (12–5 and IE034), and different maize varieties as materials, we performed proteomic profiling, molecular characterization and a genetic stability analysis. Results Our results showed that the copy number of foreign genes in 12–5 × IE034 is identical to that of its parents 12–5 and IE034. Foreign genes can be stably inherited over different generations. Proteomic profiling analysis found no newly expressed proteins in 12–5 × IE034, and the differences in protein expression between 12 and 5 × IE034 and its parents were within the range of variation of conventional maize varieties. The expression levels of key enzymes participating in the shikimic acid pathway which is related to glyphosate tolerance of 12–5 × IE034 were not significantly different from those of its parents or five conventional maize varieties, which indicated that without selective pressure by glyphosate, the introduced EPSPS synthase is not has a pronounced impact on the synthesis of aromatic amino acids in maize. Conclusions Stacked-trait development via conventional breeding did not have an impact on the genetic stability of T-DNA, and the impact of stacked breeding on the maize proteome was less significant than that of genotypic differences. The results of this study provide a theoretical basis for the development of a safety assessment approach for stacked-trait transgenic crops in China.


Background
Stacked genetically modified crops (GMCs) with their improved traits, versatility and low cost, have been well received by many growers and researchers since their inception and are now leading transgenic crop developments. In 2016, stacked GMCs were grown in 14 countries and had a planting area of 75.4 million hectares, accounting for 41% of the global transgenic crop planting area [1]. The rapid application of stacked GMCs has led to concerns over whether the safety of such products differs from that of single-trait products and how the safety of such products will be assessed.
Stacked GMCs can be obtained through cotransformation, retransformation and conventional breeding [2,3]. Typically, a stacked GMC that is produced by cotransformation and retransformation requires a de novo safety assessment as a new event [4]. However, the requirements for the safety assessment of products obtained using conventional breeding stack strategies are not standardized and differ among countries [5][6][7][8][9]. The main question is whether a breeding stack creates unintended effects and changes that require additional safety assessments. Two primary concerns are 1) whether a breeding stack can increase genomic instability and 2) whether potential interactions between the products of the transgenes in stacked GMCs impact safety [4,10].
Recently, omics approaches including genomics, transcriptomics, metabolomics and proteomics have provided a valuable platform to analyze the unintended effects of GMCs [11][12][13][14]. Proteomic analysis is especially useful to assess unintended effects in GMCs because proteins are responsible for much of plant growth and metabolism [15]. Ren et al. analyzed the impacts of different environmental treatments and genetic modifications on the proteome of Arabidopsis. A total of 102 significantly different proteins were detected between 12 transgenic Arabidopsis plants featuring different T-DNA insertion sites and wild-type Arabidopsis. The impact of cold treatment on the Arabidopsis proteome was more significant than that of genetic modification [16]. Proteomics analyses have also been used to test for unintended effects in GMCs including genetically modified (GM) rice [17], oilseed [15], tomato [18], maize [11,19], wheat [20], pea [21] and tobacco [22]. Most of these studies have found that the percentage of significantly different proteins between transgenic and non-transgenic varieties is very low and that the differences were expected or within the range of natural variation [15,17]. Gapito-Tenfen et al. [23] reported that protein changes observed in the stacked insecticidal (cry) and herbicide tolerance (epsps) transgenic maize proteome differed significantly from those of single event lines and a conventional counterpart. Using transcriptomics and metabolomics profile analysis, Wang et al. [24] reported far fewer differences in gene expression and metabolites resulting from the breeding stack than those found among traditional maize varieties. An important issue to address is how altered protein production compares with the range of natural variability after a breeding stack event.
The stacked transgenic maize 12-5 × IE034 which contains the insecticidal cry and glyphosate tolerance G10-epsps genes was obtained by sexual hybridization of transgenic maize 12-5 and IE034. It has simultaneous resistance to glyphosate and pests. The target genes can be expressed successfully at the RNA and protein level in 12-5 × IE034 [24]. In this study, proteomics was used as a molecular profiling technique to identify potential effects of the breeding stack in GM varieties. We compared the proteomic data of eight maize varieties including 12-5 × IE034, its breeding parents (12-5 and IE034), and five traditional maize varieties, and we evaluated the protein changes due to variety, transformation and the breeding stack. The results indicated that far fewer protein differences resulted from the breeding stack than from transformation and traditional maize varieties. This finding provides a theoretical basis and scientific data for the development of a safety assessment of stacked GMCs in China.
Micro-droplet digital PCR is an absolute quantification technique for nucleic acid molecules based on the poisson distribution principle and was designed to determine the copy number of foreign DNA quickly and accurately. The primers and probes used in this study showed good specificity and were capable of clearly distinguishing positive and negative micro-droplets. The number of micro-droplets generated in the test was greater than 13, 000 which met the poisson distribution criterion. In addition, the relative standard deviation (RSD) value for the number of micro-droplets generated by three wells was smaller than 0.25 which met the EU's nucleic acid molecule detection requirement. The copy number of the target genes was calculated using a prepared digital PCR system. The copy number of the T-DNA integrated in the genome of the stacked-trait transgenic maize 12-5 × IE034 was 0.47-0.5, wherase that integrated in the parent event was 0.97-1.07 (Table 1). This difference was expected because the heterozygous foreign DNA content in 12-5 × IE034 was theoretically half of that in its transgenic parents 12-5 and IE034. The micro-droplet digital PCR results were consistent with the theoretical value, which indicated that stacked-trait development via conventional breeding is not expected to change the copy number of foreign DNA in the genome.
Determination of the expression levels of foreign genes in stacked-trait transgenic maize 12-5 × IE034 The results of real-time PCR indicated that the expression of the target genes in 12-5 × IE034 was lower than that in the parents 12-5 and IE034. The expression levels of cry1Ab and G10-EPSPs in 12-5 were nearly twice those of 12-5 × IE034. The expression level of cry1Ie in IE034 was 2.5-fold that of 12-5 × IE034.

Inter-maize variety proteome analysis
In this study, mass spectrometry generated 449,238 slevel patterns. The number of identified patterns was 164,031 yielding an identification rate of 36.51%. In addition, 21,837 peptides were identified and the average peptide length was 13.55 amino acids, which was within a reasonable range of peptide lengths. A total of 3560 proteins were identified yielding an average protein identification coverage of 23.93%. In addition, 2772 proteins containing at least two unique peptides were identified, accounting for 77.87% of the total number of identified proteins (Fig. 3).

Proteomic analysis of genotype effects
The number of differentially expressed proteins in the various traditional maize varieties ranged from 102 to 380 (Fig. 4). The highest number of differentially expressed proteins (380) was observed between Z58 and Z31 which are the two parental varieties used to establish 12-5 and IE034, respectively. This result futher demonstrated that the genetic distances of the two parental varieties of the transgenic maize are long. Gene Ontology (GO) analysis showed that the differentially expressed proteins were primarily associated with the following terms: metabolic process and cellular process in the biological process category, cell and cell part in the cellular component category, and binding and antioxidant in the catalytic activity category. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis showed that the differentially expressed proteins were primarily related to metabolic pathways, biosynthesis of secondary metabolites, ribosomes, and microbial metabolism in diverse environments.

Proteomic analysis of the transgene effects
A total of 264 and 397 differentially expressed proteins were found between IE034 and its recipient Z31, and between 12 and 5 and its recipient Z58, respectively. GO analysis showed that the proteins were associated with the following terms: metabolic process, cellular process, cell, cell part, organelle, binding and catalytic activity (Fig. 5a). KEGG analysis indicated that these proteins were primarily related to metabolic pathways and biosynthesis of secondary metabolites (Fig. 5b).
Correlation analysis of differentially expressed proteins among IE034, 12-5 and the recipients Z31 and Z58 through pairwise comparisons showed 72 common differentially expressed proteins in the comparison groups of IEO34/Z31, Z58/Z31 and 12-5/Z58. In the pairwise comparison of groups, 73, 79 and 157 proteins were exclusively differentially expressed in Z31/IE034, Z31/Z58, and Z58/12-5, respectively (Fig. 6). These differentially expressed proteins were neither toxins nor allergens, and almost all were associated with the terms metabolic process, cellular process, cell, cell part, organelle, binding and catalytic activity.

Proteomic analysis of stack breeding effects
Relative to its parents 12-5 and IE034, 12-5 × IE034 contained 72 and 87 differentially expressed proteins, respectively. These differentially expressed proteins were primarily associated with the following GO terms: metabolic process, cellular process and response to stimulus in the biological process category, cell, cell part and organelle in the cellular component category, and catalytic activity and binding in the molecular function category (Fig. 7a). In the KEGG analysis, the differentially expressed proteins were primarily related to metabolic pathways, biosynthesis of secondary metabolites, ribosomes and microbial metabolism in diverse environments (Fig. 7b). These results indicated that the functions of the differentially expressed proteins between 12 and 5 × IE034 and its parents are similar to those of proteins differentially expressed among other maize varieties.
Effect of stacked breeding on the shikimic acid pathway GO and KEGG analysis of the identified proteins resulted in 70 proteins annotated to the biosynthesis pathways of lysine, tyrosine and phenylalanine, including such key enzymes as chorismic acid synthetase, EPSP synthase and shikimate kinase. Analysis of the protein expression levels indicated that the expression levels of the key enzymes participating in the shikimic acid pathway were not significantly different between the stackedtrait GM maize 12-5 × IE034 and each of its parents 12-5 and IE034, or any of the five conventional maize varieties. This result indicated that without the presence of selective pressure by glyphosate, integration of the gene encoding the EPSPS protein is not expected to produce a marked impact on the shikimic acid pathway or on the synthesis of plant aromatic amino acids (Table 3).
Therefore, it can be concluded that the impact of stacked-trait development via conventional breeding on the maize proteome was less significant than those of genetic modification or genotype. In addition, stackedtrait development via conventional breeding did not have a significant impact on the maize proteome.

Discussion
A transcript level reduction of approximately 50% of each target foreign gene was observed in 12-5 × IE034 relative to the levels in the parents 12-5 and IE034. Similarly, Agapito-Tenfen et al. [23] observed significant reductions (from 29 to 41%) in the transcript levels of three transgenes in stacked varieties relative to the levels in parental single event varieties. Homologous ubiquitin promoters control foreign gene expression in the stacked line 12-5 × IE034. The observed reduction may be due to homology-dependent gene silencing resulting from the introduction of transgenes [25][26][27][28][29][30]. Gene silencing mediated by 35S promoter homology is a common problem and occurs in tagged lines from different collections [31]. Alternatively, the reduction in gene expression at the transcript level might be related to the high energetic demand of the cell. High energetic costs occur when foreign genes are driven by constitutive promoters in transgenic plants [32,33]. Another possible reason may be that the foreign genes in 12-5 × IE034 are heterozygotes. Changes in the expression level may cause environmental or food safety concerns. Therefore, in the safety assessment, it is necessary to investigate the expression level of foreign genes in stacked lines.
Natural variation is widespread in the plant proteome. Nonspecific proteome profile and specific protein analyses have shown that protein expression is impacted by genotype and environmental factors [11,[34][35][36][37]. Lehesranta et al. [38] found that most detected proteins  exhibited significant differences between one or more GM potato varieties and landraces. Agapito-Tenfen et al. [23] reported that protein changes in stacked transgenic maize differed significantly from those of single event lines and a conventional counterpart. In the present study, the proteomic profiles of different traditional maize varieties, stacked transgenic maize and its parents were analyzed using protein iTRAQ technology. A total of 102-380 differentially abundant proteins were detected among the five traditional maize varieties from different provinces. Seventy-two and 87 differentially abundant proteins were found between 12 and 5 × IE034 and its single-trait parents 12-5 and IE034, respectively. These proteins are primarily associated with the same KEGG pathway. In addition, no new non-target proteins were found in a comparison between 12 and 5 × IE034 and the traditional varieties. These results indicated that the impact of stacked-trait development via conventional breeding on the maize proteome was less significant than that of genetic manipulation or natural variation among maize varieties.
It is common to combine beneficial traits when breeding new crop varieties. Scientists combine multiple favorable traits, such as disease resistance, insect resistance and high yield, by leveraging hybrid vigor to develop new varieties that meet production demands. Gene introgression and gene pyramiding would inevitably occur in this breeding process. It has been suggested that over two decades, a total of 111 genes of 19 crops have been transferred from wild-type to cultivated species, of which 80% were related to disease resistance [39,40]. The development of stacked-trait transgenic crops using single-trait transgenic crops as parents is substantially equivalent to conventional cross breeding which does not require molecular-level genetic manipulation. New varieties obtained through conventional cross breeding based on native crop varieties have a long, safe history as food/feed and are not required to undergo safety assessment prior to commercial cultivation. The process of stacked-trait development via conventional breeding is equivalent to the conventional cross breeding process except that transgenic crops are used as the parents. In addition, parent transgenic crops have undergone comprehensive safety assessments to ensure that they are as safe as recipient varieties. Therefore, it can be concluded that stacked-trait transgenic crops developed via conventional breeding do not present higher risks regarding food and feed safety than do their parents [10,41].
Evaluations of substantial equivalence are essential for the commercial release of transgenic crops, and substantial equivalence is an important concept in transgenic crop safety assessments [42,43]. Substantial equivalence has been widely used in safety assessments of transgenic crops worldwide [44]. The substantial equivalence concept also applies to stacked-trait transgenic crops developed via conventional breeding. Comparative analyses of 15 agronomic characteristics such as pollen viability, yield, stalk and root lodging, seedling vigor, disease resistance, insect resistance, and herbicide tolerance of stacked-trait transgenic MON 89034 × TC1507 × NK603 × DAS-40278-9 maize developed via conventional breeding indicated that transgenic maize was equivalent to conventional maize except in its targeted traits [45]. Geography-and season-related natural variation may change soybean components, and the impact of natural variation on soybean components such as isoflavones, fatty acids and vitamin E was shown to be more significant than that of genetic manipulation [46].
Thus, a breeding stack of two or more transgenic events is, in essence, a traditional breeding process. The stacking process does not involve gene transfer in vitro. In addition, the single-trait event varieties used as parents often undergo rigorous safety assessment before commercial release. Thus, the safety assessment of stacked GMCs should adopt simplified procedures based on the safety assessment of the single-trait parents. The producer should provide data related to genetic stability, foreign gene expression and the interaction between target genes as well as regulatory elements on a case-bycase basis.

Conclusions
Stack breeding is substantially equivalent to the traditional breeding process. It does not affect on the insertion site, copy number or genetic stability of the foreign gene. Foreign gene expression in stacked transgenic maize is less significant than that in its parents. Proteomic profiling showed that the impact of stacked breeding on the maize proteome was less significant than that of genotypic differences. This is the first report on the comparative proteomic profiling of stacked versus different maize varieties. This result provides a theoretical basis for the development of a safety assessment approach to stacked-trait transgenic crops in China.

PCR detection of stacked-trait GM maize 12-5 × IE034
Genomic DNA (gDNA) was extracted from the leaves of transgenic maize using a broad-spectrum plant genomic DNA quick extraction kit (Tiangen Biotech (Beijing) Co., Ltd.) and was used to conduct multiple PCR detection of three target genes, cry1Ab/cry2Aj, cry1Ie and G10-EPSPs. The operating steps and primers used in this study were as previously described by Zhang et al. [47].

Copy number determination of foreign genes in stackedtrait GM maize 12-5 × IE034
The following PCR primers and probes were designed based on the sequences of the cry1Ab/cry2Aj and cry1Ie: AB-F (GAGCCTGTTCCCCAACTACG), AB-R (GGTG TAGATGGTGATGCTGTTC), AB-probe (HEX-ACTA CGACAGCCGCACCTACCCCAT-BHQ-1), IE-F (AACC CCGACAAGCACCAGAG), IE-R (GGAAGTCCTCGT GGTTGATGTT) and IE-probe (HEX-CACCAGAGCC TGAGCAGCAACGCC-BHQ-1). The following primers and probes were designed using the maize single-copy native gene hmga as a reference gene and by referring to the national standard GB/T1945.7-2004: HMG-F (TTGGAC TAGAAATCTCGTGCTGA), HMG-R (GCTACATAGG GAGCCTTGTCCT) and HMG-probe (6-FAM-CAATCC ACACAAACGCACGCGTA-BHQ-1). A 20-μL probedigital PCR system (containing 10 μL of 2x ddPCR Super Mix for Probes, 1.4 μL of forward primer (700 nM), 1.4 μL of reverse primer (700 nM), 0.5 μL of probe (FAM/HEX) (250 nM), 1 μL of DNA (25 ng/μL), and 5.7 μL of H 2 O) was prepared. The well-mixed reaction system was added to a droplet generator to obtain micro-droplets, which were then transferred to a 96-well plate with a specific heat seal and incubated for 10 s at 180°C. PCR was performed using a QX200 platform (Bio-Rad, USA) under the following conditions: pre-denaturation (94°C, 10 min); denaturation (94°C, 30 s), annealing and extension (62°C, 60 s) cycles; and incubation (98°C, 10 min). Upon Fig. 7 GO and KEGG analysis of differentially expressed proteins between 12 and 5 × IE034 and its parents. a GO analysis of differentially expressed proteins between 12 and 5 × IE034 and its parents. 12-5/12-5 × IE034, the GO analysis results of differentially expressed proteins between transgenic maize 12-5 × IE034 and its parent 12-5. IE034/12-5 × IE034, the GO analysis results of differentially expressed proteins between transgenic maize 12-5 × IE034 and its parent 12-5. The red column presents the functional classification ratios of upregulated proteins under the categories biological progress, cellular component and molecular function. The green column presents the functional classification ratios of downregulated proteins under the categories biological progress, cellular component and molecular function. b KEGG analysis results of differentially expressed proteins between 12 and 5 × IE034 and its parents. The percentages of the top ten pathways related to the differentially expressed proteins are shown. The number of proteins in each pathway is shown in parentheses completion of the reaction, a micro-droplet reader was used to read the signal. QuantaSoft software (Bio-Rad, USA) was used to analyze the test results. The T-DNA copy number was calculated using the following formula: copy number of T-DNA = T-DNA content/reference gene content [48].

qRT-PCR detection of stacked-trait GM maize 12-5 × IE034
Maize total RNA was extracted using an EASYspin plant RNA extraction kit. The extracted RNA was reverse transcribed to cDNA through catalysis by reverse transcriptase. Using the first cDNA strand as a template, qRT-PCR was performed on an ABI7500 Real-Time System (Applied Biosystems, USA) using a SYBR Premix Ex Taq kit (TaKaRa, Dalian, China). For qRT-PCR analysis, the actin gene was used as an internal control, and the relative quantification method was used to assess the fold changes of the target genes. Five biological and three technical replicates were performed for each sample. The primers AC200F (ATGTTTCCTGGGATTGCC GAT) and AC200R (CCAGTTTCGTCATACTCTCCCT TG) were used for actin gene amplification. The primer pairs Ab-189-F (GAGCCTGTTCCCCAACTACG)/Ab-189-R (GGTGTAGATGGTGATGCTGTTC), GF1 (CCTCTGGCACCACTTTCGTGACCG)/GR1 (CGGA GCGTGGGACTTGATGTC), and IE256F (ATGTTTCC TGGGATTGCCGAT)/IE256R (CCAGTTTCGTCATA CTCTCCCTTG) were used for cry1Ab/cry2Aj, G10evo-EPSPs and cry1Ie gene amplification, respectively. PCR was performed for 15 s at 95°C, followed by 40 cycles of 95°C for 5 s and 60°C for 34 s.
Statistical analysis was carried out using SPSS software. Quantification cycle (Cq) average of RT-PCR for each biological replicate was calculated according to technical replicate results and used to perform statistical comparisons based on the standard deviation. The fold change data were log10 transformed because of their non-normal distribution. The fold change means obtained for different samples were compared using T-tests at P < 0.05 (SPSS software).

Maize proteome analysis
Ten plants exhibiting normal growth and at the same stage (5-6 leaves) of each maize variety were selected for protein extraction. Soluble protein was extracted from the defined amount of maize leaves and then quantified using the Bradford method [49]. Proteins extracted from 10 plants of each variety were used to create protein pools (in equimolar ratios) before trypsin digestion. The mixture of protein solution was diluted to a final concentration of 1 μg/μL. Approximately 500 μL of 50 mM NH 4 HCO 3 and 2 μL of trypsin were added to 100 μL of protein solution, and the mixture was incubated at 37°C for 8-16 h. After trypsin digestion, the peptides were purified using a StrataX C18 column to remove the salt and dried in a freezing drier. The peptides were divided into six equal parts and labeled using an iTRAQ reagent 8-plex kit (Applied Biosystems, Waltham, US) according to the manufacturer's instructions. Equal amounts of labeled samples were mixed and separated into 12 components using a Thermo DINOEX ultimate 3000 BioRS chromatograph and a Durashell C18 separation column (5 μm, 100 Å, 4.6 × 250 mm). An AB SCIEX nano LC-MS/MS (TripleTOF 5600 plus) mass spectrometer, AB SCIEX separation column (internal diameter: 75 μm, filling: 3 μm, ChromXP C18 column materials: 120 Å, length: 12 cm), NEW objective injection needle (internal diameter: 20 μm, tip diameter: 10 μm), and exigent ChromXP Trap Column (3 μm C18-CL, 120 Å, 350 μm × 0.5 mm) capturing column were used in the study.
Protein identification was performed using the Protein-Pilot™ V4.5 software which is specific to the AB Sciex 5600 Plus system. The database used for the ProteinPilot™ V4.5 software analysis was the UniProt Zea mays database, which contains 142,200 proteins. The identified proteins that contained at least one unique peptide fragment and had a confidence level greater than 95% (unused score ≥ 1.3) were considered credible proteins. In addition, peptide fragments with a confidence level greater than or equal to 95% were deemed credible peptides. Fig. 8 Venn diagram of differentially expressed proteins due to breeding stack. IE034/12-5 × IE034, proteomic comparison of the stacked transgenic maize 12-5 × IE034 and its parent IE034 for detecting stack effects. 12-5/12-5 × IE034, proteomic comparison of the stacked transgenic maize 12-5 × IE034 and its parent 12-5 for detecting stack effects. IE034/12-5, proteomic comparison of the transgenic maize 12-5 and IE034 for detecting differences between the breeding parents   iTRAQ quantification of the proteome was completed using ProteinPilot software. The ratio result output by ProteinPilot software is normalized by the median as the final difference multiple of protein.
Proteins were considered significantly differentially expressed when they met both of the following conditions: i) the difference was 1.5 times or greater (upregulated ≥1.5-fold or down-regulated ≤0.67-fold), and ii) the P value from the statistical significance test was less than or equal to 0.05 (The calculation of Pvalue is based on the ratio of peptide segments contained in identified proteins of each sample. it is an internal algorithm of the ProteinPilot software). Functional annotation of the significantly different proteins was completed by accessing the GO and KEGG databases.