Identification and validation of mutation points associated with waxy phenotype in cassava

Background The granule-bound starch synthase I (GBSSI) enzyme is responsible for the synthesis of amylose, and therefore, its absence results in individuals with a waxy starch phenotype in various amylaceous crops. The validation of mutation points previously associated with the waxy starch phenotype in cassava, as well as the identification of alternative mutant alleles in the GBSSI gene, can allow the development of molecular-assisted selection to introgress the waxy starch mutation into cassava breeding populations. Results A waxy cassava allele has been identified previously, associated with several SNPs. A particular SNP (intron 11) was used to develop SNAP markers for screening heterozygote types in cassava germplasm. Although the molecular segregation corresponds to the expected segregation at 3:1 ratio (dominant gene for the presence of amylose), the homozygotes containing the SNP associated with the waxy mutation did not show waxy phenotypes. To identify more markers, we sequenced the GBSS gene from 89 genotypes, including some that were segregated from a cross with a line carrying the known waxy allele. As a result, 17 mutations in the GBSSI gene were identified, in which only the deletion in exon 6 (MeWxEx6-del-C) was correlated with the waxy phenotype. The evaluation of mutation points by discriminant analysis of principal component analysis (DAPC) also did not completely discriminate the waxy individuals. Therefore, we developed Kompetitive Allele Specific PCR (KASP) markers that allowed discrimination between WX and wx alleles. The results demonstrated the non-existence of heterozygous individuals of the MeWxEx6-del-C deletion in the analyzed germplasm. Therefore, the deletion MeWxEx6-del-C should not be used for assisted selection in genetic backgrounds different from the original source of waxy starch. Also, the alternative SNPs identified in this study were not associated with the waxy phenotype when compared to a panel of accessions with high genetic diversity. Conclusion Although the GBSSI gene can exhibit several mutations in cassava, only the deletion in exon 6 (MeWxEx6-del-C) was correlated with the waxy phenotype in the original AM206–5 source.


Background
Starch is widely used for multiple commercial purposes. Not only does it provide more than 80% of the calories in the human diet [1], it also has many industrial applications, such as for glues and adhesives. It basically consists of two types of polymers: amylose (essentially α1,4polyglucans, linear) and amylopectin (α1,4-polyglucans and α-1,6-polyglucans, branched) [2], whose proportions and arrangements confer characteristics that define the starch's commercial advantages and are the focus of breeding programs. The main commercial sources of starch are maize, cassava, rice, wheat and potato. Cassava is the second most important source of starch worldwide after maize. The functional and specific properties of each of these species define their industrial applications [3][4][5][6].
Amylopectin, the basic unit of the starch granule, is composed of many short chains of glucose molecules. In contrast, amylose is a smaller molecule with longer chains [7]. Starch from different crops typically consists of 20-30% amylose and 70-80% amylopectin [1]. Starches with low or no amylose, known as waxy types, are important because they present lower syneresis and retrogradation, are generally clear and have a more viscoelastic gel form [8]. Retrogradation is a process in which the disrupted amylose and amylopectin chains (usually heated in the presence of water) can gradually re-associate into a different ordered structure. Over time, this reorganization can release the water retained within the structure (syneresis), with detrimental effects on the sensory qualities and storage time of foods derived from this type of starch, primarily because it alters the texture and nutritional properties of the food [9]. Not only does waxy cassava starch present lower syneresis in comparison with waxy starch from other crops [10], the structure of amylopectin is not modified by the mutation [11]. Cassava starch has a neutral taste due to its low contents of lipids and proteins, which gives it competitive advantages over starches from cereals for use in the food industry [12].
Though it is possible to perform chemical and physical modifications of starch to change its functional properties to meet specific market demands, there are certain limitations on these modifications. In addition, regardless of the type, making such changes to starch is always associated with a higher industrial cost. Another aspect to be considered is that consumers are increasingly demanding products that are more natural, with a minimum of industrial processing. Hence, there is a growing market for naturally differentiated starches. Breeding programs can contribute to the discovery of genetic variants of interest and the further incorporation of these characteristics in commercial varieties, and waxy starch is one of these special traits.
From the genetic point of view, a clear understanding of the genes involved in the starch biosynthesis allows the identification of accessions from germplasm carrying the mutations without the need for a laborious phenotyping approach. Among the enzymes related to starch synthesis, GBSSI (granule-bound starch synthase I) is a synthase related to amylose elongation. Waxy mutants of many species exhibit deficient activity of this enzyme [7,13,14]. Several allelic forms derived from different molecular mechanisms have been identified as responsible for the waxy phenotype. In maize, the coding region of the waxy gene comprises 3718 bp, which is composed of 14 exons, ranging from 64 to 392 base pairs (bp), and 13 introns of 81 to 139 bp [15]. According to Fan et al. [16] the wx-D7 allele involves a 30-bp deletion at the junction of the 7th exon-intron and generates an abnormal transcript retaining the 7th intron, which introduces an immature stop codon and inactivates GBSSI. In rice, two SNPs were identified in exons 6 and 10 of the GBSSI gene, with associations to the amylose content and paste properties [17]. A strong correlation was also observed between an SNP located at exon1/intron1 and paste properties in rice starch [18]. In common hexaploid wheat (Triticum aestivum L.), each of the genomes A, B and D has a waxy protein, whose respective locus coder (Wx-A1, Wx-B1, and Wx-D1) has several mutations reported in the literature. In another work, the insertion of an SNP and deletion of a single nucleotide in the null allele Wx-A1 induced premature termination codons approximately 55 nucleotides forward of the mutation point in Triticum dicoccoides A. and T. dicoccum S [19]. More recently, a new null allele was characterized by the insertion of a transposon and consequent loss of function in this species [20]. Two new mutations have also recently been discovered in maize, associated with transposable elements, with a 466-bp retrotransposon inserted in exon 6 and a transposable repeating element of 116 bp inserted in exon 7 [21].
In cassava, studies of the genetic control of the waxy phenotype and even its exploitation for the development of varieties with this characteristic are relatively recent. The first waxy clones in cassava were induced by modifications in GBSSI gene expression via transgenic techniques [22,23] and later by Zhao et al. [24]. Then, a natural mutation was reported by [25] in a series of selfpollinations carried out at the International Center for Tropical Agriculture (CIAT), which evidenced the recessive nature of the waxy gene from genotype AM206-5. Recently, the CRISPR-Cas9 technology was used in cassava to mediate targeted mutagenesis of two genes involved in amylose biosynthesis (GBSSI and Protein Targeting to Starch -PTST1) resulting in waxy clones or clones with reduced amylose content [26].
Aiemnaka et al. [27] performed molecular characterization of the GBSSI gene in segregating progenies from the AM206-5 source and identified potential functional mutations. The authors found an indel in exon 6 with a single base exclusion (cytosine) that creates a premature stop codon; a two-base variant in exon 11 (GC and AT); and a substitution in intron 11 (C to G). Based on this last mutation, the authors developed a SNAP (singlenucleotide-amplified polymorphism) marker whose information would allow breeders to direct self-pollination and backcrossing assisted by molecular markers, with the aim of reducing the costs and time required for breeding programs. In this context, the objectives of the present work were as follows: i) to validate the mutations (SNPs and indels) previously associated with the original source of the waxy starch from AM206-5 in Latin American cassava germplasm; and ii) to identify alternative alleles in the GBSSI gene in a panel of waxy and non-waxy accessions that may be useful in molecular-assisted selection (MAS) for this trait.

Results
Screening of cassava germplasm with SNAP markers SNAP primers developed previously for intron 11 [27] were used to screen 1529 cassava accessions from four countries. The profile analysis of the electrophoresis gels was based on the presence or absence of the fragment amplified by the primers MeWxI11-G and MeWxI11-C, which indicated the presence of heterozygous individuals, whereas the amplification of only one of the two alleles indicated the presence of homozygous individuals for the waxy (MeWxI11-G -GG genotype) or non-waxy (MeWxI11-C, CC genotype) phenotype. According to the molecular analyses, none of the cassava accessions evaluated were identified as having the recessive allelic condition (GG). This result was expected, since none of the 1529 genotyped accessions had the waxy starch phenotype. However, 206 of the evaluated individuals were identified as heterozygotes (CG), because they had alleles amplified by the primers MeWxI11-C and MeWxI11-G.

Genotypic and phenotypic evaluation of S 1 populations with SNAP markers
Twenty-eight heterozygous accessions from the total (206) were self-pollinated in the field with the aim of generating S 1 segregation populations for the waxy gene. From this total (28), three heterozygous accessions (BGM0061, BGM0935 and BGM0438) produced 187 plants (61, 101 and 25, respectively) that were used to evaluate the molecular segregation of the alleles and the waxy phenotype. The results of the allelic segregation of the S 1 progenies, analyzed with the primers MeWxI11-C and MeWxI11-G, showed no significant deviations from the expected proportion of individuals (1:2:1) in progenies S 1 -BGM0061, S 1 -BGM0935 and S 1 -BGM0438, according to the χ 2 test (Table 1). Therefore, even with a small number of individuals in these three progenies, it was possible to observe the Mendelian segregation of the C and G alleles located at intron 11, position 3197 bp of the GBSSI gene, as expected for a recessive trait (Supplement Fig. S1). However, although molecular segregation occurred as expected in progenies S 1 -BGM0061, S 1 -BGM0935 and S 1 -BGM0438, the evaluation of the waxy phenotype in all 28 of the S 1 progenies in the field, based on the 2% iodine test, did not identify any individual with the waxy phenotype. Therefore, this result showed that the mutation of the C by G allele at the 3197 bp position of the GBSSI gene is not a suitable molecular marker for use in molecular-assisted selection of waxy individuals when the genetic background is completely different from the original source of the mutation (AM206-5).

Identification of new genetic variation in the GBSSI by gene sequencing
To identify new allelic variants associated with waxy starch, complete GBSSI gene sequencing was performed on 89 cassava genotypes. After the alignment trimming of the sequences, 16 SNPs were identified in the GBSSI gene as well as one deletion, indicating genotypic differences among the 89 cassava accessions evaluated, which included waxy (AM206-5 derived population) and nonwaxy genotypes (germplasm bank) (Fig. 1). Six SNPs were found in untranslated regions (UTRs), one in a non-coding region and nine in coding regions. However, the allelic variation identified in the GBSSI gene was not able to identify any SNP that could precisely distinguish the waxy and non-waxy individuals (Fig. 2). Only the deletion of the nucleotide cytosine (MeWxEx6-del-C) previously identified by Aiemnaka et al. [27] was found exclusively in waxy accessions. Therefore, the SNPs identified in the GBSSI gene were not able to individually determine stop codons or errors in the genetic code reading that led to the development of a waxy phenotype in cassava.   main objective was to determine how the other mutations could jointly discriminate the waxy phenotype, in order to increase the possibilities of identifying individuals with waxy alleles in the cassava germplasm from Brazil. The overlap of the discriminant function density and the membership assignment indicated that the combination of SNPs was not able to cluster the waxy and non-waxy individuals with 100% accuracy (Fig. 3). Although the classification model demonstrated good accuracy, there was no complete discrimination (100%) of non-waxy individuals (BGM1444, BGM1288, BGM0399, BGM1140, and BGM0263), according to its membership assignment (Fig. 3).
The SNPs that contributed most to the discriminant function are located in the 3'UTR at 4041 and 4195 bp, and exon 13 at 3459 bp. Since the non-waxy individuals BGM1444, BGM1288 and BGM0399 shared these same SNPs with the waxy individuals, the waxy phenotype discrimination could not be performed accurately. The non-waxy individuals BGM1140 and BGM0263 shared the SNPs with small contributions (≤ 0.01) to discriminate the phenotype, located in the 5'UTR at 86 and 93 pb, as well as exon 1 at 381 and 534 bp and exon 3 at 1010 bp. The clone BGM1140 also shared the SNP of 3'UTR region (4195 bp) with waxy individuals.

Genotypic evaluation of the germplasm bank via KASP
After the complete sequencing of the GBSSI gene, we also identified a deletion in exon 6 at position 1701 bp (henceforth referred to as MeWxEx6-del-C), previously described by Aiemnaka et al. [27]. Considering that only the deletion of cytosine at the 1701 bp position of the GBSSI gene was able to distinguish the waxy from nonwaxy phenotypes (Fig. 2), the next step was to implement a genotyping system for this deletion on a large scale. Thus, an initial analysis with the KASP (Kompetitive Allele Specific PCR) technique was performed to evaluate the amplification of the target fragments. In the previous genotyping, the SNAP primers MeWxI11-G and MeWxI11-C were enriched with three nucleotides at the 3′ end to allow differential amplification of an SNP at intron 11. The same strategy was used in the other mutations reported by Aiemnaka et al. [27], without success [data not shown]. In contrast, KASP technology could detect only one base difference by the use of primers differentially marked by fluorescence [28].
HEX fluorescence was used to detect waxy homozygous (wxwx) accessions, and FAM fluorescence to detect non-waxy homozygous (WxWx) accessions, while the heterozygous accessions were identified by intermediate fluorescence (Wxwx). In the first analysis of the MeWxEx6-del-C deletion via KASP, we used a set of 94 accessions, and only the 2017wx-02-17 and 7934-1 genotypes did not generate consistent signals for fluorescence detection or did not amplify (Fig. 4), while the genotypes 2017wx-01-02 and 2017wx-02-19 did not  (Fig. 4). Thus, KASP allowed the precise identification of 98% of the individuals.
The main results of these analyses were as follows: i) absence of heterozygous cassava accessions for deletion of the C allele at position 1701 bp of the GBSSI gene in the evaluated germplasm collection; and ii) high specificity in the identification of the allelic condition of the cassava genotypes. Therefore, this point mutation can allow the accurate identification of the allelic condition of the waxy gene for any cassava germplasm (Fig. 4).

Discussion
The main objective of the initial evaluation of cassava germplasm with SNAP markers was to identify the presence of alleles associated with the waxy phenotype and then guide the self-pollination of the accessions once the waxy phenotype is expressed under the recessive condition. So, we used primers related to the waxy mutations previously described by Aiemnaka et al. [27] in the original source AM205-6. However, the inefficiency of the MeWxI11-G/C primers for the identification of individuals with the waxy phenotype in the germplasm from Latin America was demonstrated when evaluating the 28 S 1 families derived from self-pollination identified as having the G allele associated with the waxy trait. The reported primers did not identify the waxy phenotype, since the S 1 individuals with the G/G genotype had high amylose content according to the iodine test in the field. Therefore, it is possible to speculate that the SNP previously identified by Aiemnaka et al. [27] (C-G at position 3197 bp) can be used for molecular marker-assisted selection only in segregating populations derived from the AM205-6 source, and it is not possible to validate its use for different genetic backgrounds.
For the GBSSI gene sequencing, a panel of 90 cassava accessions from different origins in Latin America representing the great genetic diversity of this crop was used. This high diversity was crucial to identify SNPs that were common to waxy and non-waxy genotypes. The data are therefore consistent and can serve as a basis for further investigations of cassava in relation to waxy starch. Similar results have also been reported for Indian rice varieties in Northeast India, where again mutations in the GBSSI gene previously associated with amylose content did not distinguish the waxy phenotype [29]. Of the five varieties of waxy rice evaluated by the authors, two did not have the mutation, and among the nonwaxy, one mutation occurred in one variety. Choudhury et al. [29] also investigated the OsC1 locus responsible for the apiculus color in rice, and they found that a 10 bp deletion in the OsC1 gene, attributed to a colorless apiculus, was not detected in five of the 21 varieties with this phenotype. Moreover, one of the nine rice varieties with a colored apiculus presented the deletion. Thus, the mutations considered to be associated with waxy starch and apiculus color in rice did not necessarily correspond to the expected phenotype in different genetic backgrounds.
Only the deletion MeWxE6-delC (position 1701 bp) was exclusive of waxy individuals from the AM206-05 original source and thus able to characterize the waxy phenotype in these populations. This deletion creates a premature TGA stop codon at Exon 8 (position 2069 bp) that prevents complete synthesis of the GBSSI enzyme [27]. GBSS is bound to the granule, most likely by synthesizing the amylose within the granular matrix formed by amylopectin. Therefore, mutants produce less or no amylose, suggesting that no other synthase can replace it in this function [30].
The KASP genotyping for screening of the MeWxE6del-C deletion showed high accuracy of waxy and nonwaxy classification [98%]. Indeed, other authors have reported the great flexibility and high accuracy of the KASP technique for high-performance genotyping [28]. The KASP genotyping allowed a clear distinction between the homozygous and heterozygous accessions in the training population, evidencing its great potential in molecular-assisted selection. In this sense, our work is an important advance for cassava breeding, as it validates KASP genotyping for routine detection of the waxy phenotype by the marker MeWxEx6-del-C, when using alleles from the AM206-05 source. In the future, highthroughput screening (HTS) with several important functional markers for cassava will be possible, as has been the case in other crops such as wheat, for which more than 38 polymorphisms from different agronomic traits have been analyzed with the KASP technique [31]. The speed of the KASP genotyping assays was 45 times higher than that of the gel-based PCR markers.
Analyses in different species support the perception that the waxy phenotype is related to the absence/deficiency of the GBSSI enzyme, associated with partial and complete deletions of the gene, the presence of SNPs, and transposable elements, sometimes leading to stop codons [17-21, 32, 33]. In cassava, a first natural recessive mutation was identified in the GBSSI gene characterizing the inheritance of this phenotype [27]. However, this causal mutation determined by the deletion of a cytokine in exon 6 at position 1701 bp was not identified in the cassava germplasm analyzed here, so it will not be possible to explore the use of this mutation to generate segregating populations for the waxy gene in backgrounds other than the original source AM206-05 [34]. However, it is important to note that throughout the evolutionary process, different mutation points have already been described as responsible for the waxy phenotype in other amylaceous species. Therefore, other points of causal mutations can be discovered in cassava based on gene sequencing GBSSI in the whole cassava germplasm of Latin America. Indeed, unpublished results suggest there are at least two different mutations resulting in these waxy phenotypes in cassava (Ceballos, personal communication). In maize, several mutations related to waxy starch have been reported, including insertions and deletions of varying sizes as well as transposable elements and SNPs [15]. The most recent mutations include transposition of the rf2 gene [35], deletions in exons 7 and 10 [16] and transposable elements in exons 6 and 7 [21] identified as functional mutations in the GBSSI gene and related to the waxy phenotype in maize.
Another important aspect is the phenotyping of individuals for quantitative determination of the amylose content, which could allow the identification of alternative allelic patterns for different amylose contents. According to Dobo et al. [36], three polymorphism in rice: exon 1 (G/T polymorphism), exon 6 (A/C polymorphism), and exon 10 (C/T polymorphism), exhibited five different allelic patterns: TCC, TAC, GCC, GAC and GAT. The allelic forms TAC and TCC were found in low-amylose varieties, GCC in varieties with intermediate amylose content, and GAT and GAC in varieties with high levels of amylose.
In general, starch synthesis is complex and derived from the coordinated action of several enzymes and their isoforms [13]. An example is the Protein Targeting to Starch (PTST) identified in Arabidopsis thaliana (L.), which is related to amylose synthesis. This protein acts on targeting and GBSSI transport [37,38], so amylose synthesis is not yet completely elucidated. Mutants without the PTST factor in A. thaliana do not produce amylose because the GBSS protein, which normally binds to the starch, cannot bind in the absence of PTST. Recently, the application of CRISPR-Cas9 was demonstrated to generate cassava clones with modified starch by targeted mutagenesis of PTST1 genes. These cassava clones exhibited lower amylose content in comparison with the wild type, showing that PTST also participates in the synthesis of amylose in the cassava storage roots [26]. Therefore, it is possible that in addition to direct mutations at the gene level, other genomic regions may help to explain the waxy phenotype, and therefore be selectable.
The identification of different functional mutations in the PTST gene (Manes.02G075700) already described and mapped on cassava chromosome 2, as well as the other GBSSI isoform located on chromosome 1 (Manes.01G055700) [39], not evaluated in this work, may bring new contributions to a better understanding of the molecular bases of waxy gene expression in the germplasm of Latin America. Considering that the GBSSI gene is the only synthase known to be responsible for amylose synthesis and that the PTST protein is responsible for GBSSI targeting, a complete (paired-end) sequencing of such genes, including promoter regions, in the cassava germplasm would be one of the first steps in the search for alternative alleles for the waxy phenotype. All of the nucleotide diversity of the evaluated genes would allow an extensive in silico evaluation of points of potential mutations for inactivation of amylose production. This information, linked to the sequencing results, would direct the self-pollination of heterozygous accessions. With populations that have already been obtained, a molecular assessment of the segregation preceded by a field assessment using the iodine test would identify promising individuals. Finally, with validated markers in new sources of waxy genotypes, a KASP evaluation with different genetic backgrounds could be carried out, reaffirming the potential of these mutations to select potential cassava parents for waxy breeding.

Conclusions
Previously reported GBSSI-related SNPs and those found in this work are not useful for molecular markerassisted selection in genetic backgrounds other than the AM206-5 source. Only the deletion in exon 6 (MeW-xEx6-del-C) was able to completely discriminate the non-waxy and waxy genotypes. Although the evaluated germplasm did not have the allele associated with the original AM206-5 source, our results contribute to the establishment of a practical model for the use of molecular-assisted selection for allelic variants associated with the waxy phenotype in cassava, with the goal of optimizing the genotyping and early identification of specific genotypes with the desired alleles for generation of segregating populations.

Methods
Screening of cassava germplasm with single nucleotide amplified polymorphism [SNAP] markers associated with the waxy phenotype in the source AM206-5 A total of 1529 accessions belonging to the Brazilian Cassava Germplasm Bank of Embrapa Mandioca e Fruticultura ("Brazilian Agricultural Research Corporation -Embrapa Cassava and Fruit"), originating from different ecosystems of Brazil, Colombia, Venezuela and Uganda, were analyzed (Supplement -Table S1). DNA extraction was performed according to the CTAB (cetyltrimethylammonium bromide) protocol [40]. To verify the quality and quantity of extracted DNA, ethidium bromide (1.0 mg. L − 1 ) was used to stain the DNA in 1% (w/v) agarose gel (Invitrogen, USA) and was visually compared with various concentrations of Lambda DNA (Invitrogen, USA).
The mutations located in exon 6 (deletion of 1 bp at nucleotide 92) and the SNP at exon 11 (GC to AT) found by Aiemnaka et al. [27] were not optimized even after several PCR adjustments. Therefore, the genotyping of cassava germplasm (landraces and improved varieties) was performed using only the single nucleotide amplified polymorphism (SNAP) primers MeWxI11-G and
To perform the self-pollination of the cassava accessions, the receptive female flowers were covered before opening in the morning with cloth bags (20 × 15 cm) to protect them from pollen contamination. Male flowers of the same genotype were collected in the morning and placed in pre-labeled, large-cap bottles. In the late morning and early afternoon, self-pollination was carried out through the contact of the anthers with the stigma of the female flower to ensure artificial pollination. Then, the pollinated flowers were covered with voile [light cotton cloth] until the collection of the seeds after natural dehiscence.
The seeds from the self-pollination of the 28 accessions were sown in a greenhouse, and after 30 days, the seedlings were transplanted to the field. The plants were harvested and evaluated 10 months after planting. The S 1 plants of each family were evaluated for the presence of waxy starch using the 2% iodine test (2 g KI and 0.2 g I2 in distilled water), applied to a cross section of the roots of all genotypes [25]. The long amylose chains have a high ability to bind to the iodine in the solution, which gives a blue color to the starches containing amylose when stained with iodine. In contrast, amylopectin has a low iodine-binding capacity, and therefore, redbrown stains are characteristic of starches essentially containing only this polymer [7].
After the evaluations of S 1 families in the field, 185 genotypes belonging to families S1-BGM0061 (60 individuals), S1-BGM0463 (24 samples), and S1-BGM0935 (101 samples) were genotyped with SNAP markers, as previously described. The molecular segregation for the GBSSI gene in the S 1 individuals based on the primers MeWxI11-G and MeWxI11-C was evaluated by the χ 2 where f o and f e are the observed and expected frequencies of the phenotypes.

GBSSI gene sequencing
Complete GBSSI gene sequencing was performed on 89 cassava genotypes. Of this total, 54 accessions were considered homozygous (CC) or heterozygous (CG) using the primers MeWxI11-G and MeWxI11-C (non-waxy phenotype), and 35 genotypes were segregating from AM206-5 derived populations. Of the latter, 3 were homozygous (CC) or heterozygous (CG) (non-waxy phenotype), and 32 were homozygous (GG) genotypes (waxy phenotype) ( Table 2). The extraction and quantification of the genomic DNA was performed as described previously. For complete amplification of the gene, five primers were developed using the reference sequence of the GBSSI gene (Manes.02G001000) from cassava genome v6.1 [39] deposited in the Phytozome v12.1 database [41] (Table 3). The Primer3 program [42] was used to design primers with the following criteria: product size amplified between 800 and 1100 bp; annealing temperature above 60°C; and G / C percentage above 40%. The primers were allocated to overlap each of the sequences to ensure complete gene coverage.
For primers optimization, different annealing temperatures (58 to 64°C), magnesium chloride concentrations (1, 1.5 and 2 mM), and different numbers of PCR extension cycles were tested. PCR reactions were optimized in final volume of 50 μl containing 10 ng of DNA, 1X PCR buffer, 1.5 / 2.0 mM MgCl 2 (Invitrogen, USA), 0.2 mM dNTP (Promega, USA), 2 mM of each primer (Integrated DNA Technologies, USA), and 1 U of Taq DNA High Fidelity Polymerase (Invitrogen, USA). The amplification program was optimized using an initial denaturation of one cycle at 95°C for 1 min; followed by 30/35 cycles at 95°C for 15 s, annealing at 62°C for 15 s and 72°C for 30 s; and final extension at 72°C for 7 min, performed in a Veriti® 96-well thermocycler (Applied Biosystems, USA).
The electrophoresis product was purified with ExoSap-IT (Affymetrix, USA), sequenced in both directions using   [43,44], aligned with the reference sequence Manes.02G001000, and the SNPs were identified with the aid of the newSNP v3.0.1 program [45]. Identification of the coding and non-coding regions was performed by comparison with the GBSSI gene (Manes.02G001000) of the cassava genome v6.1 [39].

Discriminant analysis of principal components
Discriminant analysis of principal components (DAPC) was performed based on the allelic variants identified by the complete GBSSI gene sequencing. DAPC was performed, retaining more than 90% of the variation of the data with the first 10 principal components and one discriminant function. The analysis was performed using the adegenet package [46] of the software R v3.5 [47].
Genotypic evaluation of the cassava germplasm for deletion MeWxEx6-del-C The Kompetitive Allele Specific PCR (KASP) technology [28] for screening the deletion MeWxEx6-del-C was performed by the company Intertek AgriTech. The KASP genotyping optimization was performed in two stages. In the first step, a set of individuals with known allelic conditions was used, i.e., 31 individuals homozygous for the waxy gene (wxwx); 31 heterozygous individuals (Wxwx) from segregating populations for the waxy genotype derived from the AM206-5 source, and 32 non-waxy individuals (WxWx) from the germplasm bank (Table 4). Allele-specific fluorescence (HEX and FAM) was detected using the SNPviewer2 v4.0.0 program. and clusters were visualized using R v.3.5 [47]. After optimization and validation of the MeWxEx6-del-C marker, genotyping of 1529 cassava accessions was performed.
Additional file 2: Table S1. List of cassava clones used in screening with single nucleotide amplified polymorphism [SNAP] markers.