Skip to main content

Identification of quantitative trait nucleotides and candidate genes for tuber yield and mosaic virus tolerance in an elite population of white guinea yam (Dioscorea rotundata) using genome-wide association scan

Abstract

Background

Improvement of tuber yield and tolerance to viruses are priority objectives in white Guinea yam breeding programs. However, phenotypic selection for these traits is quite challenging due to phenotypic plasticity and cumbersome screening of phenotypic-induced variations. This study assessed quantitative trait nucleotides (QTNs) and the underlying candidate genes related to tuber yield per plant (TYP) and yam mosaic virus (YMV) tolerance in a panel of 406 white Guinea yam (Dioscorea rotundata) breeding lines using a genome-wide association study (GWAS).

Results

Population structure analysis using 5,581 SNPs differentiated the 406 genotypes into seven distinct sub-groups based delta K. Marker-trait association (MTA) analysis using the multi-locus linear model (mrMLM) identified seventeen QTN regions significant for TYP and five for YMV with various effects. The seveteen QTNs were detected on nine chromosomes, while the five QTNs were identified on five chromosomes. We identified variants responsible for predicting higher yield and low virus severity scores in the breeding panel through the marker-effect prediction. Gene annotation for the significant SNP loci identified several essential putative genes associated with the growth and development of tuber yield and those that code for tolerance to mosaic virus.

Conclusion

Application of different multi-locus models of GWAS identified 22 QTNs. Our results provide valuable insight for marker validation and deployment for tuber yield and mosaic virus tolerance in white yam breeding. The information on SNP variants and genes from the present study would fast-track the application of genomics-informed selection decisions in breeding white Guinea yam for rapid introgression of the targeted traits through markers validation.

Peer Review reports

Background

Root and tuber crops are significant contributors to global food supply next to cereal crops. Yam is among the principal root and tuber crops, after cassava and potato, that are widely grown and consumed as subsistence staples [1]. Yam is a collective name for the Dioscorea species extensively cultivated in the tropics and subtropics by smallholder farmers for its starchy underground tuber and aerial bulbils [2, 3]. The global estimated mean annual yam production and gross values are approximately 73 million tons and 14 billion US dollars, respectively, with West Africa accounting for 92% of the total yam production [4, 5]. There are over 600 Dioscorea species, of which 11 are economically significant [6]. White Guinea yam (D. rotundata), indigenous to Africa, is the most produced and consumed among cultivated species, supporting the livelihood of over 300 million people [2]. Yam is also important in many key life ceremonies in the major producing areas of West Africa [7].

Despite its socio-economic importance, a significant yield increase has not been achieved over the decades compared to cereal crops [1]. Improved varieties are vital for attaining increased productivity in farmers’ fields. The development of improved yam varieties requires a better understanding of the genetic control of traits contributing to the increased yield and acceptable quality by growers and consumers. However, the breeding efforts have not adequately explored the genetic basis of tuber yield and virus resistance traits to fast-track improved cultivar development. Genes controlling key traits such as resistance to pests and diseases, tuber yield, and tuber quality traits exhibit quantitative inheritance. They may not be linked in a preferred direction, making improving these traits challenging using conventional breeding techniques [8]. In QTL mapping studies, the variation in virus resistance is attributed to a single major locus with a modest contribution [9]. Two random amplification of polymorphic DNA (RAPD) markers tightly linked in the coupling phase with Ymv-1 locus on the same linkage group were reported in resistant genotypes of D. rotundata.

For tuber yield, limited knowledge exists regarding QTL mapping studies [8]. The QTLs detected for YMV in yam were mainly based on conventional family-based linkage mapping. In contrast, the GWAS strategy using naturally occurring variants is a more robust and efficient method for identifying significant loci and the genes involved in the genetic control of complex traits. The GWAS strategy has increasingly been utilized in many crops, including root and tuber crops, to dissect the underlying genetic control mechanism in complex traits. However, GWAS mapping for tuber yield and YMV tolerance in yam has not been reported to date.

Supporting yam breeding efforts based on quantitative genetics principles and genomics tools is indispensable to increase the program’s effectiveness for increasing productivity. Yam cultivar development using conventional strategies spans at least ten years from crossing to variety release recommendation [4, 6]. The complementation of the traditional breeding techniques with advanced molecular tools has reduced the breeding cycle in crops [10]. In theory, genotypic information from molecular markers, when associated with phenotypic traits of interest, may be extensively used to select individuals with higher genetic value through marker-assisted selection (MAS) [11].

This study's objective was to dissect the genetic control of tuber yield and YMV tolerance in white Guinea yam.

Material and methods

Plant materials

The study panel comprised 406 white Guinea yam clones, of which, 36 were trait progenitors, 49 elite clones, and 321 early generation breeding lines from the IITA's yam breeding program (Supplementary Table 1). All the genotypes are from the International Institute of Tropical Agriculture, IITA Ibadan Nigeria and are maintained by the Yam Breeding Improvement Unit.

Phenotyping

Phenotypic data on tuber yield per plant (TYP) and yam mosaic virus (YMV) severity were recorded on the plant materials assessed at different breeding stages at IITA in Nigeria. The TYP and YMV severity were recorded on plants in the field using the procedure described in yam ontology (http://www.cropontology.org/ontology/CO_343/Yam) and yam standard operation protocol [12]. Tuber yield was recorded in kilogram on a plant basis at harvest (eight months after planting). The YMV severity score was assessed at 30-day intervals from 2 to 6 months after planting based on a visual assessment of the relative area of plant leaf surfaces affected by the mosaic virus disease using a five-ordinal scale of 1–5. A score of 1 represented no visible symptoms of virus infection, 2 for mild mosaic, vein-banding, green spotting or flecking, curling and mottling on few leaves but no leaf distortion, 3 for low incidence (25–50%) of the mosaic virus on the entire plant, 4 for the severe mosaic on most leaves and leaf distortion, and 5 for severe mosaic and bleaching with severe leaf distortion and stunting. The virus severity score values were converted to percentages and then used to estimate the area under disease progress curve (AUDPC) values as described by Forbes et al. [13]:

$$\mathbf{AUDPC}={\sum}_{\boldsymbol{i}=\mathbf{1}}^{\boldsymbol{n}-\mathbf{1}}\left(\frac{{\boldsymbol{y}}_{\boldsymbol{i}}+{\boldsymbol{y}}_{\mathbf{i}+\mathbf{1}}}{\mathbf{2}}\right)\ \left({\boldsymbol{t}}_{\boldsymbol{i}+\mathbf{1}}-{\boldsymbol{t}}_{\boldsymbol{i}}\right)$$

where yi = disease severity at the ith observation, ti = time (days) at the ith observation, and n = total number of observations.

Phenotypic data analysis

We applied a one-step linear mixed model that used G-matrix to compute the best linear unbiased predictor (BLUP) values of an individual clone for a trait from the best fit model using the average information criterion (AIC) in restricted maximum likelihood (REML) algorithm [14] in the ASReml-R version 4 package [15]. The model used was:

$${\boldsymbol y}_{\boldsymbol i\boldsymbol j}=\mathbf\pi+{\mathbf\beta}_{\mathbf i}+{\mathbf\tau}_{\boldsymbol j}+{\boldsymbol\gamma}_{\boldsymbol k}+{\mathrm\varepsilon}_{\boldsymbol i\boldsymbol j}+{\mathbf Z}_{\boldsymbol u}\mathbf u$$

where yij is the phenotypic value, μ is the overall average (shared by all observations), βi is the effect of block i, τj is the specific effect to genotype j, γk is the specific effect to trials k and ij is an effect specific to each experimental unit (combination block and genotype ) and Zuu is the the vectors of random additive and non-additive genetic within location effects, respectively, with corresponding design matrix Zu. Accordingly, the genetic variance was partitioned into the additive effects, which were associated with a covariance structure proportional to genetic relationships derived from the molecular markers and the non-additive genetic effect. The non-additive genetic variance is explained by individual identity rather than the genomic relationship matrix following the approach described by Borgognone et al. [16] and Ovenden et al. [17].

Broad sense heritability (H2) estimates for the traits were calculated from phenotypic variance (σ2p) and the genotypic variance (σ2g). The BLUP values of the genotypes for the traits extracted from the best fit model were used as input for the GWAS model.

Genotyping and SNP data analysis

For each genotype, total genomic DNA was isolated from lyophilized young and fully expanded healthy leaves. Deoxyribonucleic acid (DNA) was extracted from the leaf samples using the CTAB procedure with slight modification [18]. DNA quality and concentration were assessed using agarose gel and nanodrop, respectively, following the methods described in Aljanabi and Martinez [19]. High-throughput genotyping was conducted in 96 plex DArTseq protocol, and SNPs were called using the DArT's proprietary software, DArTSoft, as described by Killian et al. [20]. Reads and tags found in each sequencing result were aligned to the Dioscorea rotundata reference genome version 2 (https://drive.google.com/drive/folders/1H5T4xjKAEl9LliR-4qK_IR6TypCDe8nj) with Hisat2 [21]. The raw HapMap file generated was first converted to a Variant Call Format (VCF) and filtered for missing value and polymorphic SNPs using quality control criteria of low sequence depth <5; SNP markers with missing values >20%; minor allele frequency (MAF) <0.05 and heterozygosity >50. Of the 16,242 SNP markers subjected to the filtering quality criteria, 5,581 good-quality SNPs were retained for various analyses.

Population genetic analysis

Various population genetic analysis methods were conducted to explore the structure and level of genetic diversity in the study material. The SNP distribution and the density were estimated using the ‘Cmplot’ function implemented in the CMplot R package [22]. For the SNP mutation from the reference to the alternative, SNPlay open website was used to estimate the rate of the transition and transversion across the retained SNP. Statistics such as the minor allele frequency (MAF), the observed and the expected heterozygosity, and the polymorphism information content were estimated using the function "--freq" and "--hardy" using PLINK V1.90 [23].

The genetic relationship among the plant materials was explored using the principal component analysis (PCA) in FactorMiner R package [24]. For the PCA, the origin of the plant (early generation and parental profile) was used as factor.

Structure software version 2.3.3 [25, 26] was used to cluster samples into populations. Structure simulations were carried out using an admixture model with a burn-in period of 20000 iterations and a Markov chain Monte Carlo (MCMC) set at 20000. The simulations were repeated 3 times for K-values of 1 to 10. The optimal subpopulation model was investigated in several ways: (1) by applying the informal pointers (i.e. geographical origin) proposed by Pritchard et al. [25] and Falush et al. [27]; (2) by considering ΔK, a second order rate change with respect to K, as defined in Evanno et al. [28], as implemented in STRUCTURE HARVESTER [29] and thus the most likely value of K determined. Structure population was then plotted using barplot function implemeneted in R. The phylogeny tree was done using ape version 5.0 implemented in R [30].

Genome Wide-Association Analysis (GWAS)

The GWAS were performed using the R package mrMLM v4.0.2 [31] with six multi-locus models. These models included: 1) multi-locus random-SNP-effect Mixed Linear Model [32], 2) Fast multi-locus random-SNP-effect EMMA (FASTmrEMMA) [33], 3) Iterative Sure Independence Screening EM-Bayesian LASSO (ISIS EM-BLASSO) [34], 4) polygenic-background-control- based least angle regression plus empirical Bayes (pLARmEB) [35], 5) polygenic- background-control-based Kruskal-Wallis test plus empirical Bayes (pKWmEB) [36]; and 6) fast mrMLM (FASTmrMLM) [37].

In the mrMLM analysis, we accounted for population structure (Q) generated from Structure analysis. For each trait, the optimal number Q value included in the GWAS models was determined based on the highest ΔK value. The percentage of variation explained by the associated marker (R2) and the markers effect were estimated in the mrMLM (v 4.0.2) R package (https://cran.r-project.org/web/packages/mrMLM/index.html).

Identification of existing putative genes

The possible candidate genes within the significant QTL region were searched in the defined range window of 1 MB at 500 Kb (downstream and upstream) from the yam Generic File Format (GFF3) file. Linkage disequilibrium (LD) was assessed between the significant SNPs using the LDheatmap library [38]. The yam generic feature format (GFF3) of the reference genome was used to identify the main gene in the inter-genic region using the SNPReff. Functions of the genes associated with the identified SNPs were determined using the public database Interpro, European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI) [39].

Haplotype estimation and SNP markers effect prediction

Haplotype associated with significant QTL was developed using “rstatix” package implemented in R, and the sequence of each haplotype was defined based on the 406 genetic material considered as testing and or identification population. The variant effect prediction was evaluated through the adjusted posterior probability, and the markers with high segregation were identified. Marker effects were then plotted for vizualization.

Results

Phenotypic data of the white yam

Table 1 presents summary statistics for the phenotypic traits assessed. Broad-sense heritability estimates were high, 0.708 for tuber yield per plant and 0.903 for yam mosaic virus. The phenotypic value for the tuber yield ranged from 0.93 to 1.47 kg plant-1 with an average of 1.19 kg. The area under the disease progress curve for YMV ranged from 100.56 to 2900.45 with an average of 936.16. (Supplementary Table 2).

Table 1 Descriptive statistics of tuber yield per plant (TYP) and yam mosaic virus (YMV)

Genetic diversity, population structure and linkage disequilibrium

The DArT genotyping of 406 white Guinea yam clones detected the highest number of SNPs (637) mapped on chromosome 5 and the lowest of 123 on chromosome 11 (Supplementary Fig. 1A). Transition SNPs (60.13%, 3,356 SNPs) were more frequent than transversions (39.87%, 2225 SNPs) (Supplementary Fig. 1B). The observed heterozygosity value ranged from 0.029 to 0.622, with an average of 0.336 (Supplementary Fig. 1C). The expected heterozygosity value ranged from 0.09 to 0.5, with an average of 0.331 (Supplementary Fig. 1D). The minor allele frequency ranged from 0.05 to 0.5, with a mean of 0.24 (Supplementary Fig. 1E). The polymorphic information content (PIC) ranged from 0.087 to 0.335, with an average value of 0.267 (Supplementary Fig. 1F).

The population structure analysis of the yam diversity panel shows that the delta K values from the mean log-likelihood probabilities plateaued at K=7 (1306.47) (Fig. 1A). At K=7, the 406 yam diversity panel was divided into 7 sub-populations (Fig. 1C). Using the 50% cutoff criterion of membership probability threshold, 305 accessions were successfully assigned to the 7 different sub-populations. The remaining 101 accessions with a probability of associations less than 50% were designated as an admixed population. The phylogenetic tree also showed seven sub-populations with higher degrees of admixture similar to the delta K plot from the STRUCTURE (Fig. 1B).

Fig. 1
figure1

Graphical representation of the population structure of the 406 yam diversity panel. A Plot of mean likelihood of delta K against the number of K groups. The highest peak observed at K=7 signifies the grouping of accessions into seven groups. B Phylogeny tree showing the 7 Sub-populations. The colors represent each sub-population. C Population structure originated from the STRUCTURE based K=7. Each vertical barplot represents a single yam clone

Exploring the the genetic relashionship through principal component analysis showed that the first two PCs account for 63.7% of the total variation (Fig. 2). The PCA clearly showed a higher degree of admixture between the early generation and parental profile clones. Both the early generation and the parental profile clones were distributed along PC1 and the PC2 (Fig. 2).

Fig. 2
figure2

Principal component displaying the relationship between and among the early generation and parental profile clones used in this study

Genome-wide scan for traits

Tuber yield

We found seventeen SNPs markers distributed on 9 chromosomes, significantly associated with tuber yield (kg plant-1) (Table 2; Fig. 3). The LOD values for these SNPs ranged from 5.07 to 10.88 with minor allele frequency (MAF) ranging from 0.09 to 0.50. Of the 17 SNP markers associated with tuber yield, four were mapped on chromosome 4, two on chromosome 5, two each on chromosomes 8, 10, 14, and 17 and a single SNP each on chromosomes 13, 15, and 19 (Table 2). The SNP marker chr05_24682916 explained the highest total phenotypic variance 8.47%.

Table 2 SNP markers associated with the tuber yield per plant (TYP) and yam mosaic virus severity score.
Fig. 3
figure3

Genome-wide association analysis of tuber yield per plant. Manhattan plot indicating three SNP markers located on chromosomes 4, 5, 8, 10, 13, 14, 15, 17 and 19 associated with the tuber yield per plant. The blue letters are the Interpro ID for the different putative genes near the SNP markers associated with the tuber yield per plant

Yam mosaic virus resistance

We found five SNP loci that showed a significant association with the reaction to mosaic virus infection (Table 2, Fig. 4). Of the significant SNPs associated with YMV, three markers named chr03_6338751, chr05_30671001 and chr16_1482029 displayed negative quantitative trait nucleotide effects (Table 2). Using different genetic model for the SNP association SNP marker chr15_3906069 located on chromosome 15 was identified by two methods pLARmEB and pKWmEB. The total phenotypic variance explained by the markers associated with the yam mosaic virus varied from 0.33% to 5.96%. The minor allele frequency (MAF) of the associated SNP marker ranged from 0.16 to 0.49.

Fig. 4
figure4

Genome-wide association analysis of yam mosaic virus. Manhattan plot indicating SNPs associated with the YMV. The y-axis represents the p-value of the marker-trait association on a –log10 scale. The red letters are the Interpro ID for the different putative genes near the SNP markers associated with the yam mosaic virus

SNP-trait association mapping

Four multi-locus models (MLMs) including FASTmrMLM, mrMLM, pKWmEB and pLARmEB detected a total of 22 QTNs across the 20 chromosomes of white yam for TYP and YMV traits (Table 2). Of the 22 QTNs, a total of 17 SNPs significantly associated with TYP. Among the 17 loci, two SNPs each were detected by FASTmrMLM and mrMLM; and seven SNPs each by pKWmEB and pLARmEB. These QTNs were distributed unevenly on 9 chromosomes (Table 2). Models pKWmEB and pLARmEB detected the highest number of 7 QTNs each. The 7 QTNs of model pKWmEB were detected on chromosomes 4, 5, 8 and 10, while those of model pLARmEB were detected on chromosomes 4, 5, 14, 15, 17 and 19.

For YMV, a total of five QTNs were detected by pLARmEB and pKWmEB and unevenly distributed on five chromosomes.

TYP tuber yield (kg plant-1), YMV Yam mosaic virus severity score (AUDPC value), LOD Logarithm of odds, Chr chromosomes, Pos position, bp base-pair, MAF Minor allele frequency, r2 r-square, QTN quantitative trait nucleotide

Identification of existing putative genes

Tuber yield

We explored the association of the identified QTN regions on the physical map with the potential candidate genes and their functions using the white Guinea yam genome sequence. The LD heatmap of the significant SNPs on chromosomes 4, 5, 8, 13, 14, 15, 17 and 19 displayed a high genetic correlation (0.3 to 0.85) between the specific SNPs in the vicinity of the peak adjacent to the putative gene (Fig. 5). On chromosome 4, the significant SNP for tuber yield is located on the genomic regions harboring six putative genes (Gibberellin regulated protein, AP2/ERF domain, NB-ARC, Dirigent protein, Membrane transport protein, and Importin subunit beta-1, plants) with known functions. On chromosome 5, we detected three putative genes (Expansin, AUX/IAA protein and AP2/ERF domain). On chromosome 8, we identified two putative genes (AUX/IAA protein; Glycine-rich protein) (Supplementary Table 4). Several putative genes were identified on chromosome 14 (Supplementary Table S4). On chromosome 15, which displayed average correlation through the Ldheatmap, five genes were identified in the vicinity of the targeted SNP marker. The LD heatmap for the SNP found in association with tuber yield on chromosome 19 revealed the presence of 9 putative genes (ABC transporter-like, Exportin-1/Importin-beta-like, Sodium/calcium exchanger membrane region, AUX/IAA protein, Geminivirus AL3 coat protein, AP2/ERF domain, Major facilitator, sugar transporter-like, and Expansin).

Fig. 5
figure5

 Heatmap LD haplotype blocks for different SNP markers located on different chromosomes A chromosome 4; B chromosome 5; C chromosome 8; D chromosome 10; E chromosome 13; F chromosome 14; G chromosome 15; H chromosome 17 and I chromosome 19. The R2 color key indicates the degree of significant association with the putative genes

Yam mosaic virus resistance

We identified four candidate genes, namely AP2/ERF domain, Major facilitator, sugar transporter-like, and AUX/IAA protein on chromosome 3 near the SNP found in association with the YMV. The four identified candidate genes, AP2/ERF domain and AUX/IAA protein, were reported to confer essential gene functions related to plant defense and growth. The pairwise LD between the SNP of chromosome 3, 5, 10, 15 and 16 situated in genomic regions associated with YMV displayed a higher correlation with the three main haplotypes block (Fig. 6). On chromosome 10, fifteen different putative genes were identified near the significant SNPs as being associated with the YMV resistance, namely SNF2-related domain, Geminivirus AL3 coat protein, SANT/Myb domain, Geminivirus AL1 replication-associated protein, CLV type, Chlorophyll A-B binding protein, AP2/ERF domain, Gdt1 family, NB-ARC, Probable transposase, Ptta/En/Spm plant, Geminivirus AL1 replication-associated protein, catalytic domain, Kinesin-like protein and Geminivirus Rep catalytic domain.

Fig. 6
figure6

Summary of the local LD and haplotype blocks for different SNP marker located on different chromosome A chromosome 3, B chromosome 10, C Chromosome (5), D chromosome 15 and E chromosome 16 The R2 color key indicates the degree of significant association

Haplotype SNP distribution and SNP markers effect prediction

The frequencies and marker prediction effects of various haplotypes associated with tuber yield and resistance to yam mosaic virus in white Guinea yam are presented in Table 3. Of the seventeen SNP markers associated with the tuber yield, six SNP markers including chr04_6236404, chr05_24237388, chr08_7046574, chr13_13467988, chr14_11128124 and chr17_15363223 displayed high haplotype segregation among the different variants. Accordingly, the SNP markers on chromosomes 4, 5, 8, 13, 14 and 17 identified variants CC and CT to be associated with genotypes with higher tuber yield, whereas variants TT and AT were found to be associated with lower tuber yield (Fig. 7). Of the five SNP markers associated with the YMV, two (chr10_1116193 and chr16_1482029) were found to have high significant haplotype variations (Table 3). On chromosome 10, SNP markers associated with the YMV located at 1116193 bp showed that variants GG and AG were linked to lower predicted YMV value, while variant AA was identified to predict the higher YMV score (Fig. 8A). For the marker chr16_1482029 associated with YMV located at 1482029 bp variants TT and AT were linked to lower predicted YMV value (Fig. 8B).

Table 3 Frequencies and marker prediction effects of various haplotypes associated with tuber yield (kg plant-1) and reaction to yam mosaic virus infection (AUDPC value)
Fig. 7
figure7

Boxplots showing the effect of the significant markers associated with tuber yield per plant on: A chromosome 4, B chromosome 5 C chromosome 8, D chromosome 13, E chromosome 14 and F chromosome 17. The letters on the X-axis represent allele variants

Fig. 8
figure8

Boxplots showing the effect of the only significant markers associated with yam mosaic virus identified from the haplotype segragatio on: A chromosome 10 with chr10_1116193 (B) chromosome 16 with one SNP chr16_1482029. The letters on the X-axis represent allele variants for the different SNP markers

.

Discussion

Phenotypic variation

The natural variation among the studied traits was high and very informative. Relatively high broad-sense heritability of 0.708 for tuber yield per plant and 0. 903 for yam mosaic virus severity score demonstrated substantial genetic variation in traits between the different clones. Therefore, the studied traits are amenable to genetic improvement through selection [40]. Furthermore, the observed natural genetic variation in the study materials signifies their relevance for genetic studies.

Population differentiation

Understanding population structure within the studied clones is imperative to determine how it affects the ability of GWAS to infer marker-trait association. The population structure of the present study based on the delta reveals 7 sub-populations, indicating high genetic variability. The high genetic variability indicates the potentials of the studied clones for genetic improvement aimed at tuber yield per plant and yam mosaic virus. The the phylogeny analysis reveals similar results as the populature structure analysis, indicating their relevance in preventing sham associations in GWAS in this study [41, 42]. Thus, the marker density, diversity, and sample size demonstrated that the yam breeding panel used for this study is sufficiently powered to capture allelic variations for the studied traits.

Genome-wide association studies

The whole-genome scan for phenotypic and allelic variation in tuber yield and yam mosaic virus resistance identified genome regions on ten chromosomes (chromosomes 4, 5, 8, 10, 13, 14, 15, 16, 17 and 19) with significant −log10 values. Both Q matrix (population structure) were considered in a mixed linear model for the association analysis to reduce false-positive associations. The model used for tuber yield and tolerance to yam mosaic virus showed no inflation of p-values indicating that the structure of relationships was well accounted for in the GWAS analysis. These findings are consistent with the view that traits with no inflation of p-values show that the structural relationship is adequate for GWAS analysis [42]. Genome-wide association mapping has been used in exploring the elite alleles of many agronomic traits such as tuber dry matter and oxidative browning [42] in water yam (Dioscorea alata). In the present study, the phenotypic effect values of the favorable alleles of TYP and YMV were evaluated and inferred to positively and negatively affect the individual traits. Based on the stringent criterion of −log10, we identified 17 significant markers trait associations ranging between 1.01 e-20 and 0.044 for tuber yield per plant; and 5 significant markers trait associations ranging between 5.25 e-14 and 0.029 for yam mosaic virus. The information on SNP variants from the present study would fast-track the application of genomics-informed selection decisions in breeding white Guinea yam for higher tuber yield and resistance to mosaic virus. Such great potential of GWAS has been reported for some root and tuber crops such as cassava [43], potatoes [44] and water yam [42].

Detection of QTNs by multi-locus models (MLMs)

This study used different MLMs (FASTmrMLM, mrMLM, pKWmEB and pLARmEB) to identify genomic region associated with TYP and YMV. A total of 17 SNPs were significantly associated with TYP by the four MLM models across 9 out of the 20 chromosomes viz: chrs 4, 5, 8, 10, 14, 15, 17 and 19. Each of the four models detected different and complemeneted numbers of the SNPs: pKWmEB and pLARmEB (7 QTNs each) > FASTmrMLM, mrMLM (2 QTNs each). This indicates varied detection of each model. The MLMs used in this study detected putative candidate genes for the studied traits indicating its usefulness in GWAS. These results support the view that MLMs are useful for identifying QTNs and candidate genes in plants [45]. The findings of this study established a link between quantitative traits such as tuber yield and yam mosaic virus and single nucleotide polymorphisms. The variations observed in the population pannels constitute a pool of quantitative trait nucleotides (QTNs) that modulate tuber yield and yam mosaic virus traits in white yam.

Identification of putative genes

Our results identified SNP markers that associate significantly with allelic variation for tuber yield and YMV tolerance in white yam. The detected markers offer good targets for further validation and analysis due to their location in proximity to candidate genes regulating growth, development and disease resistance. The SNP in chromosome 3 is near to AP2/ERF domain, AUX/IAA protein, major facilitator, sugar transporter-like genes. Zarei et al. [46] reported that the AP2/ERF-domain transcription factor ORA59 acts as the integrator of the jasmonic acid (JA) and ethylene (ET) signaling pathways and is the key regulator of JA- and ET-responsive PLANT DEFENSIN1.2 (PDF1.2) expression. The SNP in chromosome 4 is near to Geminivirus AL1 replication-associated protein, catalytic domain, AP2/ERF domain, NB-ARC, Dirigent protein, and membrane transport protein genes. The NB-ARC domain is noted to play a role in ATPase domain that comprises NB, ARC1, and ARC2 subdomains, which in its nucleotide-binding state regulates the R protein activity or resistance in plants [47]. The plant defense is induced by the R proteins in response to specific pathogen-derived molecules, called avirulence (AVR) proteins, thereby restricting pathogen proliferation [48]. The SNP in chromosome 10 is near to Geminivirus AL1 replication-associated protein, catalytic domain, Geminivirus Rep catalytic domain, Geminivirus AL3 coat protein, AP2/ERF domain, NB-ARC, Chlorophyll A-B binding protein, plant and chromista. Geminivirus AR1/BR1 coat protein, AP2/ERF domain, Geminivirus AL1 replication-associated protein, catalytic domain, Geminivirus AL1 replication-associated protein, central domain, and NB-ARC genes. Geminiviruses have been reported by Sunter and Bisaro [49] to play role in the Transactivation of Geminivirus AR1 and BR1 Gene Expression by the Viral AL2 Gene Product. Chlorophyll A-B binding protein is known as a light receptor that stimulates growth and development in plants [50]. The SNP in chromosome 16 is near to Geminivirus AR1/BR1 coat protein; AP2/ERF domain; Geminivirus AL1 replication-associated protein, catalytic domain; Geminivirus AL1 replication-associated protein, central domain; and NB-ARC genes. The SNP in chromosome 14 is near to expansin, cellulose-binding-like domain; mitochondrial substrate/solute carrier, expansin, root cap; dirigent protein; small auxin-up RNA; major facilitator, sugar transporter-like genes. Expansins or expansin-like proteins (loosenins) were reported to loosen plant cell wall activity and lignocellulose saccharification [51]. Mitochondrial carrier proteins play roles in plant growth and disease resistance [52]. The SNP in chromosome 15 is near to Gibberellin regulated protein; Major facilitator, sugar transporter-like; Senescence regulator S40; ABC transporter-like genes. The gibberellin regulated protein (GRP) has been noted to be up-regulated by gibberellin, and most of these proteins have a role in plant development and some of its members have antimicrobial activity [53, 54]. The SNP in chromosome 19 is near to Exportin-1/Importin-beta-like; Expansin; Sodium/calcium exchanger membrane region; Major facilitator, sugar transporter-like; AUX/IAA protein. The sodium/calcium exchanger has been reported to influence metabolic regulation on ion carrier interactions in living organisms [55]. The SNPs in chromosomes 6 and 8 are near to AUX/IAA protein and Protein ENHANCED DISEASE RESISTANCE 2, C-terminal (EDR2) genes. The Aux/IAA gene has been noted to play cellular and developmental roles in plants' lifespan, such as root development, shoot growth, and fruit ripening [56]. The Protein ENHANCED DISEASE RESISTANCE 2, C-terminal (EDR2) in plants limits cell death initiation and the establishment of hypersensitive response [57]. The identified putative candidate genes and SNPs linked with these important economic traits could help design new breeding strategies to hoard superior alleles for these key traits in future marker-based breeding. The novel regions identified in this study have not been previously detected, possibly due to the limitations of the various marker systems used in earlier studies.

Our findings indicated that multiple loci having unequal effects can influence the variation for TYP and YMV in white yam. The identified novel candidate genomic regions with growth, development and disease resistance genes in our study require further validation and testing in yam germplasm. This could be done by converting these MTAs into low cost Kompetitive Allele-Specific PCR (KASP) markers that can efficiently transfer alleles into elite yam genotypes as reported for wheat [58]. These valuable genomic resources and PCR based markers (KASP markers) could greatly support selection initiatives for key traits in yam breeding through marker-assisted selection (MAS). These will also support the systematic study of the genetics, comparative genomics and evolution of yam, aimed at expediting the isolation and characterization of genes that control agronomically important traits such as tuber yield and yam mosaic virus.

The SNP marker-TYP trait association exhibited high haplotype segregation. The marker effects alleles CC and CT are responsible for predicting high tuber yield per plant in the diversity panel used in the study, while alleles TT and GG were identified to associate with low yield. For the YMV, we found alleles GG, AG and TT to be responsible for low YMV disease scoring prediction. These findings suggest that data mining of favorable alleles is essential for improving the quantitative trait for tuber yield and YMV in yam using marker-assisted selection. Moreover, the results could be helpful for marker validation and deployment in yam breeding. Our findings agree with the view that information on marker effect based on segregation pattern is fundamental for marker validation and deployment in a breeding program [47, 59]. Association mapping has been utilized to explore elite alleles present in many agronomic traits, including yield and related attributes in bread wheat [60].

Conclusion

Useful genetic variability exists in the 406 genotypes studied. The genetic architecture of TYP and YMV are regulated by varied QTNs unevenly distributed on the 20 chromosomes of white yam. Among the 4 MLM models, pKWmEB and pLARmEB are most robust in identifying more QTNs. The associated SNP markers could be potentially employed for targeted and accelerated tuber yield per plant and YMV resistance in white yam. The information from our study could help design new breeding strategies to hoard superior alleles for tuber yield per plant and yam mosaic virus in future marker-based breeding. The chromosomal regions controlling these studied traits could be exploited for selection and effective pyramiding of favorable alleles in white yam population improvement. Findings are relevant for population improvement of desirable TYP and YMV traits using marker assisted breeding (MAB) and haplotype-based scheme.

Availability of data and materials

The Variant Call Format (VCF) file used in this study for the analysis can be viewed on www.yambase.org under genotypic data through this link https://yambase.org/breeders/trial/445?format= . Associated phenotypic data is presented as supplementary table within the document.

References

  1. 1.

    FAO Food and Agriculture Organization of the United Nations Statistics database, FAOSTAT. 2020. http://www.fao.org/faostat/en/#data/ QC

  2. 2.

    Asiedu R, Sartie A. Crops that feed the world 1. Yams: Yams for income and food security. Food Security. 2010;2:305–15. https://doi.org/10.1007/s12571-010-0085-0.

    Article  Google Scholar 

  3. 3.

    Cormier F, Lawac F, Maledon E, Gravillon MC, Nudol E, Mournet P, et al. A reference high-density genetic map of greater yam (Dioscorea alata L.). Theor Appl Genet. 2019;132:1733–44. https://doi.org/10.1007/s00122-019-03311-6.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  4. 4.

    Darkwa K, Olasanmi B, Asiedu R, Asfaw A. Review of empirical and emerging methods and tools for yam (Dioscorea spp.) improvement: status and prospects. Plant Breed. 2020a;139(3):474–97. https://doi.org/10.1111/PBR.12783.

    Article  Google Scholar 

  5. 5.

    Onda Y, Mochida K. Exploring genetic diversity in plants using high-throughput sequencing techniques. Curr Genomic. 2016;17:358–67.

    Article  CAS  Google Scholar 

  6. 6.

    Lebot V. Tropical root and tuber crops: cassava, sweet potato, yams and aroids, vol. XIX. Wallingford: CABI; 2009. p. 413.

    Google Scholar 

  7. 7.

    Obidiegwu JE, Akpabio EM. The geography of yam cultivation in southern Nigeria: Exploring its social meanings and cultural functions. J Ethnic Foods. 2017;4:28–35.

    Article  Google Scholar 

  8. 8.

    Darkwa K, Agre P, Olasanmi B, Iseki K, Matsumoto R, Powell A, et al. Comparative assessment of genetic diversity matrices and clustering methods in white Guinea yam (Dioscorea rotundata) based on morphological and molecular markers. Sci Reports. 2020b;10:13191. https://doi.org/10.1038/s41598-020-69925-9.

    Article  CAS  Google Scholar 

  9. 9.

    Mignouna H, Mank R, Ellis T, Van Den Bosch N, Asiedu R, Ng S, et al. A genetic linkage map of Guinea yam (Dioscorea rotundata Poir.) based on AFLP markers. Theoretical and Applied Genetics. 2002;105(5):716–25. https://doi.org/10.1007/s00122-002-0911-7.

    Article  PubMed  CAS  Google Scholar 

  10. 10.

    Norman PE, Asfaw A, Tongoona PB, Danquah A, Danquah EY, Koeyer DD, et al. Can parentage analysis facilitate breeding activities in root and tuber crops? Agric J. 2018;8:1–24.

    Google Scholar 

  11. 11.

    Jiang GL. Molecular markers and marker-assisted breeding in plants. In: Plant Breeding from Laboratories to Fields Sven Bode Andersen (ed): IntechOpen; 2013. p. 45–83. https://doi.org/10.5772/52583.

    Chapter  Google Scholar 

  12. 12.

    Asfaw A, editor. Standard operating protocol for yam variety performance evaluation trial. Ibadan: IITA; 2016. p. 27.

    Google Scholar 

  13. 13.

    Forbes, G., Pérez, W., Andrade-Piedra, J.L., 2014. Field assessment of resistance in potato to Phytophthora infestans: International cooperators guide. Lima (Peru). International Potato Center (CIP). ISSBN 978-92-9060-440-2. 35p. https://doi.org/10.4160/9789290604402

  14. 14.

    Gilmour AR, Thompson R, Cullis BR. Average information REML: An efficient algorithm for variance parameter estimation. Biometrics. 1995;51:1440–50. https://doi.org/10.2307/2533274.

    Article  Google Scholar 

  15. 15.

    Butler DG, Cullis BR, Gilmour AA, Gogel BJ, Thome R. ASReml-R Reference manual version 4. VSNi Ltd, Hemel Hempstead, HP1IES, UK. 2018.

  16. 16.

    Borgognone MG, Butler DG, Ogbonnaya FC, Dreccer MF. Molecular marker information in the analysis of multi-environment trials helps differentiate superior genotypes from promising parents. Crop Sci. 2016;56:2612–28.

    Article  Google Scholar 

  17. 17.

    Ovenden B, Milgate A, Wade LJ, Rebetzke GJ, Holland JB. Accounting for genotype-by-environment interactions and residual genetic variation in genomic selection for water soluble carbohydrate concentration in wheat. G3 Genes Genome Genet. 2018;8:1909–19.

    CAS  Google Scholar 

  18. 18.

    Dellaporta SL, Wood J, Hicks JB. A plant DNA minipreparation: version II. Plant Mol Biol Rep. 1983;1:19–21.

    Article  CAS  Google Scholar 

  19. 19.

    Aljanabi SM, Martinez I. Universal and rapid salt-extraction of high-quality genomic DNA for PCR-based techniques. Nucleic Acids Res. 1997;25:4692–3. https://doi.org/10.1093/nar/25.22.4692.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  20. 20.

    Kilian A, Sanewski G, Ko L. The application of DArTseq technology to pineapple. Acta Hortic. 2016;1111:181–8.

    Article  Google Scholar 

  21. 21.

    Kim D, Langmead B, Salzberg SL. HISAT: A fast spliced aligner with low memory requirements. Nat Methods. 2015;12:357.

    Article  CAS  Google Scholar 

  22. 22.

    Yin L. Package "CMplot". 2019. URL https://github.com/YinLiLin/R-CMplot/blob/master/CMplot.r

  23. 23.

    Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am J Hum. Genet. 2007;81:559–75. https://doi.org/10.1086/519795.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  24. 24.

    Le S, Josse J, Husson F. FactoMineR: an R package for multivariate analysis. Journal of Stat Software. 2008;25(1):1–18. https://doi.org/10.18637/jss.v025.i01.

    Article  Google Scholar 

  25. 25.

    Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155:945–59.

    Article  CAS  Google Scholar 

  26. 26.

    Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics. 2003;164(4):1567–87.

    Article  CAS  Google Scholar 

  27. 27.

    Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: dominant markers and null alleles. Mol Ecol Notes. 2007;7(4):574–8. https://doi.org/10.1111/j.1471-8286.2007.01758.x.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  28. 28.

    Evanno G, Regnaut S, Goudet J. Detecting the number of clusters of individuals using the software structure: a simulation study. Mol Ecol. 2005;14(8):2611–20. https://doi.org/10.1111/j.1365-294x.2005.02553.x.

    Article  PubMed  CAS  Google Scholar 

  29. 29.

    Earl DA, vonHoldt BM. Structure harvester: a website and program for visualizing Structure output and implementing the Evanno method. Conserv. Genet. Resour. 2012;4:359–61.

    Article  Google Scholar 

  30. 30.

    Paradis E, Schliep K. ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics. 2019;35:526–8.

    Article  CAS  Google Scholar 

  31. 31.

    Zhang YW, Tamba CL, Wen YJ, Li P, Ren WL, Ni YL, et al. mrMLM v4.0.2: an R platform for multi-locus genome-wide association studies. Genomics Proteomics Bioinformatics. 2020;18(4):481–7. https://doi.org/10.1016/j.gpb.2020.06.006.

    Article  PubMed  PubMed Central  Google Scholar 

  32. 32.

    Wang SB, Feng JY, Ren WL, Huang B, Zhou L, Wen YJ, .Zhang J, Dunwell JM, Xu S, Zhang YM. Improving power and accuracy of genome-wide association studies via a multi-locus mixed linear model methodology. Sci. Rep. 2016; 6: 19444. https://doi.org/10.1038/srep19444

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  33. 33.

    Yang-Jun W, Hanwen Z, Yuan-Li N, Bo H, Jin Z, Jian-Ying F, et al. Methodological implementation of mixed linear models in multi-locus genome-wide association studies. Briefings in Bioinformatics. 2017;19(4):700–12.

    Google Scholar 

  34. 34.

    Lwaka TC, Yuan-Li N, Yuan-Ming Z. Iterative sure independence screening EM-Bayesian LASSO algorithm for multi-locus genome-wide association studies. PLoS Computational Biology. 2017;13(1):e1005357.

    Article  CAS  Google Scholar 

  35. 35.

    Zhang J, Feng J-Y, Ni Y-L, Wen Y-J, Niu Y, Tamba CL, et al. pLARmEB: integration of least angle regression with empirical Bayes for multi-locus genome-wide association studies. Heredity. 2017;118:517–24.

    Article  CAS  Google Scholar 

  36. 36.

    Ren W-L, Wen Y-J, Dunwell JM, Zhang Y-M. pKWmEB: integration of Kruskal-Wallis test with empirical Bayes under polygenic background control for multi-locus genome-wide association study. Heredity. 2018;120(3):208–18.

    Article  CAS  Google Scholar 

  37. 37.

    Tamba CL, Zhang YM. A fast mrMLM algorithm for multi-locus genome-wide association studies. BioRxiv. 2018;341784. https://doi.org/10.1101/341784.

  38. 38.

    Shin JH, Blay S, McNeney B, Graham J. LDheatmap: an R function for graphical display of pairwise linkage disequilibria between single nucleotide polymorphisms. J Stat Softw. 2006;16:1–10.

    Article  Google Scholar 

  39. 39.

    Hunter S, Jones P, Mitchell A, Apweiler R, Attwood TK, Bateman A, et al. InterPro new developments in the family and domain prediction database. Nucleic Acids Res. 2011;40:306–12.

    Article  CAS  Google Scholar 

  40. 40.

    Piaskowski J, Hardner C, Cai L, Zhao Y, Iezzoni A, Peace C. Genomic heritability estimates in sweet cherry reveal non-additive genetic variance is relevant for industry-prioritized traits. BMC Genetics. 2018;19(1):23. https://doi.org/10.1186/s12863-018-0609-8.

    Article  PubMed  PubMed Central  Google Scholar 

  41. 41.

    Yu D, Lane SN. Urban fluvial flood modelling using a two-dimensional diffusion-wave treatment, part 2: development of a sub-grid-scale treatment. Hydrol Proccess. 2006;20:1567–83.

    Article  Google Scholar 

  42. 42.

    Gatarira C, Agre P, Matsumoto R, Edemodu A, Adetimirin V, Bhattacharjee R, et al. Genome-wide association analysis for tuber dry matter and oxidative browning in water yam (Dioscorea alata L.). Plants. 2020;9:969. https://doi.org/10.3390/plants9080969.

    Article  PubMed Central  CAS  Google Scholar 

  43. 43.

    Zhang S, Chen X, Lu C, Ye J, Zou M, Lu K, et al. Genome-wide association studies of 11 agronomic traits in cassava (Manihot esculenta Crantz). Front Plant Sci. 2018;9:503.

    Article  Google Scholar 

  44. 44.

    Björn B, Keizer PL, Paulo MJ, Visser RG, Van Eeuwijk FA, Van Eck HJ. Identification of agronomically important QTL in tetraploid potato cultivars using a marker–trait association analysis. Theor Appl Genet. 2014;127:731–48.

    Article  Google Scholar 

  45. 45.

    Karikari B, Wang Z, Zhou Y, Yan W, Feng J, Zhao T. Identification of quantitative trait nucleotides and candidate genes for soybean seed weight by multiple models of genome-wide association study. BMC Plant Biol. 2020;20:404. https://doi.org/10.1186/s12870-020-02604-z.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  46. 46.

    Zarei A, Körbes PA, Younessi P, Montiel G, Champion A, Memelink J. Two GCC boxes and AP2/ERF-domain transcription factor ORA59 in jasmonate/ethylene-mediated activation of the PDF1.2 promoter in Arabidopsis. Plant Mol Biol. 2011;75:321–31. https://doi.org/10.1007/s11103-010-9728-y.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  47. 47.

    van Ooijen G, Mayr G, Kasiem MMA, Albrecht M, Cornelissen BJC, Takken FLW. Structure–function analysis of the NB-ARC domain of plant disease resistance proteins. J Exp Bot. 2008;59(6):1383–97. https://doi.org/10.1093/jxb/ern045.

    Article  PubMed  CAS  Google Scholar 

  48. 48.

    DeYoung BJ, Innes RW. Plant NBS-LRR proteins in pathogen sensing and host defense. Nat. Immunology. 2006;7(12):1243–9. https://doi.org/10.1038/ni1410.

    Article  CAS  Google Scholar 

  49. 49.

    Sunter G, Bisaro DM. Transactivation of geminivirus AR1 and BR1 gene expression by the viral AL2 gene product occurs at the level of transcription. The Plant Cell. 1992;4(10):1321–31.

    PubMed  PubMed Central  CAS  Google Scholar 

  50. 50.

    Xu YH, Liu R, Yan L, Liu ZQ, Jiang SC, Shen YY, et al. Light-harvesting chlorophyll a/b-binding proteins are required for stomatal response to abscisic acid in Arabidopsis. J Exp Bot. 2012;63:1095–106. https://doi.org/10.1093/jxb/err315.

    Article  PubMed  CAS  Google Scholar 

  51. 51.

    Ríos-Fránquez FJ, Rojas-Rejón ÓA, Escamilla-Alvarado C. Microbial enzyme applications in bioethanol producing biorefineries: overview. In: Ray RC, Ramachandran S, editors. Bioethanol production from food crops sustainable sources, interventions, and challenges: Academic Press; 2019. p. 249–66.

    Chapter  Google Scholar 

  52. 52.

    Palmieri F. Mitochondrial carrier proteins. FEBS Lett. 1994;346:48–54. https://doi.org/10.1016/0014-5793(94)00329-7.

    Article  PubMed  CAS  Google Scholar 

  53. 53.

    Berrocal-Lobo M, Segura A, Moreno M, Lopez G, Garcia-Olmedo F, Molina A. Snakin-2, an antimicrobial peptide from potato whose gene is locally induced by wounding and responds to pathogen infection. Plant Physiol. 2002;128:951–61.

    Article  CAS  Google Scholar 

  54. 54.

    Inomata N. Gibberellin-regulated protein allergy: clinical features and cross-reactivity. Allergol Int. 2020;69:11–8.

    Article  CAS  Google Scholar 

  55. 55.

    DiPolo R, Beaugé L. Sodium/calcium exchanger: influence of metabolic regulation on ion carrier interactions. Physiol Rev. 2006;86:155–203. https://doi.org/10.1152/physrev.00018.2005.

    Article  PubMed  CAS  Google Scholar 

  56. 56.

    Luo J, Zhou JJ, Zhang JZ. Aux/IAA gene family in plants: molecular structure, regulation, and function. Int J Mol Sci. 2018;19:259. Published 2018 Jan 16. https://doi.org/10.3390/ijms19010259.

    Article  PubMed Central  CAS  Google Scholar 

  57. 57.

    Madeira F, Park YM, Lee J, Buso N, Gur T, Madhusoodanan N, et al. The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic Acids Res. 2019;47(W1):W636–41. https://doi.org/10.1093/nar/gkz268.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  58. 58.

    Rasheed A, Wen W, Gao F, Zhai S, Jin H, Liu J, et al. Development and validation of KASP assays for genes underpinning key economic traits in bread wheat. Theor Appl Genet. 2016;129:1843–60. https://doi.org/10.1007/s00122-016-2743-x.

    Article  PubMed  CAS  Google Scholar 

  59. 59.

    Li L, Tacke E, Hofferbert HR, Lübeck J, Strahwald J, Draffehn AM, et al. Validation of candidate gene markers for marker-assisted selection of potato cultivars with improved tuber quality. Theor Appl Genet. 2013;126:1039–52.

    Article  CAS  Google Scholar 

  60. 60.

    Sun C, Zhang F, Yan X, Zhang X, Dong Z, Cui D, et al. Genome-wide association study for 13 agronomic traits reveals distribution of superior alleles in bread wheat from the Yellow and Huai Valley of China. Plant Biotechnol J. 2017;15:953–69. https://doi.org/10.1111/pbi.12690.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  61. 61.

    Yang J, Zaitlen NA, Goddard ME, Visscher PM, Price AL. Advantages and pitfalls in the application of mixed model association methods. Nat Genet. 2014;46:100–6.

    Article  CAS  Google Scholar 

Download references

Acknowledgments

We are grateful to the Yam Improvement Program team for their assistance during the research implementation we also acknowledge Afolabi Agbona, Kayondo Siraj Ismael and Nnannna Nwanchukwu for their critical suggestion during the analysis.

Funding

The work was financially supported by Bill and Melinda Gates Foundation (BMGF) through the AfricaYam project (OPP1052998). The sequencing activities was fully sponsored by BMGF as well as the charges related to field evaluation. Publication fee will be paid as well by the BMGF.

Author information

Affiliations

Authors

Contributions

AA and PAA conceptualized the study, PAA did the data curation and analysis. PEN managed the phenotypic data. AA acquired the funding for the research. PEN and PAA wrote the draft manuscript with input from AA. PEN, PAA, RA and AA reviewed and edited the manuscript. All authors have read, made corrections, and approved the final manuscript.

Corresponding author

Correspondence to Paterne A. Agre.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that the research was conducted in the absence of any potential conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Supplementary Table 1.

Description of trait progenitors utilized for the study. Supplementary Table 2. BLUP values of tuber yield per plant (TYP) and yam mosaic virus (YMV) among 406 clones of white yam. Supplementary Table 3. Cluster membership of 406 genotypes of white yam based on structure and phylogeny tree analyses. Supplementary Table 4. Single nucleotide polymorphism (SNP) markers associated with the yield per plant (TYP) and yam mosaic virus (YMV) and putative genes identified in chromosomes of 406 clones of white yam

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Agre, P.A., Norman, P.E., Asiedu, R. et al. Identification of quantitative trait nucleotides and candidate genes for tuber yield and mosaic virus tolerance in an elite population of white guinea yam (Dioscorea rotundata) using genome-wide association scan. BMC Plant Biol 21, 552 (2021). https://doi.org/10.1186/s12870-021-03314-w

Download citation

Keywords

  • Genetic diversity
  • population structure
  • GWAS
  • SNP markers
  • white Guinea yam