Association mapping of North American spring wheat breeding germplasm reveals loci conferring resistance to Ug99 and other African stem rust races

Background The recently identified Puccinia graminis f. sp. tritici (Pgt) race TTKSK (Ug99) poses a severe threat to global wheat production because of its broad virulence on several widely deployed resistance genes. Additional virulences have been detected in the Ug99 group of races, and the spread of this race group has been documented across wheat growing regions in Africa, the Middle East (Yemen), and West Asia (Iran). Other broadly virulent Pgt races, such as TRTTF and TKTTF, present further difficulties in maintaining abundant genetic resistance for their effective use in wheat breeding against this destructive fungal disease of wheat. In an effort to identify loci conferring resistance to these races, a genome-wide association study was carried out on a panel of 250 spring wheat breeding lines from the International Maize and Wheat Improvement Center (CIMMYT), six wheat breeding programs in the United States and three wheat breeding programs in Canada. Results The lines included in this study were grouped into two major clusters, based on the results of principal component analysis using 23,976 SNP markers. Upon screening for adult plant resistance (APR) to Ug99 during 2013 and 2014 in artificial stem rust screening nurseries at Njoro, Kenya and at Debre Zeit, Ethiopia, several wheat lines were found to exhibit APR. The lines were also screened for resistance at the seedling stage against races TTKSK, TRTTF, and TKTTF at USDA-ARS Cereal Disease Laboratory in St. Paul, Minnesota; and only 9 of the 250 lines displayed seedling resistance to all the races. Using a mixed linear model, 27 SNP markers associated with APR against Ug99 were detected, including markers linked with the known APR gene Sr2. Using the same model, 23, 86, and 111 SNP markers associated with seedling resistance against races TTKSK, TRTTF, and TKTTF were identified, respectively. These included markers linked to the genes Sr8a and Sr11 providing seedling resistance to races TRTTF and TKTTF, respectively. We also identified putatively novel Sr resistance genes on chromosomes 3B, 4D, 5A, 5B, 6A, 7A, and 7B. Conclusion Our results demonstrate that the North American wheat breeding lines have several resistance loci that provide APR and seedling resistance to highly virulent Pgt races. Using the resistant lines and the SNP markers identified in this study, marker-assisted resistance breeding can assist in development of varieties with elevated levels of resistance to virulent stem rust races including TTKSK. Electronic supplementary material The online version of this article (doi:10.1186/s12870-015-0628-9) contains supplementary material, which is available to authorized users.


Background
Stem rust of wheat, caused by the fungal pathogen Puccinia graminis Pers. f. sp. tritici (Pgt), is considered potentially the most damaging disease of wheat. Historical crop losses caused by stem rust epidemics have been recorded in all wheat growing regions of the world [1,2]. This disease has been primarily controlled via genetic resistance with resistance genes discovered in bread wheat and its relative species [3][4][5]. However, the emergence of the highly virulent stem rust race TTKSK (also known as Ug99) and its variants has rendered many of the resistance genes ineffective to the pathogen [5,6], and threatens global wheat production and supply.
The race TTKSK [7], first observed in Uganda in 1998 was found to be virulent to Sr31 [8], a widely deployed and important stem rust resistance gene. Within a few years, virulence to three other resistance genes was documented in the Ug99 lineage: Sr24 was defeated by race TTKST [7], Sr36 was defeated by race TTTSK [7,9], and Sr9h was defeated by race TTKSF+ [10,11]. The urediniospores of the stem rust fungus can travel long distances with the flow of wind [1,12]. Consequently, Ug99 race groups have traveled from hotspot areas in East Africa to South Africa and Iran in the North [13,14]. The Ug99 race group is projected to spread further to other wheat growing areas in the world [5,14,15].
Realizing the threat posed by the Ug99 race group, over 200,000 wheat lines including accessions from germplasm collections to breeding materials from wheat breeding programs throughout the world were screened for resistance to Ug99 in Kenya and Ethiopia [5]. The results showed that 85-95 % of wheat lines grown globally are susceptible to Ug99. The results obtained from screening global germplasm highlights the risk looming over worldwide wheat production due to the susceptibility of current varieties. It is therefore essential that resistance genes are identified and used in breeding programs in these areas, including North America, to prepare for the possible arrival of the Ug99 race group and other highly virulent races. One of such highly virulent races is the newly identified race TKTTF, which has a different genetic lineage from the Ug99 race group [16]. This race caused yield losses close to 100 % on the most widely grown wheat cultivar, 'Digalu' , in southeastern regions of Ethiopia in 2013-2014 [16]. Hence, there is urgent need to identify and characterize new genes for resistance to such races, and their rapid incorporation in the breeding pipeline to develop varieties with improved level of resistance.
Two main types of resistance strategies are used in wheat breeding against stem rust: 1) all-stage resistance (ASR) or seedling resistance, and 2) adult plant resistance (APR). ASR is usually characterized by a hypersensitive reaction upon fungal attack, and usually confers a high level of resistance that is effective in all stages of plant development [17]. On the contrary, APR is generally expressed during the adult growth stages of the plant, usually beginning at booting stage. The major drawback of ASR is the high likelihood of the gene being defeated by new pathogen races when resistance genes are deployed singly, in so called "boom and bust cycles" [18]. While APR genes are considered more durable to wheat rusts than ASR genes, they may not provide adequate levels of resistance to high disease pressure [19]. Therefore, a gene-pyramiding strategy that utilizes a few seedling genes or 4-5 APR genes or genes of both resistance types would be highly desirable. Discovery and development of reliable markers for effective marker-assisted gene introgression and selection is vital to routinely combine resistance genes. Such genes and linked markers can be identified in primary, secondary, and tertiary gene pools of wheat and its related species. However, to minimize linkage drag during resistance gene introgression from wild relatives and increase breeding efficiency, discovery of resistant material in existing breeding programs is preferable.
Association mapping (AM), or linkage disequilibrium (LD) mapping, is a powerful technique used to identify marker-trait associations, and has been used successfully in several crop species such as wheat, barley, soybean, and maize [20][21][22]. The AM strategy exploits historical recombination events existing in the lines being studied, which can range from natural collections to breeding populations. Since existing populations can be used for mapping loci associated with traits of interest, as opposed to specifically designed populations, this approach is widely used to study the genetic makeup controlling trait variation in wider germplasm pools [22,23]. As allelic variation and marker polymorphisms are observed at a higher frequency in a genome-wide association study (GWAS) panel compared to a biparental population [24,25], useful and novel alleles associated with traits of interest may be identified when pairing with high-throughput marker technologies. One drawback of AM is that the underlying population stratification due to breeding history, selection, genetic drift, or founder effects can lead to false associations [26,27]. This issue, however, can be reduced by accounting for population structure using the relationship matrix or distance matrix among the lines [28].
The objective of this study was to evaluate spring wheat lines from nine breeding programs in the United States and Canada, in addition to CIMMYT lines, for their fieldbased resistance to the Ug99 race group and conduct a GWAS to identify loci associated with resistance. We also made an attempt to distinguish between loci associated with APR and seedling resistance to exotic virulent Pgt races TTKSK, TRTTF, and TKTTF.

Plant material
Two-hundred and fifty spring wheat lines from wheat breeding programs in North America were assembled as part of the Triticeae Coordinated Agricultural Project (www.triticeaecap.org). Elite lines were representative of the following wheat breeding programs in the United States: Montana State University (MSU), South Dakota State University (SDSU), University of California-Davis (UCD), University of Idaho (UI), University of Minnesota (UMN), and Washington State University (WSU); in Canada: Agriculture and Agri-Food Canada (Ag-Canada) Manitoba, Ag-Canada Saskatchewan, and Ag-Canada Alberta; and in Mexico: the International Maize and Wheat Improvement Center (CIMMYT). This germplasm panel was previously used to assess root morphology traits [29].

Field stem rust evaluation
A panel of 250 lines was evaluated for field response to stem rust in four disease environments: at Njoro, Kenya during the off-season (January to April 2013), and the main-season (June to October 2013), and at Debre Zeit, Ethiopia during the off-season (January to June 2013, and 2014). These environments are referred in the text as KenOff13, KenMain13, EthOff13, and EthOff14, respectively. In all environments, phenotypic data was collected from plant heading stage up to grain maturing stage (i.e. between Zadoks 50 and 90) [30], when the susceptible checks reached maximum severity (usually~80 % severity).
In Njoro, plots were arranged in an augmented design with the lines represented once and the susceptible check line 'Red Bobs' planted after every fifty entries. The lines were sown as 70 cm long twin rows, 20 cm apart, flat bed. Spreader rows were sown perpendicular to the twin rows, surrounding the field to initiate disease development and maintain uniform disease pressure in the nursery. The spreader rows comprised of a mixture of lines susceptible to race TTKST (Ug99 + Sr24 virulence): 'CCK' (Canadian Cunningham Kennedy), 'PBW343' , 'Morocco' and few susceptible CIMMYT lines. The disease was initiated by inoculating the spreader rows using a bulk inoculum of Pgt urediniospores collected at the Njoro field site. Wheat stem rust differential lines with known stem rust resistance genes indicated that the predominant, if not only, race present in the nursery since 2008 was race TTKST; [31]). The urediniospores were suspended in water and injected into spreader plants at 1 m distance prior to booting (growth stage Z35-Z37; [30]). The spreader plants were then sprayed with urediniospores suspended in light mineral oil Soltrol 170 (Chevron Phillips Chemical Company, The Woodlands, TX).
The nursery in Debre Zeit was set up similar to the Njoro nursery. Lines were planted in 1 m long twin rows, flanked by spreader rows comprised of a mixture of susceptible wheat varieties 'PBW343' , 'Morocco' , and 'Local Red'. The spreader rows were artificially inoculated with a bulk of fresh urediniospores collected from PBW343 (PBW343 has Sr31 and several races in the Ug99 race group are virulent to Sr31) and also collected from local fields. Inoculation was carried out in Debre Zeit as it was in the Njoro nursery.
Disease severity on a 0-100 % modified Cobb scale [32] and infection response [17] were recorded for each line. Severity and infection response notes were recorded 2-3 times during the season; the terminal data, which exhibited better disease segregation among the lines in the panel, was used in the analysis. The infection response to disease was assigned constant values as recommended by Stubbs et al. [33] with the response types 'resistantmoderately resistant' and 'moderately susceptiblesusceptible' coded as 0.3 and 0.9, respectively. The stem rust severity values were multiplied by the infection response values to obtain coefficient of infection values [33], which were used in subsequent analyses.

Seedling stem rust evaluation
Seedling assays were conducted at the United States Department of Agriculture, Agricultural Research Service (USDA-ARS) Cereal Disease Laboratory during the winter months between December and February starting December 2012 through February 2014. The 250 spring wheat lines were evaluated with three virulent races of Pgt race TTKSK (isolate 04KEN156/04, Ug99), race TRTTF (isolate 06YEM34-1), and race TKTTF (isolate 13ETH18-1). Race TRTTF was detected in Yemen and Ethiopia and characterized as broadly virulent to wheat stem rust resistance genes including Sr13 and Sr1RS Amigo [34]. Seedling assays were performed as described previously by Rouse et al. [35] for evaluating wheat germplasm for reaction to Pgt races. Two biological replicates of the seedling assays were performed for each Pgt race. Infection types (ITs) were recorded on a 0 to 4 scale according to Stakman et al. [36]. ITs less than or equal to 2+ are considered low infection types whereas ITs greater than or equal to 3-are considered high infection types [36]. In order to use the Stakman ITs in the GWAS, the 0-4 scale was converted to a 0-9 linear scale as proposed by Zhang et al. [37] (Additional file 1). The average linear scale score across the two replications was used in the AM analyses.

Statistical analysis
Data sets from each field environment were fitted into a mixed model with environment as a fixed effect and wheat lines as a random effect to correct for data distortion due to trial effects. Using this model, best linear unbiased predictors (BLUPs) for each line were predicted from the combined analysis model using SAS 9.1, from which final corrected trait values were obtained. After the mean values were normalized for each environment, trait values for each line in all environments were averaged, and used for genome-wide mapping. Heritability on an entry mean basis was calculated based on the method described by Holland et al. [38].

SNP genotyping and analysis of molecular data
The AM panel was genotyped at the USDA-ARS Cereal Crops Research Unit, Fargo, ND as part of the TCAP project for 90,000 gene-based SNPs using a custom Infinium iSelect bead chip assay following the manufacturer's instructions (Illumina Inc., Hayward, CA) [39]. Allele calls were performed using the computer program GenomeStudio v2011.1 (Illumina Inc.). As genotyping of these lines was carried out as a part of the TCAP, the genotype data was obtained from The Triticeae Toolbox. During data download, SNP markers with minor allele frequency less than 0.05 and no calls for more than 10 % of the lines were removed from the dataset. Lines that were genotyped at less than 20 % loci were also removed from the dataset. These filtering steps yielded 23,976 SNP markers and 241 lines that were used for genome-wide association analysis.
Principal component analysis (PCA) on the marker data of the lines was performed by using the Genomic Association and Prediction Integrated Tool (GAPIT) R package [40]. For the purpose of displaying the PCA results, the R package 'princomp' was used to reconstruct the covariance matrix. Principal component 1 scores were plotted against principal component 2 scores for each line in the mapping panel. In order to confirm the results of PCA analysis, the lines were also clustered by their genetic distances using JMP Pro 11.0.0 (SAS Institute Inc.) with Ward's hierarchical clustering method [41].
Linkage disequilibrium between the markers was estimated as squared allele-frequency correlations (r 2 ). The package 'LDheatmap' in R was used to calculate r 2 in each of the A, B, and D genomes of common wheat. The LD decay between the marker pairs for each genome was estimated using the least squares regression function, and is represented by the exponential curve. Map positions of the SNPs were obtained from Wang et al. [39].

Linkage disequilibrium and association analysis
A total of 18,302 mapped SNP markers were used to estimate pairwise LD. LD between markers were calculated for the A, B, and D genomes, and plotted against the genetic distance in centimorgans (cM). The extent of LD between marker pairs was visualized by fitting locally weighted polynomial regression (LOESS) curves into the scatter plot.
Genome-wide association analysis investigating the marker-trait association was performed using the R package GAPIT [40], with the growth stages of genotypes included as covariate for Pgt resistance. The population parameters previously determined (P3D) model [42] was used to conduct association analyses with all trait data. Based on the model selection using the Bayesian information content criterion (BIC), a kinship-mixed linear model (K-MLM) approach that accounts for Type I error rate due to relatedness was used for all traits.

Phenotypic data
Adequate disease pressure was observed in each environment to discriminate among the entries, as indicated by highly significant (p < 0.01) F-values from ANOVA results of the field data (results not shown). Distribution of rust severities in all environments indicated a quantitative mode of disease distribution with the Njoro main season (KenMain13) recording both higher severity and wider distribution of scores (Fig. 1a). On average however, disease severity was relatively lower in Njoro environments with mean values of 33 % during the 2013 offseason and 35 % during the 2013 main season, compared to Debre Zeit, Ethiopia with 52 % severity during the 2013 offseason and 50 % during the 2014 offseason. Moderate correlations (r ranging from 0.44 to 0.57) between the field data from the four seasons was observed ( Table 1). The estimated broad sense heritability across the four environments was 0.72.
The complete panel of 250 lines was inoculated with Pgt races TTKSK, TRTTF, and TKTTF at seedlings to uncover genetic factors contributing resistance to these virulent stem rust races. Replications of the seedling tests were highly correlated with r-value ranging from 0.84-0.99 (p < 0.001). However, the pair-wise correlations among the three races were low and not significant (data not shown). Most of the lines screened for seedling resistance against race TTKSK were susceptible, and only 15 (6 %) showed resistant reactions (IT 22+ or lower) (Fig. 1b). Seedling resistance was more common to races TRTTF, with 44 %, and TKTTF, with 63 % of lines with low infection types. Of the seedling resistant lines, nine lines showed resistance to all three races; three lines showed resistance only to races TTKSK and TRTTF; and five lines showed resistance only to races TTKSK and TKTTF (Additional file 1). The University of Minnesota cultivar 'Thatcher' was heterogeneous for resistance to race TTKSK (IT 0; / X in the first replication and IT 0; / 3+ in the second replication) yet susceptible to both TRTTF and TKTTF (IT 33+ to both races in both replications). The average adult plant disease severity of lines with seedling resistance to race TTKSK was lower in all four environments than lines susceptible to this race (t-test p-value of 0.02 at α = 0.05; Additional file 1).

Population structure
To investigate the population structure of the germplasm panel, the genotypes were analyzed for clustering  Saskatchewan, SDSU, and UMN form a single subcluster; and Cluster 2 into four sub-clusters with lines from CIMMYT, UCD, UI, and WSU. The population stratification and germplasm sharing among the lines revealed by the PC was also corroborated by the results from hierarchical clustering using Ward's method in JMP (Additional file 1).

Linkage disequilibrium
In the A-genome, LD declined to 50 % of its original value at about 8 cM (Fig. 3a), whereas these values for the Bgenome and the D-genome were about 7 cM and 6 cM, respectively (Figs. 3b and c, respectively). These values are similar to the LD values reported by Chao et al. [43] in their detailed LD characterization of wheat varieties having different growth habits from several breeding programs.

Association analysis APR mapping
Initial association analysis was conducted on all 241 lines, without removing any lines that showed resistance to race TTKSK during seedling screening of the lines as described later. This approach detected 24 SNP markers on seven different chromosomes (2A, 2B, 3B, 4A, 6A, 6B, and 7A) that were significantly associated (p-value <0.001) with field resistance to Ug99 (Table 2). In addition, five significant markers with unknown map positions were detected. The phenotypic variance explained by these 29 SNP markers ranged from 0.2 % to 4.6 %.
As the germplasm in the panel was known to possess ASR genes that are effective to the Ug99 race group, it was assumed that these ASR genes could lead to the masking of potential APR genes. Therefore, in an attempt to detect loci conferring APR, lines that were resistant to race TTKSK during seedling screening were removed. Lines that were resistant (IT ranging from 0 to 2+) in either of the two replications of seedling screening, and with a complex score with low and high ITs (for example 2 + 3-) were also removed. This filtering step yielded a subset of 219 lines for the APR-specific genome-wide association analysis.
The APR-specific AM approach identified 26 SNP markers providing APR to Ug99 race group (Table 2).  Of the 26 significant SNPs, 23 SNPs were distributed  across nine chromosomes (1B, 2A, 2B, 2D, 3B, 4A, 5A,  7A, 7B), whereas the remaining three SNPs were unmapped. Six of the 23 mapped SNPs were also detected in the initial analysis on the whole set of 241 lines. The R 2 values for the significant SNPs ranged from 0.1 % to 0.6 % ( Table 2).
Significant SNP markers detected on the complete panel and by APR-specific mapping were cross-checked with the GWAS results for each of the four environments. While no significant SNPs were detected in the Kenya 2013 offseason environment, three, six, and nine SNPs were confirmed in the Ethiopia offseason 2013, Ethiopia offseason 2014, and Kenya main-season 2014, respectively ( Table 2). Two SNPs -IWA3120 (mapped to 1B) and IWB35697 (mapped to 6B) were common between the Ethiopia 2013 and 2014 environments. No SNP markers were common in all four environments.

Mapping of seedling resistance
The genome-wide scan for SNPs linked with seedling resistance to race TTKSK detected 16 significant SNP markers on chromosomes 1D, 3B, 4A, and 5B, and 7 additional significant SNP markers with no mapped locations (Table 3). These SNPs explained 2 to 7 % of variation observed in race TTKSK seedling resistance. The results also revealed that the loci conferring seedling resistance to TTKSK are different than those involved in APR to Ug99 (Table 3, Table 2).

Discussion
Wheat stem rust disease has been primarily controlled by the use of resistant genes discovered in hexaploid wheat and its related species. However, the Ug99 race group has defeated many of the widely deployed resistance genes, and thus poses a threat to wheat production globally. Moreover, several of the previously identified genes discovered in wild progenitors or landraces are not desirable for their use in resistance breeding because of linkage drag [3,44]. Therefore, discovery of loci contributing resistance to Ug99 and other virulent races in elite breeding germplasm is a clear advantage. The resistance uncovered in this study, composed of elite germplasm from North American breeding programs, can provide a great resource for the fight against Ug99 and stem rust in general. As no SNP markers were significant across all four field environments, differences among the disease environments with regard to races present, temperature, and other environmental factors as well as locus by environment interaction are likely involved in this lack of consistency. Lack of strong correlations among the environments also corroborates this assumption ( Table 1).

Comparison of significant APR Loci with published studies
The map locations of significant SNP markers in our study, obtained from Wang et al. [39], were compared to positions of markers and genes/quantitative trait loci (QTL) reported in previous mapping studies conducted to uncover loci associated with stem rust resistance. In this section, we have used from the integrated genetic map consisting of different marker types generated by Maccaferri et al. [45] to obtain the relative distances between previously reported markers and the significant markers in our study.
Five significant SNP markers (IWA3120, IWB21176, IWB31027, IWB56771, IWB59663) were detected at position 90 cM on chromosome 1B, of which all except IWA3120 were also detected in Ethiopia 2013. The marker cfd48 reported by Pozniak et al. [46] in a durum wheat (Triticum durum Desf.) GWAS study is located 4 cM from the SNP markers we detected, and could represent the same locus. Bhavani et al. [47] and Njau et al. [48] both reported the marker wPt-1560 on 1BL to be associated with Ug99 resistance in separate spring wheat RIL populations. This marker, as well as Sr58, an APR gene for stem rust of wheat [49,50], are located at a distance of >50 cM from these five SNP markers. QTL on chromosome 2A providing APR to Ug99 have also been mainly reported in durum wheat mapping populations. Letta et al. [51] detected gwm1045 to be significantly associated with Ug99 resistance in a durum wheat AM panel; and Haile et al. [52] reported a QTL linked to the marker gwm1198 on 2A that confers resistance to Ug99 in the durum wheat population Kristal/Sebatel. Neither of these markers was in proximity to the SNP markers detected in our study. We detected one significant marker, IWB8481, located at 9 cM on chromosome 2D. The only reported QTL on 2D that provides APR to Ug99 and its derivative races is in the CIMMYT biparental population PBW343/Kiritati [47]. Two Sr genes -Sr32, and Sr46 have been mapped to the short arm of 2D [49,53], and both provide resistance to Ug99 [44]. It should be noted that Sr32 has also been introgressed to 2A and 2B [54], but is not expected to be present in the 250 lines analyzed in this study. We used the Sr32 markers developed by Mago et al. [53] to screen our panel but found the markers to be not predictive of the gene (Additional file 1). As no reliable marker for Sr46 has been developed, we are unable to distinguish between these two genes and the marker we found on 2D.
The marker IWA4275 detected on chromosome 2B (position 197 cM) in our study is very close (distance of 2.7 cM) to the marker wPt-8460, known to be linked to Sr9h in 1956 Rockefeller Foundation cultivar Gabo 56 (CI 14035) [11]. The same marker was also reported by Yu et al. [55] in their association mapping study constituting of CIMMYT spring wheat germplasm. Sr9h, previously temporarily designated as SrWeb, is derived from the Canadian wheat cultivar 'Webster' (RL6201) and confers ASR gene effective to TTKSK [56]. Markers developed by Rouse et al. [11] showed that Sr9h is present in 13 lines (5 %) in our panel (Additional file 1), implying that IWA4275 could represent the Sr9h locus in our panel. The gene Sr9a is also located on 2BL [55,57], but is ineffective to Ug99 [44]. Nine SNP markers were detected on the short arm of chromosome 3B with 8 SNPs located at positional range of 11.5 -14.1 cM and one additional SNP at 32.2 cM. These 8 SNPs in the range 11.5 -14.1 cM may be proximal to Sr2, a highly important APR gene for stem rust of wheat [3,59]. Upon marker screening, it was found that 22 lines (9 %) in the panel contain Sr2 (Additional file 1). This gene is used extensively in the CIMMYT spring wheat breeding program, and is shared by some US breeding programs that also incorporated this gene in their germplasm for broadspectrum resistance. It is possible that the SNP at 32.2 cM is associated with Ug99 resistance that has been observed near the Sr12 locus [60]. Another stem rust APR gene, Sr57 [61], is located on chromosome 7D. Screening of the panel using the sequence-tagged site marker developed by Lagudah et al. [62] showed that 97 lines (39 %) could contain Sr57. For other two stem rust APR genes: Sr55, located on 4D [63] and Sr56, located on 5B [64], no diagnostic markers are available. As no SNP markers were detected on chromosomes 4D and 5B during the analysis, we believe these genes are not present in our mapping panel.
Several QTL located on chromosome 4A that provide resistance to Ug99 have been reported in association mapping studies [55,65,66], and in biparental studies [47] in CIMMYT germplasm. These sources of stem rust resistance are not located in the vicinity of the SNPs IWB46973, IWB56556, and IWB67877 detected also on 4A in our study. Similarly, QTL on chromosome 5A providing resistance against Ug99 have been reported in biparental and association mapping studies [46,47]. The '+' sign indicates that the SNP was also detected in GWAS results in each of the environments. The '-' sign indicates that the SNP was detected only in combined analysis of all environments, and not in individual environments   However, chromosome positions of the QTL and significant loci reported in these studies differ from those detected in our study. We detected only one significant SNP (IWA233) on 6AS. Mapped at 66 cM, this SNP is located away (>100 cM) from the marker gwm617 reported by Pozniak et al. [46], and from the marker Sr26#43 linked to Sr26, which provides resistance to the Ug99 and its derivative races [55,67]. Marker screening confirmed that Sr26 is absent in the panel under study (Additional file 1). Several QTL effective to Ug99 and its derivative races have also been discovered on chromosome 6B [4,49]. Of the reported QTL, the DArT marker wPt-6116 in the AM study conducted by Yu et al. [65] is located very close to the significant markers detected in this study: 1.1 cM from IWB24757 and 2.2 cM from IWB45581. The gene Sr11 is located on 6BL, but is ineffective to Ug99 and its derivative races [3,5]. Likewise, several QTL have been reported on 7A that provide field resistance to Ug99 [46,47,51,52,68]. However, none of the reported QTL or positions of significant marker effects coincide with the significant markers detected in this study. Two 7B SNP markers -IWB47548 and IWA4175were significantly associated with resistance to Ug99. Letta et al. [51] have reported loci associated with resistance to Ug99 in durum wheat germplasm, however they are located at a large distance (>50 cM) from both markers in our study.
The significant SNP markers associated with APR to Ug99 reported in this study provide several resistance loci to fight the disease, of which some are likely novel. Validation of the significant markers in all chromosomes is essential to confirm the identity of the associated resistance loci as well as to test their usefulness in marker assisted resistance breeding in breeding programs.

Comparison of significant seedling-resistance Loci with existing resistance genes
The results of the GWAS for seedling-resistance in this study were compared with previous findings for ASR to stem rust of wheat. As the discovered SNPs are suspected to be linked primarily with existing or putatively novel resistance genes, a search for similarities in chromosomal location with known resistance genes was emphasized.  We detected 23 SNPs in our germplasm panel that were significantly associated with race TTKSK resistance at the seedling stage. The SNP marker IWA642 mapped at 67.7 cM on 1D is relatively close to Sr50, a gene that provides resistance to the Ug99 group of races [28]. The seedling resistance genes SrCad and SrTmp are considered to be present in the panel used in this study. SrCad is a stem rust resistance gene derived from the Canadian wheat lines 'Peace' and ' AC Cadillac' , and is effective to Ug99 and its derivative races [69]. Located on chromosome 6D, this gene confers a highly resistant reaction (IT of 1 to 12) to TTKSK in seedling stages, and is moderately resistant to Ug99 in field nurseries. SrCad has not been shown to be different than Sr42 in either map position or resistance specificity [70].  SrTmp is another gene resistant to TTKSK yet no SNPs were detected on 4B where the SrTmp gene is thought to be located [3]. Additional data suggest that SrTmp may be located on 6DS at a similar location to Sr42/SrCad [71]. We used two markers: SSR marker cfd49 [72] and a SNP marker (Gao et al., unpublished) to screen the panel for presence/absence of Sr42. Results indicated that at least 71 or more lines in the panel could carry this gene, yet the markers did not support each other (Additional file 1). Neither marker results also corroborate our TTKSK seedling screening results. We are not aware of any study carried out on broad germplasm to determine if these two markers are diagnostic or even predictive. From our results, it appears that they are neither diagnostic nor predictive of Sr42/SrCad. Similarly, Sr9h has a resistant reaction to race TTKSK at the seedling stages (1 to 2 infection type) [56]. The presence of Sr9h was confirmed by marker screening, as discussed above. Except for the likely presence of Sr9h, the position of the loci conferring Ug99 resistance in this study suggest that different genes than the ones discussed above could be present in our panel. We suspect that association mapping is limited by the low frequency of resistance loci (only 15 (6 %) of 250 lines resistant to TTKSK), leading to lack of detection of SNPs significantly associated with SrCad or SrTmp. Since no ASR genes effective to Ug99 are known to exist on chromosomes 3B and 5B, our findings indicate that the North American breeding germplasm might contain previously undiscovered important sources of resistance to the disease.
The genes Sr24 and Sr36 are resistant to the race TTKSK, yet no SNPs associated with resistance to this race were detected in the chromosomes containing these genes. Sr24, located on 3DL, is widely used in Mexico and the USA; Sr36, located on 2BS, is known to be present in wheat lines in the USA [3]. Lack of detection of these genes can be attributed to either 1) representative germplasm with these genes are not present in our GWAS panel; or 2) if present, the allele frequency is very low which does not pass our stringent analysis filters. Upon marker screening (Additional file 1), we discovered that Sr36 is not present in our panel; and only 7 lines (3 %) contain Sr24, confirming our assumptions.
Of the 77 mapped SNPs significantly associated with resistance to TRTTF, 57 SNPs were located on the short arm of chromosome 6A (position range 2 cM -26 cM). These markers are most likely linked to the gene Sr8a which is located on 6AS and is effective to the race TRTTF [34,73]. Similarly, the SNP marker IWB48466 located on the long arm of 7A (217 cM) is in the same region as the stem rust resistance gene Sr22. This gene was introgressed into 7AL of hexaploid wheat from its diploid relative Triticum boeoticum [74], and is effective against TRTTF [75]. Marker screening of the GWAS panel with a robust sequence tagged site (STS) marker developed by Periyannan et al. [76] confirmed that Sr22 is not present in the panel. Sr31, while ineffective against TTKSK, is effective against TRTTF, and is located on 1BL [75,77]. Our GWAS results detected 13 significant SNPs, all on the short arm of chromosome 1B (position range 44 cM -65 cM). Given the presence of CIMMYT lines in our panel and the widespread use of Sr31 in breeding programs, screening of lines with Sr31 with these markers is needed to determine if the markers are linked to Sr31, or if a novel source of resistance to TRTTF is located on 1BS.
Chromosome 1DS is known to harbor multiple Sr genes [49], and could be represented by the two SNPs that were detected on 1DS in our analysis. We also discovered markers on 5DL and 6BS associated with resistance to TRTTF. As no ASR genes effective against TRTTF are known to exist on 5DL and 6BS, the North American elite breeding germplasm likely possesses novel genes for resistance to the Yemeni stem rust race TRTTF.
One-hundred and nine SNPs associated with seedling resistance to the newly detected Ethiopian stem rust race TKTTF were detected on five chromosomes: 1AS, 4AL, 5AL, 6BL, and 7AS. The 52 6BL SNPs distributed in the positional range of 109 cM -123 cM likely represent the gene Sr11 which is effective to this race. Fifty-one significant SNPs were located on 4AL (142 cM -164 cM) possibly indicative of resistance gene Sr7a (TKTTF is virulent to Sr7b). No ASR genes are known to be located on chromosome 5AL, and therefore the germplasm under study may possess a new source of resistance to the race TKTTF in this region. APR QTL providing resistance to the Ug99 and its derivative races have been detected in the 1AS region [49,55]. Additionally, the gene Sr1RS Amigo is located on the 1RS.1AL rye chromosome arm translocation. Chromosome 7AS does not possess any known ASR genes, yet APR QTL effective to Ug99 and its derivative races have been detected in the region [49,68].
None of the SNPs associated with seedling resistance for the three races were common, suggesting that none of the genes in this material are broadly effective. Further studies involving development of populations for fine mapping and allelism tests are required to elaborate and confirm the nature of the genetic mechanisms controlling the resistance to these three rust races.

Using breeding lines in GWAS
One of the main advantages of conducting association mapping on a panel consisting of breeding germplasm is to explore the genetic composition of the lines, and estimate the effects of significantly associated loci with the trait(s) of interest. The discovery of significant SNPs can allow for tagging of lines that are enriched for alleles associated with the trait, and their use in gene introgression for resistance breeding. More importantly, as the lines used in this AM study are elite, they possess the desired agronomic traits, and are adapted to the desired regions. This helps in avoiding the problems that could otherwise arise from linkage drag, when more diverse germplasm is used to introgress alleles of interest. Singh et al. [5] have reported that up to 95 % of germplasm from global seed collections and breeding programs are susceptible to Ug99. As Ug99 and its derivative races have not yet been observed in North America, it is prudent to prepare for their possible arrival by developing resistant varieties. Discovery of resistant sources in existing breeding programs can speed up the process of gene introgression into elite lines, gene pyramiding for elevated resistance to the disease, and possible identification of diagnostic markers that can be used in marker assisted resistance breeding. Germplasm sharing among the breeding programs for this purpose, at least within the US, is plausible given the genetic similarity among the lines, as observed in Fig. 2. The availability of SNP alleles associated with reduced disease severity (as well as increased severity) in both adult plant and seedling stages (Additional file 1) should be useful for breeders to make decisions about selection of lines to be used as parents in their breeding programs. Breeders may also use the significant SNP markers we have provided to design assays for possible marker assisted selection or screening of resistant materials in their own breeding programs. Additionally, Table 6 has been populated with a list of lines that exhibited high levels of APR to Ug99 (Table 6), and The average disease severity (%) across four environments seedling resistance to TTKSK ( Table 7). The complete genotypic and phenotypic data presented in this study have also been made available on The Triticeae Toolbox (T3 webportal) with the goal of facilitating line selection based on Sr marker associations. We are confident that the North American wheat breeding programs can fortify the stem rust resistance in their germplasm by capitalizing on the information provided in this study.

Conclusions
In this study, we report the frequency and variability in seedling resistance and APR present in North American spring wheat breeding germplasm to virulent exotic Pgt races. Several loci were found to be significant, which is an indication that despite the relatively narrow goals for germplasm development, enough genetic variation lies within the current North American breeding germplasm that can be utilized to breed for resistance against the virulent stem rust races, including the Ug99 race group.
While only a small portion (6 %) of the germplasm showed seedling resistance, APR to Ug99 revealed several likely-novel genomic regions associated with resistance to Ug99. The lines that performed well at either or both growth stages (seedling and adult) could be used immediately to make crosses with elite lines to generate lines with improved rust resistance. Specific crosses could also be made to create mapping populations to fine map the regions of interest in an effort to identify diagnostic markers linked with the resistance loci. The discovery of such diagnostic markers will add great value to recurrent selection breeding programs as well as in identification of lines that carry the resistance loci. As such, further characterization and validation of the detected loci is necessary for effective utilization of these results. The availability of marker and trait data on the T3 webportal that were generated for this GWAS panel should enable interested groups to pursue these studies.

Availability of supporting data
The genotypic data generated on this GWAS panel and used in this article are available under the genotyping experiment 'TCAP90K_SpringAM_panel' in The Triticeae Toolbox repository, https://triticeaetoolbox. org/wheat/display_genotype.php?trial_code=TCAP90K_ SpringAM_panel). The phenotypic data collected on this GWAS panel and used in this article are available under the experiment set 'USSpring_GWAS' in The Triticeae Toolbox repository, https://triticeaetoolbox.org/wheat/view. php?table=experiment_set&uid=48). The data sets supporting the results of this article are included within the article and its additional files.