Skip to main content

Genome-wide association studies of seven agronomic traits under two sowing conditions in bread wheat



Wheat is a cool seasoned crop requiring low temperature during grain filling duration and therefore increased temperature causes significant yield reduction. A set of 125 spring wheat genotypes from International Maize and Wheat Improvement Centre (CIMMYT-Mexico) was evaluated for phenological and yield related traits at three locations in Pakistan under normal sowing time and late sowing time for expose to prolonged high temperature. With the help of genome-wide association study using genotyping-by-sequencing, marker trait associations (MTAs) were observed separately for the traits under normal and late sown conditions.


Significant reduction ranging from 9 to 74% was observed in all traits under high temperature. Especially 30, 25, 41 and 66% reduction was observed for days to heading (DH), plant height (PH), spikes per plant (SPP) and yield respectively. We identified 55,954 single nucleotide polymorphisms (SNPs) using genotyping by sequencing of these 125 hexaploid spring wheat genotypes and conducted genome-wide association studies (GWAS) for days to heading (DH), grain filled duration (GFD), plant height (PH), spikes per plant (SPP), grain number per spike (GNS), thousand kernel weight (TKW) and grain yield per plot (GY). Genomic regions identified through GWAS explained up to 13% of the phenotypic variance, on average. A total of 139 marker-trait associations (MTAs) across three wheat genomes (56 on genome A, 55 on B and 28 on D) were identified for all the seven traits studied. For days to heading, 20; grain filled duration, 21; plant height, 23; spikes per plant, 13; grain numbers per spike, 8; thousand kernel weight, 21 and for grain yield, 33 MTAs were detected under normal and late sown conditions.


This study identifies the essential resource of genetics research and underpins the chromosomal regions of seven agronomic traits under normal and high temperature. Significant relationship was observed between the number of favored alleles and trait observations. Fourteen protein coding genes with their respective annotations have been searched with the sequence of seven MTAs which were identified in this study. These findings will be helpful in the development of a breeder friendly platform for the selection of high yielding wheat lines at high temperature areas.


As the world population will be increased to 9 billion at the end of twenty-first century, it is predicted that the food demand, especially for wheat, will be increased by 50% by 2030 and 70% by 2050 [1, 2]. On the other hand, the mean temperature in South Asia will increase by 4 °C until 2050 or by end of this century [3]. Significant wheat yield losses of 32 to 39% worldwide [4] and 40% in less developed-irrigated wheat growing areas [5] have been reported. During grain filling duration, high temperature (> 30 °C) also occurs in 40% of the temperate zones that grow 36 million ha of wheat [6].

Around 6% decline in global wheat production [7] and 3–4% in indo-gigantic planes [8] has been assessed for each degree rise in temperature during the reproductive stage of the crop. High temperature decreases chlorophyll content and photosynthetic capacity of leaves [9], decreases the grain number per spike due to floret abortion, and lowers the grain weight [10]. Grain filling rate and ultimately the yield are highly affected as the temperature increases above 30 °C during anthesis [11]. Stone and Nicolas [12] reported that exposure of 32 °C for 4 days reduced yield by 23%. Losses due to heat stress are phenomenal and compelling wheat researchers to focus more on heat stress as compared to other abiotic stresses such as drought, metal or radiation etc.

To cope with high temperature in wheat during reproductive stage is desired by the breeders. Traits affected by high temperature are investigated to develop heat tolerant wheat cultivars. It is therefore essential to grow wheat lines under normal and heat stressed conditions to examine the variability. Using conventional wheat breeding approaches, valuable traits related to stress tolerance have been incorporated. However, it is time consuming due to limited field screening methods and laborious due to low probability of combining coveted alleles [13]. Progeny screening can be more precise and genetic gains can be accelerated if the conventional approaches are aided by molecular based techniques.

To dissect the genetic basis of the quantitative traits associated with heat tolerance, loci for grain filling duration and heat tolerance were mapped [14, 15] with few simple sequence repeats (SSRs). Other quantitative trait loci (QTL) for heat tolerance using SSRs were then identified [16, 17]. These studies were based on low marker density using SSRs. QTL mapping identifies genomic regions with low resolution; it hinders their use in diverse germplasm [18].

In order to obtain better genomic dissection of complex quantitative traits like heat tolerance, a high-throughput and robust genotyping approach is in demand [19]. Most recently Diversity Arrays Technology (DArT) markers were used to map agronomic traits in heat prone areas [20]. Genotyping with SSRs and DArT is cost-ineffective, laborious and less accuracy of prediction [19]. Genotyping-by-sequencing (GBS) is a high-throughput, cost effective, next generation sequence-based technology [21] which detects single nucleotide polymorphisms (SNP). GBS markers allow researchers to obtain high prediction accuracy as compared to previously used array of hybridization-based markers [22].

Based on such high density SNP markers, genome-wide association study (GWAS) can inspect large gene pools representative of diverse breeding reservoirs. GWAS is the most suitable approach to locate robust QTLs that show effect in both normal and stressed conditions. Recently GWAS has been effectively used to map QTLs for grain weight, drought stress, floret fertility, grain architecture and starch granule size [23,24,25,26,27].

Our objective was to use a high-density (GBS) platform through GWAS to identify chromosomal regions potentially associated with days to heading (DH), grain filling duration (GFD), plant height (PH), spikes per plant (SPP), grain number per spike (GNS), thousand kernel weight (TKW) and grain yield (GY) under normal and late sown conditions. Information from this scientific work will help wheat breeders for genetic improvement in the development of heat tolerant lines.


Phenotypic assessment

Description of seven studied trait at three locations under two treatments (NS: Normal and LS: Late sown) indicates significant (p-value < 0.01) effect on 125 genotypes (Table 1). For each trait, the difference recorded between NS and LS condition across all the environments was above the critical difference i.e. least significant difference (LSD at p-value 0.05). It was confirmed that the late sown trials faced high temperature from anthesis to physiological maturity that affects each studied trait. Yield losses due to high temperature ranged from 66 to 76% on average at three locations. Plant height was reduced by 25, 30 and 20% on average at Islamabad (ISD), Sargodha (SGD) and Bahawalpur (BWP) respectively. Average days to heading were decreased from 126 to 88 at ISD, 123 to 81 at SGD, 108 to 78 at BWP. Grain filling duration was decreased by 9 to 34%. Mean square of genotypes was also separately calculated for each location and treatment as a one-way source of variation to observe the inherent potential of the lines under study. Mean square of genotypes, locations and treatments for all studied traits were found to be significant (p-value < 0.001) with the degree of precision (CV%) ranging from 6.5 to 18.52 (Table 2). Interaction mean squares for all the traits except GFD were significant.

Table 1 Description of studied traits at three locations (Islamabad: ISD, Sargodha: SGD and Bahawalpur: BWP) along with over all mean, mean squares, percent reduction due to late sown and heritability
Table 2 Mean squares of studied traits from two-way analysis of variance

In NS conditions, GY was positively correlated with SPP (r = 0.24) and GNS was correlated with TKW (r = 0.26), DH (r = 0.21) and PH (r = 0.23); whereas GFD was significantly negative correlated with DH (r = − 0.56) and positive with SPP (r = 0.20) at p-value < 0.05 (Fig. 1). In LS conditions, GY was significantly negative correlated with DH (r = − 0.28) and positively correlated with GFD (r = 0.18), PH (r = 0.31), SPP (0.41) and GNS (r = 0.25); whereas DH was significantly negative correlated with GFD (r = − 0.73) and PH (r = − 0.25) at p-value < 0.05.

Fig. 1

Pearson’s coefficient (r) of correlation among traits studied under normal sown (upper right diagonal) and late sown (lower left diagonal) conditions with traits arranged according to first principal component (FPC). The absolute value of 0.189 and 0.184 was found significant for normal sown and late sown trials respectively. DH: days to heading, GFD: grain filled duration, PH: plant height, SPP: spikes per plant, GNS: grain number per spike, TKW: thousand kernel weight and GY: grain yield

Biplots were constructed for NS and LS conditions separately. The two principal components explained 45.27% variability in NS and 53.09% variability in LS conditions. Traits TKW, GNS, PH, GY, SPP occupied the same section in the biplot which indicates similar pattern of variability among these traits. These traits were found correlated to each other. On the other hand GFD and DH were opposite to each other in two dimensional biplot spread indicates their highly negative correlation. Genotypes with serial number 6, 81, 86, 97 and 39 performed well in normal sown conditions while 111, 120, 121, 110 performed poorly under NS conditions. Under LS conditions, genotypes with serial numbers 87, 120, 38 and 34 were found efficient with respect to GY, GNS, PH and SPP traits (Fig. 2). Genotype with serial number 34 was (waxwing*2/Heilo), the released wheat variety “Lemu”, genotype with serial number 38 was developed from Kachu#1/Kiritati//Kachu while the genotype with serial number 87 has well known synthetic wheat in its pedigree. Genotype with serial number 120 was the line having Pastor, Kakuna, Milan, Kauz in its pedigree, further detail has been given in Additional file 1: Table S1. Genotypes with serial numbers 104, 88 and 113 showed the lowest DH and the highest GFD under LS condition.

Fig. 2

Biplot of studied traits with 125 genotypes under normal sown conditions (a) and under late sown conditions (b). PC1 and PC2 are the principal components along x-axis and y-axis respectively. Numbers are genotypes serial number as in Additional file 1: Table S1. Trait legends are same as in Table 1

To check the overall performance of each genotype in NS and LS conditions, squared cosine values were calculated from principal component analysis (PCA). In Fig. 3, red colored rectangles indicate significant squared cosines of genotypes at F1 (PC1) and F2 (PC2). Genotypes with significant scores in LS conditions performed best with respect to all seven traits. These genotypes can be the genetic resource in the development of heat tolerant cultivars.

Fig. 3

Heat map of squared cosine values of 125 lines in normal and late sown (NS, LS) conditions through factorial analysis along with F1 and F2 on the basis of seven traits

Marker distribution, principal component analysis (PCA), agglomerative hierarchical clustering (AHC) and linkage disequilibrium (LD)

In total 87,096 SNPs were identified but 55,954 GBS markers were filtered with MAF > 0.05, average polymorphism information content (PIC) and diversity was 0.28 and 0.32, respectively. Out of these 55,954 markers, 35% were from A genome, 41% from B genome and 24% from D (Table 3). Chromosome 2B has the highest number of markers (4248), while 4D has the lowest (1180). Average physical distance between two adjacent SNP loci (SNP-density) along each chromosomal length can be found in Additional file 2: Figure S2.

Table 3 Genome-wide details of 55,954 markers along with polymorphism information contents and marker diversity

Five sub-groups (Fig. 4a) were expressed by 125 genotypes after the distribution along two principal components (PCs) and this finding was further validated through AHC analysis which confirmed five distinct clusters. Sub-group 3, 5 and 2 were distinctly spread in two dimensional space (Fig. 4a). Range of dissimilarity index was 0.05 to 0.25 which elucidated five clusters at cutoff value of 0.12 (Fig. 4b). Between-class variance decomposition for the optimal classification was 0.022 (15.51%). Range of distance between each class centroid was 0.178 to 0.406. Clusters 1 to 5 contained 43, 17, 14, 22 and 29 genotypes respectively. Each genotype with its corresponding score of five PCs and its designated cluster can be seen in Additional file 1: Table S1.


A panel of 125 lines classified through principal component analysis (a) and Agglomerative Hierarchical Clustering (b) using 55,954 markers

Linkage disequilibrium was calculated from 3,154,450 pairs using 100 marker sliding window operation out of which 16% of the pairs were with zero LD and 12% of the pairs were found in significant (p-value < 0.05) LD. Out of these significant pairs, inter-and intra-chromosomal LD was calculated for 3838 and 320,510 pairs respectively. Out of 320,510, there were 162,362 (51%) intra-chromosomal pairs which showed > 0.1 LD with an average of 0.293 within 16.70 M base pair (Mb) throughout the whole genome. There were 504 inter-chromosomal pairs with > 0.1 r2 LD with an average of 0.151. Intra-chromosomal LD ranged from 0.1 at chromosome 4D within 38.84 Mb pair distance to 0.24 r2 at chromosome 1D within 13.33 Mb pair distance (Fig. 5a). Average LD on A, B and D-genome was 0.177, 0.17 and 0.168 with in the average 14.62, 14.58 and 20.19 Mb pairs distance respectively. Inter-chromosomal LD was noted between 0.06 to 0.08 r2 (Fig. 5b). The extent of (threshold LD > 0.1 R2) LD-decay was observed at 25 Mb pair distance (Fig. 5c). A total of 162,866 pairs were identified having > 0.1 r2 LD (Fig. 5d).

Fig. 5

a Intra chromosomal LD with Mega base (Mb) pair distance annexed with chromosome number at x-axis and inter chromosomal LD (b) with number of pairs in significant (p-value < 0.05) LD; c LD-decay plot against distance base pair with LOWESS smoothing curve; d Frequency distribution of LD (R^2) shows number of base pairs in significant (p-value < 0.05)

Marker-trait associations (MTAs)

A total of 139 MTAs were identified out of which 69 were identified in NS experiments and 70 were found under LS conditions. Among them, 56, 55 and 28 were on genome A, B and D, respectively (Table 4). Marker S4A_610095520 was found to be associated with two (DH, GFD) traits in LS conditions; S7B_700803008 was linked with SPP in NS and GY in LS conditions and marker S6A_453869891 was identified for TKW in both NS and LS conditions (Table 4).

Table 4 Marker-trait associations found for normal and late sown conditions keeping significant threshold p-value at 3.00 (−log10)

Six genomic regions were identified in NS and 14 were detected in LS conditions that affected days to heading. For GFD, 12 marker in NS and 11 markers in LS conditions were detected. The trait PH was found to be associated with 12 genomic regions in NS and 12 in LS conditions; SPP with seven regions in NS and six in LS conditions. Three markers were found associated with GNS in NS and five in LS conditions. Fourteen MTAs were found in NS and seven in LS conditions that affected TKW. Sixteen MTAs in NS and 17 MTAs in LS conditions were detected that control GY (Table 4). Manhattan plots for the studied traits are available as Additional files 3: Figures S3 to S9. A map based on physical base pair distance of 139 markers (associated with studied traits) was constructed and an 11 Mb (752 to 763 Mb) genomic region on chromosome 2A was associated with TKW; a very narrow distance of just 0.3 Mb (from 748.5 Mb to 748.2 Mb) was found associated with PH and GY under late sown conditions. A region spanning 10 Mb (465.5 to 475.2 Mb) on chromosome 1B is associated with GY, SPP and from 592.9 to 608.2 Mb (16 Mb) is linked with GY and GFD under LS conditions. A 27 Mb region on 4A was associated with PH and TKW in normal sown conditions. On chromosome 5D, a 2 Mb (42 to 44 Mb) region was detected that was associated with DH. A narrowed distance of 0.48 Mb (611.56 to 611.08 Mb) on chromosome 6A was linked with PH and SPP. Another noticeable 4 Mb region which extends from 47 to 51 Mb on 6B is responsible for controlling TKW in NS conditions. A genomic region (321.6 Mb to 735 Mb) on chromosome 7A was found to be of prime importance due to association with multiple agronomic traits including DH, TKW and GY traits (Fig. 6). Effect of favorable alleles was estimated for all seven traits studied under LS conditions. The highest allelic effect (61%) was observed for days to heading under LS conditions (Fig. 7). Genotypes (serial number 34, 109, 115, 120; Additional file 1: Table S1) with nine favorable alleles exhibited the highest (> 0.06 kg/plot) grain yield. Favorable alleles for grain yield explained 45% variability. Increase in grain yield was observed as the number of favored allele increased additively (R2 = 0.45). Number of favored alleles in each genotype for seven traits under LS conditions is given in Additional file 1: Table S1.

Fig. 6

A Physical map showing significantly associated markers for all the studied traits. Colored shapes indicating markers significantly associated with traits under late sown conditions

Fig. 7

Effect of favorable alleles estimated for traits studied under late sown conditions

Seven MATs were selected on the basis of > 0.3 MAF and their sequences were searched using Basic Local Alignment Search Tool (BLAST) through the link ( /Multi/Tools/Blast?db = core). With these seven sequences, 144 hits were found. Fourteen gene hits with greater than 90% sequence similarity were selected with their respective annotations (Table 5). Our BLAST results can be accessed with the link:

Table 5 Gene hits of marker-trait association (MTAs) regions identified with basic local alignment search tool (BLAST) and their annotation with high confidence by IWGSC assembly BLASTX (NCBI BLAST)


Source to sink ratio is disturbed when heat stress affects the photosynthetic apparatus due to alteration in carbon assimilatory processes [28] which causes reduction in yield and yield related agronomic traits. Hence the present study was designed to evaluate the heat resistance potential of 125 spring wheat genotypes under three different locations. In this detailed phenotypic experiment, the traits were examined under two sowing conditions (NS and LS) that provided the copious opportunity to observe the difference exhibited by a trait under normal and heat-stressed (late sown) conditions. Traits studied in different locations were averaged to understand the stability across the environment as performed by Ogbonnaya et al. [20]. In this way their combined contribution under overall environmental variation in the detection of associated genomic regions has been highlighted under normal and heat-stressed conditions. Thus our study showed the MTAs for normal and late sown conditions separately, so that actively participating genomic regions controlling the traits under heat stress can be emphasized.

After reviewing the previous literature, only those traits were selected for evaluation and eventually genome-wide mapped which were reportedly affected by the heat stress [16, 17]. Significant reduction in all studied traits was observed due to heat stress. Except DH and PH, all traits were affected by genotype x environment interaction and their ranges show that the traits are quantitatively inherited, as has been in prior investigation [17] also showed such trends. Heritability estimates for the traits DH, SPP and GNS were lower in late sown than normal sown trials same as reported by Tiwari et al. [17] while GFD, PH, TKW and GY have higher heritability in late sown than normal sown conditions as shown by Paliwal et al. [16], similar heritability estimates for TKW have been reported by Pinto et al. [29].

Two-way analysis of variance showed acceptable ranges for CV% indicated that the design of the experiment is appropriate, similar trends in CV % have been shown by Ogbonnaya et al. [20]. Under heat-stressed conditions, GY is significantly affected by PH, GFD, SPP and GNS in a positive manner while PH significantly affected DH, SPP and TKW. Under heat stressed conditions DH is inversely related with SPP, GFD and GY as well while GFD if reduced that can cause ultimate reduction in the grain yield. Similar trends have also been reported by Paliwal et al. [16]. Early maturing wheat lines have maximum GFD that can cause an increase in GY.

The biplot was constructed to see the expression of genotypes in normal and heat-stressed conditions with respect to the studied traits. The biplot distinctly which genotypes performed well under normal conditions, under heat-stressed and under both. It has also been observed that the first principal component in LS conditions explains higher variability (30.14%) as than NS conditions (24.52%), and the same trend was followed by second principal component in NS and LS conditions. Variation among the genotypes has been well explained under heat-stressed conditions as compared to under normal conditions. Genotypes with good height were high yielding and those genotypes having high number of spikes per plant (SPP) have been found with high GNS under heat-stressed conditions and delayed heading caused shrinkage in GFD.

As proven by the LSD value for each trait, the effect of heat stress was significant, so the MTAs detected in both normal and heat-stressed conditions were underlined. The main focus of the study was to identify the genomic regions under normal and heat-stressed conditions so the heat susceptibility indices for the traits as calculated and mapped by some of the previous studies [16, 17] was not the objective of this study. Mapping for heat susceptibility indices of the traits cannot detect those genomic regions specifically expressed under heat stress, so the main interest was to identify the MTAs under heat stress and to pinpoint the involvement of different chromosomal regions due to normal and heat-stressed conditions.

In the present trait DH was associated with six markers in normal and 14 markers on 8 different chromosomes in heat-stressed conditions. Two markers on 1D and two markers on 5D along with three markers on 6B and three markers on 7A were identified as controlling the DH in LS conditions. Four markers, one on each of 2A, 4A, 5B and 6A were also identified meanwhile. Fifteen percent variation in DH has been explained under heat stressed conditions by the marker (S6A_71077435) on 6A; Ogbonnaya et al. [20] also reported a region on 6A with 7.48 R2 for DH. Using functional markers VRN-A1, VRN-B1 and VRN-D1 on chromosomes 5A, 5B and 5D Ogbonnaya et al. [20] explained the variation in DH. In the present study, four markers (one on 5B, three on 5D) explained 15% variation in DH.

Grain filling duration was controlled by the regions on some multiple chromosomes (Table 4) in NS and LS conditions. Above 12% variation has been explained by most of the regions in both sowing conditions. Paliwal et al. [16] reported the distance between two markers (Xgwm935-Xgwm1273) on chromosome 2B long arm linked with GFD, later on Tiwari et al. [17] identified other QTLs for GFD were identified and most recently Ogbonnaya et al. [20] detected a region on 1B explaining 4.5% of GFD. Present study identified some GFD operating regions on 2B, 5B, 5D and 7D which validates previous [16, 17] findings and also underlined some additional regions on chromosomes 3B and 4A. A region on chromosome 1B that explained 15% variation in GFD under heat-stressed conditions also corroborate with the findings of Ogbonnaya et al. [20].

Using DArTseq, plant height was mapped by Bellucci et al. [30] on chromosome 2A and 6A explaining 6 to 7% variation. Ogbonnaya et al. [20] reported a reported an MTA for PH on 5A with 4.85 R2. We report a 6 Mb region on 5A (426 to 732 Mb) associates with PH and explains 12 to 14% variation along with another locus on the same (5A) chromosome with the 23% R2. Under heat stress, there is another region of 1 Mb (32 Mb to 33 Mb) on chromosome 3B that is associated with PH. In European winter wheat, using SSR, MTA on 6A for PH was reported [31, 32]. In our study a genomic region on chromosome 6A at 212 Mb is linked with PH in NS conditions but a region on the same (6A) chromosome ~ 400 Mb away (at 611 Mb) has also been detected controlling plant height under heat stress explaining 16 and 10% variation respectively. Spikes per plant was associated with genetic regions on 5A, 5B, 6A, 7A, 7B in NS conditions but in LS conditions the trait variation has been explained up to 14% with regions on 1B, 4B and 7A. A two Mb spanned genetic region on 2D (72-74 Mb) explained 12% variation in SPP under heat stress. Number of spikes per plant is a trait of prime importance for yield [33], so the information given by these MTAs can be used to predict the number of spikes under heat stress. For grain number per spike, in addition to the previously reported [20] region on 7B, we report MTAs for GNS on 3A and 5D under NS and on 2A, 2D, 6A and 6D under LS conditions with explained R2 from 10 to 19%.

Thousand kernel weight is reportedly governed by the regions on 5A, 6A [18], 1A, 1B, 2D, 5A, 6A [20]. Paliwal et al. [16] detected MTAs for TKW on 2B, 7B and 7D. Tiwari et al. (2013) also showed the MTAs for TKW in normal and heat-stressed conditions on 2B, 6B and 2B, 6A, 7D respectively. In the present research a 4 Mb region on chromosome 6B uncovered by four GBS markers explained 12 to 14% variation in TKW, not only these results validating the previous findings as well as revealing the function of additional genomic region on 6B in NS conditions. But in heat-stressed conditions, a chromosomal location on 3B revealed by three GBS markers explained 10 to 13% variation in TKW. An MTA for TKW on 2B identified in this study is in agreement with the findings of Paliwal et al. [16] and Tiwari et al. [17]. Another set of three markers on 7A in the region from 718 to 735 Mb explained 10 to 11% variation in TKW.

Grain yield is a complex quantitative trait and has MTAs spread over different chromosomes; in both NS and LS conditions, two MTAs were detected on chromosome 1A that explained 13% variation. Chromosome 1B has been reported [20] responsible for 7.55% explained variation while in our study we identified three markers on 1B which explained 13 to 16% variation in GY under heat-stress conditions. In NS condition, two markers on 2B explained 12 to 13% variation; Tiwari et al. [17] and Paliwal et al. [16] also reported QTLs on 2B for GY. Moreover, some MTAs on 3B, 5B, 6A, 6D, 7A, 7B and 7D have also been detected in this study for GY.

As argued by Paliwal et al. [16] in the heat stress studies, a chromosomal region on 2B is of prime importance. In the present study, 11 MTAs for different traits have been detected on 2B. Using reference genome, GBS-markers associated with the traits of interest were ordered on the basis of physical mega base pair distance and have been reported in this study in the form of a physical map as guided by Poland et al. [22].


To assess the thermo tolerance of the wheat germplasm, there is a dire need to highlight those regions involved in the adaptation to high temperature. This can be done by finding the association of genomic regions with the help of next generation SNP genotyping (such as GBS) so that the detected sequences can be utilized for the development of high throughput user friendly markers which may be used in marker assisted or genomic selection. Our study contributed in the effort of finding the genomic regions expressed during high temperature. Seven chromosomal regions with minor allele frequencies greater than 0.3 have been identified and their sequence was searched with BLAST which showed 14 gene hits with > 90% similarity. The genotypes with high number of favorable alleles detected in this study can be utilized in wheat breeding programs.

Material and methods


Germplasm was comprised of 125 spring wheat genotypes (Additional file 1: Table S1) obtained from International Maize and Wheat Improvement Centre (CIMMYT). Field trials were conducted at three different locations in Pakistan; i) National Agriculture Research Center (NARC), Islamabad; ii) University of Sargodha, Sargodha and iii) Regional Agriculture Research Institute (RARI) Bahawalpur. Geographical coordinates for the experimental location at Islamabad (ISD) were 33o40’34″N 73o7’54″E, Sargodha (SGD) 32o7’50″N 72o40’50″E and Bahawalpur (BWP) 29o23’11″N 71o39’12″E. At each location, one complete set of germplasm was sown at the optimum time (during third week of November) considered as normal and the second complete set was sown late (during third week of December) to uncover the genotype response to high temperature during anthesis. The meteorological data has been presented in the Additional file 4: Figure S10.


Alpha lattice design with two replications was followed for all the experiments. Plots of two meter two rows with 30 cm row spacing were specified for each entry. Each trait under study was recorded separately for normal and late sown trials. Days to heading (DH) which is the period from sowing to the appearance of heads was noted when more than 50% of the plants of each genotype displayed heads at Zadoks stage 59 [34]. Grain filling duration (GFD) was recorded as the phase between heading and physiological maturity; the stage from sowing date to the time when green color of more than 50% of the spikes disappeared at Zadoks stage 89 [34], which is referred to as physiological maturity. After reaching physiological maturity, plant height (PH) in centimeters was measured with meter rule from the base of the plant up to the top of the spike excluding the awn length. Number of spikes per plants (SPP) was also estimated by taking the average of the number of spikes of five randomly sampled plants of each entry. Grain number per spike (GNS) was calculated, as it was the average of kernels in the main stem spikes of each of ten randomly selected plants from each entry. Grains of all spikes of ten randomly selected plants of each entry were bulked separately. Thousand kernel weight (TKW) was measured in grams by counting thousand grains randomly from each bulk using electrical weighing balance. Grain yield in kilograms for each entry was recorded after threshing whole 2 meter two rows plot.

GBS-SNP genotyping

All 125 lines were genotyped using genotyping-by-sequencing (GBS). DNA extraction kit from Prima Scientific (Bangkok, Thailand) was used to extract genomic DNA. GBS was performed following the Mascher et al. [35] protocol using two-enzyme (MspI-PstI) approach. GBS-SNPs were identified from sequence tags by aligning the sequence reads to the reference genome of Chinese Spring (CS) wheat. A pipeline of TASSEL 5 software was used for SNP calling against the whole wheat (CS) genome assembly (WGA 0.4, International Wheat Genome Sequencing Consortium). SNPs were called by setting minimum/maximum minor allele frequencies (MAF, 0.02 and 0.5) and minimum locus coverage was set at 0.2. Identified SNPs were named as “chromosome number_ physical position” i.e. 6D_13778811. The SNP markers with > 10% missing data and > 20% heterozygosity were not considered for further analysis. The genotypic panel of 55,954 SNP marker used in the study has average minor allele frequency 0.197; proportion of overall heterozygosity was 0.107 while the average missing proportion was 0.052.

Statistical analysis

By using the software Meta-R, best linear unbiased estimates (BLUEs) for three locations were calculated from restricted maximum likelihood (REML) analysis. The REML analysis was performed on results achieved in the alpha lattice experiments. Variance components were estimated with the software Statistix 8.1. Heritability (h2) was calculated for normal and late sown trials using the formula, h2 = σ2g / σ2p; where σ2g (σ2g = [(σ2genotypes- σ2error)/replicates]) is genotypic mean square and σ2p (σ2p = σ2g + σ2error) is phenotypic mean square. Biplot construction and correlation analysis for the traits in normal and late sown trials was performed using XLSTATS 2010 and the package “Corrplot” in R-software respectively [36].

To understand the genetic stratification of 125 genotypes, principal component analysis (PCA) was performed using their marker data and distance square matrix was obtained using built-in relatedness tab through TASSEL. This distance square matrix was used to explore the clusters in the panel with Ward’s method of agglomerative hierarchical clustering (AHC).

Linkage disequilibrium (LD) analysis was perform to obtain LD estimates expressed as r2 on sliding window of 100 markers by treating heterozygous calls as missing. To observe the LD decay, intra-chromosomal r2 was plotted against base pair distance using R-package ggplot2. Locally weighted scatter-plot smoother (LOWESS) function was used to draw a trend line for the detection of a threshold r2 below which relationship of any two pairs is to be considered not due to physical linkage. Marker pairs in significant (p-value < 0.05) LD were further observed with respect to inter and intra-chromosomal aspect and pairs with < 0.1 r2 were identified.

Polymorphism information content (PIC) and minor allele frequency (MAF) were calculated using the software Trait Analysis by Association, Evolution and Linkage (TASSEL 5.0) [37]. For the examined traits, calculated BLUEs were used for marker-trait association (MTAs). Genome-wide association study (GWAS) was performed using an optimally compressed mixed linear model (MLM) along with variance component estimation with P3D (population parameters previously determined) as implemented in TASSEL by Zhang et al. [38]. A kinship matrix (K) and five principal components (PC) as covariate were used [39] to run MLM (K + PCA) in TASSEL. The equation fitted in TASSEL 5.0 was: y = Xβ + Zμ + e, where y is the vector of observation for phenotypic values; β is an unknown vector with fixed effects including genetic marker and PC and μ is the unknown vector of random additive genetic effects for lines. To avoid the false positives, kinship matrix (K) was used in the model as an additive genetic effect for lines; X and Z are known matrices; and e is the vector which is unobserved of random residual. In this model μ and e are the vectors which are assumed to be normally distributed. False discovery rate (FDR) method was used to estimate the significance between marker and trait association [40] keeping a q-value cutoff of 0.05.



Agglomerative hierarchal clustering


Best linear unbiased estimations




Diversity array technology


Days to heading


Falls discovery rate


Genotyping-by- sequencing


Grain filled duration


Grain number per spike


Genome-wide association study


Grain yield




Linkage disequilibrium


Late sown


Minor/minimum allele frequency


Mega base


Marker-trait association


National agriculture research center


Normal sown


Principal component analysis


Plant height


Polymorphism information contents


Quantitative trait locus


Regional agriculture research institute


Restricted maximum likelihood




Single nucleotide polymorphism


Spikes per plant


Thousand kernel weight


Whole genome assembly


  1. 1.

    Borlaug NE, Dowswell CR. Feeding a world of ten billion people: a 21st century challenge. In R Tuberosa, RL Phillips, M Gale, Eds, Proceedings of the international congress in theWake of the double Helix: from the green revolution to the gene revolution, 27–31 may 2003, Bologna, Italy. Avenue Media, Bologna; 2005. p 3–23.

    Google Scholar 

  2. 2.

    Stratonovitch P, Semenov MA. Heat tolerance around flowering in wheat identified as a key trait for increased yield potential in Europe under climate change. J Exp Bot. 2015;66(12):3599–609

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  3. 3.

    IPCC. Summary for Policymakers of Climate Change 2007: The physical science basis contribution of Woring group I to the fourth assessment report of the inter-governmental panel on climate change. Cambridge: Cambridge University Press; 2007.

  4. 4.

    Ray DK, Mueller ND, West PC, Foley JA. Yield trends are insufficient to double global crop production by 2050. PLoS One. 2013;8:e66428.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  5. 5.

    Joshi AK, Mishra B, Chatrath R, Ferrara GO, Singh RP. Wheat improvement in India: present status, emerging challenges and future prospects. Euphytica. 2007;157:431–46.

    Article  Google Scholar 

  6. 6.

    Reynolds MP, Nagarajan S, Razzaque MA, Ageeb OAA. Breeding for adaptation to environmental factors, heat tolerance. In: Reynolds MP, Ortiz- Monasterio I, McNab A, editors. Application of physiology in wheat breeding. Mexico: CIMMYT; 2001. p. 124–5.

    Google Scholar 

  7. 7.

    Asseng S, Ewert F, Martre P, Rötter RP, Lobell DB. Rising temperatures reduce global wheat production. Nat Clim Chang. 2015;5:143–7.

    Article  Google Scholar 

  8. 8.

    Wardlaw IF, Dawson IA, Munibi P. The tolerance of wheat to high tem- perature during grain reproductive growth. I. Survey procedures and general response patterns. Aust J Agric Res. 1989;40:1–13.

    Article  Google Scholar 

  9. 9.

    Prasad PVV, Pisipati SR, Ristic Z, Bukovnik U, Fritz AK. Impact of nighttime temperature on physiology and growth of spring wheat. Crop Sci. 2008;48(6):2372–80.

    Article  Google Scholar 

  10. 10.

    Hays DB, Do JH, Mason RE, Morgan G, Finlayson SA. Heat stress induced ethylene production in developing wheat grains induces kernel abortion and increased maturation in a susceptible cultivar. Plant Sci. 2007;172:1113–23.

    CAS  Article  Google Scholar 

  11. 11.

    Rane J, Pannu RK, Sohu VS, Saini RS, Mishra B, Shoran J, et al. Performance of yield and stability of advanced wheat genotypes under heat stressed environments of indo-Gangetic Plains. Crop Sci. 2007;47:1561–73.

    Article  Google Scholar 

  12. 12.

    Stone PJ, Nicolas ME. Wheat cultivars vary widely in their responses of grain yield and quality to short periods of post-anthesis heat stress. Aust J Plant Physiol. 1994;21:887–900.

    Google Scholar 

  13. 13.

    Mondal S, Rutkoski JE, Velu G, Singh PK, Crespo-Herrera LA, et al. Harnessing diversity in wheat to enhance grain yield, climate resilience, disease and insect pest resistance and nutrition through conventional and modern breeding approaches. Front Plant Sci. 2016;7(991)

  14. 14.

    Yang J, Sears RG, Gill BS, Paulsen GM. Quantitative and molecular characterization of heat tolerance in hexaploid wheat. Euphytica. 2002;126:275–82.

    CAS  Article  Google Scholar 

  15. 15.

    Mohammadi V, Zali AA, Bihamta MR. Mapping QTLs for heat tolerance in wheat. J Agr Sci Tech. 2008;10:261–7.

    Google Scholar 

  16. 16.

    Paliwal R, Röder MS, Kumar U, Srivastava JP, Joshi AK. QTL mapping of terminal heat tolerance in hexaploid wheat (T. aestivum L.). Theor Appl Genet. 2012;125(3):561–75

    Article  PubMed  Google Scholar 

  17. 17.

    Tiwari C, Wallwork H, Kumar U, Dhari R, Arun B, Mishra VK, et al. Molecular mapping of high temperature tolerance in bread wheat adapted to the eastern Gangetic plain region of India. Field Crops Res. 2013;154:201–10

    Article  Google Scholar 

  18. 18.

    Sukumaran S, Dreisigacker S, Lopes M, Chavez P, Reynolds MP. Genome-wide association study for grain yield and related traits in an elite spring wheat population grown in temperate irrigated environments. Theor Appl Genet. 2015;128(2):353–63

    CAS  Article  PubMed  Google Scholar 

  19. 19.

    Rasheed A, Hao Y, Xia X, Khan A, Xu Y, Varshney RK, et al. Crop breeding chips and genotyping platforms: Progress, challenges, and perspectives. Mol Plant. 2017;10(8):1047–64

    CAS  Article  PubMed  Google Scholar 

  20. 20.

    Ogbonnaya FC, Rasheed A, Okechukwu EC, Jighly A, Makdis F, Wuletaw T, et al. Genome-wide association study for agronomic and physiological traits in spring wheat evaluated in a range of heat prone environments. Theor Appl Genet. 2017;130(9):1819–35

    Article  PubMed  Google Scholar 

  21. 21.

    Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, et al. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS One. 2011;6(5):e19379

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Poland JA, Rife TW. Genotyping-by-sequencing for plant breeding and genetics. The Plant Genome Journal. 2012;5(3):92

    CAS  Article  Google Scholar 

  23. 23.

    Sukumaran S, Lopes M, Dreisigacker S, Reynolds M. Genome wide association mapping for grain weight in spring wheat across multiple environments. Conference Paper, (March), 2017; 20–27.

  24. 24.

    Mwadzingeni L, Shimelis H, Rees DJG, Tsilo TJ. Genome-wide association analysis of agronomic traits in wheat under drought-stressed and non-stressed conditions. PLoS One. 2017;12(2):1–13

    Article  Google Scholar 

  25. 25.

    Guo Z, Chen D, Alqudah AM, Röder MS, Ganal MW, Schnurbusch T. Genome-wide association analyses of 54 traits identified multiple loci for the determination of floret fertility in wheat. New Phytol. 2017;214(1):257–70

    CAS  Article  PubMed  Google Scholar 

  26. 26.

    Arora S, Singh N, Kaur S, Bains NS, Uauy C, Poland J, et al. Genome-wide association study of grain architecture in wild wheat Aegilops tauschii. Front Plant Sci. 2017;8(886)

  27. 27.

    Li J, Rasheed A, Guo Q, Dong Y, Liu J, Xia X, et al. Genome-wide association mapping of starch granule size distribution in common wheat. J Cereal Sci. 2017;77:211–8

    CAS  Article  Google Scholar 

  28. 28.

    Kumar S, Kumari P, Kumar U, Grover M, Singh AK, Singh R, Sengar RS. Molecular approaches for designing heat tolerant wheat. J Plant Biochem Biot. 2013;22(4):359–71.

    Article  Google Scholar 

  29. 29.

    Pinto RS, Reynolds MP, Mathews KL, McIntyre CL, Olivares-Villegas J, Chapman SC. Heat and drought adaptive QTL in a wheat population designed to minimize confounding agronomic effects. Theor Appl Genet. 2010;121:1001–21.

    Article  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Bellucci A, Torp AM, Bruun S, Magid J, Andersen SB, Rasmussen SK. Association mapping in Scandinavian winter wheat for yield, plant height, and traits important for second-generation bioethanol production. Front Plant Sci. 2015;6(November):1–12

    Google Scholar 

  31. 31.

    Zanke CD, Ling J, Plieske J, Kollers S, Ebmeyer E, Korzun V, et al. Analysis of main effect QTL for thousand grain weight in European winter wheat (Triticum aestivum L.) by genome-wide association mapping. Front Plant Sci. 2015;6(September):1–14.

    Google Scholar 

  32. 32.

    Wurschum T, Langer SM, Longin CFH. Genetic control of plant height in European winter wheat cultivars. Theor Appl Genet. 2015;128(5):865–74.

    CAS  Article  PubMed  Google Scholar 

  33. 33.

    Chen GF, Wu RG, Li DM, Yu HX, Deng Z, Tian JC. Genome wide association study for seeding emergence and tiller number using SNP markers in an elite winter wheat population. J Genet. 2017;96(1):177–86

    Article  PubMed  Google Scholar 

  34. 34.

    Zadoks JC, Chang TT, Konzak CF. A decimal code for growth stages of cereals. Weed Res. 1974;14:415–21.

    Article  Google Scholar 

  35. 35.

    Mascher M, Wu S, Amand PS, Stein N, Poland J. Application of genotyping-by-sequencing on semiconductor sequencing platforms: a comparison of genetic and reference-based marker ordering in barley. PLoS One. 2013;8(10):e76925.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Wei T, Simko V, Levy M, Xie Y, Jin Y, Zemla J. (2017). Package ‘corrplot’. Statistician. 2017;56:316–24.

    Google Scholar 

  37. 37.

    Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics. 2007;23:2633–5.

    CAS  Article  Google Scholar 

  38. 38.

    Zhang Z, Ersoz E, Lai CQ, Todhunter RJ, Tiwari HK, Gore MA, et al. Mixed linear model approach adapted for genome-wide association studies. Nat Genet. 2010;42(4):355–60

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  39. 39.

    Yu J, Pressoir G, Briggs WH, Vroh I, Bi M, Yamasaki JF, et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet. 2006;38:203–8.

    CAS  Article  PubMed  Google Scholar 

  40. 40.

    Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B. 1995;57:289–300.

    Google Scholar 

Download references


We thank P. St Amand, A. Bernardo and G. Bai (United States Department of Agriculture-Agricultural Research Service, Hard Winter Wheat Genetics Research Unit, Manhattan, KS 66506, USA) for supervising the genotyping-by-sequencing and Manzoor Hussain (Regional Agricultural Research Institute (RARI) Bahawalpur, Pakistan) for conducting the field trials at Bahawalpur.


No funding was released for this research work.

Availability of data and materials

The phenotypic data of the current study is available in the Additional file 1: Table S1. Any other datasets used and/or analyzed are available upon request through email to the corresponding author.

Author information




MJ performed the experiment and wrote the paper. AA performed the multivariate data analysis. AG2 performed genotyping-by-sequencing. AG3 performed the field planting and sample collection of wheat lines. AAN collected the field data. AMHI formulated the univariate statistical analysis. NHN and NAY contributed in revisions. AMK planned the experiment. All the authors have read and approved the final manuscript.

Corresponding author

Correspondence to Muhammad Jamil.

Ethics declarations

Ethics approval and consent to participate

We declare that these experiments within the ethical standards and legislations in Pakistan.

Consent for publication

National and international guidelines were followed during the collection of plant material.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1:

Table S1. Pedigree of 125 studied hexaploid spring wheat lines with PCs, AHC clustering, phenotypic values of studied traits under normal, late sown conditions at three locations, average of three locations under normal, late sown conditions and number of favored alleles in each genotype under late sown conditions. (XLSX 83 kb)

Additional file 2:

Figure S2. Single nucleotide polymorphism (SNP) distributions on 21 chromosomes in 125 wheat lines, in the vertical axis are the 21 chromosomes. The horizontal axis shows chromosome length (Mb); 0 ~ 20 depicts SNP density (the number of SNPs per window). (JPG 4135 kb)

Additional files 3:

Figure S3-S9. Manhattan plot with QQ plot of days to heading, grain filled duration, plant height, spikes per plant, grain numbers per spike, thousand kernel weight and grain yield under normal (DHN, GFDN, PHN, SPPN, GNSN, TKWN, GYN) and late (DHL, GFDL, PHL, SPPL, GNSL, TKWL, GYL) conditions in 125 wheat lines. (ZIP 38475 kb)

Additional file 4:

Figure S10. Temperature details of the cropping season at three experimental locations. (PNG 106 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jamil, M., Ali, A., Gul, A. et al. Genome-wide association studies of seven agronomic traits under two sowing conditions in bread wheat. BMC Plant Biol 19, 149 (2019).

Download citation


  • Wheat
  • Heat tolerance
  • Genotyping-by-sequencing
  • SNPs
  • GWAS