Discovery, distribution and diversity of Puroindoline-D1 genes in bread wheat from five countries (Triticum aestivum L.)

Background Grain texture is one of the most important characteristics in bread wheat (Triticum aestivum L.). Puroindoline-D1 genes play the main role in controlling grain texture and are intimately associated with the milling and processing qualities in bread wheat. Results A series of diagnostic molecular markers and dCAPS markers were used to characterize Pina-D1 and Pinb-D1 in 493 wheat cultivars from diverse geographic locations. A primer walking strategy was used to characterize PINA-null alleles at the DNA level. Results indicated that Chinese landraces encompassing 12 different Puroindoline-D1 allelic combinations showed the highest diversity, while CIMMYT wheat cultivars containing 3 different Puroindoline-D1 allelic combinations showed the lowest diversity amongst wheat cultivars from the five countries surveyed. Two novel Pina-D1 alleles, designated Pina-D1s with a 4,422-bp deletion and Pina-D1u with a 6,460-bp deletion in the Ha (Hardness) locus, were characterized at the DNA level by a primer walking strategy, and corresponding molecular markers Pina-N3 and Pina-N4 were developed for straightforward identification of the Pina-D1s and Pina-D1u alleles. Analysis of the association of Puroindoline-D1 alleles with grain texture indicated that wheat cultivars with Pina-null/Pinb-null allele, possessing an approximate 33-kb deletion in the Ha locus, have the highest SKCS hardness index amongst the different genotypes used in this study. Moreover, wheat cultivars with the PINA-null allele have significantly higher SKCS hardness index than those of Pinb-D1b and Pinb-D1p alleles. Conclusions Molecular characterization of the Puroindoline-D1 allele was investigated in bread wheat cultivars from five geographic regions, resulting in the discovery of two new alleles - Pina-D1s and Pina-D1u. Molecular markers were developed for both alleles. Analysis of the association of the Puroindoline-D1 alleles with grain texture showed that cultivars with PINA-null allele possessed relatively high SKCS hardness index. This study can provide useful information for the improvement of wheat quality, as well as give a deeper understanding of the molecular and genetic processes controlling grain texture in bread wheat.


Background
Grain texture is one of the most important characteristics determining the end-use properties of bread wheat (Triticum aestivum L.). It is well known that grain texture is mainly controlled by the Ha (Hardness) locus on the short arm of the 5D chromosome, even though the Ha loci were identified on homologous group 5 chromosome in bread wheat [1]. Compared with the Ha loci on the 5AS and 5BS chromosome, the Ha locus on the 5DS chromosome possesses three special genes -Puroindoline a (Pina-D1), Puroindoline b (Pinb-D1) and Grain Softness Protein (Gsp-1). Puroindoline genes have been proven to play a key role in modulating the grain texture in bread wheat [2][3][4]. However, the mechanism by which Puroindoline genes soften endosperm remains unknown. Moreover, the Gsp-1 gene does not perform a significant function in determining grain texture [5,6].
The Pina-D1 and Pinb-D1 genes were shown to encode wheat endosperm-specific lipid binding proteins with a unique tryptophan-rich domain which was considered as being responsible for the strong affinity of the Puroindoline-D1 protein to polar lipids. The Puroindoline-D1 genes were identified in almost all of wheat and its diploid ancestors as well as related species, except the tetraploid Triticum species [7]. Both wildtype Pina-D1 and Pinb-D1 alleles produce a soft endosperm, whereas mutations in either Pina-D1 or Pinb-D1 results in endosperm hardening in bread wheat [8][9][10]. Since the first reported mutation in Puroindoline-D1 genes was reported in bread wheat [2], many natural mutations in the Pina-D1 and Pinb-D1 genes have been found (i.e. Pina-D1b~t and Pinb-D1b~ac allele (see reviewed in [11][12][13])). All these mutations produce hard endosperm in bread wheat, and variations in Puroindoline-D1 alleles have also been associated with differences in wheat quality [14][15][16]. In most geographic regions, bread wheat cultivars with the Pinb-D1b allele are predominant in bread wheat but there are some exceptions, e.g. the PINA null is the most popular allele in the CIMMYT (International Maize and Wheat Improvement Center) bread wheat cultivars [17] and Pinb-D1p is prevalent in the Chinese landrace cultivars [18,19]. Moreover, cultivars with the PINA-null allele tend to give harder endosperm than those with the Pinb-D1b [14,18,[20][21][22], and the former may be less preferable from a milling standpoint. Recently, a novel group of Pinb-2 variants described as Pinb-like genes [23][24][25] demonstrated to an inability to significantly contribute to grain softening when compared to the Puroindoline-D1 genes [26,27] and also do not have intimate association with some quality characterizations surveyed by Mohler et al. [28].
Hexaploid wheat has a 29 kb smaller Ha locus in the D genome than its diploid donator Ae. tauschii, mainly due to transposable element insertions and two large deletions caused by illegitimate recombination (Chantret et al. [1]). It is possible, a large deletion in the Pina-D1b allele occurred after the DD genome of Ae. tauschii evolved into the AABBDD genomes of bread wheat. Compared with wild type, the Pina-D1b allele possesses a 15,380-bp deletion containing most of the Pina-D1 coding region [29]. The Pina-D1r has a 10,415-bp deletion containing the entire Pina coding region and was identified in both Chinese and Japanese landraces [30,31]. Two corresponding molecular markers (Pina-N1 and Pina-N2), which span the deletions, were developed for straightforward identification of the Pina-D1b and Pina-D1r alleles.
As the largest producer and consumer of bread wheat in the entire world, China is also a secondary origin center of bread wheat, holding a highly diverse stock of wheat germplasm. Of ten Chinese agro-ecological zones, the Yellow and Huai wheat production region covering eight provinces including all Henan is the largest and most important wheat production region, and accounts for 45% of the country's total harvested area and 48% of the total wheat production [32]. Meanwhile, the Yellow and Huai valley is one of the secondary origin centers of bread wheat in China, and a large number of landraces have been collected or developed in this region, some of which have played an important role in improving wheat productivity. Moreover, in order to improve the genetic diversity and avoid germeplasm homogenization, a wide variety of international germplasms from abroad (e.g. CIMMYT, Australia, USA, and Europe) have been introduced or exchanged to this region for wheat breeding programs. The main purpose of this study was to investigate the distribution of Puroindoline-D1 alleles in landraces and introduced cultivars of the Yellow and Huai Valley of China, in order to further characterize the molecular mechanism of PINA-null alleles, and to develop a robustly PCR-based molecular marker approach for wheat breeding.

Plant materials
In this study, a total of 493 bread wheat cultivars and advanced lines, including 204 Chinese landrace cultivars, 104 CIMMYT cultivars, 88 Australian cultivars, 53 Chilean cultivars and 44 cultivars from the Netherlands, were used to identify SKCS (Single Kernel Characterization System) hardness and Puroindoline-D1 genes. Those cultivars or advanced lines, exchanged or introduced from different countries or regions by Seed Bank of Henan Agricultural University, possessed certain or multiple superior agronomic traits and are being popularly used as parents in wheat breeding programs in the Yellow and Huai Valleys of China. All the accessions surveyed in this study were planted at the Zhengzhou Scientific Research and Education Center of Henan Agricultural University during the 2009-10 and 2010-11 cropping seasons, and grew well under the local management practices which involved the use of a supporting net for Chinese landraces. We harvested them at the different stages according to the maturity of each accession to make sure each cultivar was fully mature.
Seven near-isogenic lines (NILs) with different Puroindoline-D1 alleles (Pina-D1b, Pinb-D1b, Pinb-D1c, Pinb-D1d, Pinb-D1e, Pinb-D1f and Pinb-D1g), kindly provided by Prof. Xia Xianchun from Chinese Academy of Agricultural Sciences, were planted in Zhengzhou and Zhoukou in 2011-2012 cropping seasons, and were used to further examine the influence of Puroindoline-D1 alleles on grain texture. These lines were developed at the USDA-ARS Western Wheat Quality Laboratory, Pullman, Washington. The NILs were developed by crossing donor parents possessing unique Puroindoline a and Puroindoline b gene haplotypes as male to the soft white spring wheat cultivar Alpowa. Seven backcrosses were conducted such that the general pedigree of each NIL is: Alpowa/donor parent//7*Alpowa [33].
The kernel hardness index of all wheat cultivars and advanced lines were measured by the Perten Single Kernel Characterization System (SKCS) 4100, following the manufacturer's operation procedure (Perten Instruments North America Inc., Springfield, IL). The mean, standard deviation (SD), and distribution of SKCS hardness data, were used to classify the cultivars into 'soft' , 'mixed' , and 'hard' types.

DNA extraction and PCR parameters
Genomic DNA of each hard wheat cultivar surveyed was separately extracted from three pulverized kernels following the method of Chen et al. [24]. Genomic DNA from seedlings was used for either marker development or primer walking strategy [18]. PCR amplifications were performed in a PTC-200 Peltier Thermocycler or an ABI 9700 and were conducted in 25 μl reactions using 100 ng of genomic DNA, 10 pmol of each primer, 200 μM of each dNTP mix, 1× Taq DNA polymerase reaction buffer with 1.5 μM of MgCl 2 , and 0.5 units of Taq DNA polymerase. The cycling conditions were 94°C for 5 min following 35 cycles of 94°C for 50 s, 50°C to 65°C for 50 s (primer-specific annealing temperatures, see Table 1), 72°C for 1 min, following a 10-min final extension time at 72°C. All PCR products were separated via gel-electrophoresis on a 1.5% agarose gel stained with ethidium bromide and visualized by UV light.

Genotyping of Puroindoline-D1 alleles in bread wheat
Five soft wheat cultivars were randomly selected for directly sequencing their Pina and Pinb genes with primer sets Pina-D and Pinb-D (Table 1) because they all should be wild-type Puroindoline-D1 genes, i.e. Pina-D1a/Pinb-D1a. We removed all mixed wheat cultivars from this study as they possibly contained more than one genotype for each cultivar [17,34].
For the hard wheat cultivars based on SKCS classification, we first divided them into three groups by amplifications with primer sets Pina-D (containing the whole Pina-D1 coding region) and Pinb-D (containing whole Pinb-D1 coding region) ( Table 1), i.e. Group I with both expected fragments of Pina-D1 and Pinb-D1, Group II with only expected Pinb-D1 fragment and Group III without any expected fragment of Pina-D1 and Pinb-D1 genes. In Group I, the Pinb-D1b allele was initially identified by a reciprocal pair of primer sets Pinb-D1b1 and Pinb-D1b2 [3]. PCR products amplified with the Pinb-D primer set were digested by restriction enzymes PvuII and Pf1MI for identification of the Pinb-D1c and Pinb-D1p alleles, respectively, following the methods of Lillemo and Morris [9] and Li et al. [19]. For the other remaining cultivars in this group, the PCR products amplified with the Pina-D and Pinb-D primer sets were directly sequenced from both strands by SinoGenoMax Co., Ltd (http://www.sinogenomax.com/) and genotypes were confirmed by alignment with either known Puroindoline-D1 alleles or the NCBI blast website (http://blast.ncbi.nlm. nih.gov/).
In Groups II and III, the Pina-N1 and Pina-N2 markers we previously developed [29,30] were firstly used to identify Pina-D1b and Pina-D1r alleles, respectively, and other remaining cultivars were used for the primer walking strategy illustrated below.

Development of dCAPS marker
Although the Pina-D1l and Pina-D1n alleles were discovered in previous studies [8,18,30], no valid detection marker existed. This promoted us to establish a dCAPS (derived Cleaved Amplified Polymorphic Sequences) technique as a detection method. Two sets of primers for amplifying fragments containing SNPs in cultivars with the Pina-D1l or Pina-D1n allele were designed using the dCAPS Finder 2.0 (http://helix.wustl.edu/ dcaps/dcaps.html) software along with the appropriate restriction enzymes [35]. The restriction enzymes BalI and BsrDI were used to directly digest the PCR products amplified with primer sets of BalI_Pina-D1l and BsrDI_Pina-D1n (Table 1), respectively, for the detection of Pina-D1l and Pina-D1n alleles. Cultivars with 176-bp and 124-bp digested fragments belong to Pina-D1n and Pina-D1l alleles, respectively.

Primer walking strategy
Sequences of the Ha-5D loci (CT009735) from NCBI were used to design genome-specific primers around the Pina and Pinb coding regions again Ha-5A (CT009586) and Ha-5B (CT009585) loci. A total of 38 pairs of primer sets spanning an approximately 40-kb region ( Table 1) were designed between −10,386 bp (reference to the ATG of the Pina gene) and +11,447 bp (reference to the ATG of the Pinb gene) for the primer walking strategy in order to illustrate the molecular mechanism of cultivars with the absence of the Pina gene or both Pina and Pinb genes ( Figure 1). Based on the failure or success of PCR amplification, the size and position of each deletion fragment was deduced and new primers spanning estimated deletion fragment were designed for straightforward amplification of pending test samples (see schematic diagram in Figure 2). PCR products with successful amplification were sequenced to obtain the exact size and position of deletions.

Results
Distribution of Puroindoline-D1 alleles in bread wheat cultivars of five countries Based on SKCS hardness index and distribution, 139 and 354 of the surveyed 493 wheat cultivars and advanced lines were classified as soft and hard genotypes, respectively. The 139 soft wheat cultivars were assumed to possess the wild type Puroindoline-D1 haplotype (Pina-D1a/Pinb-D1a) [4,[11][12][13]. This suggests that hard wheat is predominant in surveyed Chinese landraces and cultivars of Mexico, Australia, Chile and Netherlands even though soft wheat also possesses high distribution, with a percentage of 39%, in Chinese landraces surveyed.
All 354 hard wheat in this study were genotyped for Puroindoline-D1 alleles. The results from PCR amplification of allele-specific primers, digestions with restriction enzymes PvuII and Pf1MI and sequencing indicated that 105, 4 and 47 of 354 hard wheat cultivars possessed the Pinb-D1b, Pinb-D1c and Pinb-D1p alleles (Figure 2A,  2B), respectively. Based on detection of molecular markers we developed previously [29,30], 129 and 21 cultivars belong to Pina-D1b and Pina-D1r alleles, respectively. For the remaining 48 cultivars, we tried to      (Table 3) according to the above-mentioned nomenclature. For four cultivars without Pina and Pinb genes, including one Netherlands wheat cultivar Pcatan and three Chinese landraces Yumai, Changyaomai and Yangqingke, expected fragment sizes could only be gained in primer sets through Pina-1 to 3 and Pinb-9 to Pinb-13. Therefore, an approximate 33-kb deletion fragment containing Pina and Pinb coding regions could be deduced to occur in those three landraces when compared with the Ha sequence on the chromosome of 5DS in Chinese Spring (Table 3, Figure 1). However, a valid marker spanning this big deletion for specifying the location of this new allele was not obtained due to high similarity with the Ha loci of A and B genome in this region, even though several primer sets spanning this deletion were designed. This mutation with a single ≈ 33-kb deletion containing Pina and Pinb coding regions was temporarily designated as Pina-null/Pinb-null due to the large deletion simultaneously related to Pina and Pinb genes. In previous reports [6,45], some cultivars had been found to lack the Pina and Pinb coding regions, designated Pina-D1k by Morris and Bhave [13]. However, Pina-null/Pinb-null is still used for describing this allele in this study because it is not known if the above four cultivars have the same molecular characterization on the DNA level with Pina-D1k allele.

Distribution of Puroindoline-D1 alleles and their association with grain texture
Amongst hard cultivars from different countries, Chinese landraces showed the highest diversity on Puroindoline-D1 genes and possessed 11 types of Puroindoline-D1 alleles in hard wheat landraces (Table 4). CIMMYT hard wheat, only composed of two kinds of Puroindoline-D1 alleles, showed the lowest diversity among the four countries surveyed and Pina-D1b was predominant with the high percentage of 94.6%, which is consistent with previous studies [14,17]. In wheat cultivars from Australia and Netherlands, Pinb-D1b was predominant with the high percentage of 73.6% and 56.7%, respectively. In Chile, Pina-D1b and Pinb-D1b are almost equally distributed in hard wheat cultivars. Surprisingly, 4 out of 6 cultivars with scarce allele Pinb-D1d showing relative superior processing quality [15] was found in Netherlands. Pinb-D1p was only found, and prevalent, in Chinese landraces (Table 4). Notably, based on five kernels' results, one Chinese landrace cultivar, Bailaolaibian, possesses a double mutation genotype Pina-D1r/Pinb-D1p and its SKCS hardness index is 73.2.
In this study, we divided all cultivars surveyed into two groups of Chinese landraces and introduced cultivars for analyzing the association of Puroindoline-D1 alleles with grain texture due to the obvious difference on agronomic traits between them. A two-year average of SKCS hardness index was compared by significant differences of variance analysis among different genotypes in Chinese landraces and introduced cultivars even though grain texture possessed a high heritability of more than 80% based on previous reports. In Chinese landraces, the cultivars with Pina-null/Pinb-null allele possess the highest SKCS hardness index among several genotypes (+10321) (+11447) Yes Yes --Yes a and b indicate relative positions of primers referenced to ATG in the Pina and Pinb genes, respectively. c "Yes" and "No" indicate failure and success of amplification. d "-" indicates PCR amplification was not performed.
( Table 5). Due to the absence of both Pina-D1 and Pinb-D1 genes, those cultivars have a similar grain texture to durum wheat which also has an extremely high SKCS hardness index. Three types of Pina-D1 mutations resulting in PINA protein null do not show significant difference of SKCS hardness but they all have significantly higher SKCS hardness than Pinb-D1b genotype ( Table 5). Of the introduced cultivars, PINA-null and PINB-D1c genotypes show significantly higher SKCS hardness than PINB-D1b and PINB-D1d genotypes, which are consistent with the results of Morris et al. [10], that PINB-D1c genotype possesses significantly higher SKCS hardness than PINB-D1b genotype.
In order to further investigate the influence of Puroindoline-D1 alleles on grain texture and obtain a clean association of Puroindoline-D1 alleles with SKCS hardness without impact of other loci in the genome, seven near-isogenic lines with different Puroindoline-D1 alleles were used to compare their SKCS hardness index (Table 5). Results indicate that PINA-null genotype possesses the significantly highest SKCS hardness, whereas PINB-D1b and PINB-D1d genotypes possess the significantly lowest SKCS hardness amongst seven different hard genotypes (Table 5). These results are consistent with above-mentioned results derived from Chinese landrace cultivar and introduced wheat cultivars.

Discussion
Grain texture, which is mainly controlled by the Puroindoline-D1 genes on the 5DS chromosome, has an Pina-D1a Pinb-D1ac Hard -G to T at position 257 and Gln-99 → stop codon [44] important impact on the milling and processing qualities of bread wheat (Triticum aestivum L.). It has shown mutations in either Pina-D1 or Pinb-D1 allele result in a hard endosperm in bread wheat based on discoveries of many Puroindoline-D1 alleles. However, most of the mutations identified previously in bread wheat resulted from a single nucleotide polymorphism (SNP) in Pinb-D1 or Pina-D1 genes. In this study, we found that diverse mutations occurred in the Ha loci of bread wheat, in the form of large deletions including entire or partial Pina-D1 coding region, and caused the PINA-null allele.
In the long term, the lack of a straightforward marker for identifying the PINA-null allele leads us to develop a Pina-N1 marker for detection of Pina-D1b allele [29]. The previous most common approach for detecting Pina-D1b allele was to examine the presence or absence of the PINA protein, however this approach fails to identify the status of PINA-null allele at the DNA level. Pina-D1l/Pinb-D1a  Therefore, almost all of PINA-null allele was taken into account as Pina-D1b allele [17,18,39,46]. However, the findings from our study show the PINA-null allele to possess a completely different molecular characterization at the DNA level. The PINA-null (Pina-D1b previously called) allele is known to be the most prevalent genotype in the CIMMYT bread wheat cultivars [14,17]. In this study, all of PINA-null allele in CIMMYT wheat surveyed was shown to have the Pina-D1b allele. The four molecular markers (Pina-N1 and Pina-N2 we previously developed; Pina-N3 and Pina-N4 in this study) will be useful for straightforward and efficient identification of PINA-null alleles in bread wheat cultivars.
Up to now, many Pina-D1 and Pinb-D1 alleles have been identified in different geographic bread wheat cultivars from around the world. Amongst different countries or regions, that China seems to possess a relatively more diverse germplasm of bread wheat on the genotype of grain texture, based on several investigations of Puroindoline-D1 alleles [10,17,18,47] because almost all of the Puroindoline-D1 alleles previously reported in other countries or regions, outside of China, have been aslo found in Chinese wheat cultivars whereas a number of Puroindoline-D1 alleles appear to be exclusive to Chinese wheat cultivars so far, e.g. Pinb-D1p, Pinb-D1q, Pinb-D1t, Pinb-D1u, Pinb-D1v, Pinb-D1w, Pinb-D1x, Pinb-D1aa, Pinb-D1ac, Pina-D1m, Pina-D1n, Pina-D1p, Pina-D1q, Pina-D1r etc. [18,20,34,41,42,44]. In this study, Chinese landraces also showed the highest diversity of Puroindoline-D1 alleles among wheat cultivars from five different countries. Due to the PINA-null allele, which possibly result from an evolution of hexaploid wheats from Ae. tauschii [30,31], the molecular mechanism of each cultivar with PINA-null allele has been illustrated by either known molecular markers or the primer walking strategy. The discovery of Pina-D1v and Pina-D1u showed that five types of PINA-null alleles (Pina-D1b, Pina-D1s, Pina-D1r, Pina-D1v and Pina-D1u) have different deletion sites from each other, suggesting that the deletions could have occurred independently. According to previous reports [1,7,10,22] and recent work on Puroindoline-D1 genes in Ae. tauschii (Personal communicate with Craig F. Morris in Washington State University), all five of the above-mentioned Pina-D1 alleles are not found in Ae. tauschii so far, suggesting that the big deletions of the above Pina-D1 allele possibly occurred during the formation of hexaploid wheat. Interestingly, all wheat cultivars with the PINA-null allele from CIMMYT, Australia, Netherlands and Chile are further identified as Pina-D1b allele in this study, whereas Chinese landraces with PINA-null allele are shown to possess four different alleles of Pina-D1s, Pina-D1r, Pina-D1v and Pina-D1u, which is possibly because China is the secondary origin center of hexaploid wheat in the word.
The Yellow and Huai valley is the largest and the most important Chinese wheat production region and is greatly responsible for the national food security guarantee. However, wheat production and quality have not significantly improved during the past decade in this region. A potential reason for this is mainly because the narrow genetic basis of modern wheat cultivars is a serious obstacle against sustaining and improving wheat productivity due to rapid vulnerability of genetically uniform cultivars by potentially new biotic and abiotic stresses. In an attempt to improve the status quo, a large number of alien wheat germplasms were introduced to the Yellow and Huai wheat production region of China. Even though we have previously reported the wheat cultivars of the Yellow and Huai wheat production region regarding Puroindoline-D1 alleles [30], investigation in this study primarily focused on landraces and introduced cultivars that are or were being core parents during the breeding process in the Yellow and Huai wheat production region. The cultivars previously used in Chen et al. [30] were mainly historical cultivars and modern cultivars, and all accessions we used previously were excluded in this study. Therefore, the work carried out in current and previous [30] studies could provide a more comprehensive understanding of wheat germplasms, particularly as potential parents for wheat breeding programs in view of grain texture in the Yellow and Huai wheat production region.

Conclusion
In the present study, molecular characterization of the Puroindoline-D1 allele was investigated in bread wheat cultivars from five geographic regions. Two novel alleles Pina-D1s and Pina-D1u at the Pina-D1 locus were characterized at the DNA level by a primer walking strategy, and corresponding molecular markers were developed for straightforward identification of these two alleles. Analysis of the association of Puroindoline-D1 alleles with grain texture indicated that wheat cultivars with Pina-null/Pinb-null allele have the highest SKCS hardness index amongst the different genotypes, and wheat cultivars with the PINA-null allele have significantly higher SKCS hardness index than those of Pinb-D1b and Pinb-D1p alleles.