Genome-wide association study reveals genetic variation and candidate genes of lint yield components under salt field conditions in cotton (Gossypium hirsutum L.)


 Background

Salinity is one of the most decisive environmental factors limiting the productivity of cotton. However, the key genetic components leading to the reduction of cotton yield in saline-alkali soils are still unclear. 

Results

Here, we evaluated three main components of lint yield across 316 G. hirsutum accessions, including single boll weight (SBW), lint percentage (LP) and boll number per plant (BNPP), under four salt conditions for two years. Phenotypic analysis indicated that LP showed no change under different salt conditions, however BNPP decreased significantly while SBW increased slightly under high salt condition. Based on 57,413 high-quality single nucleotide polymorphisms (SNPs) and genome-wide association studies (GWAS) analysis, a total of 42, 91 and 25 stable quantitative trait loci (QTLs) were identified for SBW, LP and BNPP, respectively. Few overlapped QTLs and no significant phenotypic correlation among the three traits was observed. Gene Ontology (GO) analysis indicated that their regulatory mechanisms were also quite different. There were 8 overlapped QTLs for LP while fewer for SBW and BNPP identified by comparing different salt conditions. We detected that 10 genes from the 8 stable LP QTLs were predominantly expressed during fiber development. Further, haplotype analyses found that a MYB gene ( GhMYB103 ) with the two SNP variations in cis-regulatory and coding regions, was significantly correlated with lint percentage, implying the crucial role in lint yield. With transcriptome analysis, we identified that 40 candidate genes from BNPP QTLs were salt-inducible. However, these genes exhibited different regulation pattern. Genes related to carbohydrate metabolism and cell structure maintenance were rich in high salt condition, while genes related to ion transport were active in low salt condition. 

 

 

Conclusions

This study provides a foundation for elucidating cotton salt tolerance mechanism and contributes gene resources for developing varieties with high yield and salt stress tolerance in upland cotton.

detected only once, implying these QTLs be apt to be affected by environmental condition (Additional file 9: Fig. S3). To improve the reliability and stability of associated QTLs, we selected QTLs detected three or more times across different methods or environments as stable QTLs for further analysis. As a result, 42, 91 and 25 QTLs were identified in SBW, LP and BNPP, respectively ( Table 1).
The chromosomal distribution showed that these stable QTLs were widely distributed on 26 chromosomes, with more QTLs of SBW and BNPP located on At sub-genome than on Dt sub-genome, while QTLs of LP showed opposite ( Fig. 2A). Most of SBW located on chromosomes A11 and A12, LP on A08, D06 and D13, and BNPP on A05, A12 and D07 (Fig.   2B). The vein diagram of these stable QTLs showed that no overlapped QTL was detected within three traits and most of QTLs were specific for individual trait (

Identification of candidate genes in QTLs
Potential candidate genes in these stable QTL regions were extracted based on the released G. hirsutum TM-1 genome [20]. In total, 1166, 2748, and 711 candidate genes were identified in the QTL regions for SBW, LP and BNPP, respectively (Fig. 2C), with most genes distributed on chromosome A12, D13 and A12 for SBW, LP and BNPP, respectively ( Fig. 2D). With GO analysis, the genes in QTL regions for SBW enriched in "embryo development" and "regulation of cell shape" (Additional file 10: Fig. S4 and Additional file 11: Table S7). The genes from LP QTLs mainly enriched in several pathways, including "regulation of organ growth", hormone and ROS regulation such as "regulation of gibberellic acid mediated signaling pathway", "positive regulation of reactive oxygen species metabolic process", "brassinosteroid biosynthetic process" and "defense response by cell wall thickening", and carbohydrate metabolism such as "glycosylation", "glucose metabolic process", "monosaccharide biosynthetic process" and "hexose biosynthetic process", which was consistent with the previous reports that these Go items played the crucial roles in fiber development [11,21]. In addition, we identified 14, 21 and 10 genes related to "Golgi vesicle transport", "plant-type secondary cell wall biogenesis" and "glucose metabolic process", respectively, however, none of these process-related genes were found in the QTL regions of BNPP and SBW (Additional file 12: Fig. S5 and Additional file 11: Table S7). The function of genes associated with BNPP mainly enriched in "mitotic cell cycle", "ion transmembrane transport" and "polysaccharide catabolic process" (Additional file 13: Fig. S6 and Additional file 11:  4A). The single nucleotide mutation (from C to G) at TM55217 locus led to the change of amino acid from leucine (L) to valine (V) (Fig. 4A). Through Student's t test, we found that LP with A genotype in TM55216 was significantly higher than with G genotype (Fig. 4B), and with G genotype in TM55217 significantly higher than with C genotype (Fig. 4C). The two QTNs could generate 3 haplotypes including H1: AG, H2: AC and H3: GC. LP with AG and AC haplotypes were significantly higher than that with GC haplotypes. However, there was no significant difference between AG and AC haplotypes, implying that QTN TM55216 might play more important roles in LP (Fig. 4D).
In addition, we integrated the LP QTLs with GWAS signals published in previous reports [11,12], and found three QTLs (A12: 602614-743324; D13: 58792627-59289811; A09: 4676815-5076815) overlapped with GWAS signals. The three QTLs were detected in different salt conditions. Further, 10 genes from the QTL regions were identified to be predominant expression during fiber development (Additional file 16: Table S9). Of them,
Taken together, the genes from these stable LP QTLs and expressed predominantly in fiber developmental stages play an important role for the LP improvement in breeding practice.

Genes relevant to BNPP
The QTLs of BNPP were easily affected in different environments.  Table S10 and Additional file 19: Table S11). Gh_A04G1216 encoded a high-affinity K + transporter 1 (HKT1). AtHKT1 limits the root-to-shoot sodium transportation and is believed to be essential for salt tolerance in Arabidopsis thaliana In order to explore the key genes and regulation mechanism of boll number under high and low salt conditions, we compared the genes located in QTL regions of BNPP between condition A and D. Totally, 204 and 265 genes were identified under salt condition A and classification of genes between high and low salt conditions. Go terms related to "polysaccharide metabolic process", "carbohydrate catabolic process" and "cell wall organization" were enriched under high salt condition A (Fig. 5A, Additional file 20: Table   S12) and "cell cycle", "ion transmembrane transport" and "regulation of signal transduction" under low salt condition D (Fig. 5B), which indicate that carbohydrate metabolism and cell structure maintenance play an crucial role under high salt condition and ion transport is more basal under low salt condition. Under high salt condition, several candidate genes associated with BNPP were detected. Gh_A11G1551 encodes a proline dehydrogenase 1 (ProDH1), also called early responsive to dehydration 5 (ERD5), which have been studied extensively, especially under abiotic stress [39]. To counteract osmotic stress caused by salt stress, some plants accumulate several kinds of compatible osmolytes, such as proline, glycine betaine, and sugar alcohols, to protect macromolecules and remains osmotic pressure equilibrium inside and outside cell membrane [40]. The expression of ProDH2, a highly homologous gene of ProDH1, can promote proline accumulation under stress conditions [41]. For energy metabolism, Gh_A05G1912 encodes an isoamylase 3 (ISA3) which contributes to starch breakdown.
Atisa3 mutants have more leaf starch and a slower rate of starch breakdown than wildtype plants [42]. Under low salt condition, three candidate genes were identified to play important roles in the balance between sodium and potassium. In detail, Gh_A10G0441 encodes a potassium transporter 1 (KUP1) [43], Gh_A12G0074 encodes a high affinity K+ transporter 5 (HAK5) [44] and Gh_A12G0061 encodes a sodium hydrogen exchanger 2 (NHX2) [45].

Discussion
With the decrease in arable land area and the deterioration of soil environments throughout the world, there is an urgent need to improve for stress tolerance in crop plants. Xinjiang is the main cotton production area in China, but the soil salinization is serious. Excavating elite alleles that can increase cotton yield under saline-alkali QTLs associated with LP on chromosomes A08 and D08, which were reported to contain many QTLs or key genes related to fiber development [11,12]. However, the distribution of QTLs associated with LP was quite different from report by Su et al (2016) [49]. Taken together, LP is a complex quantitative trait, and the majority of loci detected in our study were novel and might be related to salt stress. Especially, 8 QTLs were identified commonly in four salt conditions, which could contribute to the increase of LP under salt stress. Further, we identified 10 genes which were closely related to fiber development in these overlapped LP QTLs, such as PILS, RAB and MYB. In Arabidopsis, RabA4d is necessary for the proper regulation of pollen tube growth. Loss of RabA4d leads to the destruction of pollen tube growth and changes in the structure of the cell wall [50]. MYB transcription factor is also well known to play crucial roles in fiber development.
GhMYB212 RNAi plants (GhMYB212i) accumulated less sucrose and glucose in developing fibers, and had shorter fibers and a lower lint index [51]. Zhang et al. (2007) showed that the transcription factor MYB103 affects callose dissolution during the anther development in Arabidopsis [32]. Although many genes related to LP have been identified by GWAS analysis, however, candidate genes in this study may play a more important role in improving LP under salt stress. In addition, we also identified three QTLs overlapping with the LP loci reported previously from GWAS analysis and further identified 10 candidate genes in the QTL regions. These studies could provide genes resource for improving LP in both salt and normal environments.
GWAS analysis on BNPP are relatively rare, especially under salt stress. Our studies showed that salt stress can lead to a significant decrease in boll number per plant which was consistent with previous reports [46,52], indicating that boll number is the first limiting factor for increasing cotton lint yield under stress environments. We also found the overlapped QTLs associated with BNPP was few under different salt conditions, implying a complex regulatory mechanism for BNPP production. GO analysis showed that genes associated with BNPP mainly involved in "mitotic cell cycle", "ion transmembrane transport" and "polysaccharide catabolic process". Of them, a large number of ion transport related processes are enriched, which suggests that the excellent ion transport capacity plays a key role in salt tolerance of cotton. Na + accumulation can lead to ion poisoning, which induces decline of biomass and yield losses in crop plants [1].
Maintaining ion homeostasis by ion uptake and compartmentalization is crucial for plant growth during salt stress. With RNA-seq analysis, we found that HKT1, which is known to play a role in the removal of Na + from the xylem and bring it back to the root, downregulated under salt stress. Overexpression of HKT1 in roots can decrease Na + accumulation in the shoot and significantly improve salt tolerance in Arabidopsis thaliana [53]. Interestingly, HKT1 was also found down-regulated under salt stress in G. davidsonii, a cotton D-genome diploid species with important properties of salinity stress resistance [54]. It suggests that the function of HKT1 could be improved in cotton for increasing the stress tolerance. We also found that no overlapped QTLs associated with BNPP was detected by comparing under high and low salt conditions, implying a complex regulatory mechanism under different salt conditions. The enriched genes under high salt condition are mainly related to energy metabolism and maintenance of cell morphology. High salt stress can lead to a decrease in photosynthetic efficiency of plants [55]. Under nonstressed conditions, plants use the majority of the energy to maintenance vegetative and reproductive growth. However, plants need to allocate more energy to resist the stress with the increase of salt concentration [1]. In addition, high salt concentration can also increase osmotic stress, and plants need to synthesize more osmolytes to maintain cell morphology. In this study, we found that ISA3 played crucial roles in energy metabolism and ProDH1 contributed to maintenance of cell morphology. In addition, the enriched genes under low salt condition are mainly related to ion transport. As a salt-tolerant crop, cotton suffers less salt damage under low salt conditions, which may be due to efficient ion transport capacity. In this study, KUP1, HAK5 and NHX2 were identified that contributed to ion homeostasis. It suggests that the active sodium and potassium ion exchange capacity at low salt concentration is the basis of salt tolerance in cotton.
Meanwhile, it also reflects the different demand for stress resistance under different salt stress conditions in cotton.
Several reports have suggested that the lint yield can be improved by altering the expression of salt-tolerant genes in cotton. For example, overexpressing AvDH1 can decrease membrane ion leakage, along with increased activity of superoxide dismutase and lead to salinity tolerance and increasing yield in cotton [52]. Overexpression SNAC1, which belongs to the stress-related NAC superfamily of transcription factors, could improve drought and salt tolerance by enhancing root development and reducing transpiration rate in transgenic cotton [56]. In this study, we first report phenotypic and GWAS analysis of three lint yield components in cotton under salt stress and found that BNPP was the most important factor for cotton lint yield in saline-alkali soil environment.
Further, we identified a large number of elite alleles which contributed to improvement of lint yield under salt conditions. These findings will help us understand the salt tolerance mechanism in cotton and provide improvement for breeding cultivars in saline-alkali soil environment.

Plant materials and field experiments
A total of 316 upland cotton accessions, with 303 cultivars/lines collected from different regions of China and 13 landraces introduced from the United States, were used in this study (Additional file 1: Table S1).  Table S2). With wide/narrow row alternation plantation mode (10 cm for narrow row and 66 cm for wide row), each accession was grown two rows with 2 m row length and 0.10 m between plants for each plot. Drip fertilization beneath mulched film was used for plant growth. Other agronomic practices were same for all the treatments.

Phenotype investigation and data analysis
Ten plants for each accession in each plot were selected randomly from middle part of each row and tagged for identification to record the data for SBW, LP and BNPP. At plant maturity (approximately 70% boll open), BNPP was counted with ten biological replicates.    represented the total salt contents in the four soil environments, with 19 g/Kg (condition A), 10 g/Kg (condition B), 7 g/Kg (condition C), and 5 g/Kg (condition D), respectively.

Additional Files
Additional file 1: Table S1. Information on 316 cotton accessions used in this study.