Comparative fiber property and transcriptome analyses reveal key genes potentially related to high fiber strength in cotton (Gossypium hirsutum L.) line MD52ne

Background Individual fiber strength is an important quality attribute that greatly influences the strength of the yarn spun from cotton fibers. Fiber strength is usually measured from bundles of fibers due to the difficulty of reliably measuring strength from individual cotton fibers. However, bundle fiber strength (BFS) is not always correlated with yarn strength since it is affected by multiple fiber properties involved in fiber-to-fiber interactions within a bundle in addition to the individual fiber strength. Molecular mechanisms responsible for regulating individual fiber strength remain unknown. Gossypium hirsutum near isogenic lines (NILs), MD52ne and MD90ne showing variations in BFS provide an opportunity for dissecting the regulatory mechanisms involved in individual fiber strength. Results Comprehensive fiber property analyses of the NILs revealed that the superior bundle strength of MD52ne fibers resulted from high individual fiber strength with minor contributions from greater fiber length. Comparative transcriptome analyses of the NILs showed that the superior bundle strength of MD52ne fibers was potentially related to two signaling pathways: one is ethylene and the interconnected phytohormonal pathways that are involved in cotton fiber elongation, and the other is receptor-like kinases (RLKs) signaling pathways that are involved in maintaining cell wall integrity. Multiple RLKs were differentially expressed in MD52ne fibers and localized in genomic regions encompassing the strength quantitative trait loci (QTLs). Several candidate genes involved in crystalline cellulose assembly were also up-regulated in MD52ne fibers while the secondary cell wall was produced. Conclusion Comparative phenotypic and transcriptomic analyses revealed differential expressions of the genes involved in crystalline cellulose assembly, ethylene and RLK signaling pathways between the MD52ne and MD90ne developing fibers. Ethylene and its phytohormonal network might promote the elongation of MD52ne fibers and indirectly contribute to the bundle strength by potentially improving fiber-to-fiber interactions. RLKs that were suggested to mediate a coordination of cell elongation and SCW biosynthesis in other plants might be candidate genes for regulating cotton fiber cell wall assembly and strength. Electronic supplementary material The online version of this article (doi:10.1186/s12870-016-0727-2) contains supplementary material, which is available to authorized users.


Background
Cotton (Gossypium spp.) fiber is the most important natural fiber in the textile industry [1]. Physical properties such as strength, length, maturity (degree of thickness), and fineness determine the value and quality of cotton fibers and the yarn spun from them. With the advent of modern high speed spinning machinery, which produces a quality yarn in a cost effective way, the demand for stronger fiber has increased in the highly competitive global textile market. During the past 2 decades, the fiber strength of US cotton has gradually improved through breeding [2], however, the pace of improvement is restricted by our limited knowledge of fiber strength development.
Both cotton producers and processers usually measure fiber strength from a bundle composed of thousands of individual fibers. The strength simply refers to the force required to break fibers [1,3]. The bundle fiber strength (BFS), which is also called fiber tenacity, is measured by an automated high-volume instrument (HVI) designed to measure the market value of cotton fibers in efficient and rapid ways. Currently, the Agricultural Marketing Service (AMS) of the United States Department of Agriculture (USDA) separates cotton fibers into different classes ranging from the lowest (23.0 g/tex or below) to the highest class (31.0 g/tex and above) based on the BFS values measured by the HVI. A stelometer is used to measure the BFS value for a relatively small amount of fibers [1,[3][4][5]. Due to the inherent variability among fibers within a bundle, two fiber bundles of the same weight do not have the same number of fibers. Thus, the actual BFS is determined in grams per tex (g/tex) in which the breaking force (g) is standardized by the linear density or fineness (tex = g/km). Hence, high BFS (g/ tex) is obtained by either increasing the breaking force (g) or deceasing the fineness value (tex).
The cotton industry has been using fiber properties determined by the HVI and Advanced Fiber Information System (AFIS) as parameters for predicting yarn quality and selecting the right raw cotton materials to produce different qualities of yarns. However, several textile papers have reported that the BFS data of cotton fibers were inadequate to predict yarn strength due to insignificant correlation between the cotton BFS and the yarn strength [4,6,7]. The BFS values are regulated by intrinsic fiber strength and other fiber properties which affect fiber-to-fiber interactions within a bundle [1,3,4,[8][9][10].
Fiber length values that promote fiber interactions within a bundle impact the BFS values [11]. Fiber thicknessrelated properties including micronaire (MIC), maturity ratio (MR), and fineness values (tex) also affect bundle strength of cotton fibers since they determine the number of individual fibers in a fiber bundle [9].
The intrinsic fiber breaking force value is correlated with yarn strength and can be measured from individual fibers by Mantis or Favimat instruments [1,3,8,9,11]. Average breaking force (cN) obtained from several hundred individual fibers was not affected by other bundle fiber properties and was reported as one of the most important factors in determining the strength of the yarn spun from those fibers [4,12,13]. Despite the usefulness of the individual fiber properties, the Mantis and Favimat instruments, which require laborious processes, have not been previously utilized in cotton genetics and genomics research.
Gossypium hirsutum germplasm NILs, MD52ne and MD90ne were developed through backcross breeding [14]. MD90ne is the recurrent parent and MD52ne is a BC 6 high-BFS selection. MD52ne contains 10 % higher BFS values, 22 % less short fibers, and 7 % greater fiber length than its NIL, MD90ne [15]. The stronger BFS trait of MD52ne was suggested to be controlled by a small number (≤2) of genes [15]. By using the prototype of cotton oligonucleotide microarray chips that was developed from cotton expressed sequence tags (ESTs) [16], we previously observed a temporal up-regulation of secondary cell wall (SCW) biogenesis genes in MD52ne fiber at the transition from fiber elongation to secondary cell wall (SCW) biosynthesis as compared with MD90ne [17]. Another group also suggested that SCW biogenesis-related genes might contribute to the fiber strength of G. hirsutum chromosome introgression lines (CSILs) carrying chromosomal segments from G. barbadense whose fibers are longer, stronger, and finer than their recurrent parent G. hirsutum TM-1 [18]. The G. hirsutum CSILs were used for identifying potential genes responsible for fiber length [19] in addition to fiber strength [18]. Since the fiber length and fineness affecting fiber-to-fiber interactions within a bundle can modulate the BFS values of the cotton NILs [17] and CSILs [18], it is not clear if the SCW biogenesis-related genes are involved in regulating individual fiber breaking force and/or other bundle fiber properties. Indeed, the same SCW biogenesis-related genes were identified as differentially expressed genes (DEGs) that may be involved in molecular mechanisms of controlling fiber length [19] and fiber thickness-related properties [20,21] in addition to the BFS [17,18]. Therefore, it remains to be answered which candidate genes are really involved in individual fiber breaking force and therefore yarn strength.
In this research, we measured fiber properties of mature and developing fibers of MD52ne, MD90ne, and their F 2 progenies using both novel and conventional methods including Favimat, cross-section image analysis microscopy, attenuated total reflectance Fourier transform infrared spectroscopy (ATR-FTIR), HVI, Stelometer, AFIS, and gravimetric fineness methods. The extended fiber property analyses showed that the superior BFS of MD52ne fibers resulted primarily from a higher individual fiber breaking force with a minor contribution from increased fiber length. Comparative transcriptome analyses of the NILs suggested that the superior BFS of MD52ne fibers was potentially related to ethylene pathway-directed fiber elongation and enhanced cell wall integrity due to the activity of receptor-like kinase (RLK) signaling pathways.

Results
Comparisons of bundle fiber properties between mature MD52ne and MD90ne fibers Fiber property analysis by HVI showed that the BFS of mature MD52ne fibers was significantly greater (19~25 %) than that of MD90ne fibers grown in two separate fields that substantially differ in geographic location and environmental conditions (Table 1). In addition, upper half mean length (UHML) of the MD52ne fiber was longer (5~6 %). Average fiber elongation values of MD52ne were lower than those of MD90ne but was only statistical significant at one location. No significant differences in the MIC values were observed between the MD52ne and MD90ne grown in both locations.

Correlation of bundle fiber strength with other fiber properties
To determine how the BFS values of the NILs were affected by other physical properties involved in fiber-tofiber interactions, we developed an F 2 population of 384 progeny plants derived from a cross between MD52ne and MD90ne. Among the F 2 progenies, the BFS values showed a Gaussian distribution with a broad range (31.99 and 42.66 g/tex) (Fig. 1a). Other fiber properties including UHML length (27.82~32.48 mm), elongation (4.78~6.45 %), MIC (4.39~5.78), maturity ratio (0.979~1.078), and fineness (172.0~215.0 mtex) measured by HVI and AFIS showed Gaussian distributions among the F 2 progenies so we were able to determine correlations of the BFS with other fiber physical properties. Table 2 showed that the BFS values of the NILs were affected by fiber length (UHML, mm) and micronaire (MIC) measured by HVI, and the maturity ratio (MR) and fineness (mtex) measured by AFIS. The correlation coefficient values (r) by Pearson's method [22] showed that the BFS value was positively correlated with UHML, MIC, MR, or fineness, whereas the BFS showed no significant correlation with FE. The R 2 values suggested that BFS variances were slightly affected by UHML (12.9 %), whereas they had little effect from fiber thickness-related properties such as MIC (1.2 %), MR (2.9 %), and fineness (1.2 %).

Comparisons of individual fiber properties between mature MD52n and MD90ne fibers
To minimize the influences of other physical fiber properties on fiber strength, we first screened for the NIL plants that had similar fiber properties except the BFS values. Among the selected NIL plants having a significant variation (24 %) of BFS values, we identified each cotton MD52ne and MD90ne line having an almost identical MIC value (4.931) representing a combination of fiber maturity ratio and fineness. The fiber properties measured by AFIS in addition to the HVI confirmed that there was no significant variation in the fiber thickness-related properties (MIC, MR, and fineness) but a significant difference (5 %, UHML) in fiber length between the selected NILs ( Table 3). The average breaking force of individual MD52ne fibers was significantly greater (22.4 %) than that of individual MD90ne fibers (Table 3). The distribution curves showed great variation of breaking forces among individual fibers of each NIL and a similar distribution range of breaking force between the individual MD52ne fibers (0.93~11.25 cN) and MD90ne fibers (0.62~11.27 cN) (Fig. 1b). Stronger fibers with higher cN values were more prevalent in MD52ne than MD90ne.

Comparisons of fiber properties between developing MD52ne and MD90ne fibers
To determine when variations of fiber properties occurred during cotton fiber development between the two NILs, we measured fiber physical properties from developing fibers at ten different developmental time points (10,13,15,17,20,24,28,33,37, and 44 DPA) covering entire developmental stages (Fig. 2). The growth rates of actively elongating MD52ne and MD90ne fibers declined at 20 DPA (Fig. 2a). Statistical analyses using t test showed no significant variances on the average lengths of the developing fibers younger than 24 DPA between the NILs. Average fiber lengths of developing MD52ne fibers at 28 DPA and older were significantly longer (p value, 0.0012) than those of developing MD90ne fibers at the corresponding time points. At around 15 DPA, fibers entered a transition phase with both SCW cellulose biosynthesis and fiber elongation (Fig. 2b). Crystallinities of developing MD52ne and MD90ne fibers were measured by an ATR-FTIR spectroscopy and its corresponding algorithm [23,24]. The comparative plot showed low crystallinity (CI IR ) of the developing fibers between 10 and 15 DPA. The CI IR values increased rapidly to 17 DPA, and reached its peak at 37 DPA (Fig. 2b). The observation implied that the developing fibers at 10, 13, and 15 DPA were composed of mainly primary cell wall (PCW) containing a low crystallinity and the transition from elongation to SCW biosynthesis occurred from 15 to 17 DPA. SCW containing a high crystallinity was deposited on the developing fibers until 37 DPA of both NILs. There was no statistically significant difference in the CI IR values of developing fibers between the two NILs during the entirety of the developmental stages (Fig. 2b). The bolls became fully developed and began to open at approximately 42 DPA. The mature fibers dehisced at 44 DPA. Fiber maturity ratio or degree of wall thickness (M IR ) was also assessed from developing MD52ne and MD90ne fibers by ATR-FTIR spectra (Fig. 2c) [23,24]. The developing fibers at 17 DPA and younger showed little of the SCW cellulose that is mainly responsible for fiber maturity [25]. The M IR values revealed that the developing fibers at 20 DPA and older consisted of SCW cellulose. During the SCW biosynthesis stage (20-33 DPA) of both NIL fibers, the M IR values increased likewise (Fig. 2c). Cross-sectioned images of fully mature MD 52ne and 90ne fibers and their calculated MR showed similarities between the NILs (Fig. 3 (Fig. 2d). Developing MD52ne fibers at 20 DPA and older were significantly (p-value < 0.0008) stronger than developing MD90ne fibers at the corresponding DPAs ( Fig. 2e). When the developing fibers reached 20 DPA, the BFS differences were clearly detected between MD52ne (21.7 g/tex) and MD90ne (17.5 g/tex). A stelometer, which requires relatively less fiber than the HVI, was used for measuring bundle strength of developing fibers. Neither the stelometer nor the HVI could measure BFS values from developing fibers that were younger than 20 DPA, because the sugary and sticky developing fibers cannot be individualized.

Transcriptome analysis of developing fibers between MD52ne and MD90ne by RNA-seq
To investigate the molecular basis for the superior fiber strength of MD52ne to MD90ne, whole genome comparative transcriptome analyses were performed with total RNAs extracted from developing fibers between MD52ne and MD90ne. Based on the fiber property data obtained from developing fibers ( Fig. 2), two different developmental time points, 15 DPA containing mainly PCW and 20 DPA containing both PCW and SCW, were compared between the NILs with two biological replicates. The average number of raw reads per library obtained by paired-end Illumina sequencing ranged from 32.80 to 33.49,000,000 reads ( Table 4). The numbers of average raw reads were slightly higher in MD52ne in both time points than MD90ne. Of the raw reads, 80.83 to 84.35 % of reads per library were mapped to annotated protein coding genes in the draft genome G. hirsutum, TM-1 [26]. A total of 61,263 genes were mapped in this Upland cotton genome for all four libraries (Additional file 1). Expressed genes were annotated with Arabidopsis Tair 10 homeolog genes. The percentages of genes having reads per kilo-base per million reads (RPKM) >1 were ranged from 71.61 to 79.22 % per library. We selected an RPKM Fig. 3 Comparisons of fiber maturity ratios of mature fibers between MD52ne and MD90ne measured by image analysis microscopy. Circularity (θ = 4πA/P 2 ) representing the degree of fiber cell wall thickness was calculated from average wall area (A) excluding lumen and perimeter (P) of cross-sections of 300 fibers. The maturity ratios between MD52ne (0.948 ± 0.010) and MD90ne (0.971 ± 0.033) calculated from the circularities showed no significant variation (p-value, 0.519). A scale bar represents 10 μm. MD52ne, MD90ne threshold of 1 to be consistent with prior work using this draft genome [26]. Expressed mapped genes were further evaluated between two NILs using RPKM values for the respective time points. The false discovery rate (FDR) was set at 5 % to determine the threshold of p value in multiple tests and analyses. To judge the significance of gene expression difference, adjusted p ≤ 0.05 and the absolute value of log 2 ratio ≥ 1 was used as the threshold [27]. Of the 37,675 expressed genes, 4005 and 1080 unique genes were significantly differentially expressed at 15 and 20 DPA, respectively between the NILs (

Annotation and gene ontology analyses of DEGs in MD52ne fibers
GO enrichment analysis of the annotated DEGs by agriGo [28] showed that the DEGs involved in responses to stimuli and phytohormones, intracellular signaling, cellular metabolic process, cell wall modification, lipid localization, and carbohydrate metabolic process were commonly identified in developing MD52ne fibers at 15 and 20 DPA (Fig. 4c and Table 5). Numerous transcription factors regulated by growth stimulating phytohormones such as auxin, ethylene, and gibberellins (GA) were differentially expressed in developing MD52ne fibers (Table 5). Among them, 1-aminocyclopropane-1-carboxylic acid synthase 6 (ACC6) and 1-aminocyclopropane-1carboxylate oxidase 4 (ACO4), key enzymes for synthesizing ethylene stimulating fiber elongation [29,30], and EIN3-binding F box protein 1 (EBF1) involved in ethylene signaling [31] were up-regulated at 15 and 20 DPA. GAST1 encoding for a GA promoting growth enzyme [32] was highly up-regulated in MD52ne. We also conducted GO enrichment analysis using DEGs at 15 and 20 DPA separately (Additional file 8).

Validation of DEGs in MD52ne fibers
The expression patterns of the DEGs identified from RNA-seq data of the two NILs were validated with real time quantitative PCR (RT-qPCR) analysis. Based on GO enrichment analysis, 32 DEGs showing different expression pattern in the RNA-seq were selected for RT-qPCR using RNAs from developing fibers at 10, 15, 20 and 24 DPA. Comparative transcript levels of the all 32 DEGs obtained by both RT-qPCR and RNA-seq were presented in Additional file 10, and the RT-qPCR results from the 9 critical DEGs were shown in Fig. 6. The identical expression patterns of all the tested DEGs between RT-qPCR and RNA-seq indicated the reliability of both sequencing and the DEG filtering process. ACO4 and EBF1, which are both involved in the ethylene signaling pathway [29,30], were expressed more in MD52ne than in MD90ne at all stages in PCW biosynthesis (10 DPA), transition (15 DPA), and SCW biosynthesis (20 and 24 DPA) (Fig. 6). The most significant upregulation in all developmental stages was found in the LRR RLK that is a new signaling pathway of cell wall integrity [33]. A transcription factor showing sequence similarity to Arabidopsis MYB 46 regulating biosynthesis of cellulose, hemicelluloses, and lignin [44,46] was expressed more abundantly in MD52ne than in MD90ne specifically at the transition stage (15 DPA) between the PCW and SCW biosyntheses (Fig. 6). Consistently, cellulose synthase catalytic subunit A4 (CesA4) which is involved in the SCW biosynthesis of cotton fiber [47] showed an identical expression pattern to the MYB46 transcription factor. The expression pattern of a zinc finger protein COL5 involved in cellular metabolic process was also highly abundant in MD52ne at all tested DPAs. A CHI/CHI (Gh_D06G0479) responsible for crystalline cellulose content in other plants [43,48] was highly up-regulated in MD52ne as compared to MD90ne. During the active SCW biosynthesis stage (20)(21)(22)(23)(24), the transcript levels of CHI/CHI were 30 fold higher in MD52ne fibers than in MD90ne fibers. An FAD-linked oxidase and a ribosomal L18e/L15 protein (L18e/L15) were highly up-regulated in MD52ne over MD90ne in all developmental stages.

DEGs at the QTL regions for bundle strength and fiber length
We previously identified stable quantitative trait loci (QTLs) associated with BFS and fiber length (UHML) using an F 2 population derived from a cross between MD52ne and MD90ne [49]. The QTLs associated with BFS and UHML were linked with simple sequence repeat markers, BNL4034 and GH454 located on chromosome 3 (A03) and chromosome 24 (D08), respectively. We aligned the QTL regions along with the DEGs with the physical map of the G. hirsutum TM-1 genome [26,50]. A total of 75 genes were differentially expressed either at 15 and 20 DPA in the QTL regions. Of the 75 DEGs, 17 genes (9 DEGs in the BFS QTL and 8 DEGs in the UHML QTL) up-regulated more than 2 fold in MD52ne were involved in cell wall modification based on GO analysis (Additional file 11). Three LRR RLKs, two NAC transcription factors, COL5, MADS-box transcription factor, and WRKY transcription factor involved in phytohormonal and RLK signaling pathways were located on the QTL regions (Fig. 7). Trehalose-phosphatse / synthase 9, XET9, Sus3, and germin like protein involved in cell wall biosynthesis or wall carbohydrate metabolisms were also found near the QTL regions. Two ribosomal L18e/L15 proteins were also located on chromosome A03 QTL location. Among them, the ribosomal L18e/L15 protein (Gh_D02G1619) at the A03 QTL and the LRR RLK (Gh_D08G0203) located near the D08 QTL were the DEGs that were mostly upregulated at all developmental stages (Fig. 6).

Discussion
Greater force is required for breaking individual fibers of MD52ne than its NIL, MD90ne The BFS value has been broadly used to evaluate fiber strength which is the breaking force of a fiber bundle. In addition to the breaking force, the BFS is also affected by variable fiber properties like length, fineness, maturity, and MIC involved in fiber-to-fiber interactions. The correlation analysis of fiber properties from the 384 F 2 progenies from a cross between MD52ne and MD90ne showed that the BFS variation of the NILs was mainly determined by the breaking force with low effects from variable fiber properties related to fiber-to fiber interactions ( Table 2). The fiber length showing 5-6 % differences between the NILs contributed a slight influence (12.9 %) on the BFS variances between the NILs, whereas the fiber thickness related properties having insignificant variations between the NILs had almost no effect on the BFS variances of the NILs (Tables 1 and 2). Since the BFS variance of the NILs is mainly driven by breaking force, the ratio of the bundle strength between the NILs was expected to be similar to the ratio of the individual fiber strength that is not affected by fiber-to fiber interactions. As predicted, the strengths of bundle (24 %) and individual fibers (22 %) from the MD52 were similarly higher than those from the MD90ne (Tables 1  and 3). As results, we concluded that the superior breaking force mainly contributed to the high bundle strength of the MD52ne fibers with a minor contribution from longer fiber. Compared with the cotton CSIL lines which have been used for studying fiber strength despite great potential effects from fiber length and thickness related properties due to their substantial variances among the CSILs [18,19], the MD52ne and MD90ne are more ideal cotton NILs for dissecting molecular mechanisms of intrinsic fiber strength since their BFS variance was mainly affected by the breaking force but minimally regulated by other fiber properties involved in fiber-to fiber interactions within a fiber bundle.
Based on the fiber property analyses of the developing fibers whose crystallinity increased rapidly from 15 to 17 DPA (Fig. 3b), we determined that the transition from PCW to SCW biosynthesis stages began at approximately 15 DPA. Thus, we compared the transcript abundance in developing MD52ne and MD90ne fibers at two In the developing fibers at the transition stage, a new cell wall layer named as winding layer is deposited [51]. Based on the observation of high fiber strength in developing fibers (21 DPA) composed of a winding layer with minimum SCW [8], the winding layer has been speculated as a potential source of fiber strength [17,52]. To determine if and how the winding layer contributes to the strength of bundle and individual fibers, further comprehensive studies may be necessary since single fiber strength of developing fibers was previously measured by an Instron tensile tester that was a prototype for measuring individual fiber strength [8] and caused high variability and inconsistency [53]. In our studies, high BFS values of the NILs were also detected at the transition stage (20 DPA) of the NILs (Fig. 3e). The BFS of developing MD52ne fibers (21.73 g/tex, 20 DPA) was significantly (p value, 0.0027) higher than that of developing MD90ne fibers (17.52 g/tex, 20 DPA).

Ethylene and its networking phytohormonal pathways may be involved in superior fiber length development in MD52ne
Comparative transcriptome analyses showed that transcripts related to ethylene and its networking auxin and GA signaling pathways for promoting fiber elongation were highly abundant in MD52ne (Table 5 and Fig. 4c). Ethylene gas is known as a major phytohormone stimulating fiber elongation [54]. Up-regulations of ethylene synthesizing genes like ACC and ACO as well as ethylene signaling gene like EBF are critical for active fiber elongation [29][30][31]. Consistent with the prior findings, ACC6, ACO4, EBF1, and ERF1 involved in ethylene biosynthesis and signaling pathway required for fiber elongation were highly expressed in elongating MD52ne fibers whose length (UHML) was longer than its NIL, MD90ne (Table 5 and Fig. 6). In addition, transcripts (AUX/IAA, auxin-responsive GH3 protein, and GAST1) involved in auxin and GA were required for differentiating and elongating fibers [41,55]. For promoting fiber elongation, multiple expansins involved in loosening the cell walls and lipid transfer proteins involved in fiber elongating [29] were also up-regulated in the MD52ne fibers (Table 5 and Fig. 8). A large set of genes (XET, PME, and Sus) involved in xyloglucan and pectin biosynthesis and carbohydrate metabolism requiring cotton fiber elongation [56] was also enriched in the elongating MD52ne fibers (Tables 5 and 6). A XET (Gh_A03G1432) and a Sus (Gh_D08G1309) were linked with the QTLs associated with BFS and UHML (Fig. 7).

Receptor-like kinase signaling pathway regulating cellulose deposition and maintaining cell wall integrity may be involved in superior fiber strength development in MD52ne
In addition to phytohormone and transcriptional networks controlling plant growth and development, receptor-like kinases (RLKs) have been found as novel regulators for both plant development and stress responses [33,57]. Among the various RLK classes described in Fig. 5, RLKs containing a leucine-rich repeat (LRR) were most frequently identified from developing MD52ne fibers. Three LRR RLKs were also found in the two QTLs associated with BFS and length ( Table 5, Figures 6 and 7). In Arabidopsis elongating root tips and seeds, two LRR RLKs named FEI 1 and 2 have been reported to play a role in cellulose deposition in Arabidopsis elongating root tips [34] and seed coat [35]. ACC (1-aminocyclopropane-1-carboxylic acid) that is essential for ethylene signaling has been suggested to be a signaling molecule for FEI 1 and 2 [58], so both ethylene and LRR RLK signaling are most likely involved in cellulose deposition in elongating tissues (Fig. 8).
Other LRR RLKs whose ligands are unknown are reported as a regulator of the SCW formation in Arabidopsis [59] and popular trees [60]. A cotton LRR RLK named GhRLK1 located in the plasma membrane was reported to be induced during active SCW synthesis stage [61]. Therefore, LRR RLK signaling pathways might be involved in mediating a coordination of cell elongation and SCW biosynthesis during cotton fiber development as suggested in other plants [57,58].
Temporal regulation of the genes involved in crystalline cellulose assembly at secondary wall biosynthesis stage of MD52ne fibers Three genes (Gh_D12G0298, Gh_A13G0320, and Gh_D13G0359) encoding COBRA-like protein were specifically up-regulated during the SCW biosynthesis stage in MD52ne fibers (Table 5). COBRA-like protein 2, a member of the GPI-anchored COBRA-like family, has been recently identified to play a role in crystalline cellulose deposition in Arabidopsis seed coat [62]. When a COBRA-like protein was deficient in brittle culm1 rice mutant [63] or brittle stalk2 maize mutant [64], mechanical strength of the stems was reduced. A COBRA-like protein interacting with cellulose modulates cellulose assembly in rice [42]. Therefore, the three COBRA-like proteins up-regulated at the SCW stage of MD52ne fibers can be candidate genes that contribute to the superior strength of MD52ne (Fig. 8).
Four CHI/CTL genes (Gh_D06G0479, Gh_A06G0439, Gh_A10G1271, and Gh_D09G2016) specifically upregulated in the SCW stage of MD52ne fibers were similar to the sequences of Chitinase (CHI) and Chitinase like protein (CTL) in other plants. CTL2 is responsible for crystalline cellulose content in Arabidopsis [43], whereas CHI is a pathogenesis-related gene responding to biotic stress in rice and maize [65,66]. Thus, the real function of CHI/CTL genes identified from MD52ne remains to be determined.

Temporal regulation of genes at transition stage of MD52ne fibers
We previously reported temporal up-regulation of SCW biosynthesis-related genes at the transition from the PCW to SCW stage in the MD52ne fibers based on the transcriptome profiles performed with the first generation of cotton oligonucleotide microarray [16,17]. The present transcriptome analyses performed with RNA-seq and G. hirsutum genome sequence identified many more DEGs in the MD52ne fibers (Additional file 12) than the previous microarray analysis [17]. The RNA-seq analysis also showed that some SCW biosynthesis-related genes such as MYB26, MYB46, and CesA4 [44,45,47] were  (Table 5 and Fig. 6). In contrast, other 26 CesAs identified by the RNA-seq did not show temporal up-regulation at the transition stage. For further analysis of the temporal regulation of SCW biosynthesis-related genes, we retrieved 181 Arabidopsis genes from PlaNet [67] that are temporally and spatially co-expressed during SCW biosynthesis in Arabidopsis. Among the MD52ne genes ortholgous to the 181 Arabidopsis SCW-related genes, 64 genes (35.4 %) showed temporal up-regulation in 15 DPA developing MD52 fibers (Additional file 13), whereas others showed no temporal up-regulation. In addition to the limited numbers of up-regulated SCW genes in the MD52ne over MD90ne, the identical levels of the crystalinity between the NILs (Fig. 2B) might imply that the genes related to SCW cellulose biosynthesis were less involved in the superior fiber strength of the MD52ne than the genes related to SCW cellulose assembly and wall integrity.

Conclusions
The demand of high fiber strength has been increased dramatically with the advent of modern high speed spinning technology for producing yarn. Cotton researchers have tried to improve this trait in G. hirsutum genetic backgrounds. MD52ne was proven to have higher fiber strength than its NIL MD90ne. This study was conducted to unveil the molecular mechanisms behind the formation of superior individual fiber strength, which is correlated with yarn strength, in MD52ne using RNA-seq technology. The bundle strength of the MD52ne fibers predominantly depends on individual fiber strength combined with fiber length. Comparative transcriptome analyses of the NILs suggested that the superior strength of MD52ne fibers was potentially related to two signaling pathways (Fig. 8): one is ethylene and its interconnected phytohormonal pathways involved in fiber elongation and interfiber interactions, and the other is RLKs signaling pathways involved in regulating cell wall integrity and potentially mediating a coordination of cell elongation and SCW biosynthesis. Several secondary cell wall biogenesis related genes and transcription factors such as COBRA-like protein, CHI/CTL, NAC, WRKY, COL5, Zinc finger family protein and MYB were up-regulated in MD52ne developing fibers. The superior BFS of MD52ne fibers might be the result of high individual fiber strength with a minor contribution from longer fiber length. The longer fibers may increase the fiber-to-fiber interactions and are likely the result of differential regulation of some PCW related genes. The improved individual fiber strength of MD52ne is likely related to pathways regulating cell wall integrity.

Plant material
The cotton NILs MD52ne and MD90ne were bred and provided by Dr. William Meredith of USDA-ARS-SEA (Additional file 14) [14,15]. The two NILs were grown at two different fields for two growing seasons and at a greenhouse for one growing season. To compare fiber physical properties between the two NILs and determine variations in each NIL plant, ten individual plants of each NIL were grown at a field located in Stoneville, MS in 2012. The soil type in Stoneville is Bosket fine sandy loam. The mature fibers were collected from each plant of each NIL. To perform comparative transcriptome analyses, three biological replications (approximately 40 cotton bolls per replication) at each developmental time point (10,13,15,17,20,24,28,33,37,44,and 48 DPA) were collected from 50 plants of the two NILs grown at a field located in New Orleans, LA in 2013. The soil type in New Orleans was Aquent dredged over alluvium in an elevated location to provide adequate drainage. Fiber samples from 384 F 2 progeny plants derived from crosses between MD90ne and MD52ne were collected at Stoneville, MS, USA in 2012 as described in Islam et al. [49]. All naturally-open bolls were manually harvested from each of the F 2 plants. Developing fibers (10-37 DPA) were manually collected from ovules and mature fibers (44 DPA) were ginned using a laboratory roller gin. The collected fibers were frozen immediately with liquid nitrogen for RNA extraction or dried in 40°C incubator for physical property analyses. To extract additional RNAs that were used for verification of the transcriptome results, three biological replications of cotton fibers at each developmental time point  were collected from the two NILs grown in 5 gal pots with Metro-Mix 360 in a greenhouse located in New Orleans, LA. Throughout all processes from planting, tagging, harvesting, and ginning, the two NILs grown side by side were equivalently treated.

Fiber property measurements
For measurements of fiber properties from cotton fibers, fibers were pre-equilibrated with 65 ± 2 % humidity at 21 ± 1°C for 48 h. All fiber properties were obtained from instruments that were properly calibrated according to the manufacturers' instructions and standard cotton fibers were obtained from USDA-AMS.
BFS (g/tex), UHML (inch), FE (%), and MIC values of the mature fibers from the two NILs were obtained from five replicates measured by HVI (USTER Technologies Inc., Knoxville, TN). To determine BFS from developing fibers, a stelometer (SDL Atlas, Stockport, England) was used with three replicates of fiber samples. BFS is given as tenacity, expressed as kilonewton meters per kilogram, and is the force required to break a bundle of fibers of a specific gravimetric linear density. For measuring the breaking force of individual fibers, Favimat (Textechno, Mönchengladbach, Germany) was used with 303 individual fibers and a 13 mm length gauge, according to Delhom et al. [68].
AFIS maturity ratio and fineness were measured using Uster® AFIS-Pro (USTER Technologies Inc., Knoxville, TN). The average AFIS fiber data were obtained from five replicates with 5000 fibers per replicate.
For measuring the gravimetric fineness (mtex, mg km −1 ) of the fibers, three hundred fibers were combed, cut at the top and bottom to leave them 15 mm long, and measured by a microbalance [69]. Average gravimetric fineness was calculated from the three measurements.
Circularity (θ) representing the degree of fiber cell wall thickness was directly measured from light microscopic images from cross-sectioned fibers [70]. Average cell wall area (A), excluding lumen and perimeter (P) of the fiber cross sections, was measured from 300 crosssections using the image analysis software according to Xu and Huang [71]. The obtained circularities from the equation, θ = 4πA/P 2 [72] were converted to maturity ratio (MR) using the equation, MR = θ / 0.577 [73].

ATR-FTIR spectral collection and data analysis
All spectra were collected with an FTS 3000MX FTIR spectrometer (Varian Instruments, Randolph, MA) equipped with a ceramic source, KBr beam splitter, and deuterated triglycine sulfate (DTGS) detector. The ATR sampling device utilized a DuraSamplIR single-pass diamond-coated internal reflection accessory (Smiths Detection, Danbury, CT), and a consistent contact pressure was applied by way of a stainless steel rod and an electronic load display. At least six measurements at different locations for individual samples were collected over the range of 4000-600 cm −1 at 4 cm −1 and 16 coadded scans. All spectra were given in absorbance units and no ATR correction was applied. Following the import to GRAMS IQ application in Grams/AI (Version 9.1, Thermo Fisher Scientific, Waltham, MA), the spectra were smoothed with a Savitzky-Golay function (polynomial = 2 and points = 11). Then, the spectral set was loaded into Microsoft Excel 2000 to assess cotton crystallinity and maturity by using a previously proposed algorithm analysis [23,24]. In the original concept of assessing cellulose maturity (M IR ) and crystallinity index (CI IR ) from IR measurement [22][23][24], the key wavelengths was identified and then two algorithms (R 1 and R 2 ) were developed to estimate the degree of cotton cellulose M IR and CI IR by representing the R 1 and R 2 values. Both mean value and standard deviation for each fiber sample were used for the comparison between two types of varieties.

RNA extraction, library preparation and sequencing
Total RNA was extracted from the developing cotton fibers (10, 15, 20 and 24 DPA) using the Sigma Spectrum™ Plant Total RNA Kit (Sigma-Aldrich, St. Louis, MO) with DNase1 digestion according to the manufacturer's protocol. The quality and quantity of total RNA were determined using a NanoDrop 2000 spectrophotometer (NanoDrop Technologies Inc., Wilmington, DE) and an Agilent Bioanalyzer 2100 (Agilent Technologies Inc., Santa Clara, CA). The RNA samples of two biological replications at two different developmental stages (15 and 20 DPA) from both NIL fibers were sent to Data2-Bio LLC (Ames, Iowa) for library preparation and subsequent paired-end Illumina mRNA sequencing according to the methods that were previously described [74].

RNA-seq data processing
The raw RNA-seq reads were trimmed with SICKLE (https://github.com/najoshi/sickle) using a quality score cutoff of 20. Then, the RNA-seq reads were aligned to the Gossypium hirsutum draft genome [26] with the GSNAP software program [75]. Reads mapped to each annotated gene were counted with Bedtools software [76].

Identification of differentially expressed genes
Differential gene expression was calculated by the negative binomial method of the EdgeR software using the tagwise estimation of dispersion [77]. RPKM was used to estimate gene expression levels as calculated: RPKM = [10 9 /NL] C, where C stands for the number of reads that could map to the target unigene, N represents the number of reads that could map to at least one unigene, and L refers to the length of the target unigene. The accuracy of the test result was corrected by FDR. In this study, FDR < 0.05 and the absolute value of the log 2 ratio (in which 'ratio' refers to the fold change in expression of the target unigene in among libraries) were used to select DEGs.

Validation of RNA-seq results with RT-qPCR
The experimental procedures and data analysis related to RT-qPCR were performed according to the Minimum Information for Publication of Quantitative Real-Time PCR Experiments guidelines [70]. Four fiber development stages (10, 15, 20 and 24 DPA) were used for RT-qPCR analyses for validating the RNA-seq result of selected genes. The detailed description of cDNA preparation, qPCR, and calculations were previously reported [20]. Specific primer pairs were designed from 32 DEGs for validation of the RNA-seq. The endogenous reference genes used in the RT-qPCR reactions were the 18S rRNA (U42827), and αtubulin 4 (AF106570). The reference and target gene primer sequences are shown in Additional file 15. Three biological replications and three technical replications for each time-point were used for RT-qPCR.

Gene annotation analyses
RNA-seq data obtained from 15 and 20 DPA fibers of two NILs were first subjected to Venn analysis utilizing BioVenn [78] to determine which DEGs were common between two time-points. To assist in the identification of biological processes represented in the data, GO enrichment analysis was performed using the agriGO Singular Enrichment Analysis [28]. The statistical test method used was the Fisher's exact test (significance level 0.05). For metabolic analysis, MapMan software [79] was used to identify and illustrate metabolic overview of cell wall related molecules.

Ethics approval and consent to participate
Not applicable.

Availability of supporting data
All supporting data can be found within the manuscript and its additional files. The biological sequences were deposited in the Sequence Read Archive (SRA), National Center for Biotechnology Information (NCBI) under the accession numbers SRS843151, SRS843159, SRS843160, and SRS843163.

Additional files
Additional file 1: Name of mapped gene in Gossypium hirsutum (TM-1) and their annotation with Arabidopsis homeolog genes. This table contains 61,263 allotetraploid cotton genes that aligned to Gossypium hirsutum (TM-1) draft genome. Mapped genes were also annotated with Arabidopsis homeolog genes and provide their description. The physical locations including chromosome name of each gene are also provided. First column have G. hirsutum (TM-1) gene name followed by Arabidopsis gene name, TAIR 10 gene description, G. Additional file 10: RT-qPCR validation of some selected differentially expressed genes from RNA-seq data during fiber development. This file contains the results of qPCR analysis and RNA-seq expression data performed on 32 genes that were used to compare fold expression levels between MD52ne and MD90ne. Four (10, 15, 20 and 24 DPA) and two (15 and 20 DPA) developing fiber samples were used for RT-qPCR and RNA-seq, respectively. RT-qPCR values were corrected to Tubilin and 18S gene for each sample. For each treatment group three qPCR measurements were taken for each of three biological replicates and then averaged. (DOCX 644 kb) Additional file 11: Names of the important genes related to cotton fiber cell wall development are located near the identified QTL regions. This table contains differentially expressed genes from RNA-seq data that physically located near the two previously reported QTLs associated with BFS and UHML located on chromosomes 03 and 24, respectively by Islam et al. 2014 [49]. (XLSX 12 kb) Additional file 12: Comparison of deferentially expressed genes between RNA-seq and microarray of Hinchliffe et al. 2010. These figure compare the number of differential expressed genes between