Functional analysis of GhCHS, GhANR and GhLAR in colored fiber formation of G. hirsutum L

The formation of natural colored fibers mainly results from the accumulation of different anthocyanidins and their derivatives in the fibers. Chalcone synthase (CHS) is the first committed enzyme of flavonoid biosynthesis, anthocyanidins are transported into fiber cell after biosynthesis mainly by Anthocyanidin reductase (ANR) and Leucoanthocyanidin reductase (LAR) to present diverse colors with distinct stability. The biochemical and molecular mechanism of pigment formation in natural colored cotton fiber is not clear. The three key genes of GhCHS, GhANR and GhLAR were predominantly expressed in the developing fibers of colored cotton. In the GhCHSi, GhANRi and GhLARi transgenic cottons, the expression levels of GhCHS, GhANR and GhLAR significantly decreased in the developing cotton fiber, negatively correlated with the content of anthocyanidins and the color depth of cotton fiber. In colored cotton Zongxu1 (ZX1) and the GhCHSi, GhANRi and GhLARi transgenic lines of ZX1, HZ and ZH, the anthocyanidin contents of the leaves, cotton kernels, the mixture of fiber and seed coat were all changed and positively correlated with the fiber color. the the three genes

its derivatives and pCLCrVB. Agrobacteria harboring pCLCrVA or one of its derivatives was mixed with an equal volume of Agrobacteria harboring pCLCrVB. The mixed Agrobacteria solutions were infiltrated into the abaxial side of the cotyledons of the 2-week-old cotton seedlings using syringes without needles. The agroinfiltration was repeated at least three times with at least 30 plants for each vector. Total DNA was extracted from the leaves of pCLCrV-inoculated cotton plants. The presence of pCLCrV DNA in infected plants was detected by PCR using primers specific for either pCLCrV DNA-A (CLCrVA F and CLCrVA R) as the described before [46]. Plants infiltrated with the pCLCrVA-empty vector and the wild-type C312 were used as controls in the experiment. Plants infiltrated with the pCLCrV-PDS vector showed the typical photobleaching phenotype in newly developing leaves, different tissues and organs. However, the efficiency of gene silencing was judged by the intensity of photobleaching during the whole growth period. The leaves and developing bolls in transgenic GhCHSi, GhANRi, GhLARi, GhPDSi and CKs plants were respectively collected at 0 DPA, 5 DPA, 10 DPA and 15 DPA for measurement of anthocyanin content and gene expression analysis.

Gene expression analysis by quantitative real-time PCR
Total RNA was isolated from the mixture of fiber and seed coat according to the manufacturer's instructions (RNAprep Pure TIANGEN BIOTECH, China), and treated extensively with RNase-free DNase I. Double-stranded cDNA was synthesized from 200ng RNA using FastQuant RT kit with gDNase (TIANGEN BIOTECH, China) according to a standard double-stranded cDNA synthesis protocol. Realtime PCR (qRT-PCR) assays were performed using the SYBR FAST qPCR kit (KAPA SYBR®, USA) and the qRT-PCR reaction was performed using the ABI QS3 fluorescence quantitative PCR instrument (ABI, USA). Specificity of the amplified PCR product was determined based on melting curve analysis.
Primers for target genes were designed using Primer Express 5 (Premier Biosoft, Palo Alto, CA) and are listed in Table 1. The cotton Ubiquitin7 gene (GhUBQ7, Gen Bank accession number: DQ116441) was used as an internal control for the assays. The expression of GhANR genes, GhLAR genes and GhCHS genes in cotton were obtained and standardized to the constitutive GhUBQ7 gene expression level.

The analysis of anthocyanin content of transgenic plants
Measurements of anthocyanidin accumulation were performed as described by Jeong et al. (2010) [48] and Wade et al. (2003) [49]. Weighed samples (approx. 100 mg) in a 1.5 mL microfuge tubes were harvested into liquid nitrogen to freeze plant tissue. Samples were extracted overnight in 1ml of 0.5% (v/v) HCl in methanol, and then violently shaken in vortex for 30 sec. The extraction buffers were shaken with 120 rpm in the dark for 1 hour. The extraction buffers were centrifuged at 2,630g for 15 min at 20℃. This process was repeated 3 times. The supernatant was assayed spectrophotometrically (UV-2600, Shimadzu, Japan) and anthocyanidin absorbance units (A 530 -A 657 ) per gram fresh weight were calculated. The blank should be 480ml Methanol with 0.5% (v/v) HCl and 320ml Milli-Q H 2 0 for a total of 800ml. A spectrophotometer was used for the absorbance measurements at 530, 620, and 650 nm. The results were determined based on the following

Identification and phylogenetic analysis of GhCHS, GhLAR and GhANR genes
The differentially expressed genes were scanned from the transcriptome of brown cotton and its near isogene line [11]. From the differentially expressed genes, the genes in the anthocyanidin biosynthesis pathway including GhCHS, GhLAR and GhANR were selected for further analysis in the colored fiber in G. hirsutum. In G. hirsutum, 7 GhCHS genes and 6 GhCHS-like genes were scanned, 2 GhANR genes and 3 GhLAR genes were obtained (Fig. 2).
Multiple ChCHSs contained high amino acid homology, the homology of special motifs reached 100%, the GhCHS genes kept much conserved in G. hirsutum (Fig. 2A). The GhLAR genes and GhANR genes were also much conserved in G. hirsutum (Fig. 2B, C). The members in the GhCHS family except GhCHSL-2 had two domains and were mostly divided into N-terminus and C-terminus, but GhCHSL-2 has only N-terminal one (Fig. 3A). The GhANR1 and GhANR2 were also divided into N-terminal and Cterminal (Fig. 3B). GhLARs had only one N-terminal domain (Fig. 3C).

Expression pattern of GhCHS, GhLAR and GhANR in the developing fibers
The 7 GhCHS genes and 6 GhCHS-like genes, 2 GhANR genes and 3 GhLAR genes were measured their transcript levels in the developing fibers of different stages in the natural colored cotton Zongxu1 (ZX1) and different cotton species. The 3 GhCHS genes (named GhCHS1, GhCHS2,GhCHS3) were detected in the developing fibers of ZX1, and GhCHS2 were predominantly in the developing fibers of ZX1, the expression level of GhCHS2 was extremely higher than that of GhCHS1 and GhCHS3, especially higher levels appeared in the fiber of 5 and 10 DPA (days post anthesis) (Fig. 4A).
The two GhANR genes (GhANR1 and GhANR2) were quantified in the developing fibers of ZX1, the maximal expression level appeared in the fiber of 10 DPA (Fig. 4B). The expression levels of GhANR genes were extremely higher than those of GhLAR genes to about 10-fold in the fibers of 5 DPA and 30-fold in the fibers of 10 DPA. All GhLAR genes were detected in the developing fibers from 0 DPA to with white cotton fibers, the transcription level of GhANR genes in brown cotton fibers was significantly higher than in white fibers. GhLAR genes had the highest expression levels in the deep brown fibers of HZ lines among the 5 cotton species (Fig. 5C), significantly higher than the natural colored cotton ZX1 and white fiber cotton C312 and HS2. Moreover, the expression level of GhLAR1 was significantly increased in HZ lines (Fig. 5C). So the conserved sequences of GhANR1 and GhANR2, GhLAR1 and GhLAR2 were used to be interfered their transcripts.

Phenotypic analysis of transgenic RNAi colored cotton
The natural colored cotton ZX1 was used to silence the endogenous GhCHS2, GhLAR and GhANR  (Fig. 7). Among the 5 cotton species, the fiber color of HZ was deeper than that of the other 4 cotton species, the fiber color of GhANRi HZ plants was obviously lighter than that of WT (ZX1) and itself (Fig. 7D). The fiber color in GhANRi ZH plants also obviously became lighter than that of WT (ZX1) and itself (Fig. 7E). It indicated that GhANR and GhLAR played an important role in the anthocyanin synthesis and the accumulation of pigment in cotton fiber.

Expression analysis of GhCHS, GhANR and GhLAR in RNAi plants
In the gene-silenced ZX1 plants, the expression levels of GhANR, GhLAR and GhCHS were significantly decreased than that in the WT and control plants of GhPDSi (Fig. 8). In the GhCHSi ZX1 plants, the expression level of GhCHS2 in the fibers at 5 DPA, 10 DPA and 15 DPA was all significantly lower than that of WT and CKs, especially in the developing fiber of 5 DPA (Fig. 8A). The expression level of GhLAR in the GhCHSi ZX1 plants appeared no significant change ( Fig. S1A; see Additional file 2); the the GhANR played the main role for colored anthocyanidins into fiber cell, the GhLAR worked for transporting leucoanthocyanidin in fiber cell also could enhance the fiber color perhaps by polymerization and oxidation to form anthocyanin derivatives [11]. The GhLARs were preferentially expressed in the deep colored fiber of HZ plant, the fiber color became lighter in the GhLAR suppressed plants.
PAs (also called condensed tannins) are synthesized via a branch of anthocyanin biosynthesis pathway under the catalyzation of leucoanthocyanidin reductase (LAR) and anthocyanidin reductase (ANR). LAR catalyzes the conversion of leucoanthocyanidin (flavan-3, 4-diol) to catechin, while ANR catalyzes the synthesis of epicatechin from anthocyanidin [36,38,58]. Ectopic expression of the tea CsLAR gene in tobacco results in the accumulation of higher level of epicatechin than that of catechin, suggesting LAR maybe involved in the biosynthesis of epicatechin [37]. ANRs from grapevine and tea are proven to have epimerase activity and thus can convert anthocyanidin to a mixture of epicatechin and catechin [37,59]. Further, previous engineering experiments in soybean, Arabidopsis, and petunia have redirected metabolic flux from anthocyanin biosynthesis into the isoflavone pathway, from lignin biosynthesis into the flavonoid pathway, and from flavonol biosynthesis into the anthocyanin pathway, by suppressing anthocyanin, lignin, and flavonol branchpoint genes, respectively [60][61][62]. The overexpression of the ANR gene from Medicago truncatula in tobacco resulted in reduced anthocyanin pigmentation in the flower and elevated PA levels [58]. These results suggested the potential for ANR to compete with the anthocyanin biosynthesis enzyme UDP-glycose: flavonoid-3-O-glycosyltransferase (UF3GT) for the substrate anthocyanidin, suppression of ANR genes results in increased anthocyanin accumulations.
The Arabidopsis ANR (or BAN) knockout mutant displayed precocious accumulation of cyanic pigments in the seed coat during early seed development [63]. The accumulations were only temporary, and resulted in a transparent testa (tt) phenotype with black pigmentation confined to the raphe of the dried grain [63]. This contrasts the phenotype in soybean, where high-level suppression of ANR genes gives a red-brown grain [41]. There may exist underlying mechanistic and metabolite differences that could explain the differences in grain phenotypes between these species. In

The anthocyanidins content in the fiber directly influenced fiber color
In the transgenic RNAi cotton plants, the content of anthocyanidins was reduced by suppression the endogenous GhANR, GhLAR and GhCHS genes, resulted in the fiber color fading. CHS plays an important role in the phenylalanine metabolic pathway, plant growth and development, such as stress response, plant fertility and plant color [71]. LAR is a key enzyme in the synthetic pathway of plant flavonoids from phenylalanine, which catalyzes the conversion of colorless anthocyanins to catechins [58,65,66]. Transcript levels of LAR1 and ANR2 genes were significantly correlated with the contents of catechin and epicatechin to regulate PA synthesis, respectively. Ectopic expression of apple MdLAR1 gene in tobacco suppresses expression of the late genes in anthocyanin biosynthetic pathway, resulting in loss of anthocyanin in flowers [66].
The anthocyanidins content in the fiber and seedcoat of GhLARi plants was higher than that of GhANRi plants, and the fiber color was also deeper than that of GhANRi plant, although LAR transported colorless anthocyanins into fiber cell. From our previous research, the transcription level of GhLAR in the fibers of brown cotton was higher than that in white cotton, during the fiber development, the fiber color of GhLARi plants was lightly fading here. Compared with white cotton fibers, the expression level of GhANR in brown cotton fibers was significantly higher. The gene expression of GhANR was active in brown cotton fibers and reached its peak at 12 DPA, when the expression level of GhANR in brown cotton fibers was >7 times higher than that in white cotton fibers [11]. During the fiber development, the GhLAR expression level in brown cotton was much lower than that of GhANR, so effect of suppression of GhLAR on the fiber color change was lower than that of GhANR, the suppression of GhANR in ZX1 could cause the fiber color significantly lighter. Our work of NMR analyses demonstrated that the flavan-3-ols in brown and white cotton fibers were in the 2, 3-cis form, but part of the proanthocyanidins in the white cotton fibers were modified by acylation. The prodelphidin (PD) relative percentage was similar to that of procyanidin (PC) in white cotton fibers, and proanthocyanidins with 90.1% PD were found in brown cotton fibers. The proanthocyanidin monomeric composition was consistent with the expression profiles of proanthocyanidin synthase genes, suggesting that ANR represented the major flow of the proanthocyanidin biosynthesis pathway in brown cotton fibers. Compared with white fibers, all of the proanthocyanidin synthase genes were expressed at a higher level in brown fibers [11]. The cis-form and trans-form of flavan-3-ols were synthesized via leucoanthocyanidin reductase (LAR) and anthocyanidin reductase (ANR) branches, respectively [11, 38,58,72]. Biochemical analyses by mass spectrometry (MS) revealed that the main PA monomers in brown cotton fibers contained three hydroxyls on the B ring (gallocatechin or epigallocatechin) [11,21,73], PA accumulation in brown fibers starts at an early stage (5 DPA) and peaks at 30 DPA, whereas in mature brown fibers, PAs are converted to oxidized derivatives (quinones). Because developing brown fibers do not exhibit distinct coloration until maturation, the condensed quinones were proposed instead of their PA precursors, directly contribute to brown pigmentation in cotton fibers [11]. Therefore, the three key genes in the anthocyanin metabolic pathways played the very important role in the coloration of cotton fibers, and became the target genes for genetic manipulation to improve cotton fiber color.

Conclusions
In colored cotton fibers of G. hirsutum, GhCHS2 gene was predominantly expressed in developing colored cotton fibers among 7 GhCHS and 6 GhCHS-like genes and represented CHS gene in anthocyanin metabolism in colored fibers. 2 GhANR genes and 3 GhLAR genes were highly conserved and homologous, significantly expressed in the developing colored cotton fibers. The GhCHS2, GhANR and GhLAR genes were differentially expressed in the colored cotton fibers with different color depth.
The GhCHS, GhANR and GhLAR genes were interfered in colored cottons with different color depth, the expression levels of the three genes were significantly declined, the anthocyanin contents in the RNAi cotton plants were significantly reduced with the declined gene expression, and the fiber color was significantly changed and weaken. The three genes of GhCHS, GhANR and GhLAR played a crucial part in cotton fiber color formation and has important significant to improve natural colored cotton quality through genetic manipulation of the three genes and create new colored cotton germplasm resources by genetic engineering.     The expression analysis of GhCHS, GhLAR and GhANR genes in the developing fiber of 10 DPA in 5 cotton species. Data presented in all graphs are means±SD (n =3).

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.