Transcriptomic analysis of differentially expressed genes in an orange-pericarp mutant and wild type in pummelo (Citrus grandis)

Background The external colour of fruit is a crucial quality feature, and the external coloration of most citrus fruits is due to the accumulation of carotenoids. The molecular regulation of carotenoid biosynthesis and accumulation in pericarp is limited due to the lack of mutant. In this work, an orange-pericarp mutant (MT) which showed altered pigmentation in the pericarp was used to identify genes potentially related to the regulation of carotenoid accumulation in the pericarp. Results High Performance Liquid Chromatography (HPLC) analysis revealed that the pericarp from MT fruits had a 10.5-fold increase of β-carotene content over that of the Wild Type (WT). Quantitative real-time PCR (qRT-PCR) analysis showed that the expression of all downstream carotenogenic genes was lower in MT than in WT, suggesting that down-regulation is critical for the β-carotene increase in the MT pericarp. RNA-seq analysis of the transcriptome revealed extensive changes in the MT gene expression level, with 168 genes down-regulated and 135 genes up-regulated. Gene ontology (GO) and KEGG pathway analyses indicated seven reliable metabolic pathways are altered in the mutant, including carbon metabolism, starch and sucrose metabolism and biosynthesis of amino acids. The transcription factors and genes corresponding to effected metabolic pathways may involved in the carotenoid regulation was confirmed by the qRT-PCR analysis in the MT pericarp. Conclusions This study has provided a global picture of the gene expression changes in a novel mutant with distinct color in the fruit pericarp of pummelo. Interpretation of differentially expressed genes (DEGs) revealed new insight into the molecular regulation of β-carotene accumulation in the MT pericarp. Electronic supplementary material The online version of this article (doi:10.1186/s12870-015-0435-3) contains supplementary material, which is available to authorized users.


Background
Citrus is one of the most important fruit crops with great economic significance and value for humans in the world [1]. As a crucial quality feature, the external colour of citrus fruit first attracts the attention of consumers, and uniform bright coloration will enhance the fruit attractiveness and consumers' acceptance. The external and internal coloration of most citrus fruits is due to the accumulation of carotenoids [2].
Carotenoids play indispensable roles in plants as components for all photosynthetic organisms and protectors against oxidation by quenching triplet chlorophyll, singlet oxygen, and superoxide anion radicals [3]. In higher plants, carotenoids provide flowers and fruits with distinct colors, ranging from yellow to orange or red, to attract insects and animals for pollination as well as seed dispersal [4,5]. Carotenoids also serve as precursors of the phytohormones abscisic acid (ABA), strigolactones, and other signalling molecules [6][7][8]. Some carotenoids are the precursors of vitamin A that cannot be artificially synthesized and therefore are essential nutritional components for animals and humans [9]. Moreover, they also have beneficial effects on human health, including enhancement of the immune system and reduction of the risk for degenerative diseases such as cancer, cardiovascular diseases and cataract [10][11][12]. Today, carotenoids are extensively used in health and nutritional products as important micronutrients [10].
Carotenoids are naturally synthesized in chloroplasts and chromoplasts by enzymes that are nuclear encoded [13]. In higher plants, structural genes of the carotenoid biosynthesis pathway have been isolated and characterized [14][15][16][17][18]. The first committed step of carotenoid biosynthesis is a head-to-head condensation of two molecules of a C20 precursor, geranylgeranyl pyrophosphate (GGPP), to form colourless phytoene catalyzed by the phytoene synthase (PSY). Next, the colourless phytoene is converted into the red lycopene by four desaturation reactions (catalyzed by phytoene desaturase, PDS, and ζ-carotene desaturase, ZDS) and (or) by two isomerization reactions mediated by carotene isomerase (CRTISO) and 15cis-ζ-carotene isomerase (ZISO). Then, the lycopene flux branches into two pathways via cyclization reaction. Lycopene β-cyclase (LCYB) adds two β-rings to the ends of lycopene molecule to form β-carotene, while the coaction of LCYb and lycopene ε-cyclase (LCYe) generates α-carotene with one β-ring and one ε-ring. Subsequently, α-carotene is converted into lutein by hydroxylations catalyzed by ε-ring hydroxylase and β-ring hydroxylase (BCH). Then, zeaxanthin and violaxanthin are generated from βcarotene with hydroxylation reactions catalyzed by HYb and epoxydation catalyzed by zeaxanthin epoxidase (ZEP). The plant hormone ABA is an end product of the carotenoid biosynthetic pathway generated by the enzymatic cleavage of 9-cis-epoxycarotenoid dioxygenase (NCEDs). Carotenoid cleavage dioxygenases (CCDs) cleave carotenoids into apocarotenoids at different double-bond positions. In the last decade, due to the importance of carotenoids, many efforts have been made to understand the molecular basis of the regulation of carotenoid biosynthesis and accumulation.
Citrus is a complex source of carotenoids, with the largest number of carotenoid species found in any one fruit [19]. More than 115 different carotenoids have been identified in the pericarp and pulp of citrus, including lycopene, β-carotene, β-cryptoxanthin, zeaxanthin, and violaxanthin [20]. Because of the large diversity of carotenoid patterns, citrus has become an important model species for studies on plant carotenoid metabolism [19,21], such as the analyses of carotenoid composition and content, and expression of the main carotenoid biosynthetic genes [22][23][24][25][26]. Mutants with alteration in the carotenoid biosynthetic pathway have proven to be useful experimental materials for identifying molecular mechanisms regulating the process [27]. In the past few years, many pulp mutants have been identified in grapefruit (Citrus paradisi) and orange (Citrus sinensis), such as Red marsh, Shara, Cara Cara, and Hong Anliu [28][29][30][31][32], and these mutants have been used to study the complex regulatory mechanism of carotenoid biosynthesis at the gene and/or protein expression level [33][34][35][36][37], facilitating the understanding of the carotenoid regulation mechanism in the pulp of citrus [38][39][40][41]. Due to the lack of mutants affected in the pericarp, the carotenoid regulation mechanism was less studied in pericarp compared with the pulp of citrus. Recently, an orange-pericarp mutant (MT) originating from Guanxi pummelo has been discovered in China and provided us a potential material for studying this regulation mechanism.
In this study, we investigated the composition and level of carotenoids and the expression of carotenoid biosynthetic genes in the pericarps of MT and wild type (WT) in the ripe stage. From the whole genome perspective, the differentially expressed genes (DEGs) in MT and WT were identified using the RNA-seq technology. The identified genes provide useful information for studying the molecular mechanism of carotenoid biosynthesis in citrus pericarp.

β-carotene is significantly accumulated in the MT
The pummelo MT was originally found in an orchard in Zhangzhou (Fujian, China) in the 2010s as a spontaneous bud mutation from the commercial variety of 'guanxi' pummelo. An obvious phenotypic change of the MT is the orange colour of the pericarp, showing a sharp contrast with the slight yellow colour of the mature pericarp in the WT fruit ( Figure 1A, B). The orange-pericarp mutant was propagated by grafting onto different rootstocks and retained the stable phenotype of the orange-colour pericarp under field conditions, and no reversion to the parental phenotype has been observed so far. Moreover, 73 pairs of Simple Sequence Repeat (SSR) markers were used to evaluate the genetic background of the mutant. All the SSR patterns were the same between MT and WT (Additional file 1), indicating that the two genotypes shared an identical genetic background.
To characterize the phenotype differences between MT and WT, the carotenoid composition and content of mature fruits were analysed by High Performance Liquid Chromatography (HPLC). The most obvious difference in carotenoid between MT and WT pericarps was β-carotene content ( Figure 1C, D). The β-carotene content of MT was about 10.5-fold that of the WT, accounting for 90.0% of the total identified carotenoids in MT. Additionally, the total carotenoid concentration of MT was 7.9-fold that of WT. Moreover, the concentrations of lutein, violaxanthin, α-carotene and β-cryptoxanthin were higher in MT than in WT. However, in the MT and WT pulps, the carotenoid species and content were similar to each other (Additional file 2).
Three carotenogenic genes involved in β-carotene degradation are significantly down-regulated in the MT Firstly, we compared the sequence information of the carotenoid biosynthetic genes in MT and WT and isolated full-length cDNAs, including ggps, psy, pds, crtiso, lcyb, lcye, lcy2b, ccd4c, bch, nced2 and nced3. The result showed that the sequences were 100% identical between MT and WT (Additional file 3). These 11 sequence data were submitted to the GenBank with accession numbers from KP462725 to KP462735. Then, the effect of the mutation on carotenogenic gene expression was examined by quantitative real-time PCR (qRT-PCR) using the probes of pummelo cDNAs encoding GGPS, PSY, PDS, ZDS, CRTISO, LCYb, LCYe, LCY2b, CCD1, CCD4a, CCD4c, BCH, NCED2, NCED3 and ZEP ( Figure 2). The expression levels of upstream carotenogenic genes (ggps, zds and crtiso) in MT and WT were almost the same. However, the gene expression level of psy, pds and lcy2b was much higher in WT than in MT. The expression level of all downstream carotenogenic genes was lower in MT than in WT. Particularly, ccd1, bch and nced2 showed significantly reduced transcript levels in the MT pericarp.

RNA-seq and global detection of DEGs
Solexa/Illumina RNA-Seq analysis was performed to identify the genes involved in the regulation of carotenoid biosynthesis in pummelo pericarp. Six libraries were constructed and sequenced, including three biological replicates for WT (termed as WT1, WT2 and WT3) and three biological replicates for MT (termed as MT1, MT2 and MT3). The major characteristics of these six libraries are summarized in Table 1. A sequencing depth of over thirteen million raw tags was obtained for each of the six libraries, with the number of raw tags ranging from 13,520,581 to 16,301,802. After filtration, we obtained a total of 13,347,784 (WT1), 14,532,229 (WT2) and 15,027,468 (WT3) clean tags for the WT RNA-Seq libraries and 16084513 (MT1), 14223118 (MT2) and 14025066 (MT3) clean tags for the MT RNA-Seq libraries, with the clean tags accounting for more than 98% of the total, which were then mapped to the sweet orange genome [42]. These reads were deposited in NCBI GEO database with an accession no. GSE64764. In the MT and WT samples, 76.0% (MT1), 76.5% (MT2), 76.4% (MT3), 75.9% (MT1), 76.4% (WT2) and 75.4% (WT3) of the clean tags from RNA-Seq data were mapped uniquely to the genome, while a small proportion of them were mapped multiply to the genome (Table 2).
Differentially expressed tags in the samples were identified by calculating the number of unambiguous tags for each gene and then normalizing this to the number of reads per kilobase of exon model per million mapped reads (RPKM). All the uniquely mapped reads were used for calculating the RPKM values of the genes. Genes within the RPKM range of 0-3 were considered to be expressed at low level; genes within the RPKM range of 3-15 were considered to be expressed at medium level; and genes beyond a RPKM value of 15 were considered to be expressed at high level [43]. Low-level expressed genes covered the highest percentage in MT and WT. The DEGs in the MT samples were identified at padj < 0.05, obtaining a total of 303 significantly DEGs, with 135 upregulated and 168 down-regulated (Additional file 4). The details of these genes are listed in Additional file 5.

Annotation of DEGs in MT and WT
These DEGs may be involved in different functions. Gene ontology (GO) is an international standardized gene functional classification system that describes the properties of genes and their products in any organism. To understand the functions of the 303 DEGs, we mapped them to the three GO ontologies, including molecular function, cellular component, and biological process ( Figure 3). According to cellular component, the most abundant DEGs were involved in "membrane" (9.2%), "cell" (5.3%) and "cell part" (5.3%). From the perspective of biological process, the DEGs were involved in "metabolic process" (28.4%), "cellular process" (20.8%), "organic substance metabolic process" (18.5%), "primary metabolic process" (17.8%) and "cellular metabolic process" (13.9%). In terms of molecular function, the genes were dominant in "catalytic activity" (31.4%), "binding" (24.4%), "ion binding" (15.5%), "heterocyclic compound binding" (13.5%) and "organic cyclic compound binding" (13.5%). In addition, the whole genome background was examined by GO category enrichment analysis (P-value ≤ 0.05). Three cellular component terms were significantly enriched in the up-regulated genes, including microtubule cytoskeleton, cytoskeletal part and cytoskeleton. To further understand the biological functions of these genes, KEGG (http://www. genome.jp/kegg/) ontology assignments were used to classify their functional annotations. All the 303 DEGs were assigned to 52 KEGG pathways. Among the pathways, carbon metabolism, starch and sucrose metabolism, biosynthesis of amino acids, and a few others were highly represented ( Table 3).

Verification of DEGs
A total of 22 DEGs were selected for qRT-PCR verification. Among them, 10 were referred to as the differentially expressed transcription factors. The other 12 genes belonged to the affected pathways including sugar metabolism and amino acid metabolism. The results showed that 19 out of the 22 differentially expressed genes in MT and WT were in consistency with the RNA-seq data ( Figure 4). Linear regression [(RNA-seq value) = a(qRT-PCR value) + b] analysis of these 19 DEGs showed an overall correlation coefficient of 0.78, indicating a good correlation between the transcription profile revealed by RNA-seq data and the transcript abundance assayed by qRT-PCR (Additional file 6). These results confirmed the reliability of the RNA-seq data.
Changes in fruit soluble sugar, amino acid, and fatty acid content Considering the singificant expression change in a number of MT genes implicated in starch and sucrose metabolism as well as the biosynthesis of amino acids and fatty acids, the content of these metabolites was determined by the GC-MS analysis ( Table 4). The results showed that the content of most sugars in MT was lower than that in WT, such as sucrose, glucose, fructose and mannose. Additionally, the MT pericarp, when compared with the WT pericarp, showed a decrease in  the levels of four types of amino acids (proline, serine, threonine and GABA), but an increase in the levels of another four types of amino acids (lysine, valine, asparagine and aspartic acid). Interestingly, we detected an amount of asparagine in MT but trace in WT. We also detected four fatty acids in WT and MT pericarps. The content of octadecanoic acid and hexadecanoic acid was significantly lower in the MT pericarp than in the WT pericarp.

Discussion
The mutant used in this study is derived from a spontaneous mutation in Guanxi pummel, and the mutation confers a novel phenotype that is regulated in a fruitspecific pattern, with the pericarp exhibiting obvious orange colour. The distinctive orange colour in the mutant pericarp has clearly been shown to be due to the massive accumulation of β-carotene. The β-carotene accumulation induced by the mutation also leads to an obvious increase of total carotenoids in the MT. In the past few years, many citrus carotenoid mutants have been discovered, but almost all of them show the red-fleshed phenotype and have proved to accumulate abnormal lycopene. Therefore, the pummelo MT identified in this study is a special material for the citrus carotenoid regulation study, particularly for the investigation of pigmentation regulation in pericarp. Previous studies on carotenoid biosynthesis in red-fleshed mutant concluded that the induction of lycopene accumulation coincided with increased expression of upstream carotenogenic genes and reduced expression of genes downstream of lycopene synthesis [30]. We hypothesized that the mechanism regulating the βcarotene accumulation was coincident with that of lycopene. As expected, the downstream genes of β-carotene degradation in the carotenoid biosynthetic pathway (ccd1, ccd4a, ccd4c, bch, nced2, nced3 and zep) exhibited a decreased expression level in MT. Previous studies in potato tubers found that silencing the bch gene can significantly  Cysteine synthase, L-3-cyanoalanine synthase 2, 1-aminocyclopropane-1-carboxylate synthase enhance β-carotene levels [44,45]. In maize, the bch alleles associated with reduced transcript expression also correlate with higher β-carotene concentrations [46]. In our research, the expression of bch in the WT was 1.58 fold that of the MT, indicating that the significantly reduced expression of bch may result in the amount accumulation of β-carotene in the MT pericarp. However, our analyses failed to find a dramatic increased expression of upstream carotenogenic genes in MT when compared with WT.
Three key enzymes (psy, lcy2b and lcyb) for the βcarotene accumulation exhibited an obvious decrease in MT expression. These results implied that the MT exerted a major effect on β-carotene accumulation via the downregulation of downstream genes, especially bch.
To understand the potential mechanisms involved in the regulation of carotenoid biosynthesis in the citrus pericarp, we used the RNA-seq approach to investigate the transcriptome profiles in MT and WT. Our analysis showed that a total of 303 genes altered expression pattern. Similar results have been reported in several studies on mutant-progenitor pairs [33,36,37]. GO analysis of annotated genes revealed that most of the DEGs were involved in catalytic activity and metabolic process ( Figure 3). Because carotenoid biosynthesis which belonging to the secondary metabolisms is a dynamic and complex process catalyzed by a series of enzymes. Functional category analysis revealed that the DEGs are involved in a number of important pathways (Table 3), such as the metabolic pathways, which is consistent with the GO results that large numbers of genes are implicated  in catalytic activity and metabolic process. The most noticeable pathways are carbon metabolism, starch and sucrose metabolism and biosynthesis of amino acids. Expressions of key genes in sucrose and starch metabolism, including alpha-1, 4 glucan phosphorylase (Cs6g22020), pectinesterase 3 (Cs1g16550), sucrose-phosphate synthase 4 (Cs5g19060) and pectinesterase 2 (orange1.1 t00214), were differentially expressed in WT and MT pericarpes, indicating that the sucrose and starch metabolism was significantly affected in MT. For example, Alpha-1, 4 glucan phosphorylase involved in sucrose degradation was upregulated and sucrose-phosphate synthase 4 involved in sucrose accumulation was down-regulated in MT, indicating the acceleration of the sucrose degradation. Our gas chromatography-mass spectrometry (GC-MS) analysis also proved that the sucrose degradation in pericarp is higher in MT than in WT (Table 4). Moreover, the content of most sugars was significantly decreased in MT, indicating that the precursors for the glycolysis were increased by the accelerated degradation of sugars. Previous reports have also proved that the β-carotene synthesis was tightly linked to carbon metabolism [47,48]. Five genes involved in carbon metabolism were differentially expressed in MT and WT in our results. One gene encoding glyceraldehyde-3-phosphate dehydrogenase (Cs2g14940) was significantly increased (2.9-fold) in MT. This gene, catalyzing the conversion of glycerate 3phosphate to glyceraldehyde 3-phosphate, was important for glycolysis, which was consistent with a previous speculation that glycolysis was increased in MT. The present research also found that five genes involved in amino acid biosynthesis were significantly changed in MT, which was in line with our GC-MS analysis that the content of amino acid differed significantly between MT and WT. A similar result was also observed in carotenoidenhanced transgenic tomato fruits [49]. Interestingly, our research found that the asparagine was the most affected amino acid. Compared to WT, the content of asparagine increased 8.85-fold in the carotenoid-enhanced transgenic tomato fruits. These data indicated that the content of asparagine was strongly correlated with carotenoid accmulation. In order to identify potential candidate genes involved in the regulation of carotenoid biosynthesis, we also analysed the top 10 most DEGs in MT and WT (Additional file 7). Among them, two genes were involved in fatty acid metabolism. One gene encoding Fatty acyl-CoA reductase 3 (Cs8g15220) was significantly reduced in the MT, which was important for the fatty acid biosynthesis. The other gene encoding GDSL esterase/lipase (Cs2g04220) was significantly increased in the MT, and the GDSL esterase/ lipase was involved in fatty acid degradation. The altered expression of these two genes indicated a decrease of the fatty acid content in MT, which was consistent with our GC-MS analysis result that the contents of octadecanoic acid and hexadecanoic acid were lower in MT than in WT (Table 4). The biosynthesis of carotenoids and fatty acids required common precursors from pyruvate [50]. We concluded that these two genes may play important role in the carotenoid metabolism regulation. We also found that the expression of one gene belonging to cytochrome P450 (Cs6g20050) was significantly increased in MT. Cytochrome P450 catalyzes various reactions in plant biosynthesis of second metabolites, including carotenoids [51,52]. Cytochrome P450 hemoproteins, which catalyze NADPH-and O 2 -dependent hydroxylation reactions, were postulated to also be able to use hydrocarbon carotenes as substrates [53].
Transcription factors are the key switches for secondary metabolite gene regulation [54]. In the present study, twelve genes encoding transcription factors were identified by RNA-Seq analysis (Additional file 8). Among the group of transcription factors, we identified three genes belonging to the MYB family of transcription factors (Cs3g02020, Cs3g23070 and orange1.1 t01787). Previous studies on the carotenoid mutants also identified a number of MYB transcription factors [34,35]. The superfamily of MYB transcription factors was proved to control many biological processes, primarily in anthocyanin biosynthesis [55,56]. Overexpression of a Vitis vinifera R2R3-MYB transcription factor (MYB5b) in tomato resulted in an increased content of β-carotene [57]. These results indicated that the MYB genes may be involved in regulating carotenoid biosynthesis. We also detected two significantly differentially expressed NAC transcription factors. NAC proteins constitute one of the largest families of plant-specific transcription factors [58]. Genes from this family participate in various biological processes including developmental programs, defence, and biotic and abiotic stress responses [59,60]. Recently, a NAC transcription factor (SlNAC4) has been proved to a positive regulator of carotenoid accumulation [61]. In this study, both of the two identified NAC transcription factors showed a down-regulated expression in MT samples, indicating that both of them may play a feedback regulating role in the carotenoid biosynthesis. Ethylene plays a key regulatory role in fruit ripening and carotenoid accumulation [62]. Our results showed that the ethylene-responsive transcription factor (RAP2-7) was highly expressed in MT. In this study, we also identified several other significantly differentially expressed transcription factors, such as WRKY (Cs2g25560), BHLH (Cs8g03200) and MUTE (Cs9g06130).

Conclusions
This is the first investigation of the biochemical and molecular alterations associated with an orange-pericarp fruit mutation in pummelo. In this study, the content of carotenoids and the expression patterns of carotenoid biosynthetic genes in the pericarps were comparatively analysed for the pummelo MT and its WT. We used RNA-seq to identify the differential expression genes in the MT by comparing with the WT. GO analysis and pathway mapping of the DEGs provide significant insight into the underlying molecular mechanisms governing the β-carotene accumulation. Critical genes and pathways involved in carbon metabolism, starch and sucrose metabolism and biosynthesis of amino acids were associated with the β-carotene accumulation. The results suggest that the considerable β-carotene accumulation appears to be due to a down-regulation of downstream genes for β-carotene degradation. Moreover, several candidate genes and transcription factors that possibly regulate carotenoid biosynthesis in the pericarp of pummelo were also identified. However, the functions of these genes remain to be elucidated in the future. The overall findings from this study facilitate the understanding of the molecular regulation of β-carotene accumulation in the pummelo mutant strain and provide useful information for further related studies.

Plant materials and RNA extraction
The materials used in this study were 'Guanxi' pummelo and its MT cultivated in the city of Zhangzhou, Fujian province, China. The samples were harvested at ripe stage with three biological replicates. After separation from fruits, the pericarps were immediately frozen in liquid nitrogen and kept at −80°C until further use. Total RNA was extracted from the pericarps of WT and MT as previously described [30]. The quality of the RNA was assessed by 1% agarose gel electrophoresis coupled with NanoPhotometer® spectrophotometer (IMPLEN, CA, USA). RNA concentration was measured using Qubit® RNA Assay Kit in Qubit® 2.0 Flurometer (Life Technologies, USA). RNA integrity was confirmed using a 2100 Bioanalyzer (Agilent Technologies) with a minimum RNA integrity number (RIN) value of 8.0.

Carotenoid content measurement
Carotenoid extraction and quantification was performed as previously described with modification [30]. Carotenoids were analyzed by reversed phase HPLC. Chromatography was carried out with a Waters liquid chromatography system equipped with a model 600E solvent delivery system, a model 2996 photodiode array detection (PAD) system, a model 717 plus autosampler, and an empower Chromatography Manager. Carotenoids were eluted with MeOH-Acetonitrile [75:25 v/v, eluent A] and MTBE [eluent B] using a C30 carotenoid column (15 × 4.6 mm; YMC, Japan). Carotenoids were identified by their characteristic absorption spectra, typical retention time, and comparison with authentic standards (Bern, Switzerland).

RNA-seq library preparation and sequencing
Sequencing libraries were constructed by using three biological replicates for WT and MT pericarps, which were named WT1, WT2, WT3, MT1, MT2 and MT3, respectively. A total amount of 3 μg RNA per sample was used as input material for the RNA sample preparation. Sequencing libraries were generated using NEBNext® Ultra™ RNA Library Prep Kit for Illumina® (NEB, USA) by following manufacturer's recommendations, and index codes were added to attribute sequences to each sample. Briefly, mRNA was purified from total RNA using poly-T oligo-attached magnetic beads. Fragmentation was carried out using divalent cations under elevated temperature in NEBNext First Strand Synthesis Reaction Buffer (5×). First strand cDNA was synthesized using random hexamer primer and MmuLV Reverse Transcriptase (RNase H-). Second strand cDNA synthesis was subsequently performed using DNA polymerase I and RNase H. Remaining overhangs were converted into blunt ends via exonuclease/ polymerase activities. After adenylation of 3′ ends of DNA fragments, NEBNext Adaptor with hairpin loop structure was ligated before hybridization. To preferentially select cDNA fragments of 150-200 bp in length, the library fragments were purified with AMPure XP system (Beckman Coulter, Beverly, USA). Then 3 μl USER Enzyme (NEB, USA) was used with size-selected, adaptor-ligated cDNA at 37°C for 15 min followed by 5 min at 95°C before PCR. The PCR was performed with Phusion High-Fidelity DNA polymerase, Universal PCR primers and Index (X) Primer. Finally, PCR products were purified (AMPure XP system) and library quality was assessed on the Agilent Bioanalyzer 2100 system. The clustering of the indexcoded samples was performed on a cBot Cluster Generation System using TruSeq SR Cluster Kit v3-cBot-HS (Illumia) according to the manufacturer's instructions. After cluster generation, the library preparations were sequenced on an Illumina Hiseq 2000 platform and 100 bp single-end reads were generated.

Data analysis
Raw sequence reads were first processed using an inhouse Perl script. In this step, clean data were obtained by removing reads containing adaptors only, reads with more than 10% unknown bases and reads with a quality score of less than 5.0 for more than half of the bases. Meanwhile, the Q20, Q30 and GC content of the clean data were calculated. All the downstream analyses were based on these clean data with high quality. For annotation, all clean tags were mapped to the reference sequence of the sweet orange genome [42]. Mismatches of no more than two bases were allowed in the alignment. The remaining clean tags were designated as unambiguous clean tags. The RPKM method was used to estimate the unique gene expression levels [63]. Reference genome and gene model annotation files were downloaded directly from the genome website (http://citrus. hzau.edu.cn/orange/index.php). Index of the reference genome was built using Bowtie v2.0.6 (Broad Institute, Cambridge, MA, USA) and single-end clean reads were aligned to the reference genome using TopHat v2.0.9 (Broad Institute). TopHat was selected as the mapping tool because it can generate a database of splice junctions based on the gene model annotation file and thus give a better mapping result than other non-splice mapping tools. Differential expression analysis of two samples (each three biological replicates) was performed using the DESeq R package (1.10.1) [64]. DESeq provides statistical routines for determining differential expression in digital gene expression data using a model based on the negative binomial distribution. The resulting P-values were adjusted using the Benjamini and Hochberg's approach for controlling the false discovery rate. The significance of the gene expression difference was determined with an adjusted P-value <0.05 found by DESeq. GO enrichment analysis of DEGs was implemented by the GOseq R package. GO terms with a corrected P-value < 0.05 were considered significantly enriched by differentially expressed genes. The statistical enrichment of the differential expression genes in KEGG pathways was tested using the KO-Based Annotation System (KOBAS) software.

qRT-PCR analysis
To validate the RNA-Seq results and provide more information for the affected metabolic processes, 22 selected DEGs corresponding to the metabolic pathways and transcription factors were verified by qRT-PCR. Actin was amplified along with the target gene as an endogenous control to normalize expression between different samples. Primer sequences used for qRT-PCR are listed in Additional file 9. The samples collected from another year and different from the RNA-seq analysis were used for qRT-PCR validation. One μg of total RNA from each sample was used to synthesize the first strand cDNA using the PrimeScript Reverse Transcriptase Kit (TaKaRa) according to the protocol of the manufacturer. The qRT-PCR was carried out in an ABI PRISM® 9600 Sequence Detection System (Applied Biosystems) using SYBR Green Supermix according to the manufacturer's instructions, under the thermal cycle conditions of an initial denaturation at 94°C for 10 min, followed by 40 cycles of 94°C for 15 s, 60°C for 31 s for annealing, and a final step of extension at 72°C for 7 min. The expression level of genes was calculated by the delta-delta-Ct method [65]. Each biological sample was examined in duplicate with two to three technical replicates.

Determination of the sugar, amino acid and fatty acid content in the pericarp
The extraction and derivatization of sugars, amino acids and fatty acids were performed as previously described with modification [66]. A 200 mg sample was added to the extracting solution containing 2,700 μl of methanol and 300 μl of 0.2 mg ml −1 ribitol in water as a quantification internal standard. Each sample (1 μl) was injected into the GC system through a fused-silica capillary column with a DB-5 MS stationary phase (30 m × 0.25 mm i.d., 0.25 μm). Total ion current (TIC) spectra were recorded in the mass range of 45-600 atomic mass units (amu) in scanning mode.