Transcriptome analysis of nitrogen-starvation-responsive genes in rice

Background Nitrogen (N), a critical macronutrient for plant growth and development, is a major limiting factor in most agricultural systems. Microarray analyses have been conducted to investigate genome-wide gene expression in response to changes in N concentrations. Although RNA-Seq analysis can provide a more precise determination of transcript levels, it has not previously been employed to investigate the expression of N-starvation-induced genes. Results We constructed cDNA libraries from leaf sheaths and roots of rice plants grown under N-deficient or -sufficient conditions for 12 h. Sequencing the libraries resulted in identification of 33,782 annotated genes. A comparison of abundances revealed 1,650 transcripts that were differentially expressed (fold-change ≥ 2) due to an N-deficiency. Among them, 1,158 were differentially expressed in the leaf sheaths (548 up-regulated and 610 down-regulated) and 492 in the roots (276 up, 216 down). Among the 36 deficiency-induced genes first identified via RNA-Seq analyses, 34 were subsequently confirmed by qRT-PCR. Our RNA-Seq data identified 8,509 multi-exonic genes with 19,628 alternative splicing events. However, we saw no significant difference in alternative splicing between N-sufficient and -deficient conditions. We found 2,986 novel transcripts, of which 192 were regulated under the N-deficiency. Conclusion We identified 1,650 genes that were differentially expressed after 12 h of N-starvation. Responses by those genes to a limited supply of N were confirmed by RT-PCR and GUS assays. Our results provide valuable information about N-starvation-responsive genes and will be useful when investigating the signal transduction pathway of N-utilization. Electronic supplementary material The online version of this article (doi:10.1186/s12870-015-0425-5) contains supplementary material, which is available to authorized users.


Background
The macronutrient nitrogen (N) is an essential component of numerous important compounds, including amino acids, proteins, nucleic acids, chlorophyll, and some plant hormones. This element is a major limiting factor in most agricultural systems. Because the Nutilization efficiency strongly influences crop productivity, a vast amount of N fertilizers is applied to maximize yields. However, it is estimated that more than half of that N is lost from the plant-soil system, with unused N fertilizers severely polluting the environment [1]. Thus, N-uptake efficiency must be increased to improve productivity and reduce pollution.
During periods of N-starvation, various deficiencyresponsive genes function to support plant survival by increasing the level of chlorophyll synthesis [2], altering root architecture [3], improving N-assimilation [4], enhancing lignin content [5], and changing the amounts of sugars and sugar phosphates [6]. Nitrate transporter genes (NRTs) are responsible for the high-affinity NO 3 − transport system and stimulate lateral root growth. Arabidopsis NRT2.1 plays a major role in NO 3 − uptake and determines root architecture by controlling lateral root formation [7]. The ammonia transporter gene AtAmt1.1, which is highly expressed in the roots, also restructures this architecture under limited-N conditions [8]. The plant-specific Dof1 transcription factor (TF) from maize also functions to increase N-assimilation [9]. In Dof1-overexpressing Arabidopsis plants, genes are upregulated under N-starvation to encode enzymes for carbon skeleton production [9]. Those transgenic plants also show markedly elevated amino acid contents, reduced levels of glucose, and improved growth during periods of N-deficient stress [9]. Overexpression of glutaminesynthetase1 in tobacco and maize is associated with significant gains in plant heights, dry weights, and kernel numbers [10,11]. Overexpression of NADH-glutamatesynthase in rice and alanine aminotransferase in canola and rice also causes increases in grain weights [12] and biomass [13,14]. An early nodulin gene, OsE-NOD93-1, responds to both increases and reductions in N supplies. Furthermore, transgenic rice plants overexpressing OsENOD93-1 have greater shoot dry biomass and seed yields [15].
Microarray analyses have been conducted to investigate genome-wide gene expression in response to changes in N conditions. Wang et al. [16] studied gene responses in Arabidopsis plants that were first grown for 10 d with ammonium as the sole N source, then treated with 250 mM nitrate for 20 min. That analysis identified 1,176 nitrate-responsive genes in the roots and 183 in the shoots. Peng et al. [17] monitored expression profiles from Arabidopsis plants grown under nitratelimiting or -sufficient conditions. There, N-starvation altered transcript levels for 629 genes, with 340 being upregulated and 289 down-regulated. Palenchar et al. [18] identified over 300 genes regulated by interactions between carbon and N signaling in Arabidopsis. Bi et al. [19] detected differential expression of genes under mild or severe chronic N stress. Plant responses were much more pronounced under severe conditions.
With 'Minghui 63' rice, Lian et al. [20] applied EST microarrays to examine expression profiles under low-N stress. In seedling roots, 473 responsive genes were identified, with 115 being up-regulated and 358 downregulated. Beatty et al. [21] generated transgenic rice plants that overexpress alanine aminotransferase. Comparisons of transcriptomes between the transgenic plants and controls revealed that 0.11% and 0.07% of those genes were differentially regulated in the roots and shoots, respectively. Cai et al. [22] analyzed the dynamics of the rice transcriptome at 1 h, 24 h, and 7 d after N-starvation treatment. In all, 3,518 genes were identified, with most being transiently responsive to such stress.
Xu et al. [23] performed a genome-wide investigation to detect miRNAs that responded to either chronic or transient nitrate-limiting conditions in maize. They identified miRNAs showing overlapping or unique responses as well as those that were tissue-specific. Humbert et al. [24] reported that the concomitant presence of N and a water deficit affected expression much more than was anticipated in maize. This research group also revealed how the interaction between those two stresses shaped patterns of expression at different levels of water stress as well as during the recovery period. Finally, Brouillette and Donovan [25] identified five genes that had markedly different responses to nitrogen limitations in Helianthus anomalus when compared with H. petiolaris and H. annuus.
Although microarray analyses have been extensively used for the past few decades, RNA-Seq analysis can more precisely measure transcript levels and allow for the absolute quantification of gene expression. However, RNA-Seq has not previously been employed to investigate N-deficiency-induced genes. Here, we report transcriptome profiles for 1,650 N-starvation-responsive genes from rice for which expression was altered in the roots or shoots due to an N-limitation.

RNA-Seq analysis of N-deficiency stress-responsive genes
Through microarray analyses, early-responsive genes have been detected in rice roots but not in leaves when sampled after 20 min, 1 h, and 2 h of N-starvation [20,22]. Cai et al. have monitored such genes after longterm (1-and 7-d) treatments with limited-N [22].
To identify additional responsive genes, we transferred rice seedlings at the six-leaf stage to an N-deficient hydroponic solution. Leaf sheaths and roots were harvested after 3 h, 6 h, 12 h, 1 d, and 2 d. We used two previously identified N-starvation-induced genes -NRT2.3 and AMT2.1to investigate induction kinetics ( Figure 1). In both sheaths and roots, transcript levels were increased upon starvation, peaking at 12 h before declining to basal levels after 1 d. This trend was consistent with earlier reports [2,3,26]. Therefore, we selected the 12-h point for RNA-Seq analyses to distinguish between our results and those of studies that had investigated only very early-or late-responsive genes. Because expression of stress-responsive genes is mostly transient, we believed our data would be valuable for finding a new class of N-starvation-responsive genes. Leaf sheaths and roots were harvested from plants grown under deficient or sufficient conditions. RT-PCR analyses were used to determine the response of several N-metabolism genes, including OsAMT1.1, OsAMT1.2, OsAMT2.1, OsAMT3.2, OsNAR2.2, OsNR, OsNRT2.2, OsNRT2.3, OsPEPC, and OsASN. Significant changes in expression were revealed in the 12-h N-deficient samples ( Figure 2).
We constructed eight cDNA libraries from two biological replicates of leaf sheaths and roots from plants grown under deficient or sufficient conditions. Sequencing those libraries resulted in the identification of 40,756,549 and 41,703,971 paired-end reads (202-nucleotide read length) from the sheaths and roots, respectively. The generated reads were then aligned to the rice genome (IRGSP/RAP build 5 data set) [27,28] by applying Bowtie [29] and TopHat2 programs [30]. In all, 86% of the reads from the sheaths and 69% from the roots were mapped to the reference genome, for which nearly 87% were correctly aligned and approximately 98% of them had unique locations in that genome (Table 1).
Transcript profiles of the RNA-Seq data were analyzed by calculating the reads per kilo base per million reads (RPKM). The sequenced RNA covered 33,782 annotated genes, accounting for 86.2% and 86.7% of those genes in the sheaths and roots, respectively. In addition, 2,986 novel transcripts were detected. Transcripts with low RPKM values were removed because they may not have been reliable due to low abundance or statistical faults. Among the 36,768 transcripts, 26,699 had RPKM ≥ 2. Of those, 22,992 were present in the leaf sheaths, 24,087 in the roots, and 18,319 in both. We identified 6,319 transcripts that were uniquely expressed in the sheaths (2,612) or roots (3,707).
Among the transcriptionally active transcripts, the top 500 most highly expressed were identified from the leaf sheath (Additional file 1) and roots (Additional file 2) under N-limited conditions. In both organ types, the most frequent transcripts functioned for protein synthesis, protein degradation, photosynthesis, stress responses, TFs, and DNA synthesis. Transcripts involved in lipid metabolism, transport, secondary metabolism, and amino acid metabolism were also common.

Differential expression of transcripts due to N-deficiency
Comparing transcript abundances revealed 1,650 transcripts that were differentially expressed (fold-change ≥ 2; p ≤ 0.05) due to a deficient N supply (Additional files 3 and 4). Among them, 1,158 were differentially expressed in the leaf sheaths and 492 in the roots. Of those identified in the N-deficient sheaths, 548 transcripts were up-regulated and 610 transcripts were down-regulated. In the N-deficient roots, 276 transcripts were up-regulated and 216 were down-regulated. To gain insight into the effect of N status on transcript expression profiles, we illustrated expression patterns with a heat map obtained via hierarchical cluster analysis (Additional file 5). This clustering revealed the relatedness of the various transcripts.
Transcription factors are important for controlling the expression of other genes. Several TFs have been described in plants exposed to limited N. For example, an R2R3-type MYB TF, CmMYB1, is a central regulator of N-assimilation in Cyanidioschyzon merolae and enhances the expression of CmNRT, CmNAR, CmNIR, CmAMT, and CmGS under N-starvation [4]. A member of the Arabidopsis GATA TF gene family, At5g56860, is inducible by nitrate; loss-of-function mutants cause reduced chlorophyll levels and downregulation of the genes involved in carbon metabolism [2]. In Arabidopsis, ANR1 encodes a MADS box protein and is induced by nitrate. When expression of this gene is suppressed, lateral root proliferation is altered due to a reduction in sensitivity to NO 3 − [3]. Of the 1,650 transcripts that we found differentially expressed under an N-deficiency, 86 were identified as TFs, covering 28 families (Table 2; Additional file 6). This included one TF each from the GATA, Dof, and MADS families. The AP2/EREBP and WRKY TF families are the two largest families responsive to this deficiency. Here, six AP2/EREBP TF members were increased in the sheaths and seven in the roots under stress. Twelve WRKY members were induced in the sheaths versus none in the roots. It will be valuable in future investigations to determine whether these TFs also play a critical role in the N-starvation response and plant development.  We classified the 1,650 differentially expressed genes into 54 functional groups by GO analysis (Figure 3). The dominant terms were 'cell part' (GO:0044464) in Cellular Component, 'binding' (GO:0005488) in Molecular Function, and 'cellular process' (GO:0009987) in Biological Process. In the third category, more than 30% of the genes for 'metabolic process' (GO:0008152), 'response to stimulus' (GO:0050896), and 'biological regulation' (GO:0065007) responded to N-starvation. 'Cellular process' accounted for 72.4% and 58.3% of the starvation-related genes in the leaf sheath and root, respectively. 'Metabolic process' genes made up 70.0% and 53% in the sheath and root, respectively; while those proportions were 46.7% (sheath) and 41.1% (root) for 'response to stimulus' and 45.7% (sheath) and 36.3% (root) for 'biological regulation'.

Confirmation by real-time PCR
Our RNA-Seq data appeared to be quite reliable for genes up-regulated by N-starvation, with 34 of the 36 deficiency-responsive genes first identified via RNA-Seq analyses subsequently being confirmed by qRT-PCR (Table 3). Only two could not be verified in that latter examination. By contrast, the identification of downregulated genes by RNA-Seq was less reliable. Among 12 examined, eight were later confirmed through qRT-PCR (Table 4).

Validation by GUS assays
We used GUS assays of T-DNA gene trap lines to confirm the N-starvation-responsive TF genes. Those tagging lines were previously generated to have a translational fusion between the tagged gene and GUS [31]. Five GUS-positive lines displayed N-responsive GUS activity. Although this activity was weak when plants were grown in a standard N-sufficient medium, it was rapidly induced by N-starvation. Under low-N conditions, four lines (3A-60813, 3A-51694, 4A-02639, and 4A-01614) showed preferential GUS-staining in the sheaths (vascular bundles) while one (1B-11001) showed staining in the roots (vascular cylinder) (Figure 4, Table 5). In all five lines, GUS activity was higher for plants in the low-N medium than in the normal MS medium. The Os01g14440 and Os11g02480 genes encode a WRKY TF, Os12g07640 encodes a MYB TF, Os03g55220 encodes a bHelix-loop-helix TF, and Os02g43300 encodes the trihelix TF GTL1. Although the T-DNA vectors carry an intron with triple splicing donors/acceptors at the right border, only one pair of donors and acceptors is utilized that reduces the frequency of translational fusion between the tagged gene and GUS [32]. Nonetheless, the GUS-trapped TF lines are valuable for investigating their roles during Nstarvation.

Analysis of alternative splicing
Alternative splicing (AS) is an important regulatory mechanism common in higher eukaryotes that results in a single gene coding for multiple proteins, thereby enhancing biological diversity [33]. Its products are efficiently identified using high-throughput sequencing techniques [34,35]. To investigate potential splicing junctions, we performed computational analyses that revealed 8,509 multi-exonic genes with 19,628 AS events ( Figure 5). These events were categorized into six common types. 'Intron retention' was the dominant type (42.8%), which is consistent with previous observations from plants [36,37]. By contrast, 'exon skipping' is the most prevalent mechanism in humans and yeast [38,39]. Here, 'alternative 3' site', 'exon skipping', and 'alternative first exon' accounted for 16.1%, 14.0%, and 13.2%, respectively, of all events. Frequencies were relatively low for 'alternative 5' site' (7.1%) and 'alternative last exon' (6.9%) (Figure 5a). These data were consistent with other recent reports for plants [36,37,40,41]. An example of the transcript isoforms is shown in Figure 5b. Alternative splicing can occur because of environmental factors. For example, expression of Wdreb2 is activated by cold, drought, salt, or exogenous ABA treatment; depending upon the source of the stress, three transcript forms may be produced [42]. However, we found no significant difference in AS between Nsufficient and -deficient conditions, which suggests that it is not involved in the low-N stress response.

Novel transcribed regions (NTRs) validated by RT-PCR
RNA-Seq technology has revealed novel transcripts that could not be identified previously [43]. Our RNA-Seq data contained 2,986 novel transcribed regions, of which 192 were regulated under the N-deficiency. To confirm their existence, we conducted semi-quantitative reverse transcription PCR with 13 NTRs (Figure 6). Among them, 10 (77%) were detected in the leaf sheath and 7 of those 10 (70%) showed expression patterns consistent with the sequencing data. The three exceptions, with inconsistent patterns, were NTR-1489, NTR-2195, and NTR-2240.

Conclusion
We performed deep transcriptomic investigations with rice plants and obtained detailed expression profiles for genes involved in responses to low-N stress. These data provide valuable information about the genes (1650 transcripts) induced by N-starvation, expecially the 86 TFs that are key regulators of growth and development. We then confirmed these RNA-Seq data by conducting qRT-PCR and GUS assays of T-DNA tagging lines. In all, 8,509 multi-exonic genes could be linked with 19,628 AS events. However, we found no significant difference in alternative splicing between N-deficient samples and controls. Our data will be useful for identifying Ndeficiency-induced genes and investigating the signal transduction pathway of N-utilization.

Plant materials and growth conditions
Oryza sativa L. ssp. japonica cv. Dongjin rice was used in all experiments. Seeds were surface-sterilized and   [44]. At the six-leaf stage, the seedlings were divided into two groups: 1) N-starvation, with the amount of NH 4 NO 3 in the solution reduced to 0.072 mM; and 2) N-sufficient, for which the nutrient solution contained the normal N concentration of 1.44 mM. At 12 h after the treatment began, the total roots and leaf sheaths were harvested from plants in both groups. Each biological replicate constituted a pool of three plants. Two of those replicates were subjected to RNA-sequencing.

Read alignment and assembly
RNA-Seq reads were aligned to the rice reference genomes by the TopHat2 program [30]. That program analyzes the RNA sequences to identify splice junctions between exons by using the ultra-high-throughput short-read aligner Bowtie [29]. Each read was mapped with Cufflinks, which assembled the alignments within the Sequence Alignment/ Map file into transfrags [45]. The assembly files were then merged with reference transcriptome annotations into a unified annotation for further analysis [46].  Expression levels for each gene were calculated by quantifying the Illumina reads according to the RPKM method [47]. Replicates were examined independently for statistical analysis. Genes that were differentially expressed by at least two-fold were tested for False Discovery Rate correlations at p-values ≤ 0.05 [48]. We also selected any transcripts with RPKM ≥ 2 in at least one cDNA library. Heat maps illustrating patterns for differentially expressed genes were generated as described by Severin et al. [49].
Gene Ontology (GO) term analysis and discovery of alternatively spliced exons Gene Ontology terms were examined by applying tools for GO enrichment (http://amigo.geneontology.org/cgibin/amigo/term_enrichment [50]) and Blast2GO [51], at p-values ≤ 0.05. Six basic modes of AS were identified by Cufflinks software, in which differentially spliced exons were detected by comparing pairs of gene models annotated to the same locus [46].

Identification of novel transcripts
Paired-end reads were mapped to the genome with a spliced-read mapper. Afterward, the reference annotations were used to generate faux-read alignments that covered the transcripts. Those alignments were used together with the spliced-read alignments to produce a reference genome-based assembly. Finally, this assembly was merged with the reference annotations and "noisy" read mappings were filtered, resulting in all reference annotation transcripts in the output as well as novel transcripts [52].

Real-time RT-PCR
Total RNA was isolated from seedling leaf sheaths and roots, using RNAiso Reagent. For first-strand cDNA synthesis, 1 μg of total RNA was reverse-transcribed in a total volume of 25 μL that contained 10 ng of oligo(dT) 12-18 primer, 2.5 mM dNTPs, and 200 units of AMV Reverse Transcriptase (Promega, Madison, WI, USA) in a reaction buffer. The samples were diluted 10 times prior to PCR. Gene-specific primers were designed using the Oligonucleotide Properties Calculator, or OligoCalc (http://basic.northwestern.edu/biotools/OligoCalc.html). Real-time PCR was performed with 3 μL of template cDNA, 1 μL of forward primer (5 pmol), 1 μL of reverse primer (5 pmol), and 5 μL of SYBR Green mix (Qiagen, Hilden, Germany). Conditions included 5 min of predenaturation at 95°C, then 45 cycles of 10 s at 95°C and 20 s at 60°C, followed by steps for dissociation curve generation (15 s at 95°C, 60 s at 60°C, and 15 s at 95°C). To examine the expression of novel transcripts, we performed semi-quantitative RT-PCR with OsUbiquitin as the internal reference to equalize the quantity of RNA. After 28 cycles of amplification, PCR products were resolved on a 2% agarose gel and stained with ethidium bromide. All primers are listed in Additional file 7.