Identification and characterization of long non-coding RNAs involved in osmotic and salt stress in Medicago truncatula using genome-wide high-throughput sequencing
BMC Plant Biology volume 15, Article number: 131 (2015)
Long non-coding RNAs (lncRNAs) have been shown to play crucially regulatory roles in diverse biological processes involving complex mechanisms. However, information regarding the number, sequences, characteristics and potential functions of lncRNAs in plants is so far overly limited.
Using high-throughput sequencing and bioinformatics analysis, we identified a total of 23,324 putative lncRNAs from control, osmotic stress- and salt stress-treated leaf and root samples of Medicago truncatula, a model legume species. Out of these lncRNAs, 7,863 and 5,561 lncRNAs were identified from osmotic stress-treated leaf and root samples, respectively. While, 7,361 and 7,874 lncRNAs were identified from salt stress-treated leaf and root samples, respectively. To reveal their potential functions, we analyzed Gene Ontology (GO) terms of genes that overlap with or are neighbors of the stress-responsive lncRNAs. Enrichments in GO terms in biological processes such as signal transduction, energy synthesis, molecule metabolism, detoxification, transcription and translation were found.
LncRNAs are likely involved in regulating plant’s responses and adaptation to osmotic and salt stresses in complex regulatory networks with protein-coding genes. These findings are of importance for our understanding of the potential roles of lncRNAs in responses of plants in general and M. truncatula in particular to abiotic stresses.
Non-coding RNAs (ncRNAs) are a set of RNAs that have no capacity to code for proteins. They are used to be considered as inconsequential transcriptional “noises”, because of limited information for their functions [1, 2]. However, this situation is being changed. Recent studies have shown that ncRNAs play important regulatory roles in numerous biological processes [3, 4].
NcRNAs are grouped into small RNAs, such as microRNAs (miRNAs) and small interfering RNAs (siRNAs), and long non-coding RNAs (lncRNAs) according to the length . LncRNAs are defined as a group of ncRNAs that have a length of more than 200 nucleotides . They are usually expressed at low levels and lacking sequence similarities among species, exhibit tissue and cell-specific expression patterns, and transcripts are localized to subcellular compartments [4, 7]. LncRNAs can be further grouped into sense, antisense, bidirectional, intronic and intergenic lncRNAs according to their relative locations with protein-coding genes . In Arabidopsis thaliana, >30 % of lncRNAs are intergenic, and antisense lncRNAs are also abundant [9, 10].
It has been shown that some lncRNAs regulate the expression of genes in a close proximity (cis-acting) or in a distance (trans-acting) in the genome via a number of mechanisms, including modifying promoter activities by nucleosome repositioning, histone modifications, DNA methylation, activating/gathering/transporting of accessory proteins, epigenetic silencing and repression [8, 11, 12]. Increasing evidence supports that lncRNAs play a crucial role in disease occurrence, genomic imprinting and developmental regulation in mammals [13–15].
In contrast to extensive studies of lncRNAs in mammals [13, 14, 16, 17], only a few studies have been reported of the function of lncRNAs in plants [18, 19]. For example, COOLAIR and COLDAIR have been identified to be associated with FLOWERING LOCUS C (FLC) in Arabidopsis. COLDAIR includes two antisense lncRNAs transcribed from the antisense strand of FLC, while COLDAIR is an intronic lncRNA transcribed from the first intron of FLC. They have been implicated in silencing and epigenetic repression of FLC to regulate flowering time [20, 21]. AtIPS1 and At4 have been shown to act as target mimics of miR399 by binding and sequestering miR399 and reduce miR399-mediated cleavage of PHO2 which is important for phosphate uptake [22, 23]. Genome-wide identification of lncRNAs in A. thaliana has been reported in several studies [24–27]. In rice, LDMAR has been shown to regulate photoperiod-sensitive male sterility . Bioinformatics analyses reveal that 60 % of lncRNAs are precursors of small RNAs and 50 % of lncRNAs are expressed in a tissue-specific manner [29–31].
Medicago truncatula is a model legume widely used in genomics, genetics and physiological studies of legumes due to its small genome size and relative ease in genetic transformation [32, 33]. Legumes account for one third of primary crop production in the word and are important sources of dietary proteins for human and animals . In M. truncatula, Enod40 and Mt4 involved in nodulation and phosphate uptake, respectively, have been identified as lncRNAs [35, 36]. Although a recent in silico analysis of lncRNAs has been conducted in M. truncatula, only limited information is presented, because only lncRNAs with poly(A) tails have been analyzed, using less finished genome sequences available at the time . As most lncRNAs have no poly(A) tails and are lowly and specifically expressed [4, 16], to identify a comprehensive set of lncRNAs including non-poly(A)-tailed lncRNAs in M. truncatula, we conducted genome-wide high-throughput sequencing of six libraries prepared using complementary sequences of synthetic adaptors. Similar to other plant species, legumes are also frequently encountered adverse environments such as osmotic and salt stresses. Previous studies of molecular mechanisms underlying plant’s tolerance to abiotic stresses are mainly focused on functional studies of protein-coding genes, while few studies have systemically investigated the roles of lncRNAs in osmotic and salt stress responses of plants. In the present study, we identified a comprehensive set of lncRNAs that are responsive to osmotic and salt stresses in leaves and roots of M. truncatula using high throughput sequencing of six cDNA libraries.
Physiological response to osmotic and salt stress
Materials used to construct cDNA libraries were treated by osmotic or salt stress for 5 h. Foliar osmolality was increased from 350 mOsmol kg−1 to 450 and 390 mOsmol kg−1, after the treatments with osmotic and salt stress, respectively (Table 1). There was a significant increase in foliar Na+ concentration after 5-h salt treatment (Table 1). No effects of osmotic and salt stress on concentrations of proline (Pro) and soluble sugars were detected (Table 1). These results suggest that plants under our treatment regime are at the early stage of stress-response to activate genes and their regulatory networks.
Six cDNA libraries were constructed using mRNA isolated from leaves and roots of M. truncatula seedlings treated with osmotic stress (OS), salt stress (SS), and control (CK) and complementary sequences of synthetic adaptors. They were sequenced by an Illumina-Solexa sequencer. The high-throughput sequencing led to more than 90,000,000 raw sequence reads. To assess the quality of RNA-seq data, each base in the reads was assigned a quality score (Q) by a phred-like algorithm using the FastQC . The analysis revealed that the data are highly credible with a mean Q-value of 36 (Additional file 1: Figure S1). Of the raw reads, more than 99 % were clean reads after initial processing (Table 2). We performed 100 bp paired-end sequencing, and led to 56.7 G raw bases and 56.6 G clean bases in total.
Identification and characterization of lncRNAs
The clean reads were mapped to the M. truncatula genome (Mt4.0) using the TopHat . Transcripts were then assembled and annotated using the Cufflinks package . Known mRNAs were identified according to the latest annotation of the M. truncatula genome sequence, and this led to the identification of 31,034, 36,482, 29,770, 36,832, 29,629 and 36,930 unique mRNAs from the six cDNA libraries, respectively (Table 2). The remaining reads were filtered according to length and coding potentials, such that transcripts smaller than 200 bp were excluded and transcripts with the coding potentials greater than –1 were removed. The remaining transcripts were considered as putative lncRNAs.
From these analyses, we identified 11,501, 18,275, 8,571, 18,277, 10,458 and 19,186 unique lncRNAs from the six cDNA libraries, respectively (Table 2). In total, 23,324 unique lncRNAs were obtained in the present study (Additional file 2: Table S1). And this number was similar to that of lncRNAs in Arabidopsis and maize [30, 41]. We found that these lncRNAs were more evenly distributed across the 8 chromosomes in M. truncatula with no obvious preferences of locations (Fig. 1a). According to the locations of lncRNAs in the genome, 10,426 intronic, 5,794 intergenic, 3,558 sense and 3,546 antisense lncRNAs were identified (Fig. 1b and e). In terms of the lncRNAs’ length, the majority of lncRNAs was relatively short. For example, 84.1 % of them were shorter than 1,000 nt (Fig. 1c). Interestingly, lncRNAs and mRNAs were much more abundant in roots than in leaves, given that similar amounts of raw reads were obtained for both leaf and root samples. In all libraries, more lncRNAs were detected in roots than in leaves (Table 2). For example, 18,275 lncRNAs were identified in roots, while there were 11,501 lncRNAs in leaves under control condition (Fig. 2a). Furthermore, we found that the accumulative frequency of lncRNAs differed in leaves from that in roots. The proportion of lncRNAs with a high level of expression was more than mRNAs in leaves, but this expression pattern was in contrary in roots under the control conditions (Fig. 1d). Moreover, these patterns of expression were not altered by treatments with osmotic and salt stress (Additional file 1: Figure S2). The lack of chloroplast-derived RNAs in roots might be a possible reason for the difference between leaves and roots.
All putative lncRNAs in M. truncatula were aligned with lncRNAs in A. thaliana from NONCODE database . We can only detect 140 lncRNAs that were comparable to those lncRNAs in A. thaliana, suggesting that lncRNAs are weakly conserved between the two species (Additional file 2: Table S1). Moreover, lncRNAs which were from transposons or which encoded microRNAs were marked (Additional file 2: Table S1).
Responses of lncRNAs to osmotic and salt stresses
To identify osmotic stress- and salt stress-responsive lncRNAs, the normalized expression (fragments per kilobase of exon per million fragments mapped, FPKM) of lncRNAs was compared amongst the six libraries.
LncRNAs that were responsive to osmotic and salt stresses in leaves and roots were identified by determining the P-value and false discovery rate. To verify the results from the RNA-seq experiments, 12 lncRNAs were selected to verify their expression by quantitative real-time PCR (qRT-PCR) (Fig. 3 and Additional file 1: Figure S3). These results indicate that our transcriptomic analysis is highly reproducible and reliable, and that lncRNAs identified from the high throughput sequencing represent real transcripts.
Transcript levels of 7,863 lncRNAs in leaves and 5,561 lncRNAs in roots were detected to be changed by the osmotic stress, and 7,361 lncRNAs in leaves and 7,874 lncRNAs in roots were identified to be responsive to the salt stress. Venn diagrams showed common and specific lncRNAs, whose expression was altered in roots and leaves by osmotic and salt stresses (Fig. 2b). Some lncRNAs in leaves and roots showed different responses to osmotic and salt stresses. There were 1,783 and 2,148 lncRNAs, whose expression was changed in both leaves and roots by osmotic and salt stresses, respectively. In leaves, more than half of stress-responsive lncRNAs were common between osmotic stress (59.6 %) and salt stress (63.7 %). However, these values were decreased to 47.0 % and 33.2 % in roots, respectively. The expression levels of 471 lncRNAs were found to be changed in the four treated samples (Fig. 2b). Among the lncRNAs, whose expression was changed in responses to osmotic and salt stresses, we further classified them to up-regulated and down-regulated classes (Additional file 1: Figure S4). For examples, 2,236 and 2,477 lncRNAs in leaves were up-regulated in responses to osmotic and salt stresses, respectively, and 475 lncRNAs shared similar expression patterns in responses to these two stresses. Twenty-eight and 213 lncRNAs were found to be up-regulated and down-regulated, respectively, in both roots and leaves treated with osmotic and salt stresses.
Functional analysis of stress-responsive lncRNAs
Previous studies showed that lncRNAs are preferentially located in a close proximity to genes that they regulate [13, 43–45]. To reveal potential functions of the identified lncRNAs, we analyzed Gene Ontology (GO) terms of genes that were co-expressed and spaced by less than 100 kb with the stress-responsive lncRNAs. We detected significant enrichments (P < 0.05) of 26 and 8 GO terms in leaves under osmotic stress and salt stress, respectively (Fig. 4, Additional file 1: Tables S2 and S3). For examples, we found GO term enrichments in cellular component (GO:0015934, large ribosomal subunit), molecular functions (GO:0004089, carbonate dehydratase activity; GO:0004075, biotin carboxylase activity; GO:0003735, structural constituent of ribosome; GO:0008270, zinc ion binding; GO:0019843, rRNA binding) and biological processes (GO:0015976, carbon utilization; GO:0006412, translation). In roots, GO term enrichments were greater than those in leaves (i.e., 52 vs 37), suggesting that roots are more sensitive to osmotic and salt stresses than leaves (Additional file 1: Figure S5, Tables S4 and S5). These findings suggest that the stress-responsive lncRNAs may regulate genes involved in many biological processes, including signal transduction, energy synthesis, molecule metabolism, detoxification, transcription and translation in response to osmotic and salt stresses.
One lncRNA may regulate multiple other lncRNAs and protein-coding genes, and vice versa . To unravel the relationship among lncRNAs and protein-coding RNAs which were co-expressed and spaced by less than 100 kb, putative interactive networks were constructed using Cytoscape (Fig. 5 and Additional file 1: Figure S6). About half of them had less than or equal to three nodes like networks in Fig. 5c. More complex interactive networks were also observed. For example, thirteen protein-coding genes involved in oxidation/reduction reaction, transcription, energy synthesis and signal transduction were found to be regulated by three lncRNAs in the situation of salt stress in leaves (Fig. 5a). Two transcription factors of MYB and zinc finger families were found in the network of Fig. 5b, which may activate stress-responsive genes in the downstream under osmotic stress in roots. The expression of lncRNAs in Fig. 5c has been validated in Fig. 3. TCONS_00046739 was identified as regulator of cytochrome P450 in roots under salt stress. The targets of TCONS_00100258 and TCONS_00118328 may be two transmembrane proteins in leaves under salt stress. These networks among lncRNAs and protein-coding genes may play important roles in sensing and responding to osmotic and salt stresses. The construction of putative network based on gene expression and vicinity of the lncRNAs and protein-coding genes may not be very robust due to the few number of samples used. Future studies to validate the regulatory relationships between lncRNAs and protein-coding genes by specifically investigating the functions of lncRNAs are warranted.
Under stresses, many GO terms were enriched, such as carbonate dehydratase activity (GO:0004089) and carbon utilization (GO:0015976) that are highly significant (because of the lowest P value) in leaves under osmotic and salt stresses (Fig. 4). The carbonic anhydrase gene Medtr6g006990, belonging to these two GO terms was down-regulated by these two abiotic stresses. This gene is predicted to be regulated by the lncRNA TCONS_00097188 located in the upstream of the coding region of Medtr6g006990 (Fig. 6a). Carbonic anhydrase catalyzing the reversible hydration of CO2 into bicarbonate plays an essential role in the accumulation of CO2 in the active site of rubisco . Our results suggest that TCONS_00097188 may regulate photosynthesis under the abiotic stresses by regulating the expression of Medtr6g006990.
Under conditions of abiotic stresses, signal transduction networks are mobilized to cope with the stressed environment. The pathway of phospholipids metabolism has been proposed to be an important in response to a number of abiotic stresses . For example, drought and salt stresses up-regulate the expression of genes encoding phosphatidylinositol-specific phospholipase C (PI-PLC), which hydrolyzes phosphatidylinositol 4,5-bisphosphate to the secondary messenger molecules inositol 1,4,5-trisphosphate and diacylglycerol . In the present study, the expression of a PI-PLC gene (Medtr3g069280), which belongs to GO:0004435 (Phosphatidylinositol phospholipase C activity) and GO:0007165 (Signal transduction) was up-regulated in response to osmotic and salt stresses, and the lncRNA TCONS_00047650 was expressed from the regulatory region of Medtr3g069280 (Fig. 6b). These results suggested that TCONS_00047650 may regulate the expression of Medtr3g069280.
Plants under osmotic and salt stresses often display oxidative stress symptoms as indicated by marked accumulation of reactive oxygen species (ROS), which damages membrane systems. To cope with the excessive accumulation of ROS, plants mobilize antioxidant enzymes to scavenge ROS . We found that the expression of Medtr7g094600 coding for glutathione peroxidase (POD) was up-regulated in roots. We identified the lncRNA TCONS_00116877 located approximately 3.9 kb upstream of the coding sequence of Medtr7g094600 (Fig. 6c). These results suggest that TCONS_00116877 may be involved in regulating plant’s tolerance to the oxidative stress by modulating the expression of POD.
Effect of salinity on plant growth can be divided into ionic toxicity and osmotic stress . Plants often exhibit similar tolerance mechanisms, such as altered energy synthesis, phospholipids signal transduction and detoxification to osmotic and salt stresses . In addition, we found that the expression of the Na+/H+ exchanger (NHX) gene Medtr1g081900 was up-regulated by the salt stress in roots. This gene codes for a vacuolar Na+/H+ antiporter mediating Na+ influx into the vacuoles . This gene is predicted to be regulated by the lncRNA TCONS_00020253 located in the upstream of the coding region of Medtr1g081900 (Fig. 6d). These results suggest that TCONS_00020253 is likely a regulator of Medtr1g081900.
Less than 2 % of the human genome sequences codes for proteins . However, transcription is not limited to protein-coding regions [17, 52]. In fact, more than 90 % of the human genome sequences are likely transcribed . These non-coding transcribed sequences are from introns, intergenic regions or the antisense strand of protein-coding genes . An increasing number of studies have shown that ncRNAs play important roles in many vital biological processes, highlighting that ncRNAs are not transcriptional “noises” .
Studies on lncRNAs are less extensive in plants than in mammals, and those studies are mainly conducted in A. thaliana [25, 26]. In addition to cereals, legumes are the most important sources for human foods and animal feeds worldwide. Moreover, legumes are unique among cultivated plants for their ability to directly utilize atmospheric nitrogen through symbiotic interactions with the soil bacteria rhizobia . According to the genome sequences of M. truncatula, only about 17 % of the sequences code for proteins . Previous studies in M. truncatula have been concentrated upon protein-coding sequences associated with nodulation, abiotic stresses and developmental processes [53–56]. Several recent studies have investigated functions of small RNAs involved in nodulation and abiotic stresses [57–59]. In this report, we show that lncRNAs are distributed in almost the entire genome of M. truncatula, suggesting that lncRNA-coding regions are much more widespread than protein-coding regions (Fig. 1a and b). Whole genome sequencing and annotation facilitate functional studies of protein-coding genes [32, 33]. Identification and characterization of the large number of lncRNAs in M. truncatula in the present study provide valuable information for functional characterization of lncRNAs in plants in general and in legumes in particular.
In the present study, the reverse transcription was made by using complementary sequences of artificial adaptors to enrich lncRNAs with or without poly(A) tails. To distinguish sense from antisense lncRNAs, strand-specific libraries were constructed and paired-end sequencing was carried out in the present study. As a result, our results can be used to identify different types of lncRNAs to facilitate functional studies. Moreover, the abundant original data (56.7 G) generated in the present study allow us to detect lncRNAs that have low expression levels. Given that the expression of lncRNAs is highly tissue-specific , lncRNAs from both leaves and roots of M. truncatula were sequenced and their expression patterns were compared. In addition, we also sequenced and compared the expression of protein-coding genes in both leaves and roots under control and stressed conditions. This information is useful for predicting putative targets of lncRNAs. Furthermore, we identified common and specific lncRNAs from leaves and roots treated with osmotic or salt stresses to study potential functions of lncRNAs in plant’s responses to abiotic stresses. To our best knowledge, this is the first report of a comprehensive set of lncRNAs isolated from osmotic-and salt-stress treated leaf and root samples of higher plants using high-throughput sequencing. Unlike previous studies where osmotic and salt stress-responsive lincRNAs (intergenic lncRNAs) were detected in Arabidopsis  and Populus , the present study identified all types of lncRNAs involved in osmotic and salt stresses in M. truncatula by the strand-specific sequencing. Moreover, to make sure that the putative lncRNAs in this study conform to the criteria of length and protein-coding ability, the putative lncRNAs were selected to have >200 bp in length and less than –1 for the coding potential score. These strict criteria and improved methods made the identified lncRNAs with high sensitivity and selectivity.
To minimize the adverse effects of abiotic stresses, plants have evolved a suite of responsive mechanisms . There are many protein-coding genes which are identified to play regulatory roles under varying abiotic stresses, such as DREB1A/CBF3, SOS1 and so on [61–64]. However, little is known of biological functions of lncRNAs in abiotic stress responses in plants. Moreover, lncRNAs are putative potent tools for plant improvement to enhance their resistance to abiotic stresses . Therefore, identification of abiotic stress-responsive lncRNAs, characterization of their functions and dissection of their regulatory networks can enhance our mechanistic understanding of plant response and adaptation to stressed environment. Several recent studies have identified lncRNAs involved in biotic/abiotic stresses in plants. Fusarium oxysporum, a soil-borne plant fungal pathogen, causes the vascular wilt disease through roots in several plant species . LncRNAs that are responsive to F. oxysporum have been identified by RNA-seq, and functional characterization of these lncRNAs reveals that lncRNAs are important components of the anti-fungal networks in A. thaliana . For abiotic stress responses, 76 lncRNAs have been identified from a full-length cDNA library of A. thaliana . Of these, 22 lncRNAs have been shown involved in abiotic stress responses; overexpression of two identified lncRNAs renders plants more tolerance to salinity. However, because the full-length cDNA library was made from mRNAs with poly(A) tails, lncRNAs without poly(A) tails have not been identified in that study. In our present study, reverse transcription was made by complementary sequences of artificial adaptors, thus, lncRNAs with or without poly(A) tails were obtained. Liu et al.  identified 6,484 lincRNAs, of which 1,832 lincRNAs are responsive to drought, cold, salinity and/or abscisic acid. In a recent study, a total of 504 drought-responsive lincRNAs has been detected in Populus . However, in these studies, only lincRNAs, rather than all types of lncRNAs, are analyzed. In our study, all types of lncRNAs, including those of sense, antisense, intronic and intergenic lncRNAs were identified using the advanced sequencing technology and analytic methods such as strand-specific sequencing and Cuffcompare analysis.
In this study, we identified 23,324 putative lncRNAs from six RNA-seq libraries of M. truncatula by high-throughput sequencing, of which 11,641 and 13,087 lncRNAs are found to be responsive to osmotic stress and salt stress, respectively. Of these, 5,634 lncRNAs are found to be responsive to both osmotic and salt stress. We analyzed GO terms of genes that either overlap with or are immediate neighbors of the stress-responsive lncRNAs. We found enrichments of GO terms in many biological processes, including signal transduction, energy synthesis, molecule metabolism, detoxification, transcription and translation. Moreover, a number of complex interaction networks were constructed based on co-expression and genomic co-location of lncRNAs and protein-coding genes. These results suggest that lncRNAs are likely involved in regulating plant’s responses and adaptation to osmotic and salt stresses in complex regulatory networks with protein-coding genes. These findings provide valuable information for further functional characterization of lncRNAs in responses of plants in general and M. truncatula in particular to abiotic stresses.
Plant materials and stress treatments
Seeds of Medicago truncatula ecotypes Jemalong A17 were treated with concentrated sulfuric acid for 8 min, and then thoroughly rinsed with water. After chilled at 4 °C for 2 days, seeds were sown on 0.8 % agar to germinate at 25 °C till the radicals being about 2 cm. The seeds were planted in the plastic buckets filled with aerated nutrient solution under controlled conditions (26 °C day/22 °C night, and 14-h photoperiod). The composition of full-strength nutrient solution is: 2.5 mM KNO3, 0.5 mM KH2PO4, 0.25 mM CaCl2, 1 mM MgSO4, 100 μM Fe-Na-EDTA, 30 μM H3BO3, 5 μM MnSO4, 1 μM ZnSO4, 1 μM CuSO4 and 0.7 μM Na2MoO4 with pH of 6.0.
Three-week-old seedlings were transferred into nutrient solutions containing either 265 mM mannitol or 150 mM NaCl, which had identical osmolality, for 5 h. Leaves and roots from at least ten individual plants were collected and frozen immediately in liquid nitrogen until use. At the same time, M. truncatula seedlings grown in the full-strength solution without mannitol or NaCl were harvested and were used as control. The regimes of treatment used in this study were chosen based on previous studies [25, 67].
Construction of cDNA libraries and high-throughput sequencing
To construct libraries, total RNA was extracted from leaves and roots of seedlings grown in different solutions (osmotic stress, salt stress and control) using the Trizol (Invitrogen) according to the manufacturer’s protocols. Ribosome RNA of six RNA samples was removed using Ribo-Zero™ Magnetic Kit (Epicentre). Thereafter the strand-specific sequencing libraries were constructed following a previously described protocol . The paired-end sequencing (2 × 100 bp) was performed on an Illumina Hiseq2000 sequencer at the LC Biotech, Hangzhou, China.
Reads mapping and transcriptome assembling
The resulting directional 100 bp paired-end reads were quality-checked with FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/), and adapter contaminations and low quality tags in the raw data were removed. Ribosome RNA data were also removed from the remaining data by alignment. Then, the clean reads from six-cDNA libraries were merged and mapped to the M. truncatula genome sequence (Mt4.0) using the spliced read aligner TopHat . To construct transcriptome, the mapped reads were assembled de novo using Cufflinks . All transcripts were required to be >200 bp in length.
Identification of lncRNAs
The assembled transcripts were annotated using the Cuffcompare program from the Cufflinks package . According to the annotation of M. truncatula genome sequence (Mt4.0), the known protein-coding transcripts were identified. The remaining unknown transcripts were used to screen for putative lncRNAs. The transcripts smaller than 200 bp were firstly excluded. Then, the coding potential for the remaining transcripts was calculated by the Coding Potential Calculator based on quality, completeness, and sequence similarity of the open reading frame to the proteins in the protein databases . A transcript was deemed to be noncoding if the coding potentials are scored to be less than −1, which suggest that this transcript has no capacity of coding for proteins.
Analysis of differential expression patterns
Expression levels of all transcripts, including putative lncRNAs and mRNAs, were quantified as FPKM using the Cuffdiff program from the Cufflinks package . Differential gene expression was determined using DESeq with a P-value < 0.05 and a false discovery rate threshold of 5 % .
Quantitative real-time PCR (qRT-PCR)
Total RNA was isolated using RNAiso Plus reagent (TaKaRa) and treated with RNase-free DNase I (Promega). About 0.5 μg RNA was reverse-transcribed into first-strand cDNA with PrimeScript® RT reagent Kit (TaKaRa). Quantitative real-time PCR (qRT-PCR) was performed using ABI Stepone Plus instrument. Gene-specific primers and internal control primers were listed in Additional file 1: Table S6. All qRT-PCR reactions were performed in triplicates for each cDNA sample with an annealing temperature of 57 °C and a total of 40 cycles of amplification. The relative expression levels were calculated by the comparative CT method.
Prediction of lncRNA function based on co-expression and genomic co-location
A number of investigations have indicated that one major function of lncRNAs is regulating the expression of neighboring protein-coding genes via epigenetic modification or transcriptional co-activation/repression [13, 43, 44]. The relative loci between lncRNAs and their neighbors can be exhibited using Integrative Genomics Viewer . Moreover, differentially expressed lncRNAs and mRNAs were forecasted to play roles in regulation of tolerance to osmotic and salt stresses. Therefore, the genomic co-locational analysis of these lncRNAs and mRNAs was performed. We defined two genes as a co-expressed and co-located pair if they were co-expressed and spaced by less than 100 kb, according to the previously described method .
The neighbors of lncRNA genes were analyzed by Gene Ontology (GO) , and GO terms were enriched when significance (P) was less than 0.05 using Blast2GO . Interaction networks among lncRNAs and protein-coding RNAs were constructed based on co-expression and genomic co-location using software Cytoscape .
RNA-seq data sets are available in the Sequence Read Archive database under accession number SRR1523070 (for CK-L), SRR1523071 (for CK-R), SRR1523072 (for OS-L), SRR1523075 (for OS-R), SRR1523077 (for SS-L) and SRR1523078 (for SS-R).
- FLC :
FLOWERING LOCUS C
Fragments per kilobase of exon per million fragments mapped
Long non-coding RNAs
Phosphatidylinositol-specific phospholipase C
quantitative real-time PCR
Reactive oxygen species
Small interfering RNAs
Struhl K. Transcriptional noise and the fidelity of initiation by RNA polymerase II. Nat Struct Mol Biol. 2007;14(2):103–5.
Ponjavic J, Ponting CP, Lunter G. Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs. Genome Res. 2007;17(5):556–65.
Kim YJ, Zheng BL, Yu Y, Won SY, Mo BX, Chen XM. The role of mediator in small and long noncoding RNA production in Arabidopsis thaliana. EMBO J. 2011;30(5):814–22.
Wilusz JE, Sunwoo H, Spector DL. Long noncoding RNAs: functional surprises from the RNA world. Gene Dev. 2009;23(13):1494–504.
Brosnan CA, Voinnet O. The long and the short of noncoding RNAs. Curr Opin Cell Biol. 2009;21(3):416–25.
Rinn JL, Chang HY. Genome regulation by long noncoding RNAs. Annu Rev Biochem. 2012;81:145–66.
Cabili MN, Trapnell C, Goff L, Koziol M, Tazon-Vega B, Regev A, et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Gene Dev. 2011;25(18):1915–27.
Ponting CP, Oliver PL, Reik W. Evolution and functions of long noncoding RNAs. Cell. 2009;136(4):629–41.
Yamada K, Lim J, Dale JM, Chen HM, Shinn P, Palm CJ, et al. Empirical analysis of transcriptional activity in the Arabidopsis genome. Science. 2003;302(5646):842–6.
Wang XJ, Gaasterland T, Chua NH. Genome-wide prediction and identification of cis-natural antisense transcripts in Arabidopsis thaliana. Genome Biol. 2005;6(4):R30.
Kornienko AE, Guenzl PM, Barlow DP, Pauler FM. Gene regulation by the act of long non-coding RNA transcription. BMC Biol. 2013;11:59.
Liu J, Wang H, Chua NH. Long noncoding RNA transcriptome of plants. Plant Biotechnol J. 2015. doi:10.1111/pbi.12336.
Rinn JL, Kertesz M, Wang JK, Squazzo SL, Xu X, Brugmann SA, et al. Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell. 2007;129(7):1311–23.
Zhao J, Sun BK, Erwin JA, Song JJ, Lee JT. Polycomb proteins targeted by a short repeat RNA to the mouse X chromosome. Science. 2008;322(5902):750–6.
Sleutels F, Zwart R, Barlow DP. The non-coding Air RNA is required for silencing autosomal imprinted genes. Nature. 2002;415(6873):810–3.
Derrien T, Johnson R, Bussotti G, Tanzer A, Djebali S, Tilgner H, et al. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 2012;22(9):1775–89.
Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, et al. Identification and analysis of functional elements in 1 % of the human genome by the ENCODE pilot project. Nature. 2007;447(7146):799–816.
Wierzbicki AT. The role of long non-coding RNA in transcriptional gene silencing. Curr Opin Plant Biol. 2012;15(5):517–22.
De Lucia F, Dean C. Long non-coding RNAs and chromatin regulation. Curr Opin Plant Biol. 2011;14(2):168–73.
Heo JB, Sung S. Vernalization-mediated epigenetic silencing by a long intronic noncoding RNA. Science. 2011;331(6013):76–9.
Swiezewski S, Liu FQ, Magusin A, Dean C. Cold-induced silencing by long antisense transcripts of an Arabidopsis Polycomb target. Nature. 2009;462(7274):799–802.
Franco-Zorrilla JM, Valli A, Todesco M, Mateos I, Puga MI, Rubio-Somoza I, et al. Target mimicry provides a new mechanism for regulation of microRNA activity. Nat Genet. 2007;39(8):1033–7.
Shin H, Shin HS, Chen R, Harrison MJ. Loss of At4 function impacts phosphate distribution between the roots and the shoots during phosphate starvation. Plant J. 2006;45(5):712–26.
MacIntosh GC, Wilkerson C, Green PJ. Identification and analysis of Arabidopsis expressed sequence tags characteristic of non-coding RNAs. Plant Physiol. 2001;127(3):765–76.
Ben Amor B, Wirth S, Merchan F, Laporte P, d’Aubenton-Carafa Y, Hirsch J, et al. Novel long non-protein coding RNAs involved in Arabidopsis differentiation and stress responses. Genome Res. 2009;19(1):57–69.
Liu J, Jung C, Xu J, Wang H, Deng SL, Bernad L, et al. Genome-wide analysis uncovers regulation of long intergenic noncoding RNAs in Arabidopsis. Plant Cell. 2012;24(11):4333–45.
Matsui A, Ishida J, Morosawa T, Mochizuki Y, Kaminuma E, Endo TA, et al. Arabidopsis transcriptome analysis under drought, cold, high-salinity and ABA treatment conditions using a tiling array. Plant Cell Physiol. 2008;49(8):1135–49.
Ding JH, Lu Q, Ouyang YD, Mao HL, Zhang PB, Yao JL, et al. A long noncoding RNA regulates photoperiod-sensitive male sterility, an essential component of hybrid rice. P Natl Acad Sci USA. 2012;109(7):2654–9.
Boerner S, McGinnis KM. Computational identification and functional predictions of long noncoding RNA in Zea mays. PLoS One. 2012;7(8), e43047.
Li L, Eichten SR, Shimizu R, Petsch K, Yeh CT, Wu W, et al. Genome-wide discovery and characterization of maize long non-coding RNAs. Genome Biol. 2014;15(2):R40.
Zhang W, Han ZX, Guo QL, Liu Y, Zheng YX, Wu FL, et al. Identification of maize long non-coding RNAs responsive to drought stress. PLoS One. 2014;9(6), e98958.
Young ND, Debelle F, Oldroyd GED, Geurts R, Cannon SB, Udvardi MK, et al. The Medicago genome provides insight into the evolution of rhizobial symbioses. Nature. 2011;480(7378):520–4.
Branca A, Paape TD, Zhou P, Briskine R, Farmer AD, Mudge J, et al. Whole-genome nucleotide diversity, recombination, and linkage disequilibrium in the model legume Medicago truncatula. P Natl Acad Sci USA. 2011;108(42):E864–70.
Benedito VA, Torres-Jerez I, Murray JD, Andriankaja A, Allen S, Kakar K, et al. A gene expression atlas of the model legume Medicago truncatula. Plant J. 2008;55(3):504–13.
Campalans A, Kondorosi A, Crespi M. Enod40, a short open reading frame-containing mRNA, induces cytoplasmic localization of a nuclear RNA binding protein in Medicago truncatula. Plant Cell. 2004;16(4):1047–59.
Burleigh SH, Harrison MJ. A novel gene whose expression in Medicago truncatula roots is suppressed in response to colonization by vesicular-arbuscular mycorrhizal (VAM) fungi and to phosphate nutrition. Plant Mol Biol. 1997;34(2):199–208.
Wen J, Parker BJ, Weiller GF. In Silico identification and characterization of mRNA-like noncoding transcripts in Medicago truncatula. In Silico Biol. 2007;7(4–5):485–505.
Ewing B, Green P. Base-calling of automated sequencer traces using phred. II Error probabilities Genome Res. 1998;8(3):186–94.
Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25(9):1105–11.
Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28(5):511–5.
Wang H, Chung PJ, Liu J, Jang IC, Kean MJ, Xu J, et al. Genome-wide identification of long noncoding natural antisense transcripts and their responses to light in Arabidopsis. Genome Res. 2014;24(3):444–53.
Xie CY, Yuan J, Li H, Li M, Zhao GG, Bu DC, et al. NONCODEv4: exploring the world of long non-coding RNA genes. Nucleic Acids Res. 2014;42(D1):D98–103.
Mercer TR, Dinger ME, Mattick JS. Long non-coding RNAs: insights into functions. Nat Rev Genet. 2009;10(3):155–9.
Yu WQ, Gius D, Onyango P, Muldoon-Jacobs K, Karp J, Feinberg AP, et al. Epigenetic silencing of tumour suppressor gene p15 by its antisense RNA. Nature. 2008;451(7175):202–6.
Pauli A, Valen E, Lin MF, Garber M, Vastenhouw NL, Levin JZ, et al. Systematic identification of long noncoding RNAs expressed during zebrafish embryogenesis. Genome Res. 2012;22(3):577–91.
Studer AJ, Gandin A, Kolbe AR, Wang L, Cousins AB, Brutnell TP. A limited role for carbonic anhydrase in C4 photosynthesis as revealed by a ca1ca2 double mutant in maize. Plant Physiol. 2014;165(2):608–17.
Xiong LM, Schumaker KS, Zhu JK. Cell signaling during cold, drought, and salt stress. Plant Cell. 2002;14:S165–83.
Mittler R. Oxidative stress, antioxidants and stress tolerance. Trends Plant Sci. 2002;7(9):405–10.
Zhu JK. Salt and drought stress signal transduction in plants. Annu Rev Plant Biol. 2002;53:247–73.
Yamaguchi T, Aharon GS, Sottosanto JB, Blumwald E. Vacuolar Na+/H+ antiporter cation selectivity is regulated by calmodulin from within the vacuole in a Ca2+- and pH-dependent manner. P Natl Acad Sci USA. 2005;102(44):16107–12.
Collins FS, Lander ES, Rogers J, Waterston RH, Conso IHGS. Finishing the euchromatic sequence of the human genome. Nature. 2004;431(7011):931–45.
Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, et al. The transcriptional landscape of the mammalian genome. Science. 2005;309(5740):1559–63.
Wang TZ, Zhang JL, Tian QY, Zhao MG, Zhang WH. A Medicago truncatula EF-hand family gene, MtCaMP1, is involved in drought and salt stress tolerance. PLoS One. 2013;8(4), e58952.
de Zelicourt A, Diet A, Marion J, Laffont C, Ariel F, Moison ML, et al. Dual involvement of a Medicago truncatula NAC transcription factor in root abiotic stress response and symbiotic nodule senescence. Plant J. 2012;70(2):220–30.
Xie C, Zhang RX, Qu YT, Miao ZY, Zhang YQ, Shen XY, et al. Overexpression of MtCAS31 enhances drought tolerance in transgenic Arabidopsis by reducing stomatal density. New Phytol. 2012;195(1):124–35.
Ge L, Peng J, Berbel A, Madueno F, Chen R. Regulation of compound leaf development by PHANTASTICA in Medicago truncatula. Plant Physiol. 2014;164(1):216–28.
Lelandais-Briere C, Naya L, Sallet E, Calenge F, Frugier F, Hartmann C, et al. Genome-wide Medicago truncatula small RNA analysis revealed novel microRNAs and isoforms differentially regulated in roots and nodules. Plant Cell. 2009;21(9):2780–96.
Wang TZ, Chen L, Zhao MG, Tian QY, Zhang WH. Identification of drought-responsive microRNAs in Medicago truncatula by genome-wide high-throughput sequencing. BMC Genomics. 2011;12:367.
Zhou ZS, Zeng HQ, Liu ZP, Yang ZM. Genome-wide identification of Medicago truncatula microRNAs and their targets reveals their differential regulation by heavy metal. Plant Cell Environ. 2012;35(1):86–99.
Shuai P, Liang D, Tang S, Zhang Z, Ye CY, Su Y, et al. Genome-wide identification and functional prediction of novel and drought-responsive lincRNAs in Populus trichocarpa. J Exp Bot. 2014;65(17):4975–83.
Liu Q, Kasuga M, Sakuma Y, Abe H, Miura S, Yamaguchi-Shinozaki K, et al. Two transcription factors, DREB1 and DREB2, with an EREBP/AP2 DNA binding domain separate two cellular signal transduction pathways in drought- and low-temperature-responsive gene expression, respectively, in Arabidopsis. Plant Cell. 1998;10(8):1391–406.
Jaglo-Ottosen KR, Gilmour SJ, Zarka DG, Schabenberger O, Thomashow MF. Arabidopsis CBF1 overexpression induces COR genes and enhances freezing tolerance. Science. 1998;280(5360):104–6.
Liu JP, Zhu JK. A calcium sensor homolog required for plant salt tolerance. Science. 1998;280(5371):1943–5.
Qiu QS, Guo Y, Dietrich MA, Schumaker KS, Zhu JK. Regulation of SOS1, a plasma membrane Na+/H+ exchanger in Arabidopsis thaliana, by SOS2 and SOS3. P Natl Acad Sci USA. 2002;99(12):8436–41.
Liu R, Zhu JK. Non-coding RNAs as potent tools for crop improvement. Natl Sci Rev. 2014;1(2):186–9.
Zhu QH, Stephen S, Taylor J, Helliwell CA, Wang MB. Long noncoding RNAs responsive to Fusarium oxysporum infection in Arabidopsis thaliana. New Phytol. 2014;201(2):574–84.
Zhang LM, Liu XG, Qu XN, Yu Y, Han SP, Dou Y, et al. Early transcriptomic adaptation to Na2CO3 stress altered the expression of a quarter of the total genes in the maize genome and exhibited shared and distinctive profiles with NaCl and high pH stresses. J Integr Plant Biol. 2013;55(11):1147–65.
Borodina T, Adjaye J, Sultan M. A strand-specific library preparation protocol for RNA sequencing. Methods Enzymol. 2011;500:79–98.
Kong L, Zhang Y, Ye ZQ, Liu XQ, Zhao SQ, Wei L, et al. CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res. 2007;35:W345–9.
Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11(10):R106.
Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, et al. Integrative genomics viewer. Nat Biotechnol. 2011;29(1):24–6.
Liao Q, Liu CN, Yuan XY, Kang SL, Miao RY, Xiao H, et al. Large-scale prediction of long non-coding RNA functions in a coding-non-coding gene co-expression network. Nucleic Acids Res. 2011;39(9):3864–78.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. Nat Genet. 2000;25(1):25–9.
Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21(18):3674–6.
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–504.
We acknowledge the three anonymous reviewers and academic editor for their constructive suggestions on the previous version of the manuscript. We thank Dr. Jianning Liu of LC Biotech for his help of bioinformatics analysis. This study was supported by the National Natural Science Foundation of China (31300231, 31272234), the National Basic Research Program of China (2015CB150800) and the External Cooperation Program of the Chinese Academy of Sciences (151111KYSB20130008).
The authors declare that they have no competing interests.
TZW and WHZ designed the experiments; TZW and ML conducted the experiments; TZW, ML, MGZ, RC and WHZ analyzed the data; TZW, RC and WHZ wrote the paper. All authors read and approved the final manuscript.
The quality score (Q) value of RNA-seq from six samples. Figure S2. Accumulative frequency of lncRNAs and mRNAs from leaves and roots under osmosis and salt stress. Figure S3. Compare of expressional results between RNA-seq and qRT-PCR. Figure S4. The common and specific lncRNAs identified to be up-regulated (a) and down-regulated (b) in leaves and roots under osmosis and salt stress. Figure S5. GO enhancements in roots of M. truncatula under osmotic stress (a) and salt stress (b). The reliability is calculated by –log10 (P-value). Figure S6. The interaction networks among lncRNAs and protein-coding genes. Table S2. The GO enhancements in leaves under osmotic stress. Table S3. The GO enhancements in leaves under salt stress. Table S4. The GO enhancements in roots under osmotic stress. Table S5. The GO enhancements in roots under salt stress. Table S6. Sequences of primes using in this study.
All putative lncRNAs identified in this study.
About this article
Cite this article
Wang, TZ., Liu, M., Zhao, MG. et al. Identification and characterization of long non-coding RNAs involved in osmotic and salt stress in Medicago truncatula using genome-wide high-throughput sequencing. BMC Plant Biol 15, 131 (2015). https://doi.org/10.1186/s12870-015-0530-5
- Long non-coding RNAs (lncRNAs)
- Osmotic stress
- Salt stress
- Medicago truncatula
- Legume plants
- High-throughput sequencing
- Transcriptional regulation