Genomic and evolutionary aspects of chloroplast tRNA in monocot plants

Background Chloroplasts are one of the most indispensable organelles that make life forms on the earth possible by their capacity to photosynthesize. These organelles possess a circular genome with a number of coding genes responsible for self-regulation. tRNAs are an important evolutionary-conserved gene family that are responsible for protein translation. However, within the chloroplast genome, tRNA machinery are poorly understood. Results In the present study, the chloroplast genome of six monocot plants, Oryza nivara (NC_005973), Oryza sativa (NC_001320), Sachharum officinarum (NC_006084), Sorghum bicolor (NC_008602), Triticum aestivum (NC_002762), and Zea mays (NC_001666) were downloaded and analyzed to identify tRNA sequences. Further analysis of the tRNA sequences in the chloroplast genomes of the monocot plants resulted in the identification of several novel features. The length of tRNAs in the chloroplast genome of the monocot plants ranged from 59 to 155 nucleotides. Pair-wise sequence alignment revealed the presence of a conserved A-C-x-U-A-x-U-A-x-U-x5-U-A-A nucleotide consensus sequence. In addition, the tRNAs in chloroplast genomes of the monocot plants also contain 21–28 anti-codons against 61 sense codons in the genome. They also contain a group I intron and a C-A-U anti-codon for tRNAIle, which is a common anti-codon of tRNAMet. Evolutionary analysis indicates that tRNAs in the chloroplast genome have evolved from multiple common ancestors, and tRNAMet appears to be the ancestral tRNA that underwent duplication and diversification to give rise to other tRNAs. Conclusion The results obtained from the study of chloroplast tRNA will greatly help to increase our understanding of tRNA biology at a new level. Functional studies of the reported novel aspects of the chloroplast tRNA of the monocot plants will greatly help to decipher their roles in diverse cellular processes. Electronic supplementary material The online version of this article (10.1186/s12870-018-1625-6) contains supplementary material, which is available to authorized users.


Background
Chloroplasts are multi-copy cellular organelle [1] which are responsible for photosynthesis and carbohydrate metabolism in photoautotrophic plants which regulate our biosphere [2,3]. They are an active metabolic center, and are responsible for sustaining the life on earth by converting solar energy into carbohydrates through the process of photosynthesis [4][5][6]. In addition to the major process of photosynthesis, chloroplasts also play an important role in various other molecular processes; including the synthesis of nucleotides, amino acids, fatty acids, vitamins, phytohormones, and several other metabolites [7][8][9][10][11][12]. Furthermore, they also contribute to the assimilation of nitrogen and sulphur [13][14][15]. In plants, these metabolites have been shown to play a critical role in the regulation of the physiology, growth, and development; as well as stress response. Therefore, chloroplasts can be regarded as the "metabolic center" of cellular reactions. Evolutionary studies indicate that chloroplasts have arisen from a cyanobacterial ancestor through internalization within a eukaryotic cell and have maintained an independent genome inside the plant cell [16][17][18][19][20]. The chloroplast genome (cpDNA) is a double stranded circular molecule containing tRNA, rRNA, and a number of protein coding genes [21]. The majority of the protein coding genes are associated with photosynthesis and bioenergetics [22,23]. The chloroplast genome contains two large 6-76 Kb inverted repeats (IRs) that are divided into a large single copy (LSC) and small single copy region (SSC) [24][25][26]. The chloroplast genome is non-recombinant and inherited uniparentally through maternal inheritance [27,28]. Therefore, the chloroplast genome is an excellent tool for genomic and evolutionary studies. It is very difficult, however, to detect polymorphisms in cpDNA due to a low level of substitutions [29,30]. Recently, the advances in high-throughput genome sequencing technology have enabled rapid progress in the sequencing and analysis of chloroplast genomes. Specifically, these technological gains have enabled us to obtain and analyze the complete chloroplast genomes of several plants to better understand their molecular and genomic characteristics.
Since chloroplasts encode a complete and independent genome, it is important to study the chloroplast genomes; especially chloroplast tRNAs which are responsible for protein translation. Since the chloroplast genome is involved in the synthesis of nucleotides, amino acids and proteins, it is important to understand its organization to determine how these processes are regulated within the chloroplast genome. Protein translation within the chloroplasts is regulated by tRNA and other associated genes. Thus, detailed analyses of chloroplast tRNAs can provide insight into the genomics and evolution of cyanobacterial tRNAs. In relative comparison to eudicots, the monocot genome is more conserved than the eudicots genome, and they have evolved from the eudicot lineage [31][32][33]. In addition, several of the important agronomic crops species are monocots. Therefore, in the present study, we considered to study the chloroplast genome of six monocot plants to better understand the genomic and evolutionary characteristics of the chloroplast tRNA that can enable functional studies for the future.
tRNAs are one of the most important and versatile molecules responsible for sustaining and maintaining the protein translation machinery. They are characterized by the presence of a clover leaf-like structure as proposed by Robert Holley [34]. This structure contains features such as an acceptor arm, D-arm, D-loop, anti-codon arm, anti-codon loop, variable arm, pseudouridine arm, and pseudouridine loop. The tRNAs are encoded within the nuclear genome and in the genome of sub-cellular organelles, including plastids and mitochondria. Over the years, detailed studies pertaining to the characterization of nuclear tRNA have gained considerable attention [35][36][37]. Structure and function of tRNAs and tRNA genes of chloroplast genome was previously described by Mareachal-Drouard et al., (1991) [38]. However, due to the lack of complete genome sequences of chloroplast genome, the study lacked the complete genomic details of tRNAs of plastid genome. Therefore, we attempted to understand the detailed genomic and molecular aspects of chloroplast tRNA in plants.
Considering the conserved evolutionary lineages of monocots, six economically important monocots were investigated and reported within this study.

Genomic of chloroplast tRNA
The whole chloroplast genome sequence of six monocot plants, Oryza nivara (NC_005973), Oryza sativa (NC_001320), Sachharum officinarum (NC_006084), Sorghum bicolor (NC_008602), Triticum aestivum (NC_002762), and Zea mays (NC_001666), were downloaded from the National Center of Biotechnology Information (NCBI) database. Subsequently, the sequences were annotated to identify the genomic tRNA sequences in these genomes (Fig. 1). The obtained genomic tRNA sequences were further analyzed using the tRNAscan-Se server to confirm their identity as tRNAs. Results indicated that O. nivara, O. sativa, S. officinarum, S. bicolor, T. aestivum, and Z. mays encode 38,35,37,29,39, and 39 tRNAs, respectively ( Table 1). The length of the chloroplast tRNAs ranged from 59 nt [tRNA Thr GGU, Sorghum bicolor, (20385)] to 155 nt [tRNA Lys NNN, T. aestivum, (4982_TraeCt095)]. tRNA Gly UCC of O. nivara (6129) was found to contain only 65 nt, whereas tRNA Gln UUG of T. aestivum (4985), and tRNA Leu UAG of T. aestivum (5086_TraeCt128) contained 118 nt and 100 nt, respectively. In the tRNA, tRNA Gln UUG (4985_TraeCt096), the tRNA begins at 46 nt and in tRNA-Leu UAG (5086_TraeCt128), it begins at 21 nt. Pairwise sequence alignment of 5′ nucleotide sequence of these two tRNAs revealed a 22.2% similarity (55.6% gaps) and the presence of a conserved A-C-x-U-A-x-U-A-x-U-x 5 -U-A-A consensus sequence. On average, chloroplast tRNAs in the examined monocot plants contain 76 nucleotides. tRNA Cys , tRNA Asn , tRNA Ala , tRNA Asp , tRNA Phe , and tRNA Trp were found to contain 71, 72, 73, 74, 73, and 74 nucleotides, respectively. All of the sequences of the tRNA-Leu and tRNA Ser were found to contain 80 nt or more. tRNA Lys was found to be absent from the chloroplast genome of O. sativa and S. bicolor (Table 1). Additionally, tRNA Ala and tRNA Ile were also found to be absent in S. bicolor (Table 1).

Conservation of chloroplast tRNA sequences is family specific
Multiple sequence alignment analysis of all 20 tRNA gene family members of studied monocot species revealed small, highly conserved consensus sequences in  (Table 3). The Ψ-loop was found to possess a conserved U-U-C-x-A consensus nucleotide sequence ( Table 3). The majority of the tRNAs contained a G nucleotide at the first position. tRNA Val , tRNA Met , and tRNA Pro , however, were found to possess an A nucleotide at the first position instead of a G (Table 3). tRNA Gln and tRNA Asn were found to possess a U nucleotide at the first position in the acceptor arm. Although no consensus sequence conservation was observed in the 5′-acceptor arm, the D-arm contained a conserved C nucleotide at the 4th position of the arm (13th position of the tRNA). In contrast, tRNA Glu , tRNA Gly , tRNA Met , tRNA Ser , tRNA Tyr , and all other tRNAs, possessed a C nucleotide at the 4th position of the D-arm. Nucleotide 7 to 16 of the canonical tRNA forms an A box, which has been reported to contain two conserved consensus sequences, 7 GUGGCNNAGU 16and -GGU-AGNGC 15 (− stands for gap & N stands for any nucleotide) [39]. Our analysis revealed that among the 20 tRNAs analyzed, only six of them possess a conserved G nucleotide at the 7th position (Table 3). The 7th position of the tRNA is instead occupied by an A, U, or C nucleotide (Table 3). The 14th position (1st nucleotide of D-loop) was found to be conserved in the majority of tRNA. Except for tRNA Arg , tRNA Asn , tRNA Gly , and tRNA Met , all other tRNAs were found to contain a conserved A nucleotide at the 14th position (Table 3). Similarly, the last nucleotide of the D-loop was found to be a conserved A nucleotide except tRNA Tyr (Table 3). The consensus sequence 52 GGUUCGANUCC 62 , which starts from the 52nd position and ends at the 62nd position of tRNA, forms a B box [40]. Our analysis indicates that the conservation of box A and B nucleotide sequences in tRNA occurs in a family-specific manner. The G-G nucleotide at the 52nd and 53rd position was found to be conserved in the majority of tRNAs, except for tRNA Glu , tRNA Lys , and tRNA Val ; whereas, the nucleotide sequence U-U-C-x-A-x-U was found to conserved at the 54th, 55th, 56th, 58th, and 60th positions (Table 3). tRNA Met was found to contain a conserved U-U-C-x-A-U-C consensus sequence at the 54th, 55th, 56th, 58th, 59th, and 60th positions, instead of the U-U-C-x-A-x-U consensus sequence (Table 3). Similarly, tRNA Asp had a conserved U-U-C-G-A-G-C consensus sequence, while tRNA Val contained U-U-C-G-A-x-x conserved nucleotides. No conserved nucleotides were found at the 59th and 60th positions of tRNA Val . The anti-codon loop at the 32nd and 33rd positions were found to contain conserved C-U or U-U nucleotides. tRNA Gln , tRNA Gly , tRNA His , tRNA Pro , and tRNA Val contained conserved U-U nucleotides instead of the C-U nucleotides. In addition, in the majority of cases, the anti-codon loop at the 38th position had a conserved A nucleotide. tRNA Gln , tRNA Pro , and tRNA Val , however, possessed a conserved U nucleotide at the 38th position instead of nucleotide A ( Table 3). The chloroplast genome encodes a predefined C-C-A tail in the gene of the tRNA. When the tRNA gene is transcribed, a C-C-A tail is included. The present study found that tRNA Ala , tRNA Arg , tRNA Ile , tRNA Lys , and tRNA Tyr contain C-C-A nucleotides in their 3′-end. A few of the encoded tRNA Leu genes in the monocot chloroplast genomes also contain C-C-A tail in the 3′-end, however, the remaining tRNAs do not possess a C-C-A consensus sequence at their 3′-end.

Nucleotide variation in the arms and loops of tRNA
In the present study, the acceptor arm of chloroplast tRNA was revealed to contain 1-7 nucleotides. Among the 213 tRNA sequences representing six species of monocot plants, only two were found to contain one nucleotide, one had five nucleotides, and one contained six nucleotides; while the rest of the 209 (98.12%) tRNAs had seven nucleotides. The D-arm was found to contain      Table S1). All of the tRNAs, except for one, had seven nucleotides in the anti-codon loop. tRNA 6160_OrniCt018 of O. nivara contained nine nucleotides instead of seven (Additional file 1: Table S1). The variable loop was found to possess a diverse number of nucleotides with different tRNAs having 4 (9.38%), 5 (59.62%), 6 (3.75%), 7 (5.63%), 11 (2.34%), 12 (0.46%), 13 (6.1%), 14 (0.46%), 15 (1.87%), 16 (2.34%), 18 (2.34%), or 19 (5.63%) nucleotides. None of the chloroplast tRNAs were found to possess 8, 9, 10, 17, 20 or more nucleotides in the variable loop (Additional file 1: Table S1). tRNA Leu , tRNA Ser , and tRNA-Tyr had 10 or more nucleotides, respectively, whereas the other tRNAs possessed less than 10 nucleotides in the variable loop (Additional file 1: Table S1). Among the 213 examined tRNA sequences, only three tRNA Gly genes had four nucleotides in the Ψ-arm, while the remaining tRNA sequences had five nucleotides. Similarly, the Ψ-loop region in all of the 213 tRNAs possessed seven nucleotides. Our study found 7 bp in the acceptor arm and 3-4 bp in the D-arm and considerable variation was observed in the other parts. The anti-codon arm was found to possess 4-5 bp, and the anti-codon loop 7 or 9 nucleotides. The number of nucleotides making up the variable loop ranged from 4 to 19 and none of the tRNAs had more than 19 nucleotides in the variable loop. Similar to the previous report, the Ψ-arm possessed 4-5 nucleotides.
Chloroplast tRNA contain group I intron In our study, however, chloroplast tRNA was found to contain introns. tRNA Lys of T. aestivum (4982_TraeCt095) was found to contain a group I intron located in the anti-codon loop region of tRNA Lys (Fig. 2). The intron was 84 nucleotides in length and began at nucleotide 37 and ended at nucleotide 120 of the tRNA Lys gene. The group I introns of chloroplast tRNA contain conserved U-U-x 2 -C and A-G-x 2 -U consensus sequences (Fig. 3). A phylogenetic tree was constructed to elucidate the evolution of the group I intron. The phylogenetic analysis indicated that the group I intron of chloroplast tRNA grouped with the group I intron of cyanobacteria (Fig. 4).

Chloroplast tRNA encodes putative novel tRNAs
In the present study, a few putative novel tRNAs were found to be encoded by the chloroplast genome (

C-A-U anti-codon codes for tRNA Ile in chloroplast tRNAs
The C-A-U anti-codon is a characteristic feature of tRNA Met and has only one iso-acceptor. In addition to the presence of a C-A-U anti-codon in tRNA Met , we also found that the tRNA Ile of chloroplast tRNA also encodes a C-A-U anti-codon. groups in cluster II (Fig. 8). In cluster I, tRNA Arg is grouped twice; once with tRNA Ala and once near to tRNA Met . Similarly, tRNA Met is also grouped twice; once near to the group containing tRNA Thr and once near the group containing tRNA Arg (Fig. 8). tRNA Arg , tRNA Ile , tRNA Leu , and tRNA Met present in cluster I are also found in cluster II of the phylogenetic tree. The tRNAs with the anti-codon G-A-C and U-A-C of tRNA Val , G-G-U and U-G-U of tRNA Thr , U-G-A, G-C-U, and G-G-A of tRNA-Ser , G-C-C and U-C-C of tRNA Gly , U-A-A, U-A-G, and C-A-A of tRNA Leu ; C-A-U of tRNA Ile , U-G-C, U-C-U, and A-C-G of tRNA Arg , all grouped separately (Fig. 8). tRNA Trp (CCA) is closely grouped with tRNA Arg (UCU) in cluster II, suggesting the evolution of tRNA Trp from tRNA Arg (Fig. 8). Similarly, tRNA Tyr (GUA) is closely grouped with tRNA Met (CAU) and tRNA Ile (CAU), suggesting the evolution of tRNA Tyr (GUA) and tRNA Ile (CAU) from tRNA Met (CAU). The grouping of tRNA Met (CAU) with tRNA Ile (CAU), and their similar anti-codon nucleotides, strongly suggests that tRNA Ile evolved directly from tRNA Met . In addition, the close grouping of tRNA Met (CAU) with tRNA Arg (ACG) further suggests that tRNA Arg has evolved from tRNA Met as well. The grouping of tRNA Glu (UUC) with tRNA Gly (GCC), tRNA-His (GUG) with tRNA Gln (UUG), and tRNA Pro (UGG) suggests that these tRNAs may have evolved from a common ancestor or by a gene duplication event. tRNA Ser (GGA, GCU, UGA) grouped with tRNA Leu (UAA); which suggests that tRNA Ser evolved from tRNA Leu . Notably, tRNA-Leu contains a C-A-A anti-codon, while tRNA Leu , which grouped with tRNA Ser , contains a U-A-A anti-codon. This suggests that tRNA Leu (CAA) has undergone a base substitution to give rise to tRNA Leu (UAA) and that further duplication and diversification resulted in tRNA Ser (GGA, GCU, UGA). The grouping of tRNA Ile (GAU), tRNA Lys (UUU), and tRNA Asp (GUC) together suggests their common evolutionary lineage. Further, grouping of tRNA Met with tRNA Thr (UGU and GGU) suggests that tRNA Thr (UGU and GGU) evolved from tRNA Met . Similarly, the close phylogenetic relationship of tRNA Met with tRNA Ala and tRNA Val in cluster I indicates that tRNA Ala and tRNA-Val also evolved from tRNA Met . A disparity index test of substitution pattern homogeneity was conducted using Monte Carlo replications to determine if all of the substitutions and the rate of substitution of the nucleotides are homogenous. Results indicated that the null hypothesis was rejected for tRNA Arg , tRNA Gln , tRNA Ala , tRNA Met , tRNA Thr , and tRNA Val ; suggesting that the rate of substitution of nucleotides in these groups is homogenous. Outside of these six tRNA isotypes, 14 did not show pattern homogeneity, and hence, the substitution of nucleotides and evolution of tRNA Gly , tRNA Pro , tRNA Ser , tRNA Leu , tRNA Phe , tRNA Asn , tRNA Lys , tRNA Asp , tRNA Glu , tRNA His , tRNA Ile , tRNA Tyr , tRNA Cys , and tRNA Trp are not homogenous. To better understand the relationship of chloroplast tRNAs with the Archaea, we incorporated tRNA two Archaea species and the tRNA sequences of three cyanobacterial species were used as ingroups. The complementary DNA sequences of two Arabidopsis thaliana NAC transcription factors (AtNAC1 and AtNAC2) were used as out groups (Additional file 2: Figure S1). A phylogenetic analysis showed some overlapping relationship of Archaea tRNAs with the chloroplast tRNA. However, chloroplast tRNAs were much closer to cyanobacterial tRNA compared to the Archaea.
The rate of transition and transversion is Isoacceptor specific tRNAs are evolutionarily conserved molecules and the possibility of undergoing major transition or transversion events is very minimum. The rate of transition (8.33) and transversion (8.34) of tRNA Ala , tRNA Asn , tRNA Asp , tRNA His , tRNA Phe , and tRNA Pro are almost equal. This indicates that, although the rate of transversion is slightly higher than the rate of transition, these tRNAs have evolved at almost an equal rate with respect to transition and transversion (Table 4). Additionally, the rate of transition (25.00) and transversion (0.00) of tRNA Cys , tRNA Gln , tRNA Trp , and tRNA Tyr were also similar to each other (Table). Notably, however, tRNA Cys , tRNA Gln , tRNA Trp , and tRNA Tyr in the chloroplast genome of monocot plants have undergone a high rate of transition but have not undergone any transversion. In contrast, the rate of transversion in tRNA Ile (8.60), tRNA Lys (10.09), tRNA Ser (9.15), was found to be higher relative to the rate of transition for tRNA Ile (7.80), tRNA Lys (4.82), and tRNA Ser (6.70), respectively (Table 4). A higher transition rate was also observed in tRNA Arg (12.40), tRNA Glu (12.53), tRNA Gly (17.39), tRNA Leu (11.88), tRNA Met (16.87), tRNA Thr , and tRNA Val ( Table 4). The highest rate of transition substitutions (25.00) was found in tRNA Cys , tRNA Gln , tRNA Trp , and tRNA Tyr . When all of the tRNAs are collectively examined, however, the average rate of transition (14.71) is greater than the average rate of transversion (5.15) ( Table 4).

Duplication of chloroplast tRNA precedes over deletion
Plant genomes contain a greater abundance of duplicated genes and whole genome duplication events have occurred multiple times over the past 200 million years [41][42][43][44]. Given the cyanobacterial origin of the chloroplast genome, the rate of duplication and loss events could be different from genes within the nuclear-encoded genome. In the present study, duplication/loss analyses of chloroplast tRNA in monocot plants revealed that 101 genes experienced a duplication event and that 139 genes underwent losses; whereas, 80 genes underwent conditional duplication. The majority of chloroplast tRNAs underwent losses during the course of evolution. Although all of the tRNAs descended from the same lineage (monocot), the loss of genes was still greater than the duplicated genes (Fig. 9).

Discussion
tRNAs are conserved family genes responsible for conducting protein translation event. Their presence in the chloroplast genome is supplementary to the genome to make it semi-autonomous. Multiple sequence alignment of chloroplast tRNAs revealed several basic conserved genomic features. A few tRNAs were found to contain extended nucleotide sequences at the 5′-end. However, the tRNAscan-SE server was not able to confirm if these nucleotide sequences of the 5′-end were introns. As a result, it is highly possible that these sequences can be introns of the tRNAs. A previous study reported the presence of a group I intron in cyanobacterial tRNA [45]. Given the origin of the chloroplast genome from a cyanobacterial lineage, it is reasonable to consider that these sequences are most likely introns of the chloroplast tRNAs [45]. Analysis of each tRNA sequence revealed tRNA Leu and tRNA Ser encoded for longest tRNA sequences. A previous study also reported the presence of 80 or more nucleotides in tRNA Leu and tRNA Ser of Oryza sativa [45]. This indicates that tRNA Leu and tRNA Ser encode longer tRNA sequences as compared to the others. This study also revealed the absence of tRNA Lys , tRNA Ala , and tRNA Ile genes in the chloroplast genome of these monocot plants. The absence of important tRNA encoding genes in the chloroplast genome is quite intriguing and makes it important to understand how protein translation in these monocot plants is conducted in the absence of important tRNAs. Most likely, genomic tRNA compensate for the absence of plastidal tRNAs or it might be possible that other tRNAs from the organellar genome perform multiple functions to conduct protein translation. This is the first report regarding the absence of tRNA Lys , tRNA Ala , and tRNA Ile in the chloroplast genome. In addition to the absence of tRNA Lys , tRNA Ala , and tRNA Ile , the chloroplast genome of monocot plants also lacks selenocystein, pyrrolysine and suppressor tRNA (Table 1). Our analysis also revealed that the monocot chloroplast genome contains the highest number genes encoding tRNA Leu and tRNA-Met ; (4) followed by tRNA Arg , and tRNA Ser (3). The universal genetic table contains 64 codons; of which, 61 are sense and 3 are anti-sense codons. Therefore, it is possible that there will be tRNAs with 61 unique anti-codons to code for 61 sense codons. Approximately 33 anti-codons were found to be absent from the tRNAs of chloroplast genome. However, the absence of UCC anti-codons of tRNA Gly is compensated by the presence of GCC anti-codons of tRNA Gly , whereas the absence of anti-codon UAC of tRNA Val is compensated by the presence of GAC anti-codons of tRNA Val . Similarly, the anti-codon GGU of tRNA Thr is compensated by the presence of the UGG anti-codon of tRNA Thr and the anti-codon UAA of tRNA Leu is compensated by the presence of anti-codon UAG and CAA. The complete absence of a tRNA gene for tRNA Lys (UUU, CUU) in O. sativa and S. bicolor, and tRNA Ala (AGC, GGC, CGC, and UGC) is difficult to understand. Nevertheless, it can be speculated that the deficiency created by the absence of these tRNAs in the chloroplast genome might be compensated by genomic tRNAs or other tRNAs of chloroplast or nuclear origin. The anti-codon CAU is encoded by tRNA Met and tRNA fMet . Our analysis indicated that chloroplast genome of the investigated monocot plants encodes tRNA Met and tRNA fMet as well. Previously, Howe (1985) and Hiratsuka et al., (1989) reported the presence of tRNA fMet in chloroplast genome [46,47]. All of the species were found to contain at least one tRNA fMet and one tRNA Met . O. nivara (6128_Orni Ct006), O. sativa (3694_OrsajCt127), S. officinarum (6569), S. bicolor (20382), T. aestivum (4994), and Z. mays (1994) each encode one tRNA fMet . In the prokaryotic genome, the initiation of protein translation is mediated by tRNA fMet , whereas subsequent addition of methionine to the polypeptide chain is mediated by tRNA Met [48][49][50]. The presence of tRNA Met and tRNA fMet is a characteristic feature of prokaryotic and organellar genes [51] and the presence of tRNA fMet in the chloroplast genome of monocot plants suggests its prokaryotic origin.
tRNAs are an evolutionarily conserved multigene family due to their functional similarities across many species. The nucleotide composition of a tRNA is responsible for maintaining the tertiary structure of the translated tRNA. Thus, the common conserved functions of tRNA should also be reflected in conserved coding sequences. A previous study reported the presence of a conserved nucleotide consensus sequence in tRNAs which was confined to the Ψ-loop only [45]. In our study, we found the presence of U-U-C-x-A nucleotide consensus sequence in the Ψ-loop. However, no conserved consensus sequences were found in other parts of the tRNAs. Instead, they were found to contain some conserved nucleotides. The nuclear encoded tRNA Gln and tRNA Asn contain a U nucleotide at the first position (Table 3) [45]. However, a multiple sequence alignment study indicated that the sequence conservation present in chloroplast tRNAs is family specific (Table 3). During protein translation, polymerase binds with the promotor of the tRNA which is known as A and B box. These two boxes contain conserved consensus sequences. Box A starts at the + 8 nucleotide of mature tRNA, whereas box B contains conserved 52 GGUUCGANUCC 62 nucleotides consensus that constitutes a part of the Ψ-arm and whole Ψ-loop. Box A of chloroplast tRNA was not so conserved, whereas box B was highly conserved. Boxes A and B are considered to be the intragenic transcription promotor signal sequence for RNA polymerase III [52]. The signal sequence for transcription activation is not conserved in a universal manner in the tRNAs of the chloroplast genome. The anti-codon loop was reported to be conserved at the 32nd position [52]. However, in the present study, conservation of nucleotides was found at the 32nd and 33rd positions in the majority of cases. In addition, several tRNA sequences were found to contain 3'-C-C-A tail. The addition of a C-C-A tail to the 3′-end of a tRNA is facilitated by a tRNA nucleotidlyltransferase. However, chloroplast genomes do not encode tRNA nucleotidyltransferases. Thus, adding a C-C-A tail to the 3′-end of the tRNA would be difficult in the absence of nucleotidyltransferases. The absence of a C-C-A tail at the 3′ end of the few tRNAs reflect their recent evolution as the majority of nuclear tRNAs lacked a 3' C-C-A tail.  Given the cyanobacterial origin of the chloroplast genome, it should be prokaryotic in nature, and in general, should be intron free. However, we found the presence of group I introns in the chloroplast tRNAs. Previous studies have also reported the presence of intron in tRNA Leu (UAA) and tRNA fMet (UAC) of cyanobacterial tRNA [53,54]. Additionally, a recent study conducted in our laboratory also reported the presence of introns in cyanobacterial tRNA Arg , tRNA Gly , and tRNA Lys [45]. Although the presence of introns in the cyanobacterial genome has been reported by several studies, the present study appears to be the first to report the presence of introns in chloroplast tRNA. The group I introns lack significant sequence conservation, however, the present analysis indicated that they contain short conserved consensus sequences. The group I intron of chloroplast tRNA grouped with the group I intron of cyanobacteria (Fig. 4), thus providing additional evidence to suggest that they evolved from a common cyanobacterial lineage.
As proposed by Robert Holley [34], tRNAs are characterized by a cloverleaf-like structure, although a few tRNAs vary in their secondary structure [35]. tRNAs contains various arms and loops that function in protein translation. Each arm and loop have their own unique nucleotide composition. A previous study reported that the acceptor arm contains seven base pairs 7 bp, the D-stem 3-4 bp, the D-loop 4-12 nucleotides, the anti-codon arm 5 bp, the anti-codon loop 7 nucleotides, the variable region 4-23 nucleotides, the Ψ-arm 5 bp, and the Ψ-loop seven nucleotides [37]. The previous report, along with the present study, suggests that significant variation exists in arms and loops of chloroplast tRNAs. The acceptor arm contains distinct information for tRNA-nucleotidyltransferases. However, the absence of an acceptor arm in tRNA Gly (UCC) of O. nivara and tRNA Thr (GGU) of S. bicolor is quite intriguing. The question arises as to how a tRNA without an acceptor arm can participate to carry an amino acid during the process of protein translation? Some tRNAs contain novel loops having A-C-U-U-U-U-G nucleotides. The stem of the novel loop allows the bonding of A to U and G to U nucleotides. The novel loop structures identified in the present study raises the question whether these loops mimic the anti-codon loop of the tRNA and play a critical role in the protein translation machinery within the chloroplast. Some of the tRNA were also found to contain nine nucleotides in the anti-codon loop; which may represent a novel phenomenon of tRNA. The functional impact of having nine nucleotides in the anti-codon loop remains to be determined. In addition to the presence of few putative novel tRNA structure, chloroplast tRNAs were found to contain a C-A-U anti-codon that codes for tRNA Ile as well. However, the presence of a C-A-U anti-codon in tRNA Ile was previously reported in Bacillus subtilis [55].
Phylogenetic analysis of chloroplast tRNA showed two distinct clusters and multiple groupings. Some of the tRNA members of cluster I also found to be present in cluster II; suggesting their evolution by duplication and divergence. However, anti-codon GAC, UAC, GGU, UGU, UGA, GCU, GGA, GCC, UCC, UAA, UAG, CAA, CAU UGC, UCU, and ACG fall independently in the phylogenetic tree; suggesting their evolution from multiple common ancestors. The overlapping grouping of tRNA family members suggests that the tRNAs with these anti-codon groups may have evolved from different common ancestors or may have arisen from duplication events. The presence of tRNA Met twice in cluster I and once in cluster II indicates that tRNA Met is one of the tRNA families that has undergone major duplication event(s) to give rise to other tRNAs. Phylogenetic analysis further revealed that tRNA Leu (CAA), tRNA Trp (CCA), tRNA Arg (UCU), tRNA Asn (GUU), tRNA Tyr (GUA), tRNA Met (CAU), tRNA Cys (GCA), and tRNA Phe (GAA) present in cluster II are the most primitive form of tRNAs with tRNA Leu as the most basal evolutionary ancestor. The grouping of tRNA Met (CAU) with tRNA Ile (CAU), and their similar anti-codon nucleotides, strongly suggests that tRNA Ile evolved directly from tRNA Met . The overall analysis clearly indicates that tRNA Met is a major player in the evolution of tRNAs in the chloroplast genome. The distribution of tRNA Met in two different clusters strongly suggests that tRNA Met underwent several major substitution and duplication events to give rise to diverse tRNA families with distinct anti-codons. The rate of transition of chloroplast tRNAs were higher than the rate of transversion. tRNA Cys , tRNA Gln , tRNA Trp , and tRNA Tyr belong to a polar R group and the rate of transversion is zero in tRNAs that carry polar amino acids. Polar amino acids are readily soluble in water and form strong hydrogen bonds with interacting molecules. This suggests that the evolution of chloroplast tRNA Cys , tRNA Gln , tRNA Trp , and tRNA Tyr strongly favors transition substitutions rather than transversion substitutions and that some tRNA Isoacceptors undergo transition more readily than transversion. A few tRNAs, however, underwent a higher rate of transversion than transition; suggesting that the rate of evolution and the rate of transition and transversion of tRNAs are Isoacceptor-specific and that tRNAs have not undergone an equal rate of evolution. In addition to the mutational event, gene duplication is also a major force in evolution and represents an important mechanism by which species acquire new genes [56]. The majority of novel gene functions have evolved through gene duplication events which can occur by genome duplication, retrotransposons, and unequal crossing over [57,58]. Ancient duplication events coupled with the retention of extant pairs of duplicated genes have contributed enormously to the evolution of gene families and functional diversification [59]. Plant genomes tend to evolve at a high rate, leading to greater genome diversity relative to other organisms [60]. The study of chloroplast tRNAs showed the rate of deletion of tRNA is superior than the rate of duplication. This suggests that the maternal inheritance of the cyanobacterial-derived chloroplast genome is more intact than the nuclear-encoded plant genome. Therefore, although the species were part of the same lineage, some genes were still lost within each species. This provides further evidence that cyanobacterial tRNAs originated from polyphyletic common ancestors, and hence, loss events are more pronounced than duplication events. Almost all of the tRNAs experienced loss events in either of species studied (Table 5).

Conclusion
We conducted a tRNA analysis of the chloroplast genome of six monocot plants and found that the chloroplast genome in these plant species encode 28 to 39 tRNA genes. The numbers of tRNA Isoacceptors ranged from 23 to 29 and the majority of tRNAs were associated with only one Isoacceptor. The tRNAs in the chloroplast genome were also found to contain a group I intron in the anti-codon region and a phylogenetic analysis revealed that the chloroplast tRNAs in monocot plants evolved from multiple common ancestors. The chloroplast genomes of the examined monocot plant species were also found to contain putative, novel tRNAs which need to be further investigated to understand  [64].

Analysis of chloroplast tRNA of monocot plants
The collected genomic tRNA sequences of chloroplast tRNAs of monocot plants were subjected to further analysis using ARAGRON and the tRNAscan-Se server [65]. Default parameters were used to analyze the genomic tRNA sequences in ARAGRON. In the tRNAscan-Se server, the following parameters were used to analyze the genomic tRNA; sequence source, bacterial; search mode, default; query sequences, formatted (FASTA); and genetic code for tRNA isotype prediction, universal. All of the tRNAs were analyzed using the same parameters and the number and composition of nucleotides in different arms and loops were recorded individually. The tRNAs that were found to have a different structure than the canonical clover leaf-like structure characteristic of tRNA were considered as putative novel tRNAs.

Multiple sequence alignment
To identify and analyze the conserved nucleotide sequences of tRNA isotypes, the nucleotide sequences of 20 isotypes were separately grouped. Later, tRNA isotypes were subjected to multiple sequence alignment using the Multalin server. All of the sequences, in FASTA format, were used in the alignment analysis with the following parameters; sequence input format, auto; display of sequence alignment, colored; alignment matrix, Blosum61-12-2; gap penalty at opening and extension, default; gap penalty at extremities, none and one iteration only, none. The highest alignment consensus value was maintained at 90% (default); whereas, the lowest consensus value was kept at 50% (default). In the displayed alignments, red indicates a similarity/conservation of 90% or more; whereas, blue indicates a sequence conservation less than 90%. Alignments displayed in black indicates no conservation.

Construction of phylogenetic tree
To analyze the evolution of chloroplast tRNAs in monocot plants, a phylogenetic tree was constructed using MEGA6.0 software [66]. Prior to construction of the phylogenetic tree, a Clustal file of all the tRNAs was created using the Clustal omega server. The generated Clustal file of tRNAs was converted to a MEGA file format using MEGA6 software. Model selection was performed prior to the construction of the phylogenetic tree. Model selection was conducted by MEGA6 software using the following statistical parameters: analysis, model selection (ML); tree to use, automatic (neighbor-joining); statistical method, maximum likelihood; substitution type, nucleotide; gaps/missing data treatment, partial deletion and site coverage cutoff was 95%. The model selection analysis that resulted in the lowest Bayesian information criterion (BIC) was considered as the best model to construct the phylogenetic tree. The lowest BIC score was found to be 7785.682 for the Kimura2+ G + I model; as a result, the latter model was used to construct a phylogenetic tree. Other statistical parameters within the Kimura2+ G + I model were: analysis, phylogeny reconstruction; statistical model, maximum likelihood; test of phylogeny, bootstrap method; no. of bootstrap replicates, 1000; substitution type, nucleotides; rates among sites, Gamma distributed with invariant sites (G + I), no of discrete Gamma categories, 5; gaps/ missing data treatment, partial deletion; site coverage cutoff, 95%; and branch swap filter, very strong.

Analysis of transition and transversion
The MEGA file format of tRNAs used to construct the phylogenetic tree was used to analyze the transition/transversion rate for all of the tRNAs. Additionally, transition/transversion rates of all of the 20 tRNA isotypes were separately studied. The tRNA isotypes were also subjected to multiple sequence alignment using the Clustal omega server to generate a Clustal file for each individual isotype. The generated Clustal files of tRNA isotypes were converted to a MEGA file format and the rate of substitution was estimated using MEGA6 software. The following statistical parameters were used to study the transition/ transversion rates in the chloroplast tRNAs of monocot plants: analysis, substitution pattern estimation (ML); tree to use, automatic (neighbor-joining tree); statistical method, maximum likelihood; substitution type, nucleotide; model/method, Kimura2-parameter model; rates among sites, Gamma distributed (G); no. of discrete Gamma categories, 5; gaps/missing data treatment, partial deletion, site coverage cutoff 95%, and branch swap filter, very strong.

Disparity index analysis
To determine if all of the substitutions of nucleotides occurred homogenously (equal rates) during evolution, a disparity index test of the pattern heterogeneity was conducted to determine the homogeneity of nucleotide substitutions. Statistical parameters used to analyze the pattern of homogeneity were: analysis, disparity index test of substitution pattern homogeneity; scope, in sequence pairs; no. of Monte Carlo Replications, 1000; substitution type, nucleotide; gaps/missing data treatment, partial deletion; and site coverage cutoff was 95%.

Analysis of gene duplication and loss
An all species tree was first constructed using the NCBI taxonomy browser (https://www.ncbi.nlm.nih.gov/Taxonomy/CommonTree/wwwcmt.cgi) to analyze the duplication and loss events of tRNA genes. Species used to construct the species tree were O. nivara, O. sativa, S. officinarum, S. bicolor, T. aestivum, and Z. mays. The phylogenetic tree used for the evolutionary analysis was utilized as the gene tree. Gene duplication/loss events were studied using Notung2.6 software. The gene tree was reconciled with the species tree during the analysis to obtain the duplication and loss nodes of the genes.