Phylogeny and a structural model of plant MHX transporters
© Gaash et al.; licensee BioMed Central Ltd. 2013
Received: 2 October 2012
Accepted: 13 March 2013
Published: 2 May 2013
Skip to main content
© Gaash et al.; licensee BioMed Central Ltd. 2013
Received: 2 October 2012
Accepted: 13 March 2013
Published: 2 May 2013
The Arabidopsis thaliana MHX gene (AtMHX) encodes a Mg2+/H+ exchanger. Among non-plant proteins, AtMHX showed the highest similarity to mammalian Na+/Ca2+ exchanger (NCX) transporters, which are part of the Ca2+/cation (CaCA) exchanger superfamily.
Sequences showing similarity to AtMHX were searched in the databases or sequenced from cDNA clones. Phylogenetic analysis showed that the MHX family is limited to plants, and constitutes a sixth family within the CaCA superfamily. Some plants include, besides a full MHX gene, partial MHX-related sequences. More than one full MHX gene was currently identified only in Oryza sativa and Mimulus guttatus, but an EST for more than one MHX was identified only in M. guttatus. MHX genes are not present in the currently available chlorophyte genomes. The prevalence of upstream ORFs in MHX genes is much higher than in most plant genes, and can limit their expression. A structural model of the MHXs, based on the resolved structure of NCX1, implies that the MHXs include nine transmembrane segments. The MHXs and NCXs share 32 conserved residues, including a GXG motif implicated in the formation of a tight-turn in a reentrant-loop. Three residues differ between all MHX and NCX proteins. Altered mobility under reducing and non-reducing conditions suggests the presence of an intramolecular disulfide-bond in AtMHX.
The absence of MHX genes in non-plant genomes and in the currently available chlorophyte genomes, and the presence of an NCX in Chlamydomonas, are consistent with the suggestion that the MHXs evolved from the NCXs after the split of the chlorophyte and streptophyte lineages of the plant kingdom. The MHXs underwent functional diploidization in most plant species. De novo duplication of MHX occurred in O. sativa before the split between the Indica and Japonica subspecies, and was apparently followed by translocation of one MHX paralog from chromosome 2 to chromosome 11 in Japonica. The structural analysis presented and the identification of elements that differ between the MHXs and the NCXs, or between the MHXs of specific plant groups, can contribute to clarification of the structural basis of the function and ion selectivity of MHX transporters.
A number of Mg2+ transport proteins have been identified in prokaryotic and eukaryotic organisms (reviewed in [1–4]). The bacterial CorA Mg2+ channel includes two transmembrane segments (TMSs) . Homologs of CorA, termed Alr proteins, were found in the plasma membrane of yeast . Human MRS2, a mitochondrial Mg2+ channel, shares many of the properties of the bacterial CorA and yeast Alr1 proteins (reviewed in ). Another family of bacterial Mg2+ transporters comprises the MgtA and MgtB proteins that have ten TMSs and are members of the superfamily of P-type ATPases (reviewed in ). Mammalian Mg2+ transporters of the SLC41 family share similarity with some regions of the bacterial MgtE transporters, which possess five TMSs (reviewed in ). The mammalian ancient conserved domain protein (ACDP) Mg2+ transporters were also identified in prokaryotes (reviewed in ). However, other mammalian transporters, including the TRPM6/7, MagT, NIPA, MMgT, and HIP14 families, were not identified in prokaryotic genomes (reviewed in ).
Four groups of proteins were shown to carry Mg2+ ions in plants (reviewed in [1, 2, 4]). The first group comprises transporters that are part of the CorA superfamily [3, 5, 6]. In Arabidopsis thaliana, this gene family has ten members, and the family was named MRS2 , or alternatively MGT . The slow-vacuolar (SV) channel is a non-selective cation channel that can also carry Mg2+ ions (reviewed in ). The cyclic nucleotide-gated channel AtCNGC10 carries Ca2+ and Mg2+ ions . The forth group consists of magnesium proton exchangers (MHX) proteins. The activity of a protein that exchanges protons with Mg2+, Zn2+ and Cd2+ ions was first identified in vacuolar vesicles of rubber tree (Hevea brasiliensis) [9, 10]. Electrophysiological analysis and overexpression studies indicated that the A. thaliana MHX gene (AtMHX) encodes a Mg2+/H+ exchanger, which exchanges protons with Mg2+, Zn2+, Cd2+, and possibly Fe2+ ions across the vacuolar membrane [11, 12]. As the vacuole is acidic compared to the cytosol, AtMHX apparently sequesters the metal cations into the vacuolar lumen, at the expense of releasing vacuolar protons into the cytosol. It is currently unknown whether the main impact of AtMHX on plant physiology is related to metal or proton (pH) homeostasis [12, 13]. AtMHX is highly expressed in the vascular region, particularly the phloem, of tissues with photosynthetic potential . The 5′ untranslated region (5′ UTR) of AtMHX includes an AUG codon upstream to the initiation codon of the main open reading frame (ORF). The resulting upstream ORF (uORF) significantly inhibits AtMHX expression, by inhibiting its translation  and subjecting its transcript to degradation by the nonsense mediated mRNA decay (NMD) pathway .
AtMHX showed high similarity (32% identity) to mammalian sodium calcium exchanger (NCX) transporters . NCX proteins are included in the Ca2+/cation (CaCA) exchanger superfamily. This superfamily was defined as a group of transporters that carry cytosolic Ca2+ ions across membranes against their electrochemical gradient, by utilizing the electrochemical gradients of other cations, such as H+, Na+, or K+. The CaCA superfamily was classified into five major families, which were named, according to their first characterized member, YRBG, CAX, CCX, NCX, and NCKX [17, 18]. YRBG transporters were mainly found in bacteria . CAX (CAtion eXchangers) are cation/H+ exchangers found in plants, bacteria, fungi, and lower vertebrates, but not in higher animals (reviewed in ). All plant CAX genes tested thus far transported Ca2+, Mn2+, Cd2+, and Zn2+ to varying degrees . CCX (Ca2+/cation exchangers) characterized thus far catalyse both Na+/Ca2+ and Li+/Ca2+ exchange (reviewed in ). NCX are Na+/Ca2+ exchangers, and NCKX are K+-dependent Na+/Ca2+ exchangers. NCX and NCKX proteins were identified in mammals, nematodes, insects, squid, and algae [17, 21, 22].Vertebrates NCX proteins were clasified into four groups (named NCX1-4) [18, 21, 23].
NCX1 includes a large intracellular loop between TMSs 5 and 6. This loop is not essential for Na+-Ca2+ exchange activity, but has a regulatory function . NCX1 is activated by binding of intracellular Ca2+ ions to two high-affinity Ca2+-binding domains, called CBD1 and CBD2, which are located in this large loop  (Figure 1). NCX1 inactivation by intracellular Na+ ions is mediated by a basic 20-amino acid segment of this large loop, called the XIP (eXchanger Inhibitory Peptide) region [40, 41] (Figure 1).
The crystal structure of a prokaryotic Na+/Ca2+ exchanger (NCX_Mj of the archaea Methanococcus jannaschii) was recently characterized . While the biochemically-determined topological model of the mammalian NCX1 exchanger only possesses nine TMSs, the crystal structure of NCX_Mj contains ten TMSs.
With the recent increase in genomic information from various organisms, more knowledge is gained about the phylogenetics of plant metal transporters [22, 43, 44]. To increase our knowledge about the MHX group of Mg2+ transporters, we present here a phylogenetic and structural analysis of 31 MHX proteins originating from 26 plant species.
We determined the cDNA sequence of the MHX genes of Solanum lycopersicum (tomato), Solanum tuberosum (potato), and Triticum aestivum (wheat). All sequences were determined on both strands. The specific clones used for sequencing were CTOA20E13, derived from S. lycopersicum cv. TA492, ST_BEa0006L05, derived from S. tuberosum cv. Bintje, and whoh15o16 (BJ273167), derived from T. aestivum cv. Chinese Spring. The CTOA20E13 clone of S. lycopersicum includes a foreign DNA insert of 112 bp, whose borders were determined by cloning the corresponding region of S. lycopersicum cv. VF-36.
The RefSeq (NCBI) and Phytozome databases were searched using the BLASTP and TBLASTN commands. The JGI database was searched (using the TBLASTN command) for additional algal proteins (at http://genome.jgi.doe.gov/genome-projects/pages/projects.jsf?kingdom=Alga). Additional file 1 presents a table of all proteins analysed in this study, and Additional file 2 provides the sequences, accession numbers and explanations about manual modifications (see below). Proteins identified based on their similarity to AtMHX are detailed in section 1 of Additional file 1. Some sequences were partial, or apparently included extra N-terminal regions, probably due to misannotation of introns. These sequences were manually re-annotated based on similarity searches in the genomic sequences (Additional file 2). Most proteins were named here by their source organisms. The two MHX paralogs identified in each of the Oryza sativa (rice) subspecies, Japonica and Indica, were named O.sativa_J1 (J1 stands for Japonica 1), O.sativa_J2, O.sativa_I1 (I1 stands for Indica 1), and O.sativa_I2. Additional similarity searches were carried out using the proteins identified in Chlamydomonas reinhardtii, Selaginella moellendorffii, and Physcomitrella patens as queries. Only searches using the sequence from C. reinhardtii allowed us to identify new proteins, and this was followed by similarity searches using some of the new proteins as queries. The resulting 14 proteins are included in section 2 of Additional file 1. Nine proteins that were identified in lower organisms based on their similarity to Homo sapiens NCX1 (HsNCX1) are included in section 3 of Additional file 1. The proteins in sections 2 and 3 of Additional file 1 were named according to their source organisms, together with an arbitrary number when more than one protein was identified in the same organism (this number is, therefore, not related to the four groups of vertebrate NCXs).
The number of EST clones derived from each of the M. guttatus and P. patens MHX genes was determined using files including the EST data of these plants, which were sent to us ahead of publication by the Phytozome database.
The phylogenetic analyses included marker proteins for the CAX, CCX, NCKX and YRBG families of the CaCA superfamily (section 4 of Additional file 1). These markers were chosen from the proteins classified by Cai and Lytton , which performed the first phylogenetic analysis of the CaCA superfamily. The A. thaliana CAX1-5 proteins served as markers for the CAX family. The A. thaliana CAX7-9 and CAX11 proteins served as markers for the CCX family. Although the latter proteins were initially called CAX, they belong to the CCX family [17, 19]. For clarity, the name of each marker protein included, besides the transporter name, the name of the family to which it belongs. For example, AtCAX7 was called AtCAX7_CCX. Markers for the NCX family (section 4.5 of Additional file 1) were chosen from the proteins classified by Marshall and co-workers . The latter proteins were named by their source organisms, the name of their family (NCX), and a number that indicates to which of the four groups of vertebrate NCXs they belong, as indicated by . Updated sequence information (as for April 2011) was retrieved from the databases for each of the marker proteins.
Phylogenetic analyses were conducted using MEGA5. We utilized the Maximum Likelihood method based on the Jones-Taylor-Thornton (JTT) amino acid substitution model . The bootstrap consensus trees inferred from 1000 replicates  were presented. Branches corresponding to partitions reproduced in less than 50% bootstrap replicates were collapsed. The phylogenetic trees were drawn to scale, with branch lengths measured in the number of substitutions per site. All positions containing gaps and missing data were eliminated.
Western blot analysis of AtMHX under reducing or non-reducing conditions (in the presence of β-mercaptoethanol or N-ethylmaleimide, respectively) was carried out according to  with some modifications. Tobacco (Nicotiana tabaccum cv. Samsun NN) plants overexpressing AtMHX  were grown in a climate-controlled greenhouse with a photoperiod of 16 h light and 8 h darkness. Leaves of five-week-old plants were harvested and crushed into a fine powder in liquid nitrogen. Frozen plant powder (25 mg) was added into pre-weighted tubes containing 150 μl of 125 mM Tris-HCL pH 6.8, 20% glycerol, 6% SDS, 0.006% bromophenol blue, protease inhibitors (1 mM PMSF, 0.5 μg/ml leupeptin, and 1 μg/ml aprotinin), and either 2.145 M β-mercaptoethanol or 25 mM N-ethylmaleimide (NEM). The samples were mixed well and incubated on ice for 2 h with frequent mixing. The samples were then centrifuged for 2 min at 4°C, 14,000 g, and aliquots that derived from 2.5 mg plant powder were fractionated by SDS-PAGE and blotted on a PVDF membrane (Bio-Rad). Western blot analysis was performed by the chemiluminescence method, using polyclonal antibodies against a peptide from AtMHX sequence (Cys-Glu-Glu-Ile-Asp-Thr-Ser-Lys-Asp-Asp-Asn-Asp-Asn-Asp-Val-His-Asp) that were affinity-purified prior to use against the same peptide using the SulfoLink Coupling Gel (Pierce).
Genes with similarity to AtMHX were searched in the databases or sequenced by us (Additional files 1 and 2) (see Methods). For the sake of clarity, most proteins were named here according to their source organisms. To determine the identity of the identified proteins, it was necessary to include in the phylogenetic analysis marker proteins for each of the five known families of the CaCA superfamily (section 4 of Additional file 1; see Methods). In particular, many marker proteins were used for the NCX family.
MHX proteins were found in three divisions of the plant kingdom. The 28 MHXs identified in the first division, Magnoliophyta (angiosperm) showed a similarly level of ~70% to each other (Additional files 1 and 3). In the second division, Lycopodiophyta, a protein showing 41% similarity to AtMHX was found in S. moellendorffii, a vascular non-seed plant. In the third division, Bryophyta, two proteins showing 39 and 40% similarity to AtMHX were found in the moss P. patens, a non-vascular, non-seed plant. Although the angiosperm and non-seed plant proteins appeared in two different clades (Figure 2), the non-seed plant proteins showed a higher similarity to angiosperm MHXs than to NCX proteins (scores of ~40 and ~30%, respectively; Additional file 3), and were, therefore, considered part of the MHX family. The existence of MHX family members in the moss P. patens indicates that although the vascular system is one of the major sites of AtMHX expression in A. thaliana, MHX proteins also exist in non-vascular plants.
Database searches identified proteins with similarity to AtMHX in a fourth division of the plant kingdom, namely Chlorophyta (green algae). Within the Chlorophyta, six proteins showing similarity to AtMHX were identified in four single-celled algae - Ostreococcus lucimarinus, Ostreococcus sp. RCC809, Micromonas pusilla, and C. reinhardtii (section 1.1.4 of Additional file 1). The phylogenetic analysis showed that the protein identified in C. reinhardtii belongs to the NCX family, while the five proteins identified in the three other species belong to the NCKX family (Figure 2 and Additional file 1). NCX and NCKX proteins were previously identified in algae . Among the chlorophytes whose genomic information is currently available, no proteins showing similarity to AtMHX were identified in Volvox carteri or Chlorella sp. NC64A. The currently available sequences of red algae do not include a protein with similarity to AtMHX. Proteins showing similarity to AtMHX were identified in some algae that do not belong to the Plantae but to the Chromalveolata kingdom (section 1.2 of Additional file 1). Among the nine proteins identified in Chromalveolata, three belong to the NCX family while four proteins belong to the NCKX family (Figure 2 and Additional file 1). Two additional proteins identified in Thalassiosira pseudonana apparently belong to a novel family of the CaCA superfamily (Figure 2).
Most of the proteins identified by their similarity to the C. reinhardtii NCX (CrNCX) (section 2 of Additional file 1), as well as the proteins identified in lower organisms by their similarity to HsNCX1 (section 3 of Additional file 1), were more related to NCX than to MHX proteins (Figure 2, and Additional files 1, 3, and 4). Searches in a large number of fungal species whose genome sequences were available revealed only five proteins showing limited similarity to AtMHX. The protein identified in Schizosaccharomyces pombe was classified to the CCX family . The analysis showed that the proteins identified in Botryotinia fuckeliana and Yarrowia lipolytica were related to the CCX family as well (Figure 2). The proteins identified in the fungi Aspergillus nidulans and Schizophyllum commune could be the founder members of novel families of the CaCA superfamily (Figure 2). It, therefore, seems that there are no close homologs of MHX proteins in fungi. Yet, it is possible that Mg2+/H+ exchange is carried out in fungi by proteins with a different phylogenetic identity. It was reported that there is Mg2+/H+ exchange activity in Saccharomyces cerevisiae, but the involved protein has still not been identified.
Many prokaryotic proteins showed some degree of similarity to AtMHX or HsNCX1. However, preliminary phylogenetic analyses showed that even the prokaryotic proteins with the highest similarity to AtMHX or HsNCX1, which were identified in Desulfococcus oleovorans and the cyanobacterium Oscillatoria, respectively, did not belong to either the MHX or NCX families. We, therefore, did not include in the phylogenetic analysis other prokaryotic proteins except those of D. oleovorans, Oscillatoria, and M. jannaschii. The latter species was added because the crystal structure of its Na+/Ca2+ exchanger was recently characterized , and we wanted to evaluate the phylogenetic relatedness of this protein to MHX and NCX family proteins. As shown in Figure 2, the M. jannaschii Na+/Ca2+ exchanger belongs to the YRBG family, the D. oleovorans protein belongs to the NCKX family, and the Oscillatoria protein can belong to a novel family of the CaCA superfamily. It, therefore, seems that prokaryotes do not contain MHX-family proteins.
The analyses of all currently available sequences indicated that the MHX family is limited to plants and does not exist in other organisms. The plant kingdom includes two major phylogenetic groups that splitted ~1.2 billion years ago , namely the streptophytes (containing land plants and charophyte algae) and the chlorophytes (containing green algae, such as C. reinhardtii and O. lucimarinus). Genomic information is not yet available for charophyte algae. The currently available chlorophyte genomes do not include proteins with homology to the MHXs, but only to NCX or NCKX proteins. As indicated by their pairwise similarity scores (Additional files 1 and 3), the MHXs are more similar to the NCXs than to the NCKXs. The currently available data are consistent with the suggestion that the MHXs evolved from the NCXs after the split of the chlorophyte and streptophyte lineages of the plant kingdom.
Most plants had only one MHX ortholog. Among the angiosperm whose genome sequences were available, two full MHX genes were identified only in Mimulus guttatus (an eudicot) and O. sativa (a monocot) (see below). Three of the plant species analysed here are polyploid: T. aestivum (a hexaploid), and S. tuberosum and Glycine max (which are both tetraploids). The G. max genome had been completely sequenced. Chromosome 2 of G. max includes a 1 kb fragment with high similarity to part of the full G. max MHX gene located on chromosome 10 (see Additional file 2). This 1 kb fragment cannot encode a full MHX protein. This indicates that the MHX loci were reduced to only two alleles subsequent to tetraploidization of the G. max genome. The secondary inactivation of some alleles subsequent to a previous polyploidization event, which results in the remaining of only two active alleles, was termed ‘functional diploidization’ (e.g., [55, 56]). This process can occur, for example, by turning some of the genes into pseudogenes . Thus, the MHX loci of G. max underwent functional diploidization. Only one cDNA clone encoding a full MHX protein could be identified in the currently available databases of T. aestivum and S. tuberosum, and all the ESTs found were related to the single cDNA identified in each of the two species. One possible explanation for the identification of only one MHX cDNA in each of these species is related to the fact that their genomes have not been completely sequenced. Yet, the absence of ESTs corresponding to another MHX gene in these polyploid species is in line with the possibility that their MHX genes underwent functional diploidization, which rendered extra gene copies non-functional.
Zea mays (maize) is a diploid but includes, in additional to a full MHX gene, two regions with similarity to parts of this gene (see Additional file 2). The existence of more than one copy of certain genes in diploid plants can be explained by the fact that there have been widespread genome duplication events throughout the history of flowering plants, including most species that are now considered to be diploid ( and references therein). A polyploidization event occurred about 70 million years ago (MYA) in the common ancestor of the major cereals . Maize underwent another genome-wide duplication event approximately 11 MYA . However, currently, only one functional MHX gene remains in the maize genome, indicating that the MHXs of this species underwent functional diploidization.
Although O. sativa is a diploid, it has two MHX genes. The most recent genome-wide duplication event in rice history predated rice divergence from sorghum . This fact, together with the phylogenetic trees shown in Figure 3, indicate that the two MHX genes of rice originated from isolated gene duplication and not from a genome-wide duplication event. The two proteins identified in each of the O. sativa subspecies Japonica and Indica showed a high similarity (97%) to each other (Figure 3, and Additional file 3). These four proteins segregated into two mini-clades, each including one Japonica and one Indica ortholog (Figure 3B). This suggests that MHX gene duplication in Oryza occurred before the split between the O. sativa Indica and Japonica subspecies, which was estimated to have occurred 200,000–400,000 years ago [61, 62]. It is interesting that the two Indica MHX paralogs are located relatively close to each other (about 57,000 bp apart) on chromosome 2, whereas in Japonica, one MHX paralog is located on chromosome 2 and the other on chromosome 11. This indicates that one MHX paralog had translocated in one of the two subspecies (most likely, following an initial duplication in chromosome 2, one Japonica MHX paralog translocated to chromosome 11). The two O. Sativa orthologous MHX proteins of the first mini-clade, O.sativa_J1 and O.sativa_I1 (of the Japonica and Indica subspecies, respectively), are highly similar to each other (Figure 3B). The two orthologs of the second mini-clade, O.sativa_J2 and O.sativa_I2, are somewhat less similar to each other (Figure 3B). Thus, the second mini-clade underwent a more rapid evolutionary divergence process compared to the first one. Moreover, for both Japonica and Indica, all the ESTs of MHX genes that were found in the RefSeq database (nine for Japonica and three for Indica) originated from the first, but not the second, mini-clade. This suggests that the second mini-clade is expressed at a much lower level than the first, or its expression is much more restricted at the spatial or temporal level. These data are also consistent with the possibility that the genes of the second mini-clade turned into pseudogenes and are, therefore, neither expressed nor subjected to natural selection that restricts their divergence process. In accord with this possibility, we were unable to identify in the genomic sequence of the Indica subspecies the coding region of the last ~20 amino acids of the O.sativa_I2 protein, despite the high similarity in this region between the MHXs and, in particular, the three other O. sativa MHX sequences (Additional files 2 and 5). O.sativa_I2 was, therefore, omitted from all sequence alignments.
The presence of two MHX genes in M. guttatus, which was sequenced from the diploid inbred line IM62, can be explained by the fact that members of the genus Mimulus underwent polyploidization events in their history . As far as we are aware, no data were published about the exact time of the polyploidization event in M. guttatus history, but the observation of only 74% similarity between the two MHXs of M. guttatus (Figure 3 and Additional file 3) suggests that considerable time had passed since this event. One EST could currently be identified in the M. guttatus EST database for each of its two MHX genes. This makes M. guttatus the only plant species in which evidence for the expression of more than one MHX-paralogous gene was obtained thus far.
The moss P. patens includes two MHX paralogous genes, whose deduced protein sequences share 65% similarity with each other (Additional file 3). Similar to all mosses, the enduring P. patens plant represents the haploid gametophyte. However, there is evidence that similar to most seed plants, P. patens is a paleopolyploid, which underwent a genome duplication event between 30-60 MYA . The phylogenetic analysis indicates that MHX gene duplication occurred in mosses (or some of them) after the divergence of vascular plants from the mosses, which occurred approximately 460 MYA ( and references therein). It is, therefore, likely that MHX gene duplication resulted from the indicated polyploidization event in P. patens history. One EST was found in the database for the gene encoding P.patens_1, but no EST was found for P.patens_2. In addition, we were unable to undoubtedly identify the sequence encoding the first ~16 amino acids of P.patens_2 in the P. patens genomic data, although this may result from the low similarity between the MHXs in this region. More data will be necessary to clearly determine if P.patens_2 is expressed, but based on the current data it is possible that MHX underwent functional diploidization in (the diploid generation of) P. patens.
To conclude, the current EST and genomic data suggest that the MHXs underwent functional diploidization in most plant species.
The 5′ UTR of AtMHX includes an uORF that substantially inhibits the expression of this gene by lowering its transcript content through NMD  and by inhibiting its translation . It was interesting to learn whether the MHX genes of other plants include uORFs as well. Information about the 5′ UTR was available for only some of the MHX genes. It was found that whereas only about 20% of the genes in plants include an uORF , uORFs are present in a high percentage of the MHX genes (Additional file 6). Among the ten eudicot MHX genes whose 5′ UTR sequences were available, all genes (100%) included at least one uORF. Three (50%) of the six available 5′ UTR sequences of monocot MHX genes included an uORF. This suggests that the presence of an uORF in MHX genes has some evolutionary advantage, possibly for restricting the expression of these genes.
In eukaryotes, the likelihood that an upstream AUG (uAUG) codon will be recognized by the ribosome and, hence, can inhibit expression, depends on the strength of its sequence (Kozak) context [66, 67]. The uAUGs of plant MHX genes have not only weak but also sub-optimal and strong contexts (Additional file 6A). Moreover, we showed that the uAUG codon of AtMHX is well recognized despite its weak context due to its stable secondary structure, resulting in strong inhibition of this gene expression [15, 16, 68]. Recognition of uAUG codons may lead to inhibition of translation and/or NMD [15, 16]. The likelihood that either of the latter impacts will be realized is increased when the length of the uORF peptide is increased [69, 70]. The potential uORF peptides of the currently identified MHX 5′ UTRs are presented in Additional file 6B. However, it is not possible to specify a definite cutoff size for peptide length that will inhibit translation , and even the short uORF of AtMHX inhibited translation and elicited NMD [15, 16]. It will, therefore, be necessary to experimentally determine whether the expression of other plant MHX genes is inhibited by their uORFs. It will also be necessary to determine if uORFs, which are much more abundant in the MHXs than in most plant genes, play a role in functional diploidization of some MHXs. For example, there are two MHXs in O. sativa, but the 5′ UTR sequence is available for only O.sativa_J1. This 5′ UTR does not include any uAUG. Our data suggest that O.sativa_J1 is expressed either exclusively or to a much higher level than O.sativa_J2 (see above). In M. guttatus, the gene encoding the M.guttatus_1 protein has a particularly high number of uAUGs (11), including one with a strong Kozak context. The 5′ UTR of the gene encoding M.guttatus_2 is currently unknown.
The peptides encoded by the uORFs of plant MHXs are not evolutionary conserved (Additional file 6B). This suggests that the amino acid sequence of the uORF peptide does not have a functional importance not only for AtMHX[15, 16] but also for the expression of other MHX genes. The similarity between the uORF peptides of the MHXs of A. thaliana and A. halleri, or S. lycopersicum and S. tuberosum (Additional file 6B) apparently resulted from the evolutionary relatedness of each pair of species.
We attempted to identify sequence elements that are conserved in the MHX and NCX families, as well as those that distinguish them from each other. We also attempted to identify sequence elements that characterize the MHX proteins of specific plant groups. This analysis can provide the basis for future experiments designed to determine the significance of the identified elements for the function and ion selectivity of these transporters. Alignment of all currently identified MHX proteins is shown in Additional file 5. Some of the proteins lack or apparently have few extra residues in the N-terminus. This is due to the low similarity between the MHXs in this region, which made it difficult to accurately predict the initiation point in the genomic sequence (see Additional file 2). The longest motif conserved in all MHX proteins, which includes six amino acids, is TADSAI at position 430 of AtMHX. There are also two conserved five-residue motifs – ASKIA and ELGGP – at positions 420 and 505 of AtMHX, respectively. As shown in Additional file 7, NCX proteins have, at the corresponding locations, similar motifs that are, however, not completely conserved in all NCXs. When only angiosperm MHX proteins were aligned, much longer conserved motifs could be identified (Additional file 8).
For each protein (except SmMHX and CrNCX), only residues that were totally conserved in the whole group it represented were highlighted, either in gray or as detailed below. Residues conserved among all MHX and NCX proteins investigated here were highlighted in red. The motif ELGG (at position 505 of AtMHX) is the longest common sequence element of the two protein families, and it will be necessary to determine its functional significance. Interestingly, ten of the 32 residues conserved between MHX and NCX proteins are glycines. It was suggested that glycines provide flexibility to active enzyme sites . Many conserved glycines were also identified in CAX proteins . NCX_Mj, which similar to NCX1 functions as a Na+/Ca2+ exchanger, shares only four of the 32 residues that are completely conserved in all MHX and NCX proteins (these four residues are indicated by bold, red letters in the Mj sequence in Figure 4). Thus, there might be some structural or functional properties that are shared by MHX transporters and Na+/Ca2+ exchangers of the NCX family, but not by other Na+/Ca2+ exchangers such as NCX_Mj.
Negatively charged residues are particularly important for the function of cation transporters. The negatively charged residues conserved between the MHXs and NCXs are Glu200-Glu199, Asp432-Asp829, and Glu505-Glu901 of AtMHX and HsNCX1, respectively (Figure 4) (note that all HsNCX1 residue numbers refer to those of the mature protein, which are indicated on the N1 sequences, for coherence with NCX1 literature). However, since MHX and NCX proteins carry different ions, these residues are unlikely to determine the ion specificity. The two-proline element at positions 361-362 and 758-759 of AtMHX and HsNCX1, respectively, might be structurally significant [all MHX and NCX proteins have two prolines at that position, except the Ciona intestinalis NCX that has one proline (Additional file 9)]. Serine, threonine and cysteine residues shared by the MHXs, NCXs and NCX_Mj will be discussed afterwards.
Sequence elements that distinguish MHX from NCX transporters can provide a clue to the basis of the difference in their biochemical activities and ion specificities. To illustrate these elements, residues (except those highlighted in red) that were totally conserved in all MHX proteins identified thus far were highlighted in light green (Figure 4). There are only three residues (asterisked in Figure 4) in which all MHX proteins differ from all NCX proteins analysed. These three residues, which are located inside or near the functionally important α1 region, include: (i) Asp at position 100 of AtMHX - a negatively charged residue present in all MHXs, whereas all NCXs include the non-charged amino acid asparagine at the corresponding position. This asparagine is essential for NCX1 function, since its mutation into cysteine rendered the exchanger insensitive to regulation by cytoplasmic Na+ or Ca2+ ions . (ii) Gln at position 112 of AtMHX is a polar, non-charged amino acid found in all MHXs, while all NCXs have the negatively charged glutamate residue (Glu113 of NCX1) at the same position. The different properties of Gln112 of AtMHX and Glu113 of NCX1 have been previously noted . It is also interesting to note that the mutation of Glu113 of NCX1 into Gln resulted in a complete loss of NCX1 activity . This substitution is identical to the natural variation between NCX and MHX proteins at this position. It will be interesting to examine whether this mutation, despite abolishing Na+/Ca2+ exchange activity, allowed NCX1 to carry other ions. (iii) The third residue that differs between all MHX and NCX proteins is Asp at position 119 of AtMHX, which parallels a glutamate residue of the NCXs (Glu120 of HsNCX1). Although both aspartate and glutamate are negatively charged, this variation can be functionally significant. For example, two NCX1 variants that had a Glu-to-Asp mutation at either position 113 or 199, were completely inactive . However, mutagenesis of Glu120 into glutamine resulted in only 36% reduction in NCX1 activity . Among the three residues that differ between all MHXs and NCXs, NCX_Mj shares the same residue with the NCXs only in the position that corresponds to Gln112 of AtMHX. Residues in this position are, therefore, particularly interesting candidates for participation in the determination of the different ion selectivities of Mg2+/H+ and Na+/Ca2+ exchangers.
It is likely that in addition to the three residues that differ between all MHX and NCX proteins, other residues can contribute to the different ion specificities of the two types of transporters. Particularly interesting candidates are conserved MHX amino acids whose properties differ from those of the corresponding residues of most NCXs. Such residues include (as presented in Figure 4 and Additional file 7), Leu104 (the positions refer to AtMHX) – most NCXs have methionine at this position; Ser114 – most NCXs have leucine at this position; Glu173 and Leu418 – most NCXs have threonine at these positions; Gln207 – most NCXs have phenylalanine at this position; His214 – most NCXs do not have a positively charged residue at this position; Pro228 – most NCXs do not have proline, a structurally important amino acid, at the vicinity of this position; and Ala513 – most NCXs have a positively charged residue at this position.
According to the discussion about MHX evolution, CrNCX as well as the MHXs apparently evolved from an NCX protein that was present in the common ancestor of the streptophytes and chlorophytes. To visualize the similarity of CrNCX to the NCXs and the MHXs, residues of CrNCX that matched the conserved amino acids of either NCX or MHX proteins were highlighted in yellow or light green, respectively (Figure 4). CrNCX sequence is similar to that of the NCXs in almost all the conserved sites of the latter proteins, but the same doesn’t hold true for the MHXs. The few conserved NCX residues that do not match their corresponding CrNCX residues can provide the basis for a putative difference between the activities of CrNCX and NCX proteins (if any). The consensus sequence of the MHXs, NCXs and CrNCX (Additional file 9) apparently represents parts of the ancestral NCX protein that was present in the last common plant-animal ancestor, which existed ~1.6 billion years ago .
We also attempted to identify sequence elements that characterize the MHXs of different plant groups – vascular versus non-vascular plants, angiosperm compared to non-seed plants, and monocots versus eudicots. The differences between the MHXs of vascular and non-vascular plants could be functionally significant, considering the fact that AtMHX is highly expressed in the vascular system. It is, therefore, possible that MHX proteins had to adapt for their role in this system during the evolution of vascular from non-vascular plants. Residues that are conserved in the currently identified MHXs of all vascular plants, but differ from the corresponding amino acids of both P. patens MHXs (alignment not shown), were highlighted in orange in Figure 4. Among these residues, there is a relatively large difference between the properties of corresponding amino acids for Trp191 (the positions refer to AtMHX) (Ser in P. patens), Thr197 (Glu in P. patens), and Ser225 (Cys in P. patens). However, the current analysis is based only on the two sequences currently available from non-vascular plants, and it will be necessary to expand it in the future.
Residues that are conserved in angiosperm MHXs but differ from the corresponding residues of all the currently identified MHXs of non-seed plants (the MHXs of S. moellendorffii and P. patens), are highlighted in dark green in Figure 4. Six of these residues are localized between amino acids 234 and 242 of AtMHX, and it will be necessary to investigate whether this region is related to some functional properties that distinguish the MHXs of angiosperm from those of non-seed plants. In addition, there is a relatively large difference between the properties of corresponding amino acids for Asp144 and Asp391 (negatively charged residues) of angiosperm MHXs versus the corresponding two asparagines (polar uncharged residues) of non-seed plants; His149 (a positively charged amino acid) of angiosperm versus leucine (a non-polar amino acid) of non-seed plants; and Gly209 of angiosperm versus proline of non-seed plants. Finally, residues that are conserved in either monocot or eudicot MHXs but differ between the two groups, are highlighted in olive-green in Figure 4. There are only three such residues and, among them, Glu160 of the eudicots (serine in monocots) should be particularly noted. It will be important to examine whether the variations described here between the MHXs of various plant groups are related to differences, if any, between the properties of these MHXs.
In most regions, there was a good agreement between the experimental observations in NCX1 and the prediction of its TMSs by the TMpred algorithm (indicated by bold, underlined letters in the Hs sequence). The TMSs predicted by TMpred in the MHXs are indicated by bold, underlined letters in the MHXs sequences. It is reasonable that the positions of MHX TMSs correspond to those of NCX1, particularly where the TMpred predictions for the two groups of proteins agree with each other and with the experimentally-identified positions of NCX1 TMSs. As mentioned, NCX1 was shown to contain two reentrant loops [33–35]. The hydropathy plots of the MHXs resemble that of HsNCX1 also in the regions that correspond to the two reentrant loops of NCX1 (these regions are indicated by dashed lines in Figure 5). This analysis suggests that the MHXs have nine TMSs and two reentrant loops. However, it will be necessary to test this suggestion experimentally.
The TMSs predicted by TMpred were mainly located at equivalent positions of the four representative MHX proteins shown in Figure 4. However, there were certain regions of dissimilarity, for example, an extra TMS was predicted before TMS6 in the MHXs of O. sativa and P. patens, but not A. thaliana or S. moellendorffii. It was interesting to examine if there are subgroups of MHX proteins with distinct structures, for example, whether the MHXs of all monocots show structural similarity to O.sativa_J1. Additional file 10 shows the TMSs predicted by TMpred for all MHXs studied here. These TMSs were predominantly located at equivalent positions for most MHXs. This suggests that there is a relatively large similarity in the structures of various MHX proteins. For only few proteins, TMpred predicted a deviation from the common structure in the region of TMSs 5 and 6, which surround the large central loop. It was predicted that the P. patens MHXs lack TMS5, and that seven out of the 31 MHXs analysed have an extra TMS before TMS6. However, TMpred predictions for the large central loop region were not accurate also for NCX1, which was also predicted to have an extra TMS before TMS6 (Figure 4), while experimental evidence proved that this is not the case. The TMpred analysis was useful for identifying that there was no clear distinction between the MHXs of different phylogenetic groups, e.g., of monocots and eudicots (Additional file 10). Based on this fact, together with the high sequence similarity between different MHX proteins (Additional file 3), we assume it is not likely that some of them have a switched TMS topology, and they are all likely to share the common structure indicated in Figure 4. The E. grandis MHX was predicted to have an extra TMS before TMS1. The SignalP 4.0 algorithm suggests that the E. grandis MHX does not have a signal peptide, and it will be necessary to determine if the initiation point of this protein was correctly annotated.
While NCX1 has a cleaved signal peptide  (highlighted in pink in Figure 4), the corresponding region is missing in the MHXs. Except NCX1, none of the other proteins presented in Figure 4, including CrNCX, was predicted to have a signal peptide by the SignalP 4.0 algorithm . The same algorithm correctly predicted the presence of a cleaved signal peptide in HsNCX1. The experimentally determined orientations of NCX1 loops, as well as N- and C-terminal regions [33–35], are indicated in Figure 4. For NCX1, which is located in the plasma membrane, the “in” or “out” oriented loops face the cytosol or the extracellular space, respectively. However, MHX proteins studied thus far are localized in the vacuolar membrane [11, 79]. For tonoplast proteins, cytosolic loops are considered to be internal, whereas loops that face the vacuolar lumen are considered to be external. This categorization is based on the similar biophysical properties of the vacuolar lumen and the extracellular space in terms of their pH and electrochemical potential. The electrochemical potential plays an important role in determining the orientation of membrane-protein loops [80–82]. It will be necessary to determine if the orientation of MHX loops is similar to that of the corresponding NCX1 loops. The large central loop, which exists in both NCX and MHX proteins and faces the cytosol in NCX1, is much longer in NCX compared to MHX proteins (Figure 4). In accord with the segregation of the protein identified in C. reinhardtii with the NCX family, the length of its central loop is similar to that of HsNCX1 (Figures 4 and 5B).
The degree of sequence conservation in the predicted TMSs and loops of MHX proteins
All non-TMS regions
Three of the ten glycines conserved in MHX and NCX proteins are located in the α1-reentrant loop 1 region and three are in the α2-reentrant loop 2 region (Figure 4). Of the three glycines conserved in the second reentrant loop, two create the GXG motif discussed above. The central amino acid of this motif is mainly Ile or Leu. The only three proteins (among the MHXs and NCXs analysed here) in which the central amino acid of this GXG motif is not Ile or Leu, are the MHXs of S. moellendorffii and P. patens (2nd paralog), where the Xs are Fhe and Thr, respectively, and the NCX of A. pisum, where the X is Val. While the Ile, Leu, and Val residues have a similar nature, it will be interesting to learn if the benzyl or alcohol groups of Fhe or Thr, respectively, confer any special properties to the indicated MHXs of S. moellendorffii or P. patens, respectively.
The study of NCX1 resulted in identification of several sequence elements that play a critical role in the function of this transporter. Amino acids whose mutation resulted in a complete loss of NCX1 activity [25, 36] are highlighted in cyan on the N1 sequence in Figure 4. These residues were mainly localized in the α repeat regions. Many of these residues corresponded to NCX_Mj amino acids that were shown to participate in ion binding (highlighted in cyan on the Mj sequence), which were all from the two α repeats . The nine residues whose functional significance was indicated in both NCX1 and NCX_Mj (that are highlighted in cyan in both the N1 and Mj sequences) include two charged residues, six Ser/Thr residues, and Asn at position 143 (all numbers correspond to the N1 sequence). The charged residues are Glu113 (as discussed above, this residue is Glu in all NCXs and in NCX_Mj, but is Gln in all MHXs) and D814 (which is Asp in all MHXs and NCXs except SmMHX, and Glu in NCX_Mj). Interestingly, six of the positions whose functional significance was indicated in both NCX1 and NCX_Mj include serine or threonine in almost all MHX and NCX proteins and in NCX_Mj (Figure 4). Serine and threonine can be used as phosphorylation sites. NCX1 is controlled by phosphorylation (reviewed in ), and it will be interesting to determine whether the indicated Ser/Thr residues participate in regulating NCX (or MHX) proteins by phosphorylation. These Ser/Thr residues are located at positions 109 (all the MHXS, NCXs and NCX_Mj have there either Ser or Thr), 140 (a completely conserved Ser), 110, 811, and 838 (almost completely conserved serines), and 810 (an almost completely conserved threonine) (Figure 4 and Additional file 9). There are other sites in which serine or threonine residues (some of which shown to be important for NCX1 function) are almost completely conserved in the MHXs and NCXs, including positions 176, 201, 213, and 818 (Figure 4 and Additional file 9).
Six α-repeat residues whose mutagenesis altered the Ca2+ affinity of NCX1  are indicated by red letters on the N1 sequence. A mutations in Thr103 altered NCX1 affinity for cytoplasmic Na+ ions . It was suggested that this interface of a cytoplasmic loop and TMS2 is important for both Na+ transport and secondary regulation by Na+ ions (together with the large cytosolic loop). It will be interesting to examine the role of the corresponding residues of MHX proteins.
The large cytosolic loop of NCX1 regulates its activity . Figure 4 shows the position in this loop of the regulatory Ca2+-binding domains CBD1 and CBD2, which are responsible for NCX1 activation by cytosolic Ca2+ ions . Electrophysiological analysis showed that, similar to NCX1, AtMHX is activated when Ca2+ ions are applied to its cytosolic (but not luminal) side . The CBD domains of NCX1 include a high density of negatively charged residues (Figure 4). Regions with relatively high densities of negatively charged residues are marked with red boxes on the central loops of the representative MHX proteins (Figure 4). The large cytosolic loop of NCX1 also includes the XIP peptide responsible for NCX1 inactivation  (Figure 4). Electrophysiological analysis showed that AtMHX undergoes an inactivation process with a time scale similar to that of NCX1 . While the XIP peptide of HsNCX1 includes eight basic residues, the corresponding regions of plant MHXs, particularly of the angiosperm, include about four basic residues, most of which are evolutionarily conserved (Figure 4). The comparable region of CrNCX includes five basic residues. It will be necessary to determine if, according to the structural model presented in Figure 4, the large central loop of the MHXs faces the cytosol, if it is essential for MHX transport activity, and whether loop elements with similarity to the XIP or CBD regions of NCX1 play a role in MHX inactivation or regulation by Ca2+ ions, respectively.
NCX1 includes a disulfide bond between Cys792 and a cysteine at position 14 or 20 . Ddisulfide bonds are, however, not essential for NCX1 function . Cys792 of HsNCX1 and the equivalent Cys395 of AtMHX are conserved in all NCX and MHX proteins (Figure 4). In parallel to the cysteines at position 14 and 20 of HsNCX1, all MHXs possess a cysteine in the N-terminal region that is equivalent to Cys23 of AtMHX in most proteins (Figure 4 and Additional file 5). Another cysteine residue - equivalent to Cys152 of AtMHX - is conserved in all MHX and NCX proteins. It is possible that some of these conserved cysteines participate in disulfide bond formation in MHX proteins.
The presence of disulfide bonds in NCX1 was demonstrated by comparing its mobility on SDS-PAGE under reducing and non-reducing conditions [52, 85, 86]. This approach is widely used to monitor the formation of disulfide bonds in proteins . While under reducing conditions (in the presence of β-mercaptoethanol) NCX1 appeared as two bands in the gel [85, 86], only a single band was observed under non-reducing conditions [in the presence of N-ethylmaleimide (NEM)]. NEM is a small compound that permanently blocks free cysteines, thereby trapping proteins at their original folding state [83, 87]. The difference between the apparent molecular mass of NCX1 under reducing and non-reducing conditions indicated that it includes disulfide bonds [52, 85, 86].
The MHX family is limited to plants, and constitutes a sixth family within the CaCA superfamily. Among the plants for which genomic information is currently available, more than one full MHX gene was identified only in O. sativa and M. guttatus. MHX gene duplication in O. sativa occurred de novo before the split between the Indica and Japonica subspecies, which happened 200,000–400,000 years ago. Most likely, following an initial duplication in chromosome 2, one MHX paralog translocated to chromosome 11 in Japonica. Some genomes include, in addition to a full MHX gene, loci with partial MHX-homologous sequences. Genomic and EST data suggest that MHX genes underwent functional diploidization in most plant species. Currently, M. guttatus is the only plant in which an EST was identified for more than one MHX-paralogous gene. The prevalence of uORFs in MHX genes is much higher than in most plant genes. These uORFs can limit expression and, potentially, contribute to functional diploidization of the MHXs. The currently available chlorophyte genomes do not include proteins with homology to the MHXs, but only to NCX or NCKX proteins. The MHXs are more similar to the NCXs than to the NCKXs. These data are consistent with the suggestion that the MHXs evolved from the NCXs after the split of the chlorophyte and streptophyte lineages of the plant kingdom, which occurred ~1.2 billion years ago.
A structural model of the MHXs, based on the resolved structure of NCX1, implies that the MHXs include nine TMSs. Altered mobility in reduced and non-reduced conditions suggests the presence of disulfide bonds in AtMHX. There are 32 residues that are completely conserved between all MHX and NCX proteins, among which ten are glycines. These conserved residues include an ELGG motif in the last loop and a GXG motif that was implicated in the formation of a tight-turn in a reentrant loop. There are only three residues in which all MHX proteins differ from all NCX proteins analysed. The identification of sequence elements that distinguish between the MHXs and NCXs, or between the MHXs of specific plant groups, can contribute to clarification of the structural basis of the function and ion selectivity of MHX transporters.
Million years ago
Nonsense mediated mRNA decay
Upstream open reading frame
5′ untranslated region.
We thank Robert K. Vickery for helpful discussions about M. guttatus evolution, Jeffrey P. Tomkins and the Clemson University Genomics Institute for the CTOA20E13 cDNA clone, Yasunari Ogihara and the Kihara Institute for Biological Research for the whoh15o16 (BJ273167) cDNA clone, Arizona Genomics Institute for the ST_BEa0006L05 cDNA clone, and the Phytozome database for sending us the EST data of M. guttatus and P. patens ahead of publication. This work was supported by the Israel Science Foundation (grant no. 199/09).
The data sets supporting the results of this article are included within the article (and its additional files).
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.