TaMSH7: A cereal mismatch repair gene that affects fertility in transgenic barley (Hordeum vulgare L.)

Background Chromosome pairing, recombination and DNA repair are essential processes during meiosis in sexually reproducing organisms. Investigating the bread wheat (Triticum aestivum L.) Ph2 (Pairing homoeologous) locus has identified numerous candidate genes that may have a role in controlling such processes, including TaMSH7, a plant specific member of the DNA mismatch repair family. Results Sequencing of the three MSH7 genes, located on the short arms of wheat chromosomes 3A, 3B and 3D, has revealed no significant sequence divergence at the amino acid level suggesting conservation of function across the homoeogroups. Functional analysis of MSH7 through the use of RNAi loss-of-function transgenics was undertaken in diploid barley (Hordeum vulgare L.). Quantitative real-time PCR revealed several T0 lines with reduced MSH7 expression. Positive segregants from two T1 lines studied in detail showed reduced MSH7 expression when compared to transformed controls and null segregants. Expression of MSH6, another member of the mismatch repair family which is most closely related to the MSH7 gene, was not significantly reduced in these lines. In both T1 lines, reduced seed set in positive segregants was observed. Conclusion Results presented here indicate, for the first time, a distinct functional role for MSH7 in vivo and show that expression of this gene is necessary for wild-type levels of fertility. These observations suggest that MSH7 has an important function during meiosis and as such remains a candidate for Ph2.


Background
In most organisms there are evolutionarily conserved mechanisms in place that minimise the frequency of mismatches introduced during DNA replication [1]. As plants lack a reserved germ-line, mutation occurring in somatic cells can be transmitted to the next generation. Conse-quently, the need for an effective post-replicative DNA repair mechanism is pronounced. The mismatch repair (MMR) system is an essential component of this DNA repair.
In eukaryotes MMR is undertaken by the MutS and MutL homologues (MSH and MLH). Both MSH and MLH polypeptides form MSH and MLH heterodimeric proteins, respectively, which act together to bind mismatched DNA and initiate repair. Most eukaryotes have genes encoding six MSH proteins, however a seventh MSH protein (MSH7) has been identified in plants [2].
All MSH proteins, except MSH1, have been shown to act in DNA repair and/or recombination during meiosis [3], with each having a specific yet often overlapping role. The MSH4-MSH5 heterodimer has only been reported to be involved in meiotic recombination [4], while the three remaining dimers are involved in both recombination and MMR. The MSH2-MSH3 heterodimer (MutSβ) binds insertion/deletion loop-outs, the MSH2-MSH6 heterodimer (MutSα) binds base mispairs and small insertion-deletion loop-outs [5,6], while the MSH2-MSH7 heterodimer (MutSγ) binds base mispairs but not insertion-deletion loop-outs [7]. These heterodimers then recruit MLH proteins to initiate MMR.
In addition to roles in MMR and homologous recombination, MSH genes are known to be involved in suppression of homoeologous recombination [8,9]. Recent research indicates that when two divergent sequences undergo recombination, some MSH proteins detect mismatches in the recombination intermediate and the recombination event is subsequently aborted [10]. Studies in bacteria and yeast, supporting these findings, have shown that inactivation of the MMR system leads to elevated levels of both inter-and intra-specific homoeologous recombination and relaxation of the species barrier [8,[11][12][13]. Using yeast (Saccharomyces cerevisiae), Datta et al. showed that between sequences with less than 10% sequence variation, homoeologous recombination was increased by up to 70-fold upon inactivation of MMR [14]. This suppression has also been observed in higher eukaryotes, with studies in plants and humans indicating that proteins involved in MMR play a critical role in suppressing homoeologous recombination [15][16][17]. In yeast, MSH2 and its two binding partners MSH6 and MSH3 mediate the suppression of homoeologous recombination [18]. In plants MSH2 can also suppress homoeologous recombination [16,19], implicating the plant specific MSH7 in this process since the two polypeptides form a heterodimer. Support for this hypothesis is strengthened by the fact that MSH7 has been mapped to a locus in wheat known to affect homoeologous recombination [20]. The bread wheat (Triticum aestivum) genomes contain several loci that are known to be involved in the suppression of homoeologous recombination. Historically, the two main loci are Ph1 and Ph2 (Pairing homoeologous). Two Chi-nese Spring derived mutants display the Ph2 phenotype. One of these, ph2a, was generated via X-ray irradiation and contains a D genome deletion [21]. The other, ph2b, is a chemically induced mutation, thought to be a single nucleotide polymorphism (SNP) or a small insertion or deletion (INDEL) [22]. The ph2b mutant (in particular) therefore suggests that Ph2 is a single gene located on the short arm of chromosome 3D [22,23]. Southern analysis using nullisomic-tetrasomic and ditelosomic lines showed that one copy of MSH7 resides on the short arm of chromosomes 3A, 3B and 3D [20]. Furthermore, hybridisation of a TaMSH7 probe to genomic DNA from Chinese Spring and ph2a lines indicated that the copy on chromosome 3D is located in the region deleted in the ph2a mutant [20].
Given the known involvement of MSH genes in the suppression of homoeologous recombination and the mapped location of TaMSH7 to the Ph2 locus in bread wheat, this gene is a strong Ph2 candidate. To understand the role of MSH7 in meiotic recombination in plants, additional research into this important candidate gene is necessary. In a wider context, enhancing meiotic recombination would benefit plant breeders, allowing new strategies for DNA introgression from wild crop relatives to domestic breeding lines [24].
The research presented here is divided into two sections. The first part compares cDNA sequences from various wheat accessions and mutants. In particular comparisons between the Chinese Spring D genome copy with the D genome copy from the ph2b mutant were made to determine whether any SNPs or small INDEL(s) were present within the known ORF of the TaMSH7 sequence. The second part of the study demonstrates that MSH7 loss-offunction results in reduced seed set in transgenic barley (Hordeum vulgare) plants, and shows for the first time that MSH7 plays a necessary role in vivo and that expression of this gene is required for wild-type levels of fertility. Barley was used for this study, since as it is a diploid it provides a simpler model than wheat and permits an assessment of the role of MSH7 on recombination processes between homologous chromosomes without the complication of dealing with both homologous and homoeologous chromosomes in wheat.

Results and Discussion
Previous studies in wheat, Arabidopsis and maize (Zea mays) have identified MSH7 as a plant specific member of the MSH protein family [1,20,25]. Given that the MSH2-MSH7 heterodimer has a different binding specificity when compared to other MSH heterodimers a functionally distinct role for MSH7 within the plant cell is suggested [2]. This study investigated a role for MSH7 in transgenic barley and compared the three sub-genomic copies of MSH7 from bread wheat to determine whether any SNPs or INDELs could possibly account for the Ph2 phenotype that has previously been reported previously.

Sequencing of TaMSH7 from bread wheat
Three distinct MSH7 sequences were identified in bread wheat that are representative of the A, B and D genome copies. All three sequences were obtained from wheat meiotic cDNA, indicating that each of the three genes is expressed during meiosis. Sequence alignment with T. tauschii (the D genome progenitor of bread wheat) was used to determine the sequence belonging to the D genome while sequences from nullisomic-tetrasomic lines were used to distinguish the A and B genomes (Figure 1A).
Conceptual translation and subsequent alignment of TaMSH7 nucleotide and protein sequences showed 97.7% nucleotide sequence identity and 95% amino acid identity between the three sub-genomic copies ( Figure  1B). Almost all amino acid differences between the three TaMSH7 protein sequences were found to be residues that were not conserved amongst other MSH7 and MSH6 proteins (e.g. residues 565, 572, 574, 575, etc.). However, residue 596 from the B genome consensus was a polar serine residue, while all other MSH7 and MSH6 proteins and also EcMutS (E.coli) had non-polar leucine, isoleucine or valine residues ( Figure 1B). This difference falls in the non-specific DNA binding domain that is truncated in MSH7 proteins. MSH7 proteins have been shown to bind DNA but the significance (if any) of the domain truncation has yet to be determined. Biochemical studies into the MutS protein family have not uncovered any particular significance of this residue [26] and while possible, it seems unlikely that this amino acid change would result in any major change to protein function.

Sequence of MSH7 from the D genome of the ph2b mutant
The two known Ph2 mutants in bread wheat, ph2a and ph2b, suggest that Ph2 may be a single gene located on the D genome. Dong and colleagues [20] have previously suggested that MSH7 may be a candidate for Ph2. Given that the phenotype observed in the ph2b mutant is believed to be a result of a SNP or small insertion/deletion, the D genome copy of MSH7 from this mutant was sequenced to determine if MSH7 could be validated as the Ph2 gene.
Three SNPs were identified between the wild-type Chinese Spring and ph2b D genome copies of TaMSH7. These SNPs resulted in two changes at the amino acid level ( Figure  1C). The first polymorphism resulted in a serine to proline change at position 477. A proline is found at this position in the maize MSH7 orthologue, suggesting that this change is functionally redundant. The second poly-morphism resulted in an isoleucine to valine change at residue 496. Valine is also present at this position in rice MSH7 and maize MSH7 suggesting that this change also results in a functional protein. Given the nature of these changes it is unlikely that the ph2b D genome copy of the MSH7 coding sequence contains any mutations that would result in a non-functional or malfunctioning protein. Furthermore, the ph2b D genome copy of MSH7 was well represented in the meiotic cDNA (approximately one third of sequenced ph2b clones) indicating that this gene is expressed during meiosis. This significantly reduces the possibility of a mutation within the promoter or other regulatory elements leading to the Ph2 phenotype.
Although the ph2b mutation was generated in a Chinese Spring background, the difference between the ph2b and parental sequence may be due to genetic variation in Chinese Spring that we and others have observed at several other loci. Results from such sequencing efforts suggest that there are several different 'versions' of Chinese Spring. The differences seen here may also be due to background mutations caused by the chemical mutagenesis of Chinese Spring that led to the initial identification of the ph2b mutant.

Transgenic barley production analysis
Over 55 independent barley lines, transformed with a wheat MSH7 double-stranded RNAi construct (see Methods), were generated with a transformation frequency of approximately 11%. When compared to previously published barley transformation experiments [27][28][29] that have used the same cultivar (Golden Promise), the frequency reported here is considerably higher. Both PCR and Southern hybridisation were conducted to confirm that each of these lines were positive ( Figure 2), with many having a single copy of the hygromycin resistance gene inserted (54% of RNAi MSH7 transgenic lines produced). Only 14% of all lines produced had 4 or more copies of the hygromycin resistance gene inserted. A characteristic phenotype with many of the T 0 lines was reduced levels of fertility, as evidenced through lower seed set than the controls that had been transformed with an empty vector containing only the hygromycin resistance gene.

Transgenic barley RNAi loss-of-function analysis
From the population of transgenic T 0 lines, 12 (Table 1) were analysed for MSH7 expression using quantitative real-time PCR (Q-PCR). In the majority of these lines expression of the transgene was significantly reduced (Figure 3A). In the T 1 generation two single-copy insertion lines were selected for further expression analysis (lines 12 and 41). These lines were chosen based on their T 0 expression levels and morphological characteristics which also included reduced seed set and pollen viability. Positive segregants from these lines showed significantly reduced MSH7 sequence alignments The majority of differences in the sub-genomic amino acid sequence were at non-conserved residues. One change Leu → Ser at residue 596 of genome B (pink) was at a residue that is conserved amongst other MSH7 and MSH6 proteins and the prokaryotic homologue, MutS. (C) Two differences in amino acid sequence between the CS and ph2b D genome sequences were identified (pink). Both these amino acids were present in other MSH7 proteins.
(A) 25  MSH7 expression when compared to null-segregants of the same lines (p = 0.009 for line 12 and p = 0.0008 for line 41) ( Figure 3B). A concomitant reduction of expression of MSH2 was also observed in line 12 but not in line 41. There were no significant differences between null and positive segregants in MSH6 expression ( Figure 3B).
Based on the reduced MSH2 expression in line 12 we investigated the possibility as to whether MSH2 and/or MSH6 expression could be affected by non-specific targeting of these genes by RNAi mechanisms. To achieve this, sequence identities between the RNAi construct and the various MSH genes were compared. As sequence information was not available for many of the barley MSH genes, rice MSH2 and MSH6 sequences were compared to the segment of rice MSH7 sequence orthologous to that used in the RNAi construct. While not ideal, this was considered an appropriate approximation of sequence identity as the presence of all MSH genes in both monocots and dicots suggests divergence of MSH genes occurred prior to rice/barley divergence [2,25]. This is also supported by previous studies in Arabidopsis which indicate that MSH7 diverged from MSH6 early in eukaryotic evolution [2]. The MSH7 fragment within the RNAi construct showed 53% and 51% sequence identity to MSH6 and MSH2, respectively. Furthermore the greatest segment length with the selected sequence for the RNAi construct showing 100% identity to either of these two mismatch repair gene family members was only 9 bp. In plants a ~21 nt RNA with 100% sequence identity is generally needed for RNAi to be effective (reviewed [30,31]), therefore it is unlikely that the RNAi construct would have affected any other members of the MSH gene family.

Seed set and seed weight
Positive segregants of lines 12 and 41 displayed reduced fertility as evidenced by reduced seed set ( Figure 3C). In line 12 this difference was significant at the 95% confidence level (p < 0.033) and in line 41 significant to 90% confidence (p < 0.077). Seed weight (1000 grain weight) differences between the positive segregants and the nulls for each of these lines (12 and 41) were also statistically significant at the 90% confidence level (p < 0.09). These results, taken together with the Q-PCR data, indicate that MSH7 plays an important role in determining plant fertility.
There are two obvious pathways that could lead to reduced fertility with reduction in MSH7 expression. First, there may be reduced levels of MMR in these plants leading to higher levels of mutation and therefore a reduction in viable seed. Secondly, reduced expression could lower the suppression of homologous recombination during meiosis. Increased recombination is known to lead to chromosomal instability and a reduction in viable gametes due to translocations and non-disjunction during cell division [8,17,21].
Based on the Q-PCR data reported for the T 1 transgenics, we cannot rule out the possibility that the reduced level of fertility observed in line 12 was affected by the reduction in expression not only of the MSH7 gene but also of the  MSH2 gene. Indeed, similar phenotypes to those observed in this study have been found by Hoffman et al. [32] who showed, using a MSH2 T-DNA insertion mutant, that disabling the MMR system in Arabidopsis leads to high levels of mutation and reduced fertility within two generations in some lines. However, the reduced fertility observed in line 41 of this study can be attributed to the reduction in MSH7 transcript alone, as there was no significant change in expression level of the MSH2 transcript. Importantly, further experiments will still be needed to distinguish between these possible reasons for reduced fertility, as even in the study reported by Hoffman et al. [32], they were not able to show if the observed phenotypes were due to a reduction in MMR, reduced homoeologous recombination or some other mechanism.

Conclusion
The results presented here indicate that bread wheat contains three functionally conserved copies of MSH7, all of which are expressed during meiosis. While SNPs were identified within the D genome copy of TaMSH7, it is unlikely that these amino acid substitutions are responsible for the Ph2 phenotype. Barley plants transformed with an MSH7 RNAi knock-down construct showed a reduction in MSH7 expression accompanied by reduced fertility when compared to null segregants and wild-type. This is consistent with previous reports, suggesting that MSH7 plays a role in recombination and DNA repair during meiosis [2,20]. Reduced seed set in transgenic barley also showed that the in vivo loss of MSH7 function (due to reduced expression) is not compensated for by other endogenous MSH proteins (that are likely to interact with or have a similar role), indicating a distinct functional role for MSH7 within the plant cell.
Transformed barley plants (cv. Golden Promise) were grown as above. Mature leaves and young spikes undergoing early prophase I were collected from T 0 plants and selected T 1 lines. The stage of meiosis in both wheat and barley tissue was determined microscopically after staining anther squashes with aceto-orcein. Agrobacterium-mediated transformation experiments were performed using the procedure developed by Tingay et al. [34] and modified by Matthews et al. [35]. The callus induction medium contained 10 µM CuSO 4 , while the shoot regeneration and plant development media contained 1 µM CuSO 4 . The media were prepared according to the altered sterilisation procedures described by Bregitzer et al. [36].

Genotyping transformed plants
Plants were genotyped by PCR using the transformed hygromycin phosphotransferase (hpt) gene (primers HvHyg1, GTCGATCGACAGATCCGGTC and HvHyg2, GGGAGTTTAGCGAGAGCCTG) and a single copy endogenous barley gene (HvSAP2) (primers GGATCGATCGTC-CAGCTACTA and AGAGTGGGTTGTGCTTGAGAT). HvSAP2 was used as a positive control to confirm the integrity of the DNA used in PCR amplification procedures.
Using the method described by Pallotta et al. [ PCR results were also verified using Southern hybridisation. Genomic DNA (10-15 µg) was digested with EcoRV (New England Biolabs, USA). The DNA fragments were separated on a 1% (w/v) agarose gel and transferred to a Hybond™-N + nylon membrane (Amersham Pharmacia Biotech Ltd., UK) with 0.4 M NaOH, according to the manufacturer's instructions. A 1.1 kb XhoI DNA fragment, excised from plasmid pCAMBIA1390, was used to detect hpt hybridising sequences in the genomic DNA of the hygromycin-resistant plants. The DNA probe fragment was isolated from an excised gel fragment using the Bresa-Clean™ Nucleic Acid Purification Kit (Bresatec, Australia), according to the manufacturer's instructions. The probe was labelled by random priming [38] using the Meg-aPrime™ DNA labelling system (Amersham).
Hybridisation was conducted at 65°C using standard conditions [39]. Following hybridisation, the membrane was washed with 0.1× SSC, 1% (w/v) SDS at 65°C for 20 min, air-dried and exposed to X-ray film (RX Fuji Medical X-ray film; RX-U, Japan) at -80°C.

cDNA synthesis and quantitative PCR
Total RNA was isolated using TRI-REAGENT (Astral Scientific Pty Ltd., Australia) according to the manufacturer's protocol. RNA was DNase treated with TURBO DNA-free™ (Ambion, USA) as outlined in the manufacturer's instructions. cDNA was synthesised from 2 µg of total RNA using SuperScript™ III reverse transcriptase (Invitrogen, Australia) according to the manufacturer's instructions. Q-PCR was conducted as described by Crismani et al. [40], using primers shown in Table 2. Q-PCR data is represented as the average of a minimum of seven replicates. To normalise the expression data, a single control gene, HvGAPdH, was used for this single tissue, single time point experiment.

PCR amplification of TaMSH7 and sequencing
Meiotic wheat cDNA was generated as for barley. Each PCR reaction contained 100 ng cDNA, 0.2 mM dNTPs, 0.2 Figure 4 MSH7 RNAi transformation vector. Sense (630 bp) and antisense (880 bp) fragments of TaMSH7 create a hairpin loop RNA structure when transcribed. This dsRNA may then reduce HvMSH7 expression through RNAi. The construct contains a hygromycin resistance gene, hygromycin phosphotransferase (hpt), which was used as a selectable marker during tissue culture. This gene was also utilised for analysis of transgene segregation in the T 1 population.

MSH7 RNAi transformation vector
µM primers (see Table 2), 2 mM MgCl 2 and 1 U Platinum ® Taq High Fidelity polymerase (Invitrogen) in 50 µL of 1× high fidelity PCR buffer (Invitrogen). PCR cycling conditions were 95°C for 5 min then 35 cycles of 94°C for 1 min, 56°C for 1 min, 68°C for 2 min, followed by a final extension step at 68°C for 10 min. 1% agarose gel electrophoresis was used to visualise the amplified products which were subsequently purified using the QIAquick gel extraction procedure (QIAGEN).
Eluted products were then cloned into the pGEM ® -T Easy vector (Promega, Australia) according to the manufacturer's protocol. The gene was sequenced with approximately 15 × coverage, ensuring all sub-genomic copies were identified. Capillary separation of sequencing reactions was undertaken by the Australian Genome Research Facility (AGRF) in Brisbane (Australia) using the Applied Biosystems fluorescent system. Contigs were generated using Contig Express (VNTI Suite, Version 8, Informax, USA). Consensus sequence generation and further analysis was undertaken in Vector NTI.

Seed set and seed weight
Mature T 1 seed was collected from ten representative spikes from each plant and dried for 7 to 10 days at 37°C. Average seed weight was then determined and used to calculate the 1000-grain weight. Student t-tests (assuming unequal variances) were used to determine whether the means of the samples in the segregating T 1 populations for seed set and 1000 grain weight were statistically different (Microsoft Office Excel 2003). Graphs were compiled using Microsoft Office Excel 2003.