Characterization and transferability of microsatellite markers of the cultivated peanut (Arachis hypogaea)

Background The genus Arachis includes Arachis hypogaea (cultivated peanut) and wild species that are used in peanut breeding or as forage. Molecular markers have been employed in several studies of this genus, but microsatellite markers have only been used in few investigations. Microsatellites are very informative and are useful to assess genetic variability, analyze mating systems and in genetic mapping. The objectives of this study were to develop A. hypogaea microsatellite loci and to evaluate the transferability of these markers to other Arachis species. Results Thirteen loci were isolated and characterized using 16 accessions of A. hypogaea. The level of variation found in A. hypogaea using microsatellites was higher than with other markers. Cross-transferability of the markers was also high. Sequencing of the fragments amplified using the primer pair Ah11 from 17 wild Arachis species showed that almost all wild species had similar repeated sequence to the one observed in A. hypogaea. Sequence data suggested that there is no correlation between taxonomic relationship of a wild species to A. hypogaea and the number of repeats found in its microsatellite loci. Conclusion These results show that microsatellite primer pairs from A. hypogaea have multiple uses. A higher level of variation among A. hypogaea accessions can be detected using microsatellite markers in comparison to other markers, such as RFLP, RAPD and AFLP. The microsatellite primers of A. hypogaea showed a very high rate of transferability to other species of the genus. These primer pairs provide important tools to evaluate the genetic variability and to assess the mating system in Arachis species.


Background
The origin and the diversity center of the genus Arachis are in South America [1]. This genus comprises 69 species, most of which are diploid and wild. The cultivated species include A. hypogaea L., the cultivated peanut, A. glabrata and A. pintoi, which have been used in forage production [2,3]. This genus is divided into nine sections (Arachis, Erectoides, Heteranthae, Caulorrhizae, Rhizomatosae, Extranervosae, Triseminatae, Procumbentes and Trierectoides) according to their morphology, geographic distribution and sexual compatibility [1].
The extensive morphological variation in A. hypogaea has led to the identification of subspecies, although studies using molecular markers have found little polymorphism in the germplasm of this species [4][5][6]. The observed restriction in genetic variation limits the use of several approaches, such as molecular marker-assisted selection and the construction of a molecular map that are essential tools in A. hypogaea breeding.
Cultivated peanut is an allotetraploid that contains genomes A and B, which are found in wild diploid species of section Arachis. This species has arisen probably from a unique cross between the wild diploid species A. duranensis (A genome) and A. ipaënsis (B genome) resulting in a hybrid whose chromosome number was spontaneously duplicated [7]. This duplication isolated A. hypogaea from the wild diploid species not allowing allele exchange with them. The origin through a single and recent polyplodization event, followed by successive selection during breeding efforts, resulted in a highly conserved genome [8]. The morphological variation observed among accessions of A. hypogaea is most probably due to the variation in few genes [9].
Microsatellites are highly polymorphic molecular markers [10], which have been used to analyze genetic variability and to construct molecular maps in several plant species [11][12][13][14]. Hopkins and colleagues [15] analyzed the genetic variation using six microsatellite primer pairs and 19 accessions of A. hypogaea and three accessions of wild Arachis species (A. duranensis, A. ipaënsis, A. monticola). These authors have observed that despite the low frequency of polymorphism found in A. hypogaea, these microsatellite loci were very informative and could provide a useful tool to identify and partition genetic variation in the cultivated peanut. Fergurson and colleagues [16] developed 226 microsatellite primer pairs for A. hypogaea and from the 192 that amplified well 110 putative loci showed polymorphism in a diverse array of 24 cultivated peanut accessions. Moretzsohn and colleagues [17] analyzing 36 species of Arachis observed the cross species amplification rate of A. hypogaea microsatellite primers was up 76% to species of section Arachis and up to 45% to species of the other eight section of genus Arachis.
Microsatellite markers could be useful to analyze the genetic variation in the germplasm of wild Arachis species. These species have more intraspecific genetic variation detectable than A. hypogaea, as shown by using molecular markers [18,19], and are resistant to numerous pests and diseases that affect the cultivated peanut [20]. The high cost of developing microsatellite markers is the main fac-tor limiting their widespread use in this genus. A good alternative would be the use of a set of primers to obtain cross-species transferability, as reported in other studies [21][22][23][24].
The objectives of this study were to isolate and characterize the microsatellite loci of A. hypogaea and to assess the cross-transferability of these markers to other Arachis species.

Results and Discussion
A total of 68 random clones were selected and sequenced. Thirty-eight (55.9%) of them contained microsatellites. Repeat length ranged from 12 bp to 47 bp. Twenty-four (63.1%) microsatellites were perfect, two (5.3%) were imperfect and 12 (31.6%) were compound repeats. From those, 16 clones were chosen to design the primers, since they had more than 10 repeats. Microsatellite sequences formed by less than 10 repeats are considered to be less polymorphic, and thus not very informative.
Seven clones contained AG/TC repeats, three contained AC/TG repeats, five contained AT/TA repeats, and one contained a poly A repeat [(A) 35 GG(A) 9 ]. Sixty-three percent of the selected clones (10/16) had complementary sequences to the oligonucleotides used in the enrichment procedure. However, the other 37% had different repeats (AT and A) that were not totally complementary to the probes used. The selection of AT sequences using AC and AG oligonucleotides were not reported in other studies where libraries were enriched for these two types of sequences [25][26][27]. In the previous studies the hybridization between the probes and single stranded clones were performed at temperatures superior to 50°C, thus under very stringent conditions, reducing the possibility of selection of clones due to mismatches. In this study, the enrichment was performed at room temperature (around 25°C). Some sequences did not contain repeated sequence indicating that mismatches have happened. However, the frequency of AT in this group was high. This high percentage could be due to the probes used (AC 15 and AG 15 ), which could have had up to 50% of their sequences complementary to AT/TA regions. Taking into account only the adenines in these probes, since adenines would pair to the timidines of target or part of the target sequence, temperatures above 35°C would be necessary to break the nitrogen bonds, since 35°C is the melting point of an oligonucleotide formed by 15 adenines. The forementioned temperature is 10°C higher than the room temperature, allowing a more stable association between the probe and the adenine-rich target (AT and poliA) than in adenine-poor targets, increasing the frequency of these motifs in the group of selected sequences due to the mismatch.
Primers were designed and synthesized to 13 of the 16 sequences selected. This set of primers and a primer pair developed by Hopkins and colleagues [15] were used to amplify microsatellite loci in A. hypogaea and in wild diploid species of eight sections of genus Arachis.
The 14 primer pairs allowed the detection of 18 putative loci in A. hypogaea (Table 1). Thus, a number of primer pairs amplified loci in both genomes of A. hypogaea. Four primer pairs (Ah7, Ah21, Ah30 and Ah282) amplified two putative loci in A. hypogaea and one locus in A. duranensis (genome A) and A. ipaënsis (genome B) and the other ten pairs allowed the amplification of a single putative locus that was amplified in A. hypogaea and in A. duranensis or A. ipaënsis (data not shown). Thus, the primers pairs fall into three groups based on the amplification events observed in A. hypogaea and in A. ipaënsis and A. duranensis: 1) those allowing the amplification in A. hypogaea and A. duranensis and detect a putative locus in the A genome, 2) those allowing the amplification of a putative locus in A. hypogaea and A. ipaënsis and detect a locus in the B genome, and 3) those allowing the amplification in A. hypogaea, A. duranensis and A. ipaënsis and detect putative loci in both genomes.
The level of polymorphism varied greatly among the polymorphic loci. Ah51 allowed the amplification of seven alleles and the PIC was 0.79, whereas the least polymorphic primer pair Ah282 amplified only two alleles and presented PIC = 0.11. Primers Ah51 flank a region that comprises the largest number of repeats (34) and the motif was formed by two nucleotides (A and G) whereas Ah282 flanked a region containing two microsatellites, each composed of six trinucleotide repeats. Hopkins and colleagues [15] and He and colleagues [28] observed that some loci, despite their long repeats , were invariant among the cultivated accessions tested. The difference observed in the studies cited above may have been due to the following: 1 -distinct number of loci were analyzed in these studies; 2 -distinct sets of A. hypogaea accessions were used. Moreover, the invariant microsatellites may be located in genes, what make them less variable despite their long repeats.
Overall, the mean percentage of polymorphic loci was 33%, the mean number of alleles per primer pair within the accessions of A. hypogaea was 4.02, and the PIC was 0.48. In this study, the percentage of polymorphic microsatellite loci was lower than those found in other studies where microsatellite markers were used to evaluate genetic variability within A. hypogaea. Ferguson and colleagues [16] studying a set of 24 accessions of A. hypogaea from 7 countries from different continents found 57.3% of polymorphic microsatellite loci. He and colleagues [28] found that 34% of the microsatellite primer pairs showed polymorphism in a sample that comprised A. hypogaea accessions from eight Latin America countries. Despite the lower percentage of microsatellite loci found in this study, it was higher than the percentage of polymorphic loci in A. hypogaea observed using RAPD [6.6% (29)], and AFLP [6.7% (6); 6.4 %, (30)]. Besides the large percentage of polymorphic loci, Hopkins and colleagues [15] observed that the amount of useful information obtained per polymorphic microsatellite locus was quite high. For instance, in this study, the mean number of alleles was 4.02, and several primer pairs were highly informative, such as primer pair Ah51, which allowed the amplification of seven different fragments.
PCR products were obtained for most of the wild species analyzed ( Table 2). In general, fragments close to the size of the fragment expected for A. hypogaea were detected. The transferability of the markers was variable, ranging from 54% for the locus Ah6-125 to 100% for Ah30. The level of polymorphism also varied among loci, ranging from 25 alleles in Ah30 and Ah126 to 15 alleles in Ah11 and Ah20.
The annealing temperatures used to amplify microsatellite loci in wild Arachis species ranged from 10°C below the melting temperature (Tm) of a given pair of primers to the melting temperature of the primer. The necessity of lower annealing temperatures for some pairs of primer suggested that some microsatellite flanking regions were more conserved than others in the Arachis species analyzed. The data also suggested that changes in the flanking regions most probably resulted from point mutations and small deletions and insertions, since if major rearrangements were responsible for causing the changes, they would probably have resulted in no amplification due to the interruption or deletion of primer-annealing sites. Point mutations and small rearrangements (deletions and/or insertions) were detected in the flanking regions of the Ah11 locus of some species analyzed ( Figure 1). For instance, a sequence of five bases (positions 112 to 116) was absent from A. triseminata.
The cross-transferability of A. hypogaea markers to species of section Arachis was very high, ranging from 60% for Ah20 to 100% for Ah30. A similar level of microsatellite marker transferability was observed from Triticum aestivum L. to its ancestral diploid species [24]. Section Arachis comprises species with genomes (AA and BB) similar to those found in the cultivated peanut (AABB) showing agronomical value characteristics, which are introgressed into cultivated peanut mainly by means of crosses with synthetic amphidiploids resultant from crosses between A and B genome species. The resulting F 1 has to be backcrossed many times to get an off-spring that has the introgressed characteristic and most of the recurrent parental genome. A genetic map constructed using wild diploid species would be useful to guide the introgression of genes from wild species to A. hypogaea. A map would allow the discovery of markers linked to gene(s) or chromosome regions that are responsible or involved in the expressions of introgressed characteristic. Similarly, it would allow the selection of markers distributed all over both genomes of A. hypogaea, helping the selection of plants showing largest percentages of the recurrent parental genome. This approach would be the most efficient way to integrate molecular markers into breeding programs of cultivated peanut, since genetic polymorphism in A. hypogaea is very low [4,15,31] and insufficient to construct a genetic map. Figure 2 shows the relationships among species of Arachis section based on amplification events observed using eight pairs of microsatellite loci from A. hypogaea. Two groups were identified. The first consisting of A. hypogaea, A. monticola and all analyzed genome Aspecies, the second contained the species classified as genome B and A. glandulifera Stalker, classified as genome D [32]. The division into two main groups was based essentially on the absence of amplification of two loci (Ah20 and Ah6-125) in genome B species. Despite the few loci analyzed (8), the groups defined by the dendrogram agreed with previous studies that classified species of the Arachis section into genomes A, B and D. In this study A genomes species were placed closer to A. hypogaea than the B genomes. Tallury and colleages [33] using AFLPs found A ipaënsis and A. williamsii, both B genome species, closer to A. hypogaea than A genome species. This difference on affinities of A and B genome species with A. hypogaea may be due to the type and number of markers used. The data also agreed with the close relationship between A. glandulifera and B genome species [33]. These findings suggested that flanking regions contain useful phylogenetic information.
PCR products were also obtained for species from sections Caulorrhizae, Erectoides, Extranervosae, Procumbentes, Rhizomatosae, Trierectoides and Triseminatae (Table 3). Five primer pairs, namely Ah2, Ah11, Ah19, Ah30 and Ah126, from the eight primers (62.5%) tested resulted in amplifications from all sections. The pair Ah6-125 (12.5%) produced amplification in six sections, and Ah20 and Ah21 A h 2 1 9 9 1 9 9   in five sections (25%). Taking into account that 33% of the primers pairs allowed the detection of polymorphism among accessions of A. hypogaea, this set of primers will probably show polymorphism in wild Arachis species, so they could be analyzed using microsatellite loci with no costs to primer development. Cross-transferability of Arachis microsatellite markers was also observed in other studies. Hopkins and colleagues [15] Figure 1. All species showed repeated sequences similar to those found in Ah11 locus of the cultivated peanut, regardless of the section to which the species belonged. These sequences differed from each other only in the number of repeated motifs. Thus, primers for A. hypogaea were able to amplify microsatellites in other Arachis species.
A neighbor-joining tree constructed based on a small part of the flanking regions and on the repeated sequences of the Ah11 locus in 18 species is shown in Figure 3. The species of the different sections of Arachis were scattered throughout the tree and some were located close to species from other sections. The majority of the variation among species reflected differences in the number of motifs among the species and not in the flanking regions. These results suggested that there was no correlation between the number of repeated sequences and the taxonomic relationship among these species, and that the level of information contained in a microsatellite locus did not necessarily positively correlate to the degree of relatedness to A. hypogaea. For instance, A. hoehnei and A. cardenasii, both from section Arachis, had shorter microsatellites than A. repens (Section Caulorrhizae) and A. triseminata (section Triseminatae). A larger number of plants from each species would need to be analyzed in order to test this hypothesis because microsatellites are highly polymorphic and the accessions of the species analyzed may have been extreme in the range of variation found at each analyzed locus.
An analysis of the cross-transferability of microsatellite loci in Vitaceae showed that microsatellite repeats were present in most of the species examined and that flanking sequences were conserved and could be used to examine evolutionary relationships [35]. The potential usefulness Alignment of nucleotide sequences of 18 Arachis species amplified using primer pair Ah11  (14), A. hoehnei (15), A. decora (16), A. macedoi (17) and A. hypogaea (18). The sequences of all species comprised microsatellites, but the number of repeats varied a lot among them.
of flanking regions to assess taxonomic relationship in Arachis was not approached in this study. However, our results indicate that these regions could be useful for establishing genetic relationship in Arachis, since the relationship established based on amplification events (Figure 2), which depend on the conservation of the flanking regions, agreed with the division of the species of Arachis section into genomes A and B.
Some species had more than one fragment amplified using some primer pairs, including accessions of the diploid species A. simpsonii, A. aff. cardenasii, A. linearifolia, A. hermannii, and A. pseudovillosa, which showed more than one fragment using Ah21 (Table 4). These results suggested that the accessions of the above species were heterozygous. Arachis species were expected to be homozygous since they are considered to be autogamous simply by analogy to cultivated peanut [36]. In addition, these species are diploid, a fact that excludes the possibility of exist-ence of two homozygous loci with different alleles, as observed for Ah21 and Ah30 in A. hypogaea, which is an allotetraploid ( Table 3). The data suggested that cross-pollination happens in some Arachis species. Evidences of cross-pollination in A. duranensis were found when different accessions were analyzed using RFLP [37]. The extensive polymorphism detected within accessions of A. cardenasii using cDNA and seed storage proteins probes [5,38,39] has also been suggested to be related to high frequency of cross-pollination. As polymorphic codominant markers, microsatellites are useful tools to analyze the mating system of wild species of Arachis.

Conclusion
These results show that microsatellite primer pairs of A. hypogaea have multiple uses. A higher level of variation among A. hypogaea accessions is detected using microsatellite markers in comparison to other markers, such as RFLP, RAPD and AFLP. The microsatellite primer pairs of Phenogram showing the relation among Arachis species based on amplification events obtained using eight primer pairs and 22 Arachis species Figure 2 Phenogram showing the relation among Arachis species based on amplification events obtained using eight primer pairs and 22 Arachis species. The polymorphism was not enough to characterize most species, but they were grouped according the type of their genomes (A, B and D).
A. hypogaea showed high transferability rate to other species of the genus. These primer pairs are useful tools to evaluate the genetic variability and to assess the mating system among Arachis species.

Plant material
Sixteen accessions of A. hypogaea and 38 accessions of species from eight of the nine sections of the genus Arachis were analyzed (Table 3). The samples were obtained from Arachis Germplasm Bank (EMBRAPA Recursos Genéticos e Biotecnologia -Brasília, DF, Brazil).

DNA extraction
DNA was extracted using the procedure of Grattapaglia and Sederoff [40]. The quality of the DNA was checked on 1% agarose gels and the DNA concentrations were estimated spectrophotometrically (Genesys 5 -Spectronic Instruments).

Library construction and primer design
Nine micrograms of genomic DNA from A. hypogaea were digested using 1.35 μl of HaeIII (10 U/μl), 2 μl of AluI (10 U/μl) and 1 μl of RsaI (10 U/μl) (New England Biolabs). The reaction products were electrophoresed on 1% low melting point agarose gels and fragments 200-600 bp in size were excised from the gel, extracted with phenol/chloroform, and ligated into pBluescript (Stratagene). The ligated clones were used to transform Escherichia coli XL1blue MRF' (Stratagene). The library was amplified overnight at 30°C with shaking (300 rpm) and the plasmids then isolated by phenol extraction. The library was enriched for AC and AG repeats using a GeneTrapper ® cDNA positive selection system (Invitrogen). The selected clones were used to transform XL1-blue MRF' bacterial cells. The white colonies were grown overnight in LB-liquid medium supplemented ampicillin (100 μg/ml). The plasmids were extracted using a Concert ® rapid plasmid purification system (Invitrogen). Sequencing was done using T3 and T7 primers. The sequencing reaction mixture consisted of 2 μl of plasmid DNA, 2 μl of Big Dye terminator, 8 pmol of primer and water to a final volume of 10 μl. Sequencing was carried out in an ABI Prism 377 sequencer (Applied Biosystems). The primers were designed using Primer3 software [41].

PCR
Fourteen primer pairs were used for PCR amplification, being 13 designed using the sequences selected in the above step and one reported by Hopkins and colleagues [15] (

Electrophoresis
The sequence variation in A. hypogaea and two wild diploid species (A. duranensis and A. ipaënsis) was analyzed using 4% denaturating polyacrylamide gels (19:1 acrylamide/bisacrylamide, 7 M urea) that were silver stained. The sizes of the fragments were estimated based on a 10 bp ladder (Invitrogen).
The PCR products obtained using DNA from wild species were electrophoresed on 3% metaphor (FMC Bioproducts) agarose gels for 3 h at 120 V. The agarose gels were stained with ethidium bromide and PCR products viewed under UV light. The size of fragments was estimated based on a 100 bp ladder (GE).

Analysis of variation in A. hypogaea
The allelic and genotypic frequencies were calculated for the samples analyzed. The genetic variability of the sample as a whole was estimated based on the number of alleles per locus (total number of alleles/number of loci), the percentage of polymorphic loci (number of polymorphic loci/total number of loci analyzed) and Polymorphism Information Content (PIC = 1 -).

Analysis of the locus cross-species transferability
The cross-species transferability of eight loci was evaluated using 37 accessions of 37 species from eight sections of the genus Arachis. The percentage of transferability was calculated for each locus for section Arachis (22 species) and for the whole sample (37 species) as the number of species in which the expected fragment was detected/the total number of species analyzed. A binary matrix based on the amplification events for section Arachis alone was prepared based on the data in Table 4. In this matrix, 1 indicated amplification and 0, no amplification. A genetic distance matrix was calculated using the Nei and Li distance [42] and a dendrogram was constructed using the UPGMA method (unweighted pair group method with arithmetic mean) [43].

Sequencing of PCR products and sequence analysis
The PCR products obtained using the pair of primers for locus Ah11 repens, A. subcoriacea, A. triseminata and A. valida) were purified using the Concert ® Rapid PCR purification system (Invitrogen). The sequencing reaction mixture had a total volume of 10 μl: 2 μl of purified PCR product, 2 μl of Big Dye Terminator, 6 pmol of one primer, and 5.4 μl of water. The sequencing cycle consisted of 25 cycles of 96°C for 45 s, 55°C for 55 s, and 60°C for 4 min. The reactions were run in a PTC 100 cycler (MJ Research) followed by sequencing in an ABI Prism 377 sequencer. The sequences were edited using the Sequencer program (version 3.1) (GeneCodes). Sequence alignment and a neighbor-joining tree were obtained using Clustal X (version 1.8) [44].

Authors' contributions
AAH, AVGB and DAP have made substantial contributions in the acquisition, analysis and interpretation of data; MAG and CRL have made contributions to conception, design and interpretation of data. All authors have been involved in revising the manuscript critically and approved the final version.  * Reference: Hopkins and colleagues [15].