Prior to this work, important advances were made in the construction of genetic maps of species of Eucalyptus. Although some RFLP markers were mapped in E. nitens  and E. globulus , the most extensive genetic mapping data has been accumulated mainly with dominant RAPD and AFLP markers [4, 42–45]. While RFLP markers are useful for comparative mapping purposes across individuals, species and even more distant taxa, high throughput genotyping, probe distribution and maintenance are difficult. On the other hand RAPD and AFLP markers while allowing the generation of several hundred markers and providing very good genome coverage, have limited information content and are almost useless for comparative mapping studies and QTL validation across pedigrees and species. The novel set of 230 microsatellite markers reported herein, summed to the 70 markers reported earlier , totals a relatively large set of 300 microsatellites that should allow significant advances in Eucalyptus genetic research. Furthermore, the linkage map presented, involving 234 mapped loci, spans an estimated ~90% of the recombining genome of Eucalyptus, making it the most comprehensive genetic linkage map of a forest tree to date based exclusively on microsatellite markers.
Consolidating all the development and screening data since our initial studies , from 450 primer pairs designed, we obtained 300 operationally usable markers, i.e. a final efficiency of marker development of 67%. Out of these 300 microsatellites, we were able to detect polymorphism and Mendelian segregation at 237, i.e. 79%, in this particular mapping population. Although no published study is yet available making a detailed evaluation of large sets of microsatellite markers in other Eucalyptus pedigrees, a similar proportion of informative markers could be obtained when genotyping tropical eucalypt progenies within the section Latoangulatae of subgens Symphyomyrtus. In fact Missiaggia et al.  were able to easily select 100 informative microsatellites distributed throughout the Eucalyptus map in a cross of E. grandis × E. urophylla hybrid parents when mapping QTL for early flowering. Moving microsatellites to other commercially important species such as E. globulus (section Maidenaria) or E. camaldulensis (section Exsertaria) the issue becomes one of transferability first and then information content. While in an earlier study, based on 100 microsatellites, we indicated a transferability of 78% from E. grandis to E. dunni (section Maidenaria) , estimates of transferability and information content are still limited to reports based on mapping a few microsatellites in E. globulus [17, 44], E. camaldulensis  and a slightly more extensive study involving E. globulus and E. tereticornis . These studies taken together suggest, however, that transferability of these 300 microsatellites within Symphyomyrtus and across section, particularly Maidenaria where E. globulus and E. nitens belong, should remain around 80%. Once robustly transferred, it is likely that polymorphism should be detected at a rate similar to the one in this study, i.e. 70 to 80%. This entire set of 300 EMBRA microsatellite markers in addition to the 67 published by other groups should therefore allow positioning 200 or more informative markers on any segregating family involving any of the most planted species of Eucalyptus.
Besides microsatellites, other sources of genetic markers such as EST, gene-based and SNP have been [14, 17] and will likely be increasingly mapped in Eucalyptus providing important anchor loci and candidate genes for positional cloning efforts as well as association mapping experiments. These markers, however, will demand high throughput typing techniques based on single nucleotide polymorphism assays to be able to be widely used across pedigrees. Other sources of microsatellites will also be important to sample regions of the genome that have not been contemplated so far. For example, 93 operational microsatellites were derived from a sample sequencing study of 3 megabases of shotgun DNA of Eucalyptus grandis . EST derived microsatellites are another important source of novel microsatellites. Exploiting the large EST databases constructed in the Genolyptus project  we have been rapidly expanding the number of markers currently being mapped on a set of reference pedigrees. Three important aspects must be pointed out in this respect: (1) EST-derived microsatellites will efficiently complement the ones developed from enriched genomic libraries sampling different portions of the Eucalyptus genome; (2) microsatellites into transcribed regions, specifically in untranslated regions such as 5'-UTR, should be evolutionarily older than those in noncoding regions and thus are expected to be more polymorphic as reported in a survey of some major monocots and dicots species ; (3) genetic mapping of EST-derived microsatellites will enrich the map with transcriptional information opening up the perspective of co-localization of QTLs and candidate genes in regions of higher recombination.
Eucalyptus microsatellite features
Fully informative markers that allow integration of the parental maps have been always considered key elements for a more detailed examination of interaction among alleles at QTL in forest trees . The pseudo-testcross design and marker full informativeness are, evidently, mutually exclusive. RFLP based markers do reveal such 3 or 4-allele segregation, however not to an extent that allows map-wide analysis of such detailed QTL properties. In Eucalyptus, 33% and 36% of fully informative RFLP markers were detected respectively [9, 17]. We originally reported 80% of fully informative microsatellite markers based on a mapped set of 20 markers , a proportion biased upward due to a stronger selection of polymorphic markers that was later revised to a more realistic 60 to 70% . In this study 128 of the 234 mapped microsatellites (55%) were fully informative and no markers segregating in a 1:2:1 configuration were detected. Thamarus et al.  found a similar proportion of fully informative markers, 24 in 40 (60%) using a different set of microsatellites in an intraspecific E. globulus pedigree and also did not detect any marker segregating 1:2:1. These results taken together, and now based on a larger set of mapped markers from different sources, indicate that around 60% of a screened set of Eucalyptus microsatellites should segregate in a fully informative fashion. Furthermore the fact that the pedigree used in this study is interspecific, should not significantly increase the proportion of fully informative markers due to the fact that E. grandis and E. urophylla although separate species, belong to the same section (Latoangulatae).
Null alleles at microsatellites is a general occurrence reported in essentially all species where two-generation analysis required for genetic mapping or paternity determination have been carried out (reviewed in ). In this study, the overall occurrence of null alleles, was inferred in 20 (8%) out of the 241 segregating marker loci. Most markers were in fact homozygous null in E. urophylla but amplified both alleles in E. grandis (Table 2). The overall frequency of loci displaying null alleles was only 2% in E. grandis while 8% in E. urophylla most likely reflecting the fact that microsatellites were originally developed from an enriched E. grandis library. No other genetic mapping report of Eucalyptus mentions the frequency of microsatellite markers with null alleles. However the result of this study suggests that even for microsatellites deemed transferable across species the frequency of null alleles should increase as we move to species more distantly related to E. grandis. The presence of null alleles in heterozygosity is not a problem for the construction of the separate parental maps. By scoring the two segregating alleles in a binary fashion it is sufficient to observe only one allele while the other is scored as null. However, for the construction of the consensus map, the complete genotypic class information is necessary to perform the analysis, resulting in the exclusion of loci with one or more than one null allele. In fact all six fully informative markers with one or more null alleles segregating could not be positioned on the consensus map. It will be interesting and important to accumulate data on genetic mapping and null allele frequency at all the microsatellite available for Eucalyptus, so as to arrive to a robust set of markers with low frequency of sequence polymorphism in the microsatellite priming sites. EST derived microsatellites will likely supply a good source of such markers.
Combining the linkage information derived from the two parental maps and the consensus map, a total of 234 marker loci were consistently mapped at LOD 3.0. A larger number of markers segregated and were mapped on the E. grandis map (202) than on the E. urophylla map (160). In principle this should be due to a higher level heterozigosity in the E. grandis parent tree. However, previous survey of randomly distributed sequence polymorphism with RAPD markers in these same two parents did not show significant difference in the number of segregating markers with 272 heterozygous markers from E. grandis and 286 from E. urophylla assayed with the same set of arbitrary sequence primers  The observed difference in mappable microsatellites is most likely due to the incomplete transferability of markers between these two species as they were originally developed from a E. grandis library. Fourteen microsatellites did not amplify in E. urophylla (Table 2). In addition some E. grandis microsatellite loci, although yielding amplicons in E. urophylla at the same locus defined by the flanking primer sequences, could be bearing modified simple sequence repeats in E. urophylla. This occurrence, i.e. amplification of a PCR product but absence of sequence polymorphism has been observed when attempting to transfer microsatellites across related species [53, 54]. In a microsatellite transferability study between Quercus and Castanea, despite the high sequence identity at the flanking regions observed for 14 loci mapped in corresponding linkage groups, the repeat motif in the non-source species was in some cases shortened and/or modified .
Consensus map construction
The effect of merging parental maps in a consensus map has been a matter of debate about the final quality of locus ordering and estimates of recombination fraction. For example, while Maliepaard et al.  suggested that merging linkage maps with large differences in recombination rates can result in incorrect marker orders in the integrated map, Lespinasse et al.  found that even with significant differences in recombination between the parental meiosis, the merged maps displayed only slight differences in marker order. In order to better evaluate the effect of combing segregation data form both parents in a single map, we chose to present the separate parental species maps built using a widely used approach and software (pseudo-testcross and Mapmaker ) and compare them with the consensus map resulting form the integrated segregation data analyzed with Outmap. Although 17% of the two-point estimates of recombination frequency for the same pairs of microsatellite markers differed considerably between the two parental maps (on average by 25%), an overall analysis showed no significant difference in the mean recombination fraction between adjacent markers when comparing the two parental maps. This result indicates that the reported map distances between adjacent markers on the consensus map should be adequate average estimates. As expected, the total size of the consensus map was thus intermediate between the two parental maps. The consensus map had an observed length of 1,567.7 cM and a mean inter-marker distance of 8.4 cM. This mean distance does not fall between the mean inter-marker distances of the two parental maps (10.7 and 9.2) as one could expect (Table 1). The consensus map is, in fact, a newly constructed map based on the consolidation of segregating markers inherited from both parents. While the total map distance of the consensus was a close average between the two parental maps, the total number of markers mapped on the consensus map is larger or at least equal to the individual parental maps, thus resulting in a denser map and a reduced average inter-marker distance.
All linked microsatellites mapped consistently on the same linkage groups in the two parental maps and 82% of the markers mapped with the same order along the homologous parental linkage groups with most order changes concentrated on linkage groups 1, 8, 9 and 10 (Figure 2). Although biological reasons such as chromosomal rearrangements between the two species could be involved, these order changes are most likely attributable to analytical causes. These include scoring errors due to allele drop-outs generating apparent recombination events and artifacts of the consensus mapping algorithm when attempting to define order between markers segregating only from one or the other parent with fully informative ones, leading to local reordering of adjacent markers.
Considering a total of 241 microsatellite markers amplified for E. grandis and E. urophylla, only 12 were expected to be distorted by chance at p ≤ 0.05. However, 29 (12%) were identified but only one would remain significantly distorted after applying a stringent Bonferroni correction. All distorted markers but one were mapped on the consensus map and more than half clustered mainly into two linkage groups (group 2 and 8) with the others scattered across eight linkage groups (Figure 2). The detection of segregation distortion at greater levels than expected by chance has been the rule in mapping reports for many species of plants (e.g. [58, 59]). In Eucalyptus, essentially all the mapping reports to date detected significant deviation of the expected proportion of distorted markers although at different levels, usually higher in inter specific [4, 42, 43, 45] when compared to intra specific crosses [9, 17]. Distorted markers usually cluster in specific regions of the genome therefore excluding genotyping errors as a potential cause. Several post-zygotic selection phenomena could be causing such segregation distortions. However in highly heterogeneous undomesticated forest trees such as Eucalyptus, the most likely cause involves the expression of deleterious alleles in heterozygous condition  or hybrid incompatibility when crossing divergent species. Myburg et al.  observed high levels of distortion in backcross families of a E. grandis × E. globulus F1 hybrid, and used this information to perform a whole-genome analysis of post zygotic barriers between these two species. Although it was possible to demonstrate that positive and negative heterospecific interactions affect introgression rates in such a wide interspecific pedigree, the fact that the study was carried out with dominant AFLP markers precluded a more detailed analysis of the sources of distortion. As properly pointed out in that study, the availability of a large set of microsatellites as described in this report, will be a powerful tool to further investigate the nature of post-zygotic barriers in Eucalyptus and thus guide advanced generation hybrid breeding, an exceptionally powerful approach that has been commonly used in Eucalyptus to derive elite clones.
The consensus map reported in this work does not contain the RAPD markers originally used to bridge the microsatellites mapped earlier [12, 15]. Although the RAPD markers could have been integrated, trying to pack in a very large number of markers could lead to a reduced likelihood support for marker order of the microsatellites, main focus of this study. Furthermore, given the very limited or nil transferability of RAPD to other pedigrees, their presence would add little if any information for future QTL mapping studies. It is important to note, however, that high throughput, high multiplex typing methods such as RAPD and particularly AFLP, will continue to be important complementary tools in Eucalyptus genetics for high density mapping  and high resolution positioning of disease resistance loci (e.g. ) to eventually allow map-based cloning efforts. Novel ultra-high throughput genotyping methods of transferable, sequence specific markers such as DArT, successfully evaluated for Eucalyptus , and SNP arrays , combined with the framework of microsatellites described in this work will most likely provide a robust platform for integrative high-density QTL mapping in the genus Eucalyptus.
Genome length and map coverage
Published estimates of genome length for species of Eucalyptus have varied between 919 cM and 1551 cM although most estimates to date have remained around 1300 to 1500 cM (reviewed in ), based mostly on markers generated with arbitrary sequence primers. In this work, based only on microsatellites, we obtained a total genome length of 1,814.5 for the female E. grandis map, 1,133.4 for the E. urophylla male map and 1,567.7 cM for the consensus map. While the length for the E. urophylla and consensus maps agree with most estimates to date, the E. grandis total length is larger. This observed length is probably inflated by a few markers that, although grouping at LOD >3.0, extend the map in a disproportionate way. They do not map or map in a different order on the consensus map and should therefore be viewed with caution. They were, however included on the maps to provide their preliminary linkage group assignment. These markers are: EMBRA114 on group 3 extending the map in 54 cM; EMBRA88 on group 8 extending the map in 37.7 cM; EMBRA140 on group 9 extending the map in 57 cM. EMBRA102 and EMBRA40 on group 10 extending the map in over 100 cM and displaying map order on the E. urophylla on the consensus map. By removing these five markers from the map we arrive to a more conservative total length of 1,562.3 cM. As a reference we compared these observed lengths with the ones obtained earlier for these same parental trees based on 240 and 251 RAPD markers . For E. grandis, the conservative estimate of 1,562.3 cM is close to the 1,552 cM of the RAPD map and for E. urophylla the microsatellite map with 1,133.4 is also close to the 1,101 cM of the RAPD map. No framework mapping adjustment was carried out in this study as the main objective was to provide the most likely map position for all microsatellite markers reported. However, using the estimated total map length based on the RAPD framework maps (1,620 for E. grandis and 1,156 for E. urophylla, ) the microsatellite parental maps reported in this study cover respectively 96.4% and 98% of the estimated genome length. We were however interested in estimating genome coverage for the consensus map. Using a conservative estimate of the proportion of microsatellite markers that would map as framework markers (i.e. with log likelihood support for order of at least 3.0) we estimated an observed genome coverage of 93%, and a theoretical expected genome coverage of 88.6%. These estimates allow us to propose that the microsatellite consensus map covers approximately 90% of the genome.
Consolidation of linkage data and comparative mapping in Eucalyptus
Using EMBRA markers that were mapped in other independent studies, it was possible to assign other 41 microsatellites to this consensus map at the linkage group level. The definition of the exact order of these 41 microsatelites relative to the other markers on the linkage groups will require genotyping them on this same set of progeny individuals. However the consolidation of linkage data carried out in this study demonstrates the power that a more comprehensive map of microsatellites provides for expanding the opportunities of comparative mapping across Eucalyptus species.
Linkage group numbering adopted for this map follows the one originally established for RAPD marker maps. This was an arbitrary numbering that was nevertheless kept to allow integration of microsatellites on the existing maps. Other reports where RFLP, AFLP, EST, other microsatellites and candidate genes were mapped have used different numbering. There is clearly a need to unify linkage group numbering for Eucalyptus species to facilitate the continued addition of new markers and genes. While the numbering proposed here now makes a first step toward this direction, the establishment of a correct numbering system for the chromosomes and hence for the linkage groups, should derive from cytogenetic studies using previously screened BAC with specific microsatellites as in situ hybridization probes.
This consolidation of microsatellite linkage mapping data will also expand the prospects of making comparative analysis of putative QTL synteny such as that carried out in Eucalyptus by Marques et al.  for vegetative propagation traits and by Thamarus et al.  for wood density QTLs. Another interesting opportunity is the proposition of putative candidate genes for major effect QTLs. For example, an early flowering QTL named Eef1 was recently mapped by Missiaggia et al.  on linkage group 2 flanked by markers EMBRA27 and EMBRA164. Linkage group 2 corresponds to linkage group 4 of Thamarus et al.  where EAP1, the Eucalyptus functional equivalent of the Arabidopsis Apetala1 gene was mapped. As the ectopic expression of the EAP1 in Arabidopsis driven by the 35S promoter caused plants to flower earlier  this comparative mapping information provides an interesting lead to test this gene as a candidate underlying the Eef1 QTL. In a similar way, the linkage group assignment of the CCR gene at the tip of linkage group 10 of Thamarus et al.  indicates that this candidate gene should be located at one of the tips of linkage group 10 of this consensus map, either close to EMBRA33 and EMBRA10 or at the other end close to EMBRA155 and EMBRA127. Polymorphisms at the CCR gene have recently been associated with variation in microfibril angle in Eucalyptus nitens and E. globulus . This combined information can be very valuable for a directed screening of microsatellite markers linked to CCR in populations segregating for microfibril angle in an attempt to validate this QTL in different populations or Eucalyptus species. In an analogous way, once microsatellites are mapped close to the EgMYB2 gene, recently shown to co-localize with a QTL for lignin content , it will be possible to validate this QTL and evaluate the effect of this candidate gene in variable genetic backgrounds.