The final consensus map comprised 2029 loci, spanning 1603.5 cM, following the integration of 6 individual maps derived from 6 distinct RIL mapping populations. It has allowed us to map a larger number of markers than possible in any individual map, to obtain a more complete coverage of the sorghum genome and to fill a number of gaps on individual maps. Only two other published sorghum genetic linkage maps are of a comparable marker density; the BTx623/IS3620C map consisting of 2926 loci spanning 1713 cM  and the BTx623/S. propinquum map consisting of 2512 loci spanning 1059.2 cM . While both of these previously published maps have a higher overall marker density than the present DArT consensus map; 1 marker/0.42 cM , 1 marker/0.59 cM  vs. 1 marker/0.79 cM in the presented consensus map, these maps are based on high numbers of RFLP markers  or AFLP markers  and it can be argued that the sequential nature of gel-based marker systems such as RFLPs and AFLPs involves high costs and is more labour intensive per assay thus DArT markers may represent the most suitable markers for molecular breeding strategies. DArT markers, with their high multiplexing level (all the DArT markers reported here were analysed in a single assay per population), offer sorghum breeding programs an alternative and low-cost approach to whole-genome profiling and the final consensus map presented here consists predominantly of DArT markers (1190; 59%), in addition to 839 non-DArT markers (497 RFLPs, 334 SSRs or STSs and 8 morphological markers).
The overall consensus map marker order was in good agreement across the individual maps. Locally, the consensus map resolution was slightly compromised by occasional inconsistencies in groups of markers, commonly covering about 1–6 cM, but also swaps of individual markers over even longer distances. The majority of the 77 observed marker order inconsistencies involved closely-spaced markers. Inversion is a common feature of closely spaced markers and this phenomenon has been observed previously in sorghum when aligning different sorghum maps [27, 30]. These marker order rearrangements could be real, they could be due to error in one of the small mapping populations or they could be explained by the statistical uncertainty of orders at the cM-scale that is inherent in datasets derived from a limited number of RILs. Of the 498 markers in common across all 6 maps, in only 5 cases did markers map to a truly incongruous location on the corresponding linkage groups in alternative populations, which could be explained by mapping paralogous loci in different populations. A similar 1% frequency of paralogous loci was recently observed by  when aligning genetic linkage maps derived from both inter- and intra-specific sorghum populations. Such marker ordering inconsistencies are frequently observed for consensus maps and can be related to the overall number and distribution of commonly mapped bridge markers used for building the framework of the consensus map. For constructing the present DArT consensus map, 251 markers were used as bridge markers (12.5% overall) spaced at average intervals of 5.4 cM. This bridge marker frequency is comparable to other recent consensus map studies, including  who used 10% of all markers as bridge markers to construct a consensus map for barley from 3 doubled haploid populations.
Differences of local recombination frequencies (map length) between populations can also effect marker ordering between maps, and the importance of similar recombination frequencies across individual maps when constructing a consensus map has previously been noted . A difference ratio was therefore calculated per chromosome, derived from the equation for the distance measurement of interval variables  by , to compare the genetic distances on each map with the TAMU-ARS base map. The overall difference ratios in genetic distance between the TAMU-ARS map and the five other maps were low and varied from 0.0045 (S4) to 0.12 (S5) and were comparable with a recent study  that calculated a difference ratio of 0.05 between two sorghum maps. The low difference ratios observed indicate that there is good agreement in overall distances between common marker pairs across the component maps used in this study. It also provides justification for the "neighbours" consensus map construction strategy adopted here and the use of the TAMU-ARS genetic distances for the locus positions of the bridge markers along each chromosome. It can also be argued that map distance estimates are less important than marker order, as map distances do vary between different genetic linkage maps by several centimorgans , and that the marker order is the most critical feature for further application of the map, for example, for map-based cloning. Additionally, the synthetic approach to consensus map development, based on the integration of separately constructed component maps, was recently reported to be the preferable consensus map construction strategy, compared to building a consensus map de novo from an integrated set of segregation data , at least until improved or alternative software options become available.
Consensus map features
The non-random distribution of markers across the consensus map, due to both clusters and gaps of markers across chromosomes, is a feature that has also been observed in previous sorghum maps. Figure 4 indicates that there is a clustering of markers around the centromere for every chromosome, with the exception of SBI-06. Such marker-dense regions around the centromeres were also observed by . This is also supported by the recent observation by  that the pericentromeric heterochromatic regions of sorghum chromosomes showed much lower rates of recombination (~8.7 Mbp/cM) compared to euchromatic regions (~0.25 Mbp/cM), with the average rate of recombination across the heterochromatic portion of the sorghum genome being ~34-fold lower than recombination in the euchromatic region. Similarly, the sparseness of markers on the short arm of SBI-06 could also be explained by the observations of  that this chromosome arm showed a relatively low rate of recombination compared to other regions of euchromatin (~2.3 Mbp/cM vs. the overall average of ~0.25 Mbp/cM). Both DArT and non-DArT markers clustered around the centromeres, however a slightly higher overall proportion of DArT markers (71% of all markers in the centromeric regions) in these regions were observed. This is in contrast to the recent high-density DArT consensus map developed for barley, which  found that DArT markers were significantly less clustered at most centromeric regions of barley chromosomes compared to non-DArT markers. Marker redundancy can also enhance the non-random marker distribution pattern. In previous studies [32, 38, 39], a low level of DArT marker redundancy has been observed, however during the process of consolidating the most informative DArT clones in new arrays, the large majority of redundant markers are excluded from the final DArT array, and hence DArT marker redundancy should be minimised.
In addition to the uneven distribution of recombination events along chromosomes and the potential for the confounding effects of marker redundancy, non-random marker distribution can also be due to the preferential survey of DNA polymorphism that is unevenly distributed along chromosomes. In particular, areas of low marker density may correspond to regions of similar ancestry or identity by descent in the germplasm included in the initial diversity representation for the development of the sorghum DArT markers . In the present DArT consensus maps, there were 3 gaps larger than 10 cM; one on the distal end of the long arm of SBI-05, one on the distal end of the long arm of SBI-08 and one on the distal end of the short arm of SBI-09. These regions of low marker density may therefore be associated with genomic regions that were identical by descent or that had very limited genetic variability in the initial diversity representation used for the development of the DArT array. An alternative hypothesis is that because, in total, nine of the twelve parental genotypes of the six mapping populations used in this study were included on the initial diversity representation, the gaps could be a true reflection of co-ancestral regions between the parents, as opposed to a result of the composition of the array, and maybe suggestive of genomic regions containing key adaptive genes which have been fixed through selection through the pedigree. Regions of low marker density have been observed previously; even on the densest meiotic linkage map produced yet, for potato , a gap spanning 14 recombination units was observed. The authors  postulate that this could be due either to recombination hot spots or could also indicate fixation (homozygosity) of the potato genome in this region. Non-random marker distribution can also be associated with other interesting features of sorghum genome organisation. It has also been noted  that sorghum chromosomes have cytologically distinguishable knobs, which may account for some marker excesses or deficiencies.
Approximately 75% of the consensus map (524 markers spanning 1495 cM) was associated with markers which had skewed segregation in one or more of the six component maps. However, only 407 (19.8% of the markers on the consensus map) of the 524 skewed markers were linked by less than 5 cM to other markers showing distortion. The 117 markers with skewed segregation that were linked by at least 5 cM to markers that weren't distorted could reflect residual levels of heterozygosity in the lines (when scored with dominant markers), due to either natural or artificial selection, sampling bias due to lower numbers of markers in these regions or mis-scoring of the markers. Skewed segregation was observed for both DArT and non-DArT markers; no one marker type showed a particular tendency for skewness. Marked differences were observed, however, for the distribution of markers with skewed segregation across chromosomes, although there was some similarity between the component maps, e.g. the short arm of SBI-01 showed skewed marker segregation in four of the six maps (TAMU-ARS, S2, S4 and CIRAD). Highly significant deviation from the expected 1:1 segregation ratio on SBI-01 towards the BTx623 allele was also observed by , which affected almost the entire linkage group. The authors  also noted other reports of similar skewed segregation in the same genomic region and observe that strong and consistent segregation distortion in one genomic region is less likely to be due to sampling error and more likely suggests selection favouring one parental allele. On the DArT consensus map, SBI-01 has the highest proportion of chromosomal regions associated with skewed segregation (67%). Two other chromosomes (SBI-04 and SBI-08) also have over 50% of the chromosomal regions associated with skewed segregation (51.6% and 54.1%, respectively), once again also observed by . SBI-07 has a significantly lower portion of the chromosome associated with skewed segregation (9.6%) than any other chromosome on the consensus map. This non-random and consistent distribution pattern of skewed segregation lends weight to previous proposals [18, 25, 40, 41] that distorted segregation is due to the elimination of gametes or zygotes by a lethal factor located in a neighbouring region of the marker. Higher frequencies of skewed markers have also been observed in RIL populations, compared to doubled haploid, backcross or F2 population structures , due to increased opportunities for selection across generations; all six component maps in the current study are based on RIL populations.
Of the 1997 markers included in the DArT consensus map, 35 mapped to different chromosomes in the component maps. The frequency of multicopy markers detected in this study (1.8%) is much lower than observed by , who found that 17% of RFLP probes mapped to multiple locations. This could be explained by the differences in marker types. It has been found that DArTs, as a hybridisation-based bi-allelic marker, inherently select against multi-locus markers , as the hybridisation intensities measured for such multi-locus markers tend to appear monomorphic. Variation in the frequency of multicopy markers was observed across chromosomes, with SBI-07, SBI-10, SBI-02 and SBI-05 having a multicopy marker frequency greater than 5%. SBI-06 had the lowest multicopy marker frequency (1.1%). A tendency for the multicopy markers to be present in the centromeric regions across chromosomes was also observed, with approximately 22% of all multicopy markers occurring in the pericentromeric heterochromatic regions, whilst overall only 13% of all markers included in the consensus map are located in the centromeric regions. Centromeric suppression of recombination is associated with the accumulation of repeated sequences  and could explain the tendency towards marker duplication. The non-random distribution of multicopy loci across chromosome pairs has been reported previously [20, 26]. It has been observed  that the duplication of sorghum chromatin closely resembles the pattern for rice, showing ancient duplications in some regions. However, very little evidence was found in the current study for co-linearity between chromosomes, lending weight to the argument against an ancient polyploidisation event in the evolution of the sorghum genome [42–44]. It has also been previously observed  that 30% of the sorghum genome showed correspondence to two or more unlinked intervals which the authors postulated could either be due to very localised colinearity or which may reflect more recent duplications superimposed on more ancient ones.
Utility of the consensus map for genomics and breeding applications
The DArT consensus map presented in this paper will help link information on sorghum diversity and QTLs to the sorghum physical map and to the sorghum genome sequence. The availability of the primer sequence information for the majority of SSRs http://sorgblast3.tamu.edu/linkage_groups.htm and probe sequence information for a subset of RFLP markers with the prefixes bcd, bnl, cdo, csu, psb, RG, rz and umc
http://cggc.agtec.uga.edu/ included on the consensus map already provides immediate opportunities to anchor the presented consensus map to the physical map, hence faciliating sequence mapping of known genes from other species, taking advantage of known syntenic relationships between sorghum, rice, maize and other grasses [45, 46], in addition to a positional cloning approach to identify candidate genes underlying QTLs flanked by sequenced mapped SSRs or RFLPs. To demonstrate this, 42 RFLPs included on the consensus map were sequence mapped on the rice genome (TIGR; http://rice.plantbiology.msu.edu/) and bin-mapped on the maize genome (MaizeGDB; http://www.maizegdb.org/); data presented in Additional File 4. The syntenic genomic regions between sorghum, rice and maize were largely as expected, at the macro-level [45, 46]. With the recent availability of both the rice and sorghum whole genome sequences, and the on-going sequencing of the maize genome, however, not only the macro-level synteny, but genic microsynteny can now be furthered explored. As an example, comparisons for fifteen predicted genes (downloaded from ftp://ftp.jgi-psf.org/pub/JGI_data/Sorghum_bicolor/v1.0/Sbi/) in the 265,271 bp euchromatic region between the two RFLP markers rz630 and umc90 on the sorghum genome (SBI-01) were made between rice and sorghum. BLAST similarity between the sorghum predicted genes and the rice sequence, requiring hits with E ≤ 1e-10 based on BLASTn, are detailed in Additional File 5. Over 73% conserved synteny among the 15 predicted genes was observed; comparable to microsyntenic levels (72%) observed previously  in euchromatic genomic regions in rice and sorghum. Far greater microcolinearity has also been observed  in euchromatic regions, compared to heterochromatic regions. Further detailed evaluation of the level of genic microcolinearity, both in euchromatic and heterochromatic regions, between rice and sorghum based on the whole genome sequence analysis will provide invaluable knowledge for cereal scientists and will provide new opportunities for sorghum researchers to link QTL and gene information aligned to genetic linkage maps directly to the whole genome sequence and predicted genes. The on-going sequencing of the sorghum DArT clones, when integrated with the whole genome sequence, offers many opportunities to greatly accelerate gene discovery and analysis in addition to the opportunity to convert the recombination fractions on the consensus map to physical map distances (cM to kb), affording new prospects for the progress of genomic applications. The sorghum whole genome and DArT clone sequences can also be exploited for targeted marker development for specific genomic regions. Because of ease of sequence analysis, DArT markers have a significant advantage over AFLPs for positional cloning efforts due to the difficulty in sequencing AFLPs that, therefore, cannot be readily integrated into the whole genome sequence.
An additional use of the presented DArT consensus map is in whole genome profiling-assisted breeding. The marker density on the consensus map is sufficient to provide a better choice of markers for specific breeding populations to ensure adequate polymorphic marker coverage in regions of interest. Further, the marker density on the consensus map is suitable for whole genome pedigree analysis, and calculating identity-by-descent through generations. The consensus map provides a large number of markers along the length of the chromosome that can be used to genotype individuals for detecting recombinants, fixing loci, restoring a recurrent genetic background, or assembling complex genotypes in complex crosses. The co-location of a range of marker types (DArTs, RFLPs and SSR markers) on the consensus map will enable sorghum breeders to quickly identify target loci through whole-genome DArT scans and then select markers of interest from the same region for marker-assisted selection.