Over the last decade, many plant pathogen resistance (R) genes or quantitative trait loci (QTL) have been cloned. The largest class of known R genes encodes proteins with a central nucleotide binding (NB) domain and a C-terminal leucine-rich repeat (LRR) domain . Based on the amino-terminal domain feature, the NB-LRR proteins can be divided into two classes: TNL (TIR-NB-LRR) and CNL (CC-NB-LRR) in which the R proteins possess, respectively, either the Toll/Interleukin-1 Receptor (TIR) domain or a coiled-coil (CC) domain . The NB domain seems to have NTP-hydrolyzing activity for regulating signal transduction through conformational changes . The LRR domain contains tandemly arrayed repeats that is involved in the specific recognition of pathogen effectors . Both TIR and CC domains are assumed to be involved in protein-protein interactions and signal transduction [4, 5].
Due to the availability of whole genome sequences, NB-encoding resistance gene homolog (RGH) sequences have been annotated and mapped in a number of plant species such as Arabidopsis thaliana, poplar (Populus trichocarpa) , potato (Solanum tuberosum) [8, 9], rice (Oryza sativa) , sorghum (Sorghum bicolor) , grapevine (Vitis vinifera) , coffee tree (Coffea arabica) , Medicago truncatula, and papaya (Carica papaya) . While NB-LRR genes are widely distributed among plant genomes, their numbers vary greatly in different species. For example, the papaya and grapevine genome contains 55 and 535 NB-LRR RGHs representing 0.2% and 1.8% of their total genes, respectively [12, 15]. A lack of recent genome duplication was believed to be the reason of the overall low NB-LRR gene numbers in papaya . NB-encoding genes are unevenly distributed in the plant genome and are mainly organized in multi-gene clusters. The clustered distribution of R-genes is assumed to provide a reservoir of genetic variations from which new pathogen specificity can evolve via gene duplication, unequal crossing-over, ectopic recombination or diversifying selection [17, 18]. In addition, nucleotide polymorphism analyses demonstrated extremely high level of inter- and intra-specific variations of NB-LRR genes, which presumably evolve rapidly in response to changes in pathogen populations [12, 19]. Nevertheless, conservation of synteny for NB–LRR disease resistance genes among phylogenetically related species was also observed [20, 21]. However, the extent of genome-wide conservation and synteny of NB-LRR RGHs between different species is not well documented.
Cucumber, Cucumis sativus L. (2n = 2x =14) is an economically important vegetable crop and a system of choice for studying several important biological processes . In recent years, application of next generation sequencing technologies enabled release of draft genomes of three cucumber lines (‘9930’, ‘Gy14’ and ‘B10’) [23–25] providing powerful tools for understanding the structure and organization of R genes in the cucumber genome. In the 9930 draft genome, 61 NB-containing RGHs were identified , but no details were given for these RGHs, and the RGH numbers seem to be underestimated as compared with an improved annotation of the 9930 genome (Version 2.0) . Thus, one objective of the present study was to conduct genome wide identification and characterization of NB-LRR type RGHs in the Gy14 draft genome assembly (Version 1.0) . Since the ratio of genetic to physical distances varies along the chromosomes (for example, ), the information of genetic map locations of RGHs, especially on a high-density reference genetic map, is very useful for map-based cloning of R genes or association mapping through the candidate gene approach. The association of RGHs with candidate disease resistance genes has been well established in a number of crops such as melon (Cucumis melo) [29, 30], wheat (Triticum aestivum) , cucumber , sunflower (Helianthus annuus) , and potato . The information of genetic and physical locations of RGHs also allows for quick map-based cloning of several R genes or QTL in rice [35–37], poplar  and common bean (Phaseolus vulgaris) .
Cultivated cucumber has a very narrow genetic base [28, 40, 41] making it difficult to develop high-density genetic maps. From whole genome sequences, tens of thousands of simple sequence repeat (SSR) markers have been developed [24, 42]. Among all SSR-based cucumber genetic maps constructed thus far [27, 28, 42–46], the one by Ren et al. with 995 SSR loci has the highest marker density. However, this map was developed with a limited number of recombinant inbred lines (RILs) from an inter-subspecific cross between Gy14 and the wild cucumber (C. sativus var. hardwickii) accession PI 183967 (CSH-RIL map hereinafter). Strong recombination suppression was found in this mapping population, and more than one quarter of mapped loci were clustered across five chromosomes (3, 4, 5, 6 and 7). As a result, the total genetic distance of this map is only 572.9cM, which is shorter than the expected ~750cM map length for the cucumber genome . The most recent intra-varietal linkage map of cultivated cucumber was developed with an F2 population of Gy14 × 9930 (CSS-F2 map hereinafter) containing 735 marker loci with a total map length of 707.8cM, which allowed for integration of the genetic and physical maps to develop a chromosome-level draft genome assembly of Gy14 (Version 1.0) . While such maps are a significant improvement as compared with those AFLP- or RAPD-based maps developed early, marker density on this map is still far from being satisfactory for many molecular marker-based applications such as marker-assisted breeding, map-based gene cloning or assembly of a more complete cucumber genome.
For cultivated crops like cucumber with limited genetic diversity, development of a dense consensus map is a method of choice to increase marker density, which is usually achieved through map integration by synthesizing the information from multiple segregating populations of diverse genetic backgrounds. This allows for mapping a larger number of loci than in most single crosses to saturate the map, thus providing a genomic framework for QTL identification, map-based gene cloning, assessment of genetic diversity, association mapping, as well as marker-assisted selection in molecular breeding . Consensus maps have been constructed in a number of crop species such as lettuce (Lactuca sativa) , grapevine , cowpea (Vigna unguiculata) , red clover (Trifolium pratense) , sorghum , soybean (Glycine max) [53, 54], melon , and the oilseed rape, Brassica napus. In cucumber, a consensus map with 1,369 mapped loci was also developed by integrating the CSH-RIL map (Gy14 × PI 183967 RIL) and the S94 × S06 RIL map . A major drawback associated with this consensus map is that marker orders in the recombination suppressed regions were not well resolved, which greatly affect the accuracy of the order of loci and the quality of the resulting integrated cucumber map. Thus, the second objective of the present study was to develop a high-density consensus map for cultivated cucumber by integrating several individual maps and to anchor all NB-LRR type RGHs identified herein onto this integrated map.
We first scanned the Gy14 draft genome and bioinformatically identified and characterized 70 NB-containing RGH sequences. In silico expression in cucumber transcriptome and conservation in sequence homology and colinearity between cucumber and melon genomes were investigated. Through comparison between the Gy14 and 9930 draft genome sequences, we identified DNA polymorphisms in the regions harbouring the RGHs, and genetically mapped these RGH loci on the Gy14 × 9930F2 linkage map (CSS-F2 map) . By integrating three component maps, we developed a cucumber consensus map that contained 1,681 loci and anchored 67 RGH loci and 10 cucumber genes.