As a component of the gene-for-gene resistance mechanism in plants, resistance genes play an important role in recognizing products encoded by specific avirulence genes of a pathogen . In this study, 365 tentative RGLs from common bean were successfully identified by data-mining based on the availability of 454-derived sequences in our lab and common bean ESTs in GenBank. About 60% (218) of identified PvRGLs were from 454-derived sequences; moreover, 166 (76.15%) of the 218 PvRGLs were new transcripts.
ESTs are highly valuable for genome annotation and gene structure prediction . The 454 sequencing is a faster and more cost-effective method of producing sequence data than the Sanger method and is capable of producing a 400 to 600 million base pairs per run with 400 to 500 base pair read length . It has been successfully used to maximize gene discovery, improve gene predictions, and detect SNPs and mutations [52, 53]. In the past few years, data-mining approaches have been successfully used to isolate RGLs or RGAs from sugarcane , wheat  and maize . As reported previously, Dilbirligi and Gill  adopted four different data-mining methods including domain search, individual and multiple motif searches, consensus sequence search, and individual full-length search to mine R-gene-like wheat sequences, and showed that the individual full-length search was the most successful method. There were 243 NBS-LRRs in addition to 101 other types of expressed R-gene candidates which were then isolated via an individual full-length search using a low E value of e-1 . Xiao et al.  used three methods including modified AFLP, RACE and data-mining to isolate RGAs and R-gene-like ESTs from maize and found that the data-mining method is the most efficient way. A total of 186 expressed RGAs were recovered from 550,000 maize ESTs using a moderate E value of e-10 or better . Rossi et al. revealed 88 RGLs from sugarcane ESTs by using a very stringent E value of e-50 or better and represented three major classes of R genes, namely NBS-LRR, LRR-TM and PK . The above three research reports showed that different E values have a great effect on the number of resulting RGA or RGLs. In the present study, the moderate E value of e-10, similar to that in maize, was used to mine common bean RGLs and a total of 365 tentative PvRGLs were identified. Of the 365 tentative PvRGLs, 29 belonged to NBS-LRR type, 96 belonged to LRR, LRR-TM, and LRR-PK type, 229 belonged to PK type, six and five contained sequences with similar to putative TM domains and Toxin reductase domains, respectively. The number of RGLs identified in the present study was about two times more than those in maize due to three reasons. Firstly, 1.77 million 454-derived sequences and common bean ESTs were screened to identify RGLs in common bean, about three times more EST sequences than those in maize (550,000 ESTs). Secondly, 454 sequencing can generate more sequence data than the Sanger sequencing. Therefore, transcripts at extremely low levels can be detected . Finally, some identified PvRGLs match the same R genes or RGLs. For example, PvRGL083 and PvRGL236 match different regions of the same BAC-end sequence.
RT-PCR was used to examine the expression of the PvRGLs in this study. Results indicated that all of the selected PvRGLs were actually expressed in the leaves of genotype Sierra (Figure 1). In contrast, many RGLs or RGAs amplified from genomic DNA using degenerate primers or mined from whole genome are not expressed . Previously, eight classes of disease-resistance related sequences were amplified from common bean DNA using degenerate primers based on the conserved NBS domain . Expression analysis indicated that three RGAs (SB1, SB3 and SB8) were not expressed . In Lotus, 62 NBS-encoding sequences were considered as pseudogenes due to encoding of incomplete protein sequences . In Arabidopsis, at least 12 NBS-LRR-encoding genes were predicted to be pseudogenes due to frame shift and nonsense mutations .
The PvRGLs discovered in this study correspond to most of the 25 previous common bean R genes or RGAs in the PRGdb database . PvRGL266 and PvRGL275 have strong hits to the P. vulgaris TL5601 disease resistance protein gene with amino acid identity of 90% and 85%, respectively. TL5601, located in the I locus of common bean, controls resistance to Bean Common Mosaic Virus . PvRGL262 matched to coiled-coil NBS-LRR (CNL)-B11 with amino acid identity of 94%. CNL-B11 was mapped in the B4 R gene cluster which contained at least three R genes (Co-9, Co-y, and Co-z) and QTL effective against anthracnose, and Bean golden yellow mosaic virus [55, 56]. PvRGL309 was a part of polygalacturonase-inhibiting protein (PGIP) gene. PGIP can inhibit fungal endopolygalacturonases and is considered to be an important factor for plant resistance to phytopathogenic fungi . PvRGL294 was the same as, but much longer in the 5' end than the previous RGL SB3 and OB9 .
So far, most of the known R genes have been cloned by map-based cloning and transposon tagging approaches . Therefore, mapping RGLs and RGAs to genomes and/or genetic maps is very important and will facilitate R gene cloning. In this study, 105 PvRGLs could be integrated into the common bean FPC physical map by comparison of PvRGLs to P. vulgaris BAC-end sequences. Additionally, we were able to anchor 237 PvRGLs to the common bean genetic map by using conserved syntenic blocks between common bean and soybean. The PvRGLs are broadly, but unevenly distributed among the 11 linkage groups of common bean with a strong tendency of clustering. For example, 41 PvRGLs, the largest number, were mapped to Pv8, while 17 of them clustered at the bottom of Pv8. David et al. (2008) also found that many specific R genes against various pathogens cluster together: for example, in the B4 resistance gene cluster, 73 BAC clones (FI159954 - FI160067) were identified by using a NBS probe PRLJ1 to screen a common bean BAC library . In the present study, PvRGL173 was anchored at the top of Pv4 by in silico mapping. Meanwhile, PvRGL173 shows high similarity (92.56% and 96.79%) with BAC-end sequence FI160023 (5e-153) and FI159996 (1e-101), respectively. PvRGL173 should be located in the B4 resistance gene cluster. The B4 resistance gene cluster contains various R genes resistant to different pathogens, such as Colletotrichum lindemuthianum (anthracnose), Uromyces appendiculatus (rust), and Pseudomonas syringae pv. phaseolicola (halo blight), in addition to QTLs for resistance to BGYMV and anthracnose; therefore further work will be needed to determine the role of PvRGL173. In Medicago, two superclusters of disease resistance genes were identified. One is located at the top of Mt3 containing 73 CNL and 9 TNL encoding sequences. The other is located at the bottom of Mt6 containing 57 TNL encoding sequences . Clustering of R genes facilitates the genetic variation of R genes and benefits in the evolution of new R genes. Bertioli et al. found that retrotransposons were associated with the evolution of some resistance gene clusters via analysis of the synteny among Arachis, Lotus, and Medicago . Several other hypotheses such as duplication, gene conversion, and unequal crossing-over have been proposed to elucidate R gene cluster and evolution [1, 38]. Therefore, further studies will be needed to better understand common bean R gene evolution.