Genome-wide association study and transcriptome analysis discover new genes for bacterial leaf blight resistance in rice (Oryza sativa L.)

Background Rice (Oryza sativa) bacterial leaf blight (BLB), caused by the hemibiotrophic Xanthomonas oryzae pv. oryzae (Xoo), is one of the most devastating diseases affecting the production of rice worldwide. The development and use of resistant rice varieties or genes is currently the most effective strategy to control BLB. Results Here, we used 259 rice accessions, which are genotyped with 2 888 332 high-confidence single nucleotide polymorphisms (SNPs). Combining resistance variation data of 259 rice lines for two Xoo races observed in 2 years, we conducted a genome-wide association study (GWAS) to identify quantitative trait loci (QTL) conferring plant resistance against BLB. The expression levels of genes, which contains in GWAS results were also identified between the resistant and susceptible rice lines by transcriptome analysis at four time points after pathogen inoculation. From that 109 candidate resistance genes showing significant differential expression between resistant and susceptible rice lines were uncovered. Furthermore, the haplotype block structure analysis predicted 58 candidate genes for BLB resistance based on Chr. 7_707158 with a minimum P-value (–log 10 P = 9.72). Among them, two NLR protein-encoding genes, LOC_Os07g02560 and LOC_Os07g02570, exhibited significantly high expression in the resistant line, but had low expression in the susceptible line of rice. Conclusions Together, our results reveal novel BLB resistance gene resources, and provide important genetic basis for BLB resistance breeding of rice crops. Supplementary Information The online version contains supplementary material available at 10.1186/s12870-021-03041-2.


Background
Rice (Oryza sativa) is providing approximately 20% dietary energy supply for world's people [1]. However, rice production worldwide is severely threatened by bacterial leaf blight (BLB), a plant disease caused by Xanthomonas oryzae pv. oryzae (Xoo) [2]. Xoo is now prevalent in ricegrowing areas of world, but due to its host-shifting capacity, this bacterium also threatens wheat production in both south America and Asia [3,4]. Damage from this disease has led to rice production losses of 20%-30%, reaching devastating levels of up to 80%-90% in India and Philippines [5,6]. Generally, this damage begins at the tillering stage, becoming more widespread as the incidence of disease increases with host plant growth peaking at the flowering stage [7]. The easiest way to prevent BLB is to apply chemical pesticides; however large-scale use of a variety of pesticides threatens the safety of rice food products. Additionally, because BLB spreads rapidly, such chemical control applied in a monsoon climate is ultimately unpractical, since once a BLB infestation occurs on a large scale, its effective control by pesticides is difficult if not impossible [8]. Therefore, developing and applying resistant rice cultivars is the most effective way to control this disease and ensure food security.
Most studies of BLB resistance have focused on a single resistant parent or bi-parental genetic mapping populations. To genetically map many agronomic traits and disease resistance loci in plants, genome-wide association study (GWAS) based on high-density single nucleotide polymorphisms (SNPs) and next-generation sequencing has been widely used [24,25]. For example, Li et al. (2015) detected 97 loci associated with resistance to stripe rust in wheat via GWAS methods; and genes conferring resistance to Verticillium dahlia were identified in cotton by GWAS [26]. For rice, researchers have used GWAS to detect 27 loci related to rice blast resistance [27]. Yet due to its insufficient marker density and linkage disequilibrium (LD), GWAS does not provide an accurate target gene at a given locus. Transcriptome analyses can overcome this limitation by detecting and distinguishing the expression of candidate genes of different genotypes. Recently, Wen et al. [28] identified a set of candidate genes associated with white mold resistance in soya bean by combining transcriptome and GWAS approaches.
In this study, 259 rice lines were inoculated with two Xoo races P3 and P6, respectively, to evaluate their BLB resistance. We performed a GWAS of BLB resistance using 2 888 332 high-confidence SNPs (missing data < 20%; minor allele frequency [MAF] > 1%). Building on this, we explored candidate resistance genes by analyzing the transcriptomes of the most resistant rice line NSIC RC154 and the most susceptible line CT 9737-6-1-3P-M, at five post-infection time points (0, 12, 24, 48, and 72 h). Our results allowed us to detect candidate genes linked to BLB resistance in these rice lines. This should provide important gene resource for improving disease resistance breeding in rice.
Population-structure and phylogenetic-tree analysis indicated that these 259 rice varieties contained three subgroups (K = 3): landrace indica cultivar (94), improved indica cultivar (146) and japonica (19) (Fig. 1B). According to the principal component analysis (PCA) of these 259 rice lines, the 29.54% genetic variation was explained on the first two PCs (Fig. 1C). These results indicated that the rice varieties used in this study harbor abundant genetic variation in the core germplasm of rice. The LD decay for 259 rice cultivars was estimated to be 194 kb (Fig. 1D), suggested the rice lines exhibited moderate LD [29].

Phenotypic variation among rice cultivars
All 259 rice lines were inoculated with two Xoo races (i.e., P3 and P6) at the booting stages, and the resistant phenotypes to each were investigated over 2 years. For the P3 Xoo race, the BLB disease severity index (DSI) in all lines ranged from 0.26% to 99.46% in 2018 (average = 37.94%) and from 2.82% to 100% in 2019 (average = 30.34%). For P6, the corresponding BLB DSI values were 1.46-85.32% (2018; average = 30.92%) and 2.49-97.84% (2019; average = 31.92%) ( Fig. 2A-D; Table S1). The wide range of BLB DSI values observed in the different rice lines (a 16-and 12-fold difference, respectively, for the two Xoo races) demonstrated substantial genotypic variability is associated with rice resistance to BLB. Additionally, the BLB incidence rate for the two races was normal ( Fig. 2A-D). Analysis of variance (ANOVA) of BLB DSI for both races revealed significant differences among genotypes, implying the presence of dominant loci conferring BLB resistance. Crucially, the level of BLB resistance was highest among the improved cultivars and lowest among the landraces (Fig. 2E, F). This latter result indicated that artificial selection has been successful in rice breeding applications.

GWAS for bacterial leaf blight resistance
Based on the 2 888 332 high-confidence SNPs, the GWAS of two resistance traits (each Xoo race was presumed to select for different plant trait) was conducted by a mixed linear model (MLM). For the GWAS analysis, phenotypic data of these two traits (in 2 years) and their best linear unbiased prediction (BLUP) values of each trait per year were used. A total of 196 and 164 SNP loci were significantly associated with resistance to P3 and P6, respectively (− log 10 P ≥ 5) ( Fig. 3; Tables S5, S6, S7, S8, S9 and S10). Intriguingly, 63 SNP loci of them were common to all traits. Next, the GWAS was applied to the selected 240 varieties of indica rice. Similarly, the phenotypic data of the two traits and their yearly BLUP values were used this analysis. Among significant association SNPs, when compared with results of the GWAS implemented for all 259 rice varieties, the strongest signals was also identified on Chr.7 ( Figure S1). However, some known loci did not    Figure S1). Accordingly, the 19 japonica rice varieties can contribute to the BLB GWAS, so the GWAS results of all 259 rice varieties were investigated further. Candidate genes were detected within 200 kb upstream and downstream of the significant associated SNP, according to the LD decay of the rice genome (Fig. 1D). A total of 2 081 genes (significant for 2 years and BLUP values) were detected for P3, and likewise 1 297 genes were significantly detected for P6. Of those, 954 genes were common to both resistance traits (Tables S5, S6, S7, S8, S9 and S10). The number of these genes was higher on Chr. 7 and Chr. 11, when compared with the other rice chromosomes (Table S5, S6, S7, S8, S9 and S10).
We summarized 41 previously reported fine-mapped QTLs or genes related to BLB resistance using Oryzabase database (http:// www. shigen. nig. ac. jp/ rice/ oryza base/ gene/ list). To further confirm these significant genes or SNPs found associated with BLB, the results were compared with those of 41 previously reported fine-mapped QTLs or genes. Of the loci detected in the P3 strain, the sixteen previously QTLs fine-mapped QTLs or genes were identified ( Fig. 3A-C). For the loci that were detected in the P6 strain, seven SNPs were adjacent to previously reported QTL or genes ( Fig. 3D-F). This confirmed that our results were reliable, has a strong potential for more deeply exploring novel resistance genes in rice.

Comparison of the transcriptomes of resistant NSIC RC154 and susceptible CT 9737-6-1-3P-M lines of rice
Because of the strong population structure and large extent of LD in rice, GWAS-identified loci often fall within gene deserts or in regions with many equally plausible causative genes. This makes it difficult to robustly identify functional genes. Combining the results of GWAS with RNA-seq data has been used to detect the function genes.
Furthermore, previous reports also indicated that genes that were present different expression levels in different resistance lines are most likely to be association with disease resistance [28].
The resistant line NSIC RC154 and the susceptible line CT 9737-6-1-3P-M exhibited a substantially different response to the Xoo inoculation; the number of differentially expressed genes (DEGs) in NSIC RC154 (11 674; 3 167 up-regulated and 8 507 down-regulated) (Table S12) (Table S13). Among these DEGs, although 8 851 DEGs of them were common by the two rice varieties at different time points, 2 817 DEGs were only found in the resistance line NSIC RC154 (Table S14). Therefore, those latter genes may figure prominently in conferring resistance to BLB. At earlier stages (12 h and 24 h) of inoculation, 9 023 DEGs (2 477 up-regulated, and 6 546 down-regulated) were detected in the resistant line NSIC RC154, compared with 8 040 DEGs (2 201 up-regulated, and 5 839 down-regulated) in the susceptible line CT 9737-6-1-3P-M. When challenged with the pathogen, NSIC RC154 harbored more DEGs than did CT 9737-6-1-3P-M at the early stages. This suggested that NSIC RC154 reacts more strongly than CT 9737-6-1-3P-M to pathogen attack and invasion. We presumed that NSIC RC154 might have numerous resistance-related genes that are primed for a quick response to pathogen infection.

Expression analysis of the GWAS-identified genes
Among the 954 genes found significantly associated with resistance against the two Xoo races, 161 (36 upregulated, 125 down-regulated) were DEGs in the resistant rice line NSIC RC154, and 197 (64 up-regulated, 140 down-regulated) were DEGs in the susceptible rice line CT 9737-6-1-3P-M. Furthermore, a set of 109 DEGs that underwent significant differential expression between resistant and susceptible rice lines was detected (Fig. 4A). Gene Ontology (GO) analysis revealed a stark enrichment of DEGs in several functional categories, namely signal transducer activity, purine nucleoside binding, receptor activity, motor activity, and molecular transducer activity (Fig. 4B). The KEGG (Kyoto Encyclopedia of Genes and Genomes) annotations indicated that the pathways enriched with these DEGs were closely related to phenylpropanoid biosynthesis, biosynthesis of secondary metabolites, photosynthesis, diterpenoid biosynthesis, as well as the biosynthesis of stilbenoid, diarylheptanoid and gingerol (Fig. 4C). Elucidating the detailed mechanisms of BLB resistance based on these results is difficultly; however, these DEGs were great importance as they most likely as candidate genes for enhancing resistance to BLB.
Furthermore, among those 109 DEGs with significant differential expression levels between resistant and susceptible rice lines, nearly half (45) were expressed more in the resistant than susceptible line (Fig. 4A). These 45 genes contain a large number of defense-associated protein encoded genes, such as the NBS-LRR disease resistance protein encoding the genes LOC_Os11g44960, LOC_Os11g45050, LOC_Os07g02570, LOC_Os11g44990, LOC_Os07g02620, and LOC_Os07g02560. It is known that the NBS (nucleotide-binding state)-LRR (Leucine-rich repeat) disease resistance protein is critically involved in plant-pathogen interactions, and its NBS play a key role in regulate activity of this protein [30]. Perhaps more importantly, of the six NBS-LRR disease resistance protein-encoding genes, LOC_Os07g02570, LOC_ Os11g44990, LOC_Os07g02620, and LOC_Os07g02560 displayed induced up-regulation in the resistant line at 24 hpi, whereas no significant changes in those four genes were detectable in the susceptible line after the Xoo inoculation. Reactive oxygen species (ROS) is important in plants by triggering plant basal defense responses [31]. The peroxidase precursor-encoding gene, LOC_Os07g02440, was up-regulated in the resistant line at 24 hpi, but not induced in the susceptible line at all time points tested (Fig. 4A). The expression of this gene implies that ROS could be crucially involved in marshaling defense against the Xoo pathogen early in its infection of rice.
Moreover, the natural-resistance-associated macrophage protein (NRAMP)-encoding gene LOC_Os07g15460, the glycosyl hydrolase family 3 protein-encoding gene LOC_Os11g44950, and the cytochrome P450-encoding gene LOC_Os02g17760 all underwent significantly higher expression in the resistant line compared with the susceptible line. According to other research, iron is a key element for most living organisms, and pathogens are likely to compete with their hosts for its acquisition [32]. The bacterial plant pathogen Dickeya dadantii depends strongly on its siderophore-mediated iron uptake system for systemic disease progression on several host plants, including Arabidopsis thaliana [33]. In rice plants, several metal ions such as Zn 2+ , Mn 2+ , Fe 2+ , and Cd 2+ are transported via NRAMP transporter proteins [34]. In our study, the NRAMP-encoding gene LOC_Os07g15460 was found This gene may thus contribute to resistance to BLB via iron transfer. Furthermore, three protein kinase-encoding genes (LOC_Os11g44660, LOC_Os11g44560, and LOC_ Os07g02450) were down-regulated in the resistant line by the Xoo inoculation, but their expression levels were not induced in the susceptible line's transcriptome at any time point examined. Further research should try to functionally validate effects of these genes, which is needed to reveal the molecular mechanisms of complex BLB resistance traits in rice.

Expression validation of candidate genes by qRT-PCR
The seven candidate genes were selected for verify the RNA-Seq data. These consisted of six encoding NBS-LRR disease resistance protein (LOC_Os11g44960, LOC_ Os07g02560, LOC_Os07g02570, LOC_Os11g44990, and LOC_Os07g02440) and one protein kinase gene (LOC_ Os11g44660) (Fig. 5). The threshold cycle (Ct) values of each gene were normalized relative to those of the UBQ gene (internal control). The relative expression levels of these genes were detected using qRT-PCR method and compared with transcriptome results. The results indicated that these genes expressed differently in the resistance line after the Xoo P6 strain inoculation, in way that was consistent with the RNA-Seq data (Fig. 5).

Identification of new BLB resistance genes by haplotype and expression analyses
In this study, the strongest signals identified on Chr.7 were novel (Fig. 3), and the region of SNPs located on Chr.7 contained many genes that encode NBS-LRR resistance proteins. The candidate region was estimated using pairwise LD correlations. We focused on the locus mapped from 0.747 to 0.981 Mb with 58 candidate genes (Fig. 6A). Among these genes, the transcriptome data uncovered two NLR protein-encoding genes, LOC_Os07g02560 and LOC_Os07g02570, which went significantly high expression in the resistant line but had low expression in the susceptible line of rice (Fig. 6B). To confirm the functioning of LOC_Os07g02560 and LOC_Os07g02570 in BLB resistance, six resistant and six susceptible varieties were selected to examine the expression levels of LOC_Os07g02560 and LOC_Os07g02570 after inoculation with Xoo P6. Expression of these two genes was higher in the resistant than susceptible varieties at 24 hpi (Fig. 6C). We also analyzed the sequences difference of these two candidate genes in resistant (NSIC RC154) and susceptible (CT 9737-6-1-3P-M) varieties. We found four SNPs (Chr. Chr.7_926479, A-G) in coding region were detected. These results indicated those two genes are significantly associated with resistance to BLB, and so they may be promising candidate resistance genes for this disease.

Discussion
For 259 rice accessions, we evaluated their resistance levels to BLB over 2 years. The results showed that not all rice sub-populations were equally resistant. Moreover, among the resistant levels of the 259 rice lines, there was a remarkable disparity between the two Xoo pathogen races tested, which indicated that the R genes with specific resistance were only carried by certain rice varieties. This finding is consistent with those of a smaller study by Zhang et al. [35], in which the level of indica rice resistance differed significantly among six Xoo races, with the latter divided into three groups based on the lesions' size (length values). Furthermore, the resistance level of each line could be inferred from their incidence rate: < 10%  Error bars represent standard errors from four biological replicates (*, P < 0.05; **, P < 0.01; ***P < 0.001) possibility of selecting appropriate materials for accelerating BLB-resistance breeding and the genetic study of rice.
GWAS is an important approach for detecting the function genes of complex traits. It has been used to detect new genes associated with resistance to diseases and important agronomic traits in plants. In maize, for instance, Li et al. [36] used GWAS to illuminate the role of ZmFBL41, which encodes an F-box protein, in that corn plant's resistance to sheath blight. In this study, we identified 63 BLB resistance loci, containing 954 significantly associated genes, though a GWAS of 2,888,332 high-confidence SNPs. The five significant SNPs found, namely Chr. 12_17579641, Chr. 11_28745675, Chr. 11_21379864, Chr. 08_27163888, and Chr. 11_17915331 were respectively located near the cloned R genes xa25 [37], Xa26 [14], Xa21 [38], Xa23 [39], Xa33 [40], xa13 [16], and Os-11N3 [41] (Fig. 3). These results indicate that a relatively high resolution of GWAS is attainable when using a relatively large population, which considerably strengthened the investigation of genetic diversity and generation of a high-density SNP map for rice. By successfully combining the GWAS with transcriptome data to discover BLB resistance candidate genes in rice, novel loci were identified on chromosomes 7 (Chr.7_707158) that were significantly associated with rice resistance to both P3 and P6. Our data provide important information for future gene function studies of BLB resistance. It is anticipated these findings can serve as a robust reference for function gene discover on complex traits in rice and other plant species.
Compared with the susceptible line CT 9737-6-1-3P-M, the resistant line NSIC RC154 had more upregulated genes at early time points following the pathogen inoculation. Furthermore, in combining the GWAS and transcriptome data, 109 significant associations with BLB-resistance DEGs were obtained. After assigning these genes to KEGG functional annotation, the pathways of phenylpropanoid biosynthesis, biosynthesis of secondary metabolites, photosynthesis, diterpenoid biosynthesis, and stilbenoid, diarylheptanoid, and gingerol biosynthesis were found to be all enriched. Phenylpropanoid compounds play key roles in plant defense, ranging from constitutive or inducible physical and chemical barriers against pathogenic infections, to acting as signal molecules involved in local and systemic signaling for the induction of one or more defense genes [42]. The enriched phenylpropanoid biosynthesis suggests that secondary defense metabolites figure prominently in early resistance to BLB in rice. Other pathways, however, such as diterpenoid biosynthesis and photosynthesis, likely also participated in plant disease resistance. For example, Oscyp71Z2 governs BLB resistance by regulating the biosynthesis of diterpenoid phytoalexin [43]. Our results suggest that rice resistance to BLB is a rather complex trait, in that it depends on a well-coordinated and activated network of multiple defense pathways.
Many BLB resistance genes have been searched for and applied to rice breeding [44]. Although some Xoo resistance genes are now known, most are specific resistance genes; e.g., xa25, xa26, and Xa1 [30,45,46]. Therefore, given the diminished plant resistance to BLB caused by evolving Xoo populations, it is imperative we find new genes conferring resistance traits and combine them with known resistance genes to develop durable and sustainable resistant lines of rice. In this study, multiple NLR genes were found localized at the Chr.7_707158 locus, where several NBS-LRR genes are also clustered. Through the haplotype and expression analyses, an NBS-LRR gene cluster (LOC_Os07g02560 and LOC_Os07g02570) that confers broad-spectrum resistance in rice to both Xoo races was uncovered. But whether all these mutations in fact drove the loss of plant resistance to Xoo requires further careful investigation. Furthermore, plant genomes often will encode several hundred NLR proteins that are involved in defense responses, some of which occur in clusters at specific loci following gene duplication and amplification events. Previous reports indicate that genetically linked NLR genes may act together to recognize a pathogen's avirulent effectors, such as RPS4/RPS1, RGA4/RGA5, and Pikp-1/Pikp-2. In such pairs, one gene functions as a "sensor" that perceives pathogen effectors while the other is a "helper" required to activate immune signaling [47][48][49]. Similarly, LOC_Os07g02560 and LOC_Os07g02570 were co-localized in an LD block. Yet whether these two NBS-LRR genes in the R-cluster are subject to the same regulatory mechanism as RPS4/RPS1, RGA4/RGA5 or Pikp-1/Pikp-2 is unknown and merits further study.

Conclusions
We integrated the GWAS and transcriptome results of our study to provide some new, useful gene resources against bacterial blight in rice. Two candidate genes LOC_Os07g02560 and LOC_Os07g02570, were thus obtained. These findings should provide reliable targets for assessing candidate genes for use in the breeding of BLB resistance. More work remains to be done, however, to verify which additional genes underpin resistance to BLB in rice.

Plant materials and phenotypic evaluation
A total of 259 rice lines were used in this study. These varieties were collected from different country, including Senegal, China, Malaysia, Brazil, Colombia, and Mexico.
Of these, 146 lines were provided by the International Rice Research Institute (IRRI) while the other 113 had been preserved by the Rice Research Institute of Sichuan Agriculture University, China. Information of 259 rice lines can be found in Table S1. To evaluate BLB resistance, the seeds of all 259 rice varieties were sown in a greenhouse. Then transplant to an experimental field (at Sichuan Agriculture University) after 30-day-old, with 10 plants per row.
We used two representative Xoo strains P3 and P6 to artificially inoculate plants. The strains were cultivated separately on peptone dextrose agar (PDA) medium for 2 days at 30 °C; each Xoo race was suspending using sterile water at a concentration of 10 8 cells ml −1 as inoculum. At the rice tillering stage, 15 of the uppermost leaves of each rice variety were infected with the two Xoo races, using the leaf-clipping method [50]. Lesion lengths were measured at 14 days after inoculation, when lesions were easily visible. A BLB disease score was recorded for each line, as the lesion length divided by the leaf length. The average disease score of each variety was calculated based on 15 individual leafs. Data were processed with Microsoft Excel 2010. Statistical analysis of BLB scores among different rice varieties or sub-populations was done using ANOVA followed by Dunnett's multi-comparison test in SPSS v16.0 (IBM Corp., Armonk, USA).

DNA extraction and sequencing
Young leaves of 21-day-old seedlings of each rice variety were sampled to extract their genomic DNA. The cetyl trimethyl ammonium bromide (CTAB) method was used to extract genomic DNA [51]. The purity and concentration of DNA was determined, respectively, through a Nano-Photometer spectrophotometer (IMPLEN, CA, USA) and a Qubit DNA Assay Kit with a Qubit 2.0 Fluorometer (Life Technologies, CA, USA). The DNA samples of all 259 rice varieties were first fragmented by sonication to 350 bp fragments. These DNA fragments were sequencing after end-polished, A-tailed, and ligated to full-length adapters. Next, the raw sequences having a 150-bp read length were obtained. Among raw data, those reads that contained adapter sequence stretches of -Ns, or had low quality scores were deleted. The high quality paired-end reads were mapped to the Nipponbare rice genome (ftp:// ftp. ensem blgen omes. org/ pub/ plants/ relea se_ 36/ fasta/ oryza_ indica/ dna/) by Burrows-Wheeler Aligner software tool with the command "mem -t 4 -k 32 -M" [52]. After their alignment, genomic variants in GVCF format for each accession were identified by the Haplotype Caller module and GVCF model in, Genome Analysis Toolkit (GATK) software [53]. All the GVCF files were then merged together. A raw genotype file was then filtered by these parameters: depth for each individual ≥ 4; minor allele frequency (MAF) ≥ 0.01, genotype quality for each individual ≥ 4; and; a miss rate ≤ 0.2. In this way, a total of 2 888 332 SNPs were obtains, and further annotated using ANNOVAR software (v2013-05-20) [42]. These SNPs were divided into five groups according to their annotations: CDS SNPs, upstream SNPs (those positioned within 1 kb of the transcription start site), intergenic SNPs, downstream SNPs (located within 1 kb of the transcription stop site), and intron SNPs.

PCA, population structure, and LD analysis
The neighbor-joining (NJ) tree was built through the P-distance, by using the 2,888,332 SNPs in the Tree Best software (v1.9.2) and 1000 bootstrap replications [54]. The population structure of 259 rice lines was identified by the program ADMIXTURE (v1.23) [55], with a K-value ranging from 2 to 3. A principal component analysis (PCA) was then carried out using GCTA software [56]. To do this, the genetic relationship matrix was first calculated through the parameter "-make-grm"; the top three principal components were obtained by the parameter "-pca3". To identify the LD of this rice population, the squared Pearson correlation coefficient (r 2 ) between pairwise SNPs was calculated through the "Pop-LD-decay" software tool [57], whose program parameters were set to "-MaxDist 1000 kb-MAF 0.05 -Miss 0.1". The average r 2 value was calculated for pairwise markers in a 1-kb window, and these values averaged across the whole rice genome.

Estimation of breeding value
The breeding values were calculated by BLUP (best linear unbiased predictor), using the "lme4" package in the R computing platform (v. 3.2.2) [58], as follows: where, the Y, μ, Line, and Loc are respectively the phenotype, intercept, variety effects, and environmental effects. The Rep is the number of replications, and ε indicated the random effects; the Line × Loc term denotes the interaction between variety and environment, while Rep × Loc is the interaction between replication and environment.

GWAS analysis
Only those SNPs with a sequencing depth ≥ 4, missing rate < 0.2 and MAF ≥ 0.01 were used in the GWAS, with the latter analyzed using the EMMAX (beta version) software package [59]. The matrix of pairwise genetic distances, which calculated using EMMAX, formed the variance-covariance matrix of random effects.

Transcriptome analysis
To further identify candidate resistance genes positioned around the GWAS-identified loci, the resistant rice variety, NSIC RC154, and the susceptible variety CT 9737-6-1-3P-M (both confirmed) were grown and inoculated with the more virulent Xoo race P6 in a greenhouse by the leaf-clipping method [50]. From each rice variety, leaves sample were obtained at 12, 24, 48, and 72 hpi, respectively, and each treatment has three replicates. Control samples of non-inoculated, fresh leaves of seedlings at 12 h were also collected. Place all leaf samples in liquid nitrogen and stored at -80℃ for their RNA isolation. Total RNA was isolated with the Plant Total RNA Isolation Kit (Sangon Biotech, Shanghai, China), according to the manufacturer's instructions. We used the NEBNext Ultra ™ RNA Library Prep Kit for Illumina (NEB, USA) for RNA-Seq libraries construction. The Illumina Hi-Seq platform was used sequencing, and 125-bp paired-end reads were generated. Among raw data, the reads having a low quality score and those containing adaptor sequences and stretches of -Ns were removed. An index of the Nipponbare rice reference genome was built using Bowtie v2.2.3, to which the above paired-end reads were aligned using TopHat v2.0.12 [60][61][62]. To count the number of reads mapped to each gene, HTSeq v0.6.1 software was used [63]. The expression value of each gene was present based on FPKM (fragments per kilobase of transcript sequence per million) that calculated using Cuffdiff software (v2.2.1).
The differential expression analysis of two treatments (each treatment contains three biological replicates) was carried out in R, using the "DESeq" package (v1.18.0) [64]. Differential expression levels of gene in the two treatments sample comparisons were determined based on the negative binomial distribution. Benjamini and Hochberg's approach was used to adjust P-values for controlling the false discovery rate (FDR). Genes with the |log twofold change |> 1 and adjusted P-values of < 0.05 were designated as differentially expressed [65].

Haplotype analyses
Haplotype blocks were distinguished by using the confidence interval method [66], and Haploview software [67]. For this, the Hardy-Weinberg P-value cut-off was set to 0.001, with a MAF of 0.05.

qRT-PCR
Relative expression levels of seven candidate genes were investigated in rice plants by qRT-PCR. Total RNA extraction and reverse transcription were performed as described previously. The PCR reactions were using 20µL volume, which contained cDNA template 3 µL and forward and reverse gene-specific primers 0.8 µL, respectively. Each PCR set four replicated times. The ubiquitin (UBQ) gene was used as an internal control for data normalization. The 2 −∆∆Ct method was used calculating gene expression levels. The primers used in this experiment are provided in Table S15.