Genome-wide identification of nucleotide-binding domain leucine-rich repeat (NLR) genes and their association with green peach aphid (Myzus persicae) resistance in peach

Resistance genes (R genes) are a class of genes that are immune to a wide range of diseases and pests. In planta, NLR genes are essential components of the innate immune system. Currently, genes belonging to NLR family have been found in a number of plant species, but little is known in peach. Here, 286 NLR genes were identified on peach genome by using their homologous genes in Arabidopsis thaliana as queries. These 286 NLR genes contained at least one NBS domain and LRR domain. Phylogenetic and N-terminal domain analysis showed that these NLRs could be separated into four subfamilies (I-IV) and their promoters contained many cis-elements in response to defense and phytohormones. In addition, transcriptome analysis showed that 22 NLR genes were up-regulated after infected by Green Peach Aphid (GPA), and showed different expression patterns. This study clarified the NLR gene family and their potential functions in aphid resistance process. The candidate NLR genes might be useful in illustrating the mechanism of aphid resistance in peach. Supplementary Information The online version contains supplementary material available at 10.1186/s12870-023-04474-7.

two layers of immune system, called pattern-triggered immunity (PTI) and effector triggered immunity (ETI) [1].PTI is induced when surface pattern-recognition receptors (PRRs) bind pathogen-derived molecules at the plasma membrane.ETI generally is induced inside cells, when pathogen virulence factors (known as effectors) are recognized by NLR receptors, thereby inducing immune responses [2][3][4].
NLR genes are the most important R genes in plants [5].The proteins encoded by these genes are highly similar and usually have three conserved domains: the NBS domain with the core in the middle, the LRR domain with different numbers at the C-terminal and the variable domain at the N-terminal [6].In angiosperms,

Introduction
The plant innate immunity system ensures normal growth during pathogen infection [1].Plants have evolved cell surface and intracellular receptors that can recognize pathogen-derived chemicals or molecules.There were NLR genes were mainly characterized into two categories: TIR-NBS-LRR (TNL) and non-TIR-NBS-LRR (nTNL), which is also known as CC-NBS-LRR (CNL) [6].Recently, some other types of domains were identified, such as resistance to powdery mildew (RPW8) domain, which consist of Transmembrane-Coiled-coiled (TM-CC) domain.[7].It was reported that the NLR gene encoding the TM-CC domain was not directly involved in the recognition of specific pathogens, but participated in downstream signaling pathway of disease resistance process.For example, NRG1 (DQ054580.1) in tobacco and ADR1 (AT1G33560.1) in Arabidopsis thaliana can both regulate the accumulation of the defense hormone salicylic acid during the immune response, and ADR1 can also be used as "auxiliary NBS-LRR" to transduce specific NBS-LRR receptors during ETI [6,8].P-kinase, Hydrolase and Duf676 are new domains found in the N-terminal of R protein, which were identified in the genomes of Physcomitrella patens, Marchantia polymorpha, and sphagnum fallax respectively, but the functions of these bryophyte-specific NLR subclasses have not yet been explored [9][10][11].
The NB-ARC structure (NBS) domain, belongs to the signal transducing ATPase multi-structural domain (STAND) superfamily [12], which has function in binding and hydrolyzing ATP [13,14].In Arabidopsis thaliana, it was identified that the NBS domain usually contains 8 conserved motifs [15], including P-loop, RNBS-A, kinase2, RNBS-B, RNBS-C, GLPL, RNBS-D and MHDV.These motifs are all conserved in the NBS domain of other species [16].Kinase 2 may be an important regulator of ATP hydrolysis, and P-loop, GLPL and MHDV, which may be involved in the regulation of nucleotide binding.The mutation of aspartate in MHDV region of tomato I-2 resulted in continuous activation [17].In the P-loop region of RPM1 and other NLR genes in Arabidopsis thaliana showed that the proteins were inactivated [18].The leucine rich repeat (LRR) domain is more polymorphic than the NBS domain, which is composed of 20-30 leucine rich residues and forms β chain α Spiral structure [19].Therefore, LRR domains are often involved in protein-protein interactions.Some pathogens, including Listeria and Streptococcus, can integrate into host cells by encoding proteins with LRR domains [20].
Plants rely on NLR protein to respond to invasive pathogens and activate the immune response, so as to obtain resistance to bacteria, viruses, nematodes and pests [21].In previous studies, many NLR proteins that are resistant to pests have been proved in different plants.For example, Rpi-blb2 confer broad-spectrum resistance to pathogen isolates in potato [22].Mi-1.2 is similar to Rpi-blb2, it has specific resistance to root knot nematodes and aphids in tomato [23].In gramineous plants, the resistance of wheat to aphids was dominated by Adnr1 [24].The RMES1 locus which contains five NLR genes on sorghum genome were predicted, and proven resistance to Melanaphis sacchari [25].In addition, the Dp-fl locus, which confers resistance to Dysaphis plantaginea contains 19 genes acting as R-genes, 2 of which are NLRs in Malus pumila [26].
Peach is the fourth largest deciduous fruit crop in the world and has valuable nutrition [27].Green Peach Aphid (Myzus persicae, GPA) is the most harmful pest during peach production.It can stab and suck the new shoots and leaves, resulting in curling leaves, growth limitations.It can also secrete honeydew to spread viruses between species.In the last decades, several genetic loci conferring resistance to aphids have been identified and mapped on peach genome.Most of genes belong to the resistance genes encoding NLR proteins [28].In peach, a strong candidate gene responsible for the dominant GPA resistance in Rm3 locus were identified [29].However, the regulatory mechanism and other NLR genes in response to GPA infestation were still unknown.
In this study, we analyzed the NLR gene family in peach.A total of 286 NLR genes were identified, and their chromosome location, phylogenetic relationship, gene structure, conserved domains and promoter ciselements were analyzed.Transcriptome analysis showed that the expression of 22 identified NLR genes was significantly up-regulated after GPA infestation.The results would provide a basis for further study on the function of NLR genes in aphid resistance.

Identification and distribution of NLR gene family in peach
The NLR genes in peach genome were identified according to the NBS and LRR domain.Firstly, the 195 NLR genes in Arabidopsis thaliana were used as queries to find out the candidate genes in peach using the NCBI-Blastp toolkit.Then, their protein domains were further analyzed, especially the number of NBS and LRR domain.Finally, 286 NLR genes were selected in this study, which showed at least one NBS and LRR domain.These NLR genes are unevenly distributed on peach chromosomes, most of which are present on chr.1 (14.3%), chr.2 (25.52%) and chr.8 (27.27%) (Fig. 1).

Phylogenetic relationships, domains, motifs and number of exons of peach NLR gene family
To uncover the evolutionary relationship of the peach NLR genes, a neighbor-joining (NJ) phylogenetic tree was constructed using protein sequences of peach NLRs and 20 reported NLR genes in Arabidopsis thaliana, The results showed that the peach NLR genes could be divided into four subfamilies (I-IV), which included 153, 104, 11, 18 peach NLR genes respectively (Fig. 2).
According to the differences in N-terminal domain, the subfamilies I-III were mainly characterized as CNL, TNL and RNL respectively, although some NLRs without N-terminal domain were also clustered in subfamily I and II (Fig. S1).By further checking the whole sequences of these NLRs, we found that the N-terminal conservative domain was not completely deleted, resulting in incomplete CC or TIR domains (Fig. S2).The subfamily IV contained NLRs without N-terminal domain.Phylogenetic analysis suggested that CNL, TNL and RNL were all derived from subfamily IV, which was consistent with the previous study [15].
Gene structure analysis of NLR gene family showed that peach NLR genes contained many Exon and UTRs, and there were significant differences among different subfamilies.The average numbers of Exon and UTRs of these NLR genes was 4.69 and 4.47.Besides, the numbers in subfamily I (3.31, 5.80) was mostly less than II (6.16, 4.19) (Table S2), while multiple exons were identified in subfamilies II, III.In contrast, the gene coding sequence of subfamily I and IV was simpler than the others.In addition, the smallest gene (Prupe.4G236500)has 3 Exons and no UTRs, much simpler than the longest (Prupe.2G118000)(4 Exons and 3 UTRs).(Fig. S3).

Gene duplication and collinearity analysis
In order to further clarify the expansion and evolution of peach NLR genes and gene duplication events were investigated.Totally, 9 pairs of homologous gene on peach genome (Prupe.1G389500/Prupe.7G138500,Prupe.1G541300/Prupe.8G077100,Prupe.2G055200/Prupe.2G066600,Prupe.2G057100/Prupe.2G068000,Prupe.2G040500/Prupe.2G053700,Prupe.2G043000/Prupe.2G504200,Prupe.2G045200/Prupe.2G055200,Prupe.2G055200/Prupe.2G068900,Prupe.2G057100/Prupe.2G068000) were identified, indicating duplication was a major mode of gene expansion (Fig. 3A).In addition, we also constructed the collinearity of the NLR genes in the peach, Arabidopsis thaliana and Prunus armeniaca.A total of 6 pairs of NLR genes were identified between Arabidopsis thaliana and peaches, and 56 pairs of NLR genes also were identified between Prunus armeniaca and peach (Fig. 3B).This result showed that NLR genes in peach and Arabidopsis thaliana had relatively low homology, but high homology in Prunus armeniaca and peach.NLR copy number varies greatly across different species [30,31].Our results have shown that species with distant evolutionary relationships have much lower homology in NLR genes compared to plants of the same genus.Positive selection has been found in NLR genes, which contributed to make the NLR family is one of the most variable gene families in plant genomes [32,33].

Subcellular localization of peach NLRs
NLR has been shown to be expressed in the nucleus during effector induced activation in some plant species [34][35][36].For example, in the presence of homologous powdery mildew effector Avra10, CNL and MLA10 were transferred to the nucleus and interacted with WRKY and MYB6 transcription factors to further activate the defense response in barley [37].However, a number of recent studies have demonstrated that coordinated nucleo-cytoplasmic transportation of plant NLRs is required for the full activation of defense response, suggesting that a single NLR protein may activate distinct signaling pathways in the cytoplasm and nucleus [38].Subcellular localization of peach NLR proteins were predicted using an online tool (https://www.genscript.com/wolf-psort.html).A total of 1289 results were predicted, including 20% in cytoplasm, 17% in plasma membrane, 15% in chloroplast, and relatively few in other organelles (Fig. 4A).In this study, three different types of NLR genes (CNL: Prupe.2G274900,TNL: Prupe.6G152300,RNL: Prupe.7G138800), which had the closest phylogenetic relationship with three types Arabidopsis thaliana NLR genes respectively were cloned into the pCAMBIA1300 vector fused with GFP reporter.The results showed that all three types of NLR were localized both in nucleus and cytoplasm (Fig. 4B), which was consistent with previous study.

Promoter element analysis of peach NLR genes
The cis-elements in the promoter sequences of 286 NLR genes were predicted using PLANTCARE database (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/).Totally, 14 types of cis-elements were mainly enriched in these promoters, including 5 types of plant hormones response elements (ABA, GA, MeJA, IAA, SA), 3 types of stress response elements (defense and stress, low-temperature, wound-responsive element) and  S4).Among the total elements, plant hormone elements accounted for 35.3%, stress response elements accounted for 48.6%, and growth related elements accounted for 16.1%.In addition, heat map showing the number of cis-elements in each NLR genes was further constructed, and the results showed that the most enriched cis-element was light.Hormone associated element were also greatly enriched in these promoters, such as MeJA, ABA, SA, which indicated that NLR might participate in stress triggered signaling pathways.However, no significant differences in the number and distribution of promoter elements between different subfamilies were found (Fig. 5).

NLR genes response to aphids and its expression patterns in different tissues
Among the three dominant loci, a strong candidate gene responsible for the dominant GPA resistance in Rm3 locus were identified [29].However, there is little known about the underlying genes of NLR genes.In this study, to understand the expression patterns of peach NLR genes during aphid infestation, transcriptome analysis was carried out again using the published data [29].Twentytwo NLRs were significantly up-regulated after aphid infestation.Among them, 8 genes (Prupe.1G217900,Prupe.1G389500,Prupe.1G545200,Prupe.3G016700,Prupe.5G256000,Prupe.6G243400,Prupe.7G138600,Prupe.8G027300)showed much higher expression levels than the others (Fig. 6A).Tissue specific analysis of these 22 NLR genes were performed, which showed that they were highly expressed in leaf, stem and root, but little in fruit.This result was consistent with their function in disease and insect resistance (Fig. 6B).
In order to clarify the role of these 22 NLR genes in aphid resistance process, leaf samples at 0 h, 3 h, 6 h, 12 h, 24 h, 48 h after aphid infestation and were collected.As a control, the leaf samples at 0 h, 3 h, 6 h, 12 h, 24 h, 48 h without any processing were also collected.The expression levels were determined by qRT-PCR.The results showed that, most of genes were highly expressed at 3 h (Prupe.1G217900,Prupe.2G060400,Prupe.2G283300,Prupe.4G224500,Prupe.5G019000,Prupe.7G065500,Prupe.7G138600,Prupe.7G139100,Prupe.8G023100,Prupe.8G023800), and then rapidly decreased to the normal level.Only a few genes showed lower expression (Prupe.2G283200,Prupe.3G016700,Prupe.8G027300).The expression of most genes not infected by aphids did not change significantly with time, only 3 genes (Prupe.2G283200,Prupe.2G283300,Prupe.3G016700) were highly expressed (Fig. 6C).Furthermore, most of the genes showed higher expression levels at the early stage after aphid feeding and then declined to the normal level (Fig. 6C).Such expression pattern might be an appropriate manner for plant immune system to ensure plant self-protection.

Discussion
Plant immune system play great roles in protecting cells or tissues from pathogen infection through PTI and/or ETI pathways [39,40].Insects could produce and release salivary proteins into host cells to further activate ETI system, such as aphid or brown planthopper [21,41].Over the past few decades, NLRs were isolated from plants, which could resist various pathogens, including bacteria, fungi, viruses, nematodes and insects [12].Peach is cultivated worldwide, due to the high nutrition and economic value.However, most of peach cultivars are susceptible to aphids, especially GPA [27].Besides, the NLR genes in the peach genome have not been systematically analyzed and classified.In this study, a total of 286 NLR genes were identified on peach genome.The bioinformatics analysis and expression of NLR genes during aphid infestation process were analyzed, which provided a foundation for further study on illustrating the mechanism of aphid resistance.
The analysis of the distribution of NLR gene showed that most of them were clustered on the chromosome and clustered at a small area of the chromosome (Fig. 1).For example, dozens of NLR genes are only concentrated in two to three positions on Chr.1, Chr.2, Chr.3, Chr.6, Chr.7 and Chr.8.In previous study, The distribution of NLR genes on chromosomes showed that most of NLR genes exist in clusters and only a few genes exist in single gene loci, which were consistent with the analysis in Arabidopsis thaliana [15].There are two mechanisms for the formation of NLR gene clusters: one is a gene cluster formed by multiple tandem replications of ancestral genes.Such gene clusters composed of closely related genes are considered to be homogeneous clusters.The other is the gene cluster formed by genes with distant genetic relationship or even belonging to different categories (TNL and nTNL, respectively) clustered in adjacent positions due to various mechanisms, such as translocation or ectopic replication [42,43].
The NLR genes have been reported in many plant species, such as Vitis vinifera (535), Oryza sativa (508), Glycine max (429), Solanum tuberosum (438), Populus (416), Gossypium spp (355) [44,45].Phylogenetic analysis showed that the 286 identified NLR genes were divided into four subfamilies according to the differences in N-terminal domain, which were consistent with previous reports (Fig. 2) [15].According to the results of phylogenetic tree and conservative domain analysis, the NLR genes contained 93 TNLs, 134 CNLs, 11 RNLs and 48 NLs.Among them, NL is divided into NBS TIR -LRR, NBS CC -LRR and NBS-LRR (Table 1), which is caused by the deletion of CC and TIR domain at the N-terminal [15].Similar to soybean, the TNL genes in peach are much less than those of CNL, because the evolution rates of TNL and CNL are different [46].
The localization of plant NLR proteins might be associated with the localization of effectors [47].In general, the activation of NLR proteins occurs in the nucleus.For example, in the presence of powdery mildew effector AvrA10, the CNL protein MLA10 in barley needs to be transferred into nucleus to interact with WRKY and MYB6 to further activate the downstream defense signaling [34,37].In addition, some plant NLR proteins were not located in nucleus during resistance response process.For example, a CNL protein recognizing potato virus X (PVX) was located in both nucleus and cytoplasm [38,48].In present study, subcellular localization of three representative genes in subfamilies I-III showed that CNL, TNL and RNL could localized in both nucleus and cytoplasm (Fig. 4).Pathogen recognition and resistance occurred in cytoplasm [45,49], indicating their potential function in pathogen resistance.
Most of plant immune responses are accompanied with the release of phytohormones [47].Salicylic acid (SA) participates in the process of ETI, possibly through the activation of genes involved in cell death [50].In addition, jasmonic acid (JA) is also involved in plant immune system.There is a complex crosstalk between SA and JA [51].Analysis of cis-elements in promoters of peach NLR genes identified considerable elements enriched in plant hormone, such as SA, GA, ABA, and MeJA.JA was involved in the regulation of balancing plant growth and disease resistance (Fig. 5) [52].Therefore, peach NLR genes that response to JA might also have such functions, which provided valuable resources for illustrating the balance of growth and resistance.
The formation of pest resistance is a complex process and a highly coordinated developmental process.Plants have developed a variety of insect resistance mechanisms to decrease pests survival, growth, development, and reproduction [53].GPA is one of the most dominant aphids, affecting peach growth.During GPA infestation, a large number of NLR genes were activated significantly (Fig. 6C).For example, Prupe.1G217900,Prupe.1G545200,Prupe.2G060400,Prupe.4G224500,Prupe.5G019000,Prupe.7G065500,Prupe.7G138600,Prupe.8G023100 and Prupe.8G023800 were highly expressed after 3 h of GPA infection.Prupe.2G022500,Prupe.2G283300,Prupe.5G025600 and Prupe.7G139100 were highly expressed after 6 or 12 h GPA infection.Tissue specific expression analysis showed that peach NLR genes was mainly expressed in root, leaf and stem, indicating their roles in disease and insect resistance (Fig. 6B).The differentially expressed NLR genes identified during GPA infestation might be useful in illustrating the mechanism of aphid resistance in peach.

Identification of putative peach NLR genes
The protein sequences of NLR genes in Arabidopsis thaliana were obtained from NCBI (National Center for Biotechnology Information).Using NLR genes in Arabidopsis thaliana as queries, the homologous NLR genes in peach were identified using Blastp tools in NCBI and the NBC and LRR domains were checked manually to get the final set of peach NLR genes.Structural domains were analyzed using Pfam (http://pfam-legacy.xfam.org/)[54].Physicochemical properties were analyzed and characterized using TBtools software [55].

Phylogenetic relationships gene structure, motif and collinearity analysis
The phylogenetic tree of peach NLRs was constructed using MEGA11 software, and was viewed using evolview online website (http://www.evolgenius.info/evolview/)[56].Chromosome information of peach NLR genes were extracted from the peach genome annotation file and were converted into a readable BED file by GSDS2.0 (gene Structure Display Server 2.0 (http://gsds.gao-lab.org/) [57].Gene structure was further viewed using GSDS2.0.MEME 5.4.1 (https://meme-suite.org/meme/ tools/meme) was used to predict and analyze the conserved protein motifs [58].The base sequence value was set to 15 and other parameters were set with default values.The structure of the conserved protein motifs was plotted using TBtools [55].Genome-wide collinearity between peach and Arabidopsis thaliana were analyzed using MCScanX software and mapped using TBtools [59].

Subcellular localization analysis
Three peach NLR genes represent the main types of TNL, CNL and RNL were selected according to the phylogenetic tree.Their CDS sequences were obtained from NCBI and were cloned into pCAMBIA1300 vector fused with GFP under CaMV 35 S promoter (35 S:Prupe.2G274900-GFP,35 S:Prupe.6G152300-GFP and 35 S:Prupe.7G138800-GFP),Then, the recombined constructs were transferred into Agrobacterium tumefaciens GV3101 for transient overexpression in tobacco leaves using previously described methods [60].GFP reporter was viewed using a confocal laser microscope (Zeiss LSM880, Germany).Primers used in this section are listed in Table S3.

Promoter cis-element analysis
The promoter sequences of 286 peach NLR genes (2 kb upstream of the 5'UTR) were download from Genome Database for Rosaceae (https://www.rosaceae.org/)and submitted to PLANTCARE database for promoter element prediction [61].Their distribution and heat map were plotted using TBtools.

RNA-seq analysis
The comparative transcriptome data were generated in previous study and the clean reads under SRP144490 were download from NCBI [29].Clean reads were mapped to reference peach genome (release version 2.0_ a2.1) using tophat.The FPKM (fragments per kilobase of exon per million reads mapped) and differentially expressed genes were calculated using cufflink [62].

RNA extraction and gene expression analysis
Aphid-resistant cultivar, 'Zao You Tao' , was used for gene expression analysis.To mimic aphid infestation,10 aphids were put on the new young leaves and bagged with 100-mesh insect screens to avoid aphid escaping.Leaf samples were collected at 0 h, 3 h, 6 h, 12 h, 24 h, 48 h after infestation and immediately frozen in liquid nitrogen and then stored at -80℃ for analysis.Total RNA was extracted from the samples using an RNA extraction kit (Tiangen, China) and first-strand cDNA was synthesized using PrimeScript first-strand cDNA synthesis kit (Takara, Dalian, China).Real-time quantitative polymerase chain reaction (qRT-PCR) was performed on ABI7500 system using SYBR premix ExTaq (Takara, China) with the following procedure: 95 °C for 5 min, followed by 45 cycles at 95 °C for 10 s, 58 °C for 10 s and 72 °C for 20 s.The relative expression level was calculated by 2 −ΔΔCT method [63].Primers for qRT-PCR are listed in Table S3.

Fig. 1
Fig. 1 Distribution of peach NLR gene family on peach genome.Chromosomes 1-8 are indicated by bars of gene density, and peach NLR genes are marked in red font

Fig. 2
Fig. 2 Phylogenetic analysis of peach NLR family.Subfamily I indicates CNL, Subfamily II indicates TNL and subfamily III indicates RNL.Subfamily IV contained NLRs without N-terminal domain.The red circle represents Arabidopsis thaliana

Fig. 3
Fig. 3 Collinearity analysis of NLR genes.The Chr means chromosomes, and the red lines represent the connections of collinear genes.A Collinearity analysis of peach NLR genes in peach.B NLR genes collinearity of Prunus persica, Arabidopsis thaliana and Prunus armeniaca

Fig. 4 Fig. 5
Fig. 4 Subcellular localization of peach NLR genes.A Prediction of subcellular localization in wlof PSORT and Cello database.B Subcellular localization of three typical peach NLR proteins.The photographs were taken under bright light, in the dark feld for the GFP-derived green fourescence and merged, respectively.Scale bars, 20 μm

Fig. 6
Fig. 6 Gene expression analysis of peach NLR.A Heat map of differentially expressed peach NLR genes during aphid infestation.Different colors represent the relative expression levels of genes.B Relative expression of peach NLR genes in different tissues.C Relative expression of peach NLR genes at different stages after aphid infestation

Table 1
The number of different types of NLR gens in peach