Genome-wide identification of small heat shock protein (HSP20) genes family in grape and expression profile during berry development process

Background Studies have shown that HSP20 genes plays an important role in regulating plant growth, development and stress response. However, the grape HSP20 genes family have not been well studied. Results A total of 51 VvHSP20 genes were confirmed from grape genome. And they were divided into eleven subfamilies (CI, CII, CIII, CV, CVI, CVII, MI, MII, ER, CP and PX/Po) based on the phylogenetic tree and subcellular localization. Further structural analysis showed that the same group of VvHSP20 in the phylogenetic tree had the same motif and the structure was relatively conservative in the evolutionary process. In addition, majority of the VvHSP20 genes were located on the proximate or the distal ends of the chromosomes and four groups of VvHSP20 genes can be identified as tandem duplication genes, which inferred that tandem duplication played a predominant role in the expansion of VvHSP20 family together. To determine the functions of VvHSP20s during the development of grape berries, the expression profiles of VvHSP20s genes were analyzed after H2O2 treatment. genes and the differences in transcription levels of VvHSP20s may result of differentiation genes during results and the


Background
Studies have shown that HSP20 genes plays an important role in regulating plant growth, development and stress response. However, the grape HSP20 genes family have not been well studied.

Results
A total of 51 VvHSP20 genes were confirmed from grape genome. And they were divided into eleven subfamilies (CI, CII, CIII, CV, CVI, CVII, MI, MII, ER, CP and PX/Po) based on the phylogenetic tree and subcellular localization. Further structural analysis showed that the same group of VvHSP20 in the phylogenetic tree had the same motif and the structure was relatively conservative in the evolutionary process. In addition, majority of the VvHSP20 genes were located on the proximate or the distal ends of the chromosomes and four groups of VvHSP20 genes can be identified as tandem duplication genes, which inferred that tandem duplication played a predominant role in the expansion of VvHSP20 family together. To determine the functions of VvHSP20s during the development of grape berries, the expression profiles of VvHSP20s genes were analyzed after H2O2 treatment.
VvHSP20s genes indeed involved in the grape berry development and the differences in transcription levels of VvHSP20s may be the result of functional differentiation of genes during evolution.

Conclusions
The results provide valuable information on the evolutionary relationship of the VvHSP20 family and on the functional properties of the VvHSP20 genes for grape berry development.

Background
As one of the most important cultivated fruit crops in the world, grape has high economic value.
'Kyoho' is a tetraploid interspecific hybrid and mid-late ripening grape cultivar derived from the cross of Vitis vinifera x Vitis labrusca, whichi is widely cultivated in China. Our previous studies on' Kyoho' have shown that hydrogen peroxide (H 2 O 2 ) treatment could promote the early ripening of 'Kyoho' grape and make it ripen 20 days earlier than the control [1,2]. Other studies in tomato [3] and pear [4] have also demonstrated that H 2 O 2 was associated with the fruit development. H 2 O 2 was an early component of the thermal signal pathway, which was a necessary condition for the activation of HSP20 synthesis [5]. In addition, the response of HSP20s to H 2 O 2 had also been revealed in tomato and rice, in which H 2 O 2 induced the expression of mitochondrial HSP22 and chloroplast HSP26, respectively [6,7]. It has been reported that heat shock protein(HSP21) could protect photosystem II (PSII) from oxidative stress, promote color change during fruit ripening, and play a key role in the transformation of chloroplast to pigment mother cells during fruit ripening [8].
The expression of the heat shock proteins (HSPs) is activated or increased under the hightemperature stress. According to the molecular weight and sequence homology, HSPs can be divided into five families as HSP100, HSP90, HSP70, HSP60 and HSP20, [9,10]. Among them, the molecular weight of HSP20 protein was between 15 and 42 kDa, thus it is also called as small HSPs. In some plant tissues, HSP20s are the largest proportion of HSPs [9]. HSP20 possessed a typical conserved domain, known as the α-crystalline domain (ACD), which contained conserved 80-100 amino acid sequence and had a compact β -strand structure and contains two conserved regions (CRs): CR I with β2, β3, β4, and β5; and CR II with β7, β8, and β9, and a β6 loop [11]. HSP20s can prevent and /or reduce the aggregation of other proteins damaged by high temperature or other toxic stress, and assist in refolding or degradation of stress-damaged proteins [12,13]. Thus, HSP20s had the function of molecular chaperone and it was an important part of cellular molecular chaperone.
In plants, the HSP20 genes were involved in many developmental processes and responses to abiotic stresses [14,15]. Under heat stress, HSP20s can prevent the aggregation and irreversible denaturation of heat-denatured proteins, which ensured that other proteins can perform normal functions at high temperature, providing a strong basis for improving the heat resistance of plant organs. HSP20s have been shown to be located in mitochondria, cytoplasm and endoplasmic reticulum [16].
The number of HSP20 genes in plants is about four times greater than that in animals [17]. Members of the HSP20 gene family have been investigated in many plants, such as Arabidopsis, rice, soybean, watermelon, pepper and tomato. There are 19 HSP20 genes in Arabidopsis [11], 39 in rice [18], 51 in soybean [19], 44 in watermelon [20], 35 in pepper [21] and 42 in tomato [22]. To date, HSP20 gene family members of the grape have not been identified. In addition, HSP20 genes were to be screened as the differentially expressed genes between the H 2 O 2 treatment and the control of 'Kyoho' (Unpublished data). Therefore, this study aims to elucidate the composition, gene structure, evolution and expression of the grape HSP20 gene family, in an attempt to explain its structural and functional characters, and to lay a solid basis for further utilization of plant heat shock proteins.

Results
Genome-wide identification of VvHSP20 genes family in grape A total of 61 VvHSP20 genes obtained by Hidden Markov Model (HMM) analysis, and they were submitted to CDD, Pfam and SMART database to confirm the ACD domain. The sequences without the typical ACD domain and with a molecular weight beyond the 15-42-kDa range were discarded. Finally, 51 sequences were retained and confirmed as grape HSP20 genes. Detail information on physicochemical properties of HSP20s were listed in Table 1. The lengths of the VvHSP20 proteins ranged from 108 (VvHSP20-31 and VvHSP20-36) to 365 amino acids (VvHSP20-44); the molecular weights of VvHSP20s were between 12.64 kDa (VvHSP20-31) and 40.59 kDa (VvHSP20-44). The predicted pI values of VvHSP20 ranged from 4.68 (VvHSP20-44) to 9.48 (VvHSP20-21).
Phylogenetic analysis of VvHSP20 genes An unrooted Neighbor-Joining (NJ) phylogenetic tree was constructed by complete alignment of amino acid sequences of HSP20 proteins from grape, Arabidopsis, tomato ( Fig. 1). In total, 19 sequences from Arabidopsis, 26 sequences from tomato, 51 sequences from grape were assessed in the phylogenetic tree. According to the phylogenetic and the subcellular localization analysis, the grape HSP20 protein are divided into eleven subfamilies (CI, CII, CIII, CV, CVI, CVII, MI, MII, ER, CP and PX/Po) ( Fig. 1, Table 1). The cluster of the subfamilies is largely consistent with the subcellular localization, i.e., the proteins in the same cluster located in the same subcellular sites. Specifically, six HSP20 subfamilies (CI-CVI), MTI and MTII subfamilies, CP, ER and PX /Po localize to the cytoplasm/nucleus, mitochondria, chloroplast, endoplasmic reticulum and peroxisome, respectively. The 96 HSP20s were classified into 15 distinct subfamilies, except for the unclassified VvHSP20s (VvHSP20-16, VvHSP20-17, VvHSP20-41 and VvHSP20-44), the subcellular localization of which could not be predicted by online tool Protcomp. Most of the HSP20s, including 35 out of 47 VvHSP20s, were classified into CI-CVII, which suggested that cytosol may be the main functional region of plant HSP20s.
Characterization of the amino acid sequences and gene structure ofVvHSP20s As shown in Fig. 2a, 51 VvHSP20s were divided into 11 subgroups, except for the unclassified HSP20 (VvHSP20-16, VvHSP20-17, VvHSP20-41 and VvHSP20-44). Ten conserved motifs of VvHSP20 proteins were identified in MEME website. Details of the 10 motifs were outlined in Table 2. The lengths of these conserved motifs ranged from 6 to 60 amino acids (Fig. 2b, Table 2). ACD consists of two conserved regions, CRI of β2, β3, β4, and CRI of β7, β8, and β9, separated by a variable length hydrophilic region β6 loop (Fig. 3). VvHSP20-2, 3, 42, 50, 51 lacked the β6-loop. VvHSP20-39 lacked the β4. The different components of the ACD domain mean the functional diversity of VvHSP20s. The same group of VvHSP20 proteins in the phylogenetic tree had the same motif, indicated that they were highly conserved.
Genes of the same subgroup had the same intron phase, which indicated that the structure was relatively conservative during the evolutionary process.
Analysis of Cis-element in VvHSP20 genes' promoters 6 To understand the possible role of cis-regulatory elements of VvHSP20, the promoter sequences (comprising of −2000 bp upstream of the translation start site) of 51 VvHSP20 genes were submitted into PlantCARE to detect the cis-elements. Nine abiotic stress response elements, including MeJAresponsiveness, salicylic acid responsiveness, light responsive element, gibberellin-responsiveness, auxin responsiveness, abscisic acid responsiveness, defense and stress responsiveness, lowtemperature responsiveness and HSE1, were identified and they were displayed in Fig. 5. In addition, most of VvHSP20 genes possessed W boxes, MYB binding sites, including CCAAT-boxes.

Expression patterns of VvHSP20s in H 2 O 2 treatment
There is a close relationship between gene expression and its function. To determine the functions of VvHSP20s in grape, the heatmap of 50 VvHSP20 genes was constructed using FPKM values from RNAseq data in control and H 2 O 2 treatment berries of 'Kyoho' (Fig. 6, Sampling period is shown in materials and methods and Table 3). The expression level of HSP20-35 was absent because its expression level was extremely low and not detected in RNA-seq analysis during fruit development.
Most of VvHSP20s were down-regulated after the treatment, especially at the fourth period. However, the opposite trend was also observed for a few genes, including HSP20-14, HSP20-21 and HSP20-32.
These results indicated that most of the VvHSP20 genes were related to the H 2 O 2 treatment, and the response mechanisms of different VvHSP20 genes to H 2 O 2 were different.
Based on the statistical significance of the gene expression levels from RNA-seq and the partition of the clusters of the genes from phylogenetic analysis, fourteen differentially expressed VvHSP20 genes were selected to be further validated by qRT-PCR in the control and H 2 O 2 treatment (Fig. 7).
Consistent with the RNA-seq data, the expression level of most genes decreased after the treatment.
Besides HSP20-33, the relative expression levels of the remaining 13 genes were extremely downregulated at the fourth period. It was worth noting that VvHSP20-18 and VvHSP20-26 were hardly expressed after treatment. Similar expression patterns were revealed within the tandem duplicated gene groups (VvHSP20-26 and VvHSP20-29). The similar expression patterns indicated that the tandem duplicated VvHSP20 genes had similar functions and structures. The members of CI subgroup (VvHSP20-25, VvHSP20-26, VvHSP20-29 and VvHSP20-33) had similar expression patterns after the treatment, it indicated that they had similar functions in response to H 2 O 2 treatment.
Expression patterns of ABA-related genes in H 2 O 2 treatment As we know, ABA plays an important role in grape [23,24]. In the previous study [1], H 2 O 2 treatment promotes the early fruit ripening of 'Kyoho'. To further explore the role of ABA in this process, the expression analysis of ABA-related genes were performed by RNA-Seq data and qRT-PCR. As shown in Hydrogen peroxide not only acts as a stress inducing factor, but also as a signal molecule. The imbalance of ROS generation and removal, such as hydrogen peroxide (H 2 O 2 ), leads to oxidative stress on aerobic organisms [27,28]. Evidence of H 2 O 2 signaling function is well established by identifying a number of genes that are regulated at H 2 O 2 expression levels [29,30]. Among the H 2 O 2 inducible genes, heat-shock proteins (HSPs) are related to defense or stress responses [5]. However, the relationship between hydrogen peroxide and HSP20 in grape berry development is not clear.
Therefore, a preliminary study on this issue were conducted.
HSP20 proteins were ubiquitous ATP-independent molecular chaperones, which play an important role in plant growth and development, and deter or reduce the irreversible aggregation of denatured proteins under pressure [14,15]. Although HSP20s blocks the aggregation and stabilization of non-natural proteins in an ATP-independent way [17], HSP20s cannot refold non-native proteins by themselves. For example, pea Hsp18.1 works with the hsp70 system to refold thermally modified proteins [31]. In recent years, due to the availability of whole genome sequences, the HSP20 families have been identified from some plants, such as Arabidopsis [11], tomato [32], rice [18] and soybean [19]. However, there are few studies on HSP20 family in grapes.
Following an integrated approach to detect HSP20s in grape, 51 putative VvHSP20 genes were identified. These genes were divided into 11 subgroups (CI, CII, CIII, CV, CVI, CVII, MI, MII, ER, CP and PX/Po). Previous researches showed that 12 HSP20 genes subgroupswereidentified from Arabidopsis (CI-CVII, MI, MII, ER, CP and PX/Po) [11,33]. Likewise, four new nuclear subgroups from rice (CVIII, CIX, CX and CXI) were reported [9]. However, several subgroups including CIV, CVIII, CIX, CX, and CXI of rice were not identified from the VvHSP20 genes of grape. Study had shown that CIV subgroup may be involved in coping with all kinds of stress conditions and developmentally regulated [33]. Under normal growth conditions, members of CVIII subgroup may be heat-induced and the subgroup of CX may be related to specific housekeeping functions [9]. And the same situation was revealed in other plants, for example, the CIV, CV, CVIII, CIX, CX, and CXI subgroups of HSP20 family in pepper also were absent [21]. In addition, the HSP20 family of rice lacked CIV and CVII subgroups [9]. Therefore, it was easy to infer that gene acquisition and loss events were widespread in plant species. The deletion of subgroups may be due to the loss of genes during the evolution of HSP20 genes.
The structure of genes plays a crucial role in the evolution of multiple gene families. The results showed that most of the VvHSP20 genes (89.1%) had no intron or only one intron with short length.
Plants tend to retain genes without introns or with shorter introns [34]. This was consistent with the previous reports that of pepper (97.14%) [21] and tomato (83.33%) [32]. Most VvHSP20s in the CII and ER subgroups had no intron, which was consistent with those in pepper, rice and soybean [18,19,21], but the gene structure (exon-intron) of CI group in grape was different from those in these species, indicating that the intron pattern might not be well preserved among different species. In addition, the stability index of most VvHSP20 proteins was greater than or equal to 40, indicating that most of them were unstable proteins. Instability is believed to be a common feature of stress proteins, and also reflects the rapid induction of VvHSP20 gene [35].
HSP20s can be induced not only by environmental stresses, including heat, cold, drought and salinity, but also by various developmental processes, such as embryogenesis, germination and fruit development [19,[36][37][38]. In this study, the expression of VvHSP20s were down-regulated by H 2 O 2 treatment during fruit development. Similarly, FaHSP17.4 was highly expressed in leaves and flower organs of 'Fengxiang' strawberry, but the expression decreased gradually along the fruit development [36]. In addition, HSP expression is induced at specific developmental stages in plants. HSP20s were highly expressed in the development stages of zygotic embryonic tissues, pollen maturation of rice and tomato [9,39]. NJJS4 gene was a kind of HSP20 coding gene, which accumulated in strawberry fruit (Fragaria x ananassa cv, receptacle) during ripening [40]. Class II sHSP17.4 expressed at almost all stages of fruit development, and maintained a high level at the later stage of fruit ripening, while Class II sHSP17.6 reached a peak at the turning stage, and Class I HSP17.7 reached a high level at the pink stage [41]. Four differentially expressed HSP20 genes were revealed from RNA-Seq results of Heize 1706 in tomato fruit and it were considered that they played an important role in fruit development [42]. These observations indicate that HSP20s were associated with fruit development.
ABA played an important role in promoting fruit ripening. In non-climacteric grape berries, ABA is considered to be the main signal that triggers the onset of maturation-related processes as it peaks at version, accompanied by the beginning of berry softening and skin coloration [43]. The content of ABA is determined by the dynamic balance of endogenous ABA biosynthesis and catabolism [44].
Previous study had shown that 9-cis-epoxycarotenoid dioxygenase (NCED) was a key enzyme in ABA biosynthesis [45] and CYP707A (ABA key degradation enzyme gene) plays a predominant role in ABA catabolism in vivo in strawberry [46,47]. NCED play important role in abscisic acid (ABA) mediated signaling pathway [45,48]. In this study, NCED3 had a low expression level at the early stages of fruit development and rapidly increases at K4 stage in the control. However, it reached peak at veraison then rapidly decreased at H4 stage. This is consistent with the changes of ABA during fruit development; ABA levels reached the peak at veraison stage and decreased after that [49,50]. ABA catabolism and biosynthesis are closely linked through feedback and feedforward loops to limit the amount of ABA needed for fruit growth and to rapidly increase the amount of ABA before fruit ripening [47]. CYP707A4 gene were highly induced at H1 stage, then gradully decreased, finally reached the lowest value at veraison after H 2 O 2 treatment. Previous study showed that the expression level of FveCYP707A4a was higher in early stages of fruit development in woodland strawberry [47]. This may be due to the high level of ABA inhibiting early fruit growth [47] and ABA degradation was accelerated after hydrogen peroxide treatment.

Conclusion
In this study, the HSP20s gene family of grape were comprehensively identified. The phylogenetic relationships, gene structures and conserved motifs, cis-acting elements of 51 VvHSP20 genes were analyzed, and the expression level were also explored by RNA-Seq and qRT-PCR analysis. A total of 51 HSP20 were divided into eleven subfamilies according to phylogenetic tree and subcellular localization. The expression levels of HSP20 genes in grape under H 2 O 2 treatment were verified by qRT-PCR analysis, providing a basis for further study on the function analysis of HSP20 genes in fruit development. Finally, the expression level of ABA-related genes were verified, and confirmed that H 2 O 2 indeed affected the ABA metabolism and the expression of HSP20 genes to promote the fruit development and ripening.

Identification of HSP20 genes in grape genome
We downloaded the grapevine reference genome assembly and protein sequences from Ensembl Plants Database (http://plants. ensembl.org/index. html). The HMM profile of HSP20 (PF00011) were downloaded from Pfam protein family database (http://pfam.xfam. org/), and it was used for identify grape HSP20 candidates. The output putative HSP20 proteins sequences were submitted to CDD (https://www.ncbi. nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi), Pfam and SMART (http://smart.emblheidelberg.de/) to further confirmed the conserved HSP20 domain. The predicted protein sequences with a deletion of the HSP20 domain or molecular weight outside the range of 15-42 kDa were removed. Finally, 51 HSP20 genes were identified after all redundant putative HSP20 sequences were removed. The physicochemical properties of HSP20 protein were predicted with Protparam online tools (https://web. expasy.org/protparam/). Subcellular localization prediction was conducted using online tool Protcomp (http://linux1. softberry.com/). These VvHSP20 genes were named according to their positions on pseudomolecules [18].

Phylogenetic analysis of HSP20 genes in plants
The amino acid sequences of HSP20s derived from Arabidopsis thaliana, Solanum lycopersicum combined with newly identified VvHSP20s were used for phylogenetic analysis. Multiple sequence alignments of the HSP20s amino acid sequences were performed with MEGA7.0 using default parameters. The NJ phylogenetic tree was constructed with the aligned HSP20 sequences using MEGA 7.0 software.
Gene structure and conserved motif analysis of VvHSP20 proteins The conserved motifs of VvHSP20s were conducted through MEME program (version 4.11.2, http://alternate.meme-suite.org/tools/meme), and the parameters were as follows: optimum motif width ranges from 6-200 amino acid residues and maximum of 10 misfits. Gene structures of VvHSP20s genes from grape were identified using TBtools software [51].
Chromosomal location, gene duplication of HSP20 genes The chromosomal localization information of VvHSP20 genes were obtained from Ensembl Plants Database (http://plants.ensembl.org/index.html) and the chromosome location images were generated using the MapDraw V2.1 tool (http://mg2c.iask.in /mg2c_v2.0/). The definition of CaHSP20 gene replication is based on the following criteria: (1) the sequence alignment length accounts for 70% of the longer gene; (2) the aligned gene region similarity were≥70% [52]. The duplication events of VvHSP20 genes were determined using MCScanX (Multiple Collinearity Scan) [53]. A syntenic analysis was conducted locally using Circos software.

Plant materials
The samples were collected in 2017 from the farm of Henan University of Science & Technology, Luoyang, China. The 'Kyoho' grape treated with distilled water (containing 0.03% silicon wet-77 surfactant) was naturally grown for 6 years as a control and treated twice with 300 mmol/L H 2 O 2 . The first spraying was conducted at 25 days post anthesis (dpa) in 2017 and the second was 35 dpa.
Samples were taken 35 days after flowering and every ten days until the treated fruits were ripe ( using TransStart Top Green qPCR SuperMix kit (TRANSGEN, Beijing China) with a total volume of 10µL reaction system. Each VvHSP20 gene was reproduced by three independent techniques replicates.
The relative expression changes of VvHSP20s genes were calculated using the 2 -ΔΔCt method [57].
The results were performed using SPSS version 21.0 and differences between means of the levels of expression were measured by ANOVA employing Duncan's multiple range test.
The FPKM values of VvHSP20 genes were from the RNA-seq data (Accession codes, SRA: PRJNA541089     Chromosomal locations of VvHSP20 genes on grape chromosomes. Blue lines indicated gene position.

Figure 5
Cis-element analysis of putative VvHSP20 promoters. Different cis-elements with the same or similar functions were shown in the same color.

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download. Table S1. Primers used for the qRT-PCR reactions.xlsx