In-silico analysis of heat shock transcription factor (OsHSF) gene family in rice (Oryza sativa L.)

Background One of the most important cash crops worldwide is rice (Oryza sativa L.). Under varying climatic conditions, however, its yield is negatively affected. In order to create rice varieties that are resilient to abiotic stress, it is essential to explore the factors that control rice growth, development, and are source of resistance. HSFs (heat shock transcription factors) control a variety of plant biological processes and responses to environmental stress. The in-silico analysis offers a platform for thorough genome-wide identification of OsHSF genes in the rice genome. Results In this study, 25 randomly dispersed HSF genes with significant DNA binding domains (DBD) were found in the rice genome. According to a gene structural analysis, all members of the OsHSF family share Gly-66, Phe-67, Lys-69, Trp-75, Glu-76, Phe-77, Ala-78, Phe-82, Ile-93, and Arg-96. Rice HSF family genes are widely distributed in the vegetative organs, first in the roots and then in the leaf and stem; in contrast, in reproductive tissues, the embryo and lemma exhibit the highest levels of gene expression. According to chromosomal localization, tandem duplication and repetition may have aided in the development of novel genes in the rice genome. OsHSFs have a significant role in the regulation of gene expression, regulation in primary metabolism and tolerance to environmental stress, according to gene networking analyses. Conclusion Six genes viz; Os01g39020, Os01g53220, Os03g25080, Os01g54550, Os02g13800 and Os10g28340 were annotated as promising genes. This study provides novel insights for functional studies on the OsHSFs in rice breeding programs. With the ultimate goal of enhancing crops, the data collected in this survey will be valuable for performing genomic research to pinpoint the specific function of the HSF gene during stress responses. Supplementary Information The online version contains supplementary material available at 10.1186/s12870-023-04399-1.


Background
The growth of plants is significantly impacted by a variety of detrimental environmental variables, including biotic and abiotic stresses [1] because they can hasten chlorophyll deterioration and reduce photosynthetic efficiency.The abiotic stresses like high temperatures and drought are particularly important because they can severely restrict plant growth, development, and function.Due to sessile structure of plants, which precludes them from actively avoiding stress, plants are dependent on physiological and biochemical processes to withstand external extremes [2,3].As a result, they must create a wide range of complex and effective mechanism to maintain normal physiology, metabolism, and development under stress conditions.The transcription factors like ABRE Binding Factor and MYC are involved in calcium signaling, abscisic acid and jasmonate signaling pathways that regulate the reactive oxygen species (ROS) and cell signaling pathways [4].For plants to be resistant to stress, transcription factor (TF) gene expression is essential.For the reception and transmission of signals, eukaryotes usually contain a set of transcription factors called heat shock factors (HSFs).Plant stress response and the tolerance to heat are induced by the discovery of heat shock factors and the regulation of downstream genes [5,6].Numerous studies have documented interactions between heat and oxidative stress in the cellular pathways.The production  of ROS is regarded to be a link between stressful situations like flooding, exposure to UV radiation, pathogen attack etc. [7].Previously, it was proposed that redox responsive transcription factors like HSFA4a are probably responsible for detecting ROS levels in Arabidopsis.These "sensors" are thought to function upstream in a cascade that controls some stress-responsive proteins and other TF, including Zat and WRKY gene families [8].
In-depth investigation has shown that a variety of HSFs, including HSFA1b, HSFA4a, and HSFA8, are suspected of taking part in abiotic stress-induced ROS regulated gene networks.It is proposed, the generation of various ROS triggers HSF activation, which in turn causes the regulation of other genes.These mechanisms could act as a molecular bridge between the cellular response to heat stress and other types of stresses [9].Heat shock transcription factors are the primary regulatory components of the plant towards heat stress response.The sequence of the Arabidopsis thaliana genome revealed 21 open reading frames (ORFs) that encode putative HSFs that were divided into three groups as A, B and C based on phylogenetic analysis and structural features [10].The DNA binding domain (DBD), which interacts with "heat-shock elements" (5′-nGAAnnTTCn-3′) regulatory sequences found in the target gene via the helixturn-helix motif and the oligomerization domain, which is responsible for HSF trimerization and has a bipartile heptad repeat pattern in the hydrophobic-associated region (HR-A/B) [11].The HSF gene family has been characterized in several plant species, including A. thaliana [12], Brachypodium distachyon [13], Glycine max [14], Solanum lycopersicum [15], Populus trichocarpa [16], Triticum aestivum and Zea mays [17][18][19].
However, the function of HSFs in rice plant growth and development, responses to stressors and transcript expression profiling of HSFs gene has not been thoroughly investigated.The computational biology methods offer a practical and stable foundation on which additional wet-lab research could be carried out.Numerous abiotic stresses have been connected to HSF genes.In the study, we examined this important gene family in detail using the whole annotated rice genome sequence (TIGR Rice Annotation release 7).

Identification of HSF genes in Oryza sativa genome
The genome of the Oryza sativa L. japonica cultivar Nipponbare was initially mined for HSF genes using ESTs and cDNA sequences.The National Centre for Biotechnology Information (NCBI) https:// www.ncbi.nlm.nih.gov/ [20], the Database of Rice Transcription Factors (DRTF) http:// plant tfdb.gao-lab.org/ index.php? sp= Osj [21], MSU Rice Genome Annotation Project Database http:// rice.uga.edu/ [22] and Plant Genome Database (PlantGDB) https:// www.plant gdb.org/ [23] were used to mined the HSF genes.HSF genes in the rice genome were predicted using the BLAST online tool available at http:// rice.uga.edu/ analy ses search blast.html on the RAP-DB website [24].The sequences with more than 80% coverage in the BLAST analysis were found using the online tool GENSCAN (http:// holly wood.mit.edu/ GENSC AN. html).On both sides of the hit, the open reading frame (ORF) was expanded by around 2000 bp [25].Additionally, the HSF domains in the query sequences were validated using the SMART (Simple Modular Architecture Research Tool) programme (http:// smart.embl-heide lberg.de/).

Phylogenetic and MEME motif analysis
Through the use of Clustal Omega (https:// www.ebi.ac.uk/ Tools/ msa/ clust alo/), the protein sequences obtained from several public repositories were aligned to remove the redundant sequences.Bootstrap (5000 replicates) and pairwise deletion were used as the default parameters to create a combined unrooted neighbor-joining (NJ) tree.Besides, the conserved motifs in HSF rice protein sequences were combed using online tool Multiple Em for Motif Elicitation (MEME Suite version 5.5.0)https:// meme-suite.org/ meme/ tools/ meme.

Distribution of intron and exon size in OsHSF family genes
Using the Gene Structure Display Server (GSDS) http:// gsds.gao-lab.org/, the positions of introns and exons in OsHSF genes were determined by gaps discovered during the alignment of full-length cDNA transcripts with genomic sequences [5].Concisely, exons are proximal blocks of homologous sequence between full-length cDNA and genomic sequences.The introns are gaps between exons that are wholly made of genomic sequence for a single fulllength cDNA that was matched to a conterminous stretch of genomic sequence [26].To better comprehend the range and magnitude of HSF family genes, the total length of a gene is estimated by adding the lengths of each of its exons.

Protein 3D structure
Using the online programme AlphaFold, available at https:// alpha fold.ebi.ac.uk/ [27], the 3-D structure of the HSFs rice genes was predicted, as shown in Fig. 1.

Gene expression analysis
The rice expression profile database (RiceXPro) [28], a public repository of gene expression, was utilized to analyze and confirm the expression of the OsHSF gene (s).The data from microarray experiments were used to study the entire life cycle of the rice plant, including field development (leaf day time, root day time, leaf sunset, leaf night time, root night time, reproductive organs, grain at early stage, grain ripening, spatio-temporal profile), and plant hormones (abscisic acid, auxin, brassinosteroid, cytokinin, gibberellin, and jasmonic acid in root and shoot).The most precise quantitative measurement of the transcript levels for particular genes is produced by creating a table of normalised signal intensity values for each gene in each plant tissue.

Identification and chromosomal distribution of OsHSFs
With the development of genomic sequencing technology, it is now possible to recover the protein/nucleotide sequences of all OsHSFs family genes.After eliminating the duplicated sequences, 25 OsHSFs were discovered in the study, as indicated in Table 1.Using HMM and EMBL-EBI, all OsHSF proteins were evaluated for the presence of DBD.The SMART online tool certified the OsHSFs-DBDs.
Table 2 lists all of the properties of the OsHSF genes.The 25 HSF genes were localized on rice chromosomes as shown in Fig. 2. Chromosome-1 and chromosome-3 had a maximum of 5 and 6 OsHSF genes respectively, whereas a single copy of OsHSFs gene was localized on chromosomes-4 and chromosome-5.In contrast, chromosomes-6, chromosome-7 and chromosome-8 harbor three paralogus genes, while two paralogus gene were identified on each of chromosomes-2 and chromosome-9 respectively.Except for OsHSF13800, OsHSF06630, OsHSFSS12370, OsHSF25080 and OsHSF08140, all other OsHSF genes were confined on the lower arm of the chromosomes.

Characteristics of each group in the rice HSFs family genes
The responses of these genes to abiotic factors have been documented in Arabidopsis, Brachypodium and Oryza species.These genes must be classified in accordance with various stress regimes in order to be included in unique groups based on their protein similarity, which may aid in related function within their evolutionary placement.The Table 4 provides a summary on the roles of each gene in the OsHSF family.The two sub-groups of Clade-I, Ia and Ib, harbour nine total genes.All genes in The clade-II comprised of 14 genes and involved in anther, ovary, embryo and endosperm and root development.In Clade-III and clade-IV has single gene.The gene in the clade-III is involved in vegetative i.e. leaf blade and root development along with reproductive (pistil and palea development).The gene in clade-IV involved in leaf blade, leaf sheath, stem, vegetative and root development and reproductive (pistil, palea, lemma anthers and inflorescence).

Distribution of motifs
OsHSFs TF contain functionally important motifs linked to mitochondria and chloroplasts.Such functional sequencing motifs are typically conserved among members of a subgroup in vast families of transcription factors in plants, and the proteins of these motifs in their subgroups are likely to have similar activities.Multiple alignment analysis with Clustal Omega was used to investigate the conserved motifs in the nucleotides of each clade in the rice OsHSFs gene family.The MEME Suite version 5.5.0 was used to examine rice OsHSFs protein sequences for the presence of conserved motifs.Overall, 15 conserved motifs were predicted which correspond to the OsHSFs domain as shown in Fig. 5 and these conserved motifs found in the OsHSFs family are listed in Table 3.

Gene structure analysis
The GSDS tool was used to examine the intron-exon organization of the selected OsHSFs in order to determine the structural link between the genes.The quantity and structure of introns and exons have a significant impact on how gene families have evolved.The number of exons and introns was found to remain constant, and 84% (21/25) of OsHSFs contain just one intron for Os08g43334, Os03g12370, Os09g35790 and Os03g06630 (Fig. 6).The remaining OsHSF gene family has two introns.Exon counts for the Os08g43334, Os03g12370, Os09g35790, and Os03g06630 revealed seven, two, three, and three exons, respectively.All OsHSFs contained 5′ and 3′ un-translated region (UTR).In terms of intron number, intron phase, exon length, and overall gene length, similar intron-exon patterns were observed in the OsHSF genes belonging to the same class and subclasses.

Expression profiles of OsHSFs at different developmental stages
Investigations into the OsHSF gene expression patterns were conducted on various time scales and at various growth stages.Transcriptome profiles provide insight into the potential role of genes in a variety of biological processes, despite the fact that protein expression is not always associated with gene expression.The rice genome database RiceXpro was used to download the transcriptome data that was used in the current study.The Os01g39020, Os01g53220, Os03g25080, Os01g54550, Os10g28340, and Os02g13800 have been demonstrated to be among the tissues with the highest up-regulation of OsHSFs during different growth phases (Fig. 7).The Os01g39020, Os01g53220, Os01g54550, Os02g29340, Os03g58160, Os03g63750, Os06g35960, Os07g08140, Os07g44690, Os09g28200, Os09g35790, and Os10g28340 are also up-regulated throughout the development of reproductive organs and grain ripening stages.A total of 4 OsHSFs exhibited less expression at all as shown in Table 4.

Expression profiles of OsHSFs at different plant hormone stages
A wide variety of plant hormones have an impact on rice growth, development, and yield.RiceXpro was used to analyze the data in order to survey rice OsHSFs expression in response to several plant hormones (Table 5).The genes Os01g39020, Os01g53220, Os02g13800, Os03g12370, Os03g25080, Os04g48030, Os06g35960, Os07g08140, Os08g43334 and Os09g28354 had the highest expression in the root and shoot in response to abscisic acid.Os01g54550 and Os03g25080 were moderately overexpressed in the root and shoot at various times when gibberellins were present.Only two OsHSFs (Os01g39020 and Os01g54550) displayed considerably increased expression in the rice plant's root and shoot when auxin hormone was present.A single gene, Os01g43590, showed moderate expression in the root and shoot when brassinosteroid hormone was present.The genes Os01g39020, Os01g53220, Os02g29340, and Os08g43334 had the maximum expression under cytokinin hormone; however no discernible effect was seen in the shoot.Most OsHSFs genes in the shoot are up-regulated in response to jasmonate.However, some OsHSFs could only be activated by a specific hormone.While other OsHSFs displayed virtually minimal expression in response to any hormone stimulation.

Coexpression of OsHSFs gene family
According to a hierarchy and mutual rank (MR) value on an ascending MR value, as illustrated in Fig. 13, the HyperTree graphical presentation illustrates the relationships between coexpressed genes.The HyperTree nodes were labeled with transcription factors name.It reveals the association of HSF genes with other TF such as G-2 like, GRAS, RWP, RWK, bZIP, trihelix, WRKY as shown in Fig. 14.As a result, the gene network would show how the 25 HSF genes had overlapping activities and provide valuable information that could be used to better understand the molecular mechanism of rice reproductive evolution.

Discussion
To provide food security under diverse climate scenario and ever-increasing global populace, it is imperious to comprehend the molecular mechanisms of plants and discover genetic resources related to agricultural productivity.It has been discovered through the sequencing of the crop plants that the number of  [31] hypothesised that the rice genome contained 23 genes that encode HSF.OsHSFs are essential to plant growth, according to earlier findings.So, using the RiceXPro database, we looked at the specific expression of OsHSFs across 12 different developmental stages (Table 4).Many genes had increased expression, which was indicative of how they functioned at various developmental stages.In particular, Os01g39020, Os01g53220, Os03g25080, Os01g54550, Os02g13800 and Os10g28340 expressed extremely across all the growth stages.This gives significant support for the outcome of our analysis and creates a solid foundation for subsequent research to characterize the functions of Os01g39020, Os01g53220, Os01g54550, Os02g13800 and Os10g28340 under different plant hormonal level.Similarly, it has been suggested that OsHSFs are crucial for plants to cope with abiotic stresses.According to Kumar et al. (2018), TaHSFs A6e modulates wheat's resistance to drought and heat stress during the reproductive phases [32].By inducing the expression of heat shock proteins (HSPs), Arabidopsis adaptation to salt and heat stress [33].The expression of OsHSFs was assessed under abiotic stress conditions using microarray analysis.Most of OsHSFs family genes displayed stress-specific expression; however some OsHSFs exhibited up-regulation under particular stress.According to Jiang (2016), OsHSFs improves plant tolerance to heat and salinity stress and escalated sensitivity to the abscisic acid [34].Similarly in our study, Os01g39020, Os01g53220, Os01g54550, Os02g13800 and Os10g28340 genes exhibited highest expression may be used to improved development of plant reproductive organs, leaf diurnal, root diurnal and grains ripening and these genes is also played better performance under different hormonal levels.The study found that there were various transcription factor transcripts under different stress conditions.In Arabidopsis, sunflower and Medicago truncatula solely express HSFA9 gene during seed development [35,36].The rice gene Os03g12370, which is an Arabidopsis and sunflower homolog, was not expressed during seed development.In our investigation, six OsHSF genes had enhanced expression in specific tissue.The Os09g28354 and Os01g39020 and Os01g53220 genes have a relationship with reproductive organ tissues, respectively, as well as seed and root tissues.The Os02g1380 is in root and reproductive organs, Os05g45410 and Os01g54550 in leaf, vegetative and ripening, Os03g58160 in panicle, Os01g53220 in flower and Os03g25080 in the pistil has significant affect under stress conditions.Theoretical explanations for HSF A1a/A1b in Arabidopsis and HSF A1a in tomato suggest that constitutively produced OsHSFs may be crucial for the regulation of stress-induced HSFs genes [37,38].
It is well known that osmotic stress, salt, cold, and heat all significantly increase the expression of HSF in Arabidopsis.In this study, expression profile analysis revealed that OsHSF also respond to various abiotic stresses.Os03g53340, Os07g44690, Os01g53220, Os01g54550 and Os02g13800 play major role in the ROS accumulation pathways.Our findings are consistent with those of Wang et al. (2022) [39], who hypothesised that OsHSFs would function as sensors for changes in ROS intensity.Among all OsHSFs, the gene Os03g53340 showed the greatest level of expression at both oxidative stress time points.Furthermore, it implies that Os02g32590 and Os01g39020 could be involved in the delayed reaction to oxidative stress.It is notable that Os07g08140 seems to be the least responsive to the stress circumstances whereas Os03g53340 had the greatest transcript regulation under stress.The co-induction of OsHSF genes may provide important details about the pathways that respond to stress.The DNA-binding domain of plant OsHSFs genes is divided into two portions by an intron.This intron is located in the same location at each OsHSF, however its size varies [40].The majority of HSF genes only contain one intron in the DBD, and rice is not an exception to this rule (Fig. 6).Besides, it was revealed that the rice HSFs gene is not intronless, contrary to the general finding that roughly 20% of rice genes are intron-less [19,41].Intron-less genes have been found in several rice transcription factors such as MADS box [42], C2H2 zinc finger [43], bZIP [44], SAUR [45] and F-box [46] gene families.Alternative splicing may occur and vary according to environmental stresses and at certain developmental stages.The Oryza 10% and Arabidopsis 11.6% genes exhibited alternatively spliced across numerous tissues [41,47].It is hypothesized that the evolution of a gene family is significantly influenced by the increase or decrease in exon number.As a result, the quantity and distribution of introns and exons in OsHSF genes were examined.Our findings showed that, with the exception of Os08g43334, Os03g12370, Os09g35790, and Os03g06630, all OsHSF genes had one intron and two exons (Fig. 6).Furthermore, exon and intron length and positions varied significantly between various subclasses as they were highly conserved within the same subclasses.It is reported that the improvement of translational efficiency through the promotion of gene expression by intron transcription initiation, increased mRNA accumulation [48].
According to Xie et al. 2019, the OsHSF family of genes exhibits a co-expression pattern under various abiotic stimuli in Arabidopsis [49].Our study indicates that OsHSF TF regulate multiple mechanisms in rice.During co-expression analysis of selected OsHSF gene(s), it was found that the OsHSF genes trigger the C2H2 type zinc finger proteins that enhance plant drought resistance through activating the expression of related targeted genes and increasing the levels of osmotic regulations [50].The OsHSF genes also coexpressed with Golden 2-like family genes (G2-like) that have been characterized by regulating the formation of chloroplasts during the transition and early maturing phases Fig. 14 [51].
The CCAAT-binding complex (CBC), which regulates primary and secondary metabolism, development, stress reactions, and pathogenicity in fungi and plants, is activated by the OsHSFs.The CBC is normally composed of heterotrimeric core subunits.[52].Moreover, the sequence-specific DNA-binding TF known as "growth regulating factors" (GRFs) regulate numerous aspects of plant growth and development [53,54].The BZR1/BES1 and OsHSFs play a crucial role in BR signaling and also act as a regulator in multi-signal-regulated plant growth and development events by directly networking with other key proteins or genes [42,55].
In the co-expression analysis, the AP2/EREBP genes were also triggered that play indispensable roles in root  high temperature circumstances [57].Systematic investigation of the rice TF family gene reveals that they are up-regulated under heat stress and contribute in a multiplicative way to the OsHSF genes [58].When plant is growing, auxin stimulates the cell wall and also influences root formation [59].
The root nodules (RN) symbiosis is dependent on two GRAS domain transcription factors known as the nodulation signaling pathway (NSP1 and NSP2).Their rice homologs, OsNSP1 and OsNSP2, effectively reversed the RN symbiosis-defective phenotypes of the mutants of the corresponding genes in the model legume Lotus japonicas [60,61].Through cell differentiation, OsHSFs and the RWP-RK domain regulate the development of female gametophytes.This is good attribute to identify the early maturing rice genotypes which flower under high temperature [62].
Co-expression of HSF and WRKY TF, which respond to biotic and abiotic stresses, controls plant growth and development.It is still unclear how WRKY TFs regulate plant height in rice and react to drought stress at the molecular level.In rice, the majority of the WRKY genes show variable responses towards cold, heat, PEG and salinity stresses [63].Recently, the HSFA2e gene has been annotated to confer thermo-tolerance in transgenic Arabidopsis plants [64].In the field, plants are subjected to a variety of stresses; hence it's crucial to develop crop types that are resilient to a variety of stress conditions.The ability of OsHSFs to respond to stress can be used to create transgenic rice plants that are tolerant to abiotic stress [65].

Conclusion
Comprehensive in-silico investigation, including phylogenetic analysis, gene structure and conserved motif analysis, chromosomal location, evolutionary analysis, and OsHSF expression profile, was carried out to better understand the function of 25 OsHSF genes.According to expression profiling, Os03g53340, Os01g54550, Os02g13800, and Os01g39020 are the key heat shock regulators (HSR) in rice, and Os03g53340 is crucial for the early activation of the heat shock protein gene under heat stress.These findings laid the foundation for developmental processes and responses to various stresses using various functional validation processes, such as overexpression, knockout via CRISPR/Cas9 systems, etc.The role of OsHSFs in the abiotic stress response pathway was initiated not only in heat shock but also in other abiotic stresses.This information can be used to produce stress-tolerant rice cultivars suitable under changing climate conditions.

Fig. 1
Fig. 1 Protein structure of rice HSFs.The prediction model confidence level is presented at the bottom The phylogram is alienated into a total of four clades namely clade-I to clade-IV.Clade-I is further distinguished into sub-groups: I-a, I-b and with total of nine OsHSF genes.With 14 OsHSF genes, the clade-II is further split into clade-IIa and clade-IIb.Clade IIa also has two sub-clades called clade IIab and clade IIac.Two distinct groups with a single gene each are clade-III and clade-IV.Finally, the genes are characterized into OsHSF proteins, which are applied for abiotic factors like heat shock and drought resistance.

Fig. 7
Fig. 7 Role of Os02g13800 gene in field development (A) and network image (B)

Fig. 8
Fig. 8 Role of Os01g39020 gene in field development (A) and network image (B)

Fig. 9
Fig. 9 Role of Os01g53220 gene in field development (A) and gene networking analysis on the basis locus ID (B)

Fig. 10
Fig. 10 Role of Os01g54550 gene in field development (A) and gene networking analysis on the basis locus ID (B)

Fig. 11
Fig. 11 Role of Os03g25080 gene in field development (A) and network image (B)

Fig. 12 Table 5
Fig. 12 Role of Os08g43334 gene in field development (A) and gene networking analysis on the basis of locus ID (B)

Fig. 13
Fig.13 Networking of HSF gene family members[25] triggering multiple genes.Red dots represents the endocytosis process, blue dot for the spliceosome process, pink dot for the ascorbate and aldarate metabolism and yellow dot for glutathione metabolism

Fig. 14
Fig. 14 Hyper tree of single guide gene of heat shock factor gene family

Table 1
Basic information of HSF gene family

Table 2
Features of HSF gene family for chromosomal localization

Table 3
Conserved motif sequence of Oryza sativa L. HSFs involved in root development, vegetative growth and reproductive stages (embryo development).These genes have resistance against water stress in early seed germination and at the time of flowering during high temperature.

Table 4
Role of HSF gene family in field development