The CYP450 Multigene Family of Fontainea and Insights Into Diterpenoid Synthesis


 Background: Cytochrome P450s (CYP450s) are enzymes that play critical roles in the biosynthesis of physiologically important compounds across all organisms. Although they have been characterised in a large number of plant species, no information relating to these enzymes are available from the genus Fontainea (family Euphorbiaceae). Fontainea is significant as the genus includes species that produce medicinally significant epoxy-tigliane natural products, one of which has been approved as an anti-cancer therapeutic. Results: A metabolome analysis showed that Fontainea species possess a chemical profile different from various other plant species. The diversity and expression profiles of Fontainea CYP450s were investigated from leaf and root tissue. A total of 103 and 123 full-length CYP450 genes in Fontainea picrosperma and Fontainea venosa, respectively (and a further 127/125 partial-length) that were phylogenetically classified into clans, families and subfamilies. The majority of CYP450 identified are most active within root tissue (66.2% F. picrosperma, 65.0% F. venosa). Fontainea CYP450 associated with diterpenoid biosynthesis were classified into 2 subfamilies (CYP71D and CYP726A), of which CYP726A1, CYP726A2 and CYP71D1 appear to be exclusive to Fontainea species, and significantly more highly expressed in root tissue compared to leaf tissue. Conclusion: This study presents a comprehensive overview of the CYP450 gene family in Fontainea that may provide important insights into the biosynthesis of the medicinally significant epoxy-tigliane diterpenes found within the genus.


Background
Diterpenes, also known as diterpenoids or isoprenoids, are a structurally diverse class of small molecules that are widespread throughout the plant kingdom. Diterpenes exhibit many and varied biological activities and consequently there is signi cant commercial interest in their potential applications as pharmaceuticals, food products, and industrial and agricultural chemicals [1][2][3][4][5]. Tigilanol tiglate (TT), a novel epoxy-tigliane diterpene ester extracted from the fruit of Fontainea picrosperma (family Euphorbiaceae) [3,6], is of particular current interest due to its effectiveness as a local treatment for a range of cancers in humans and companion animals [3,[7][8][9]. Recently, TT was approved by regulatory authorities in Europe and the USA as a veterinary pharmaceutical for the treatment of non-metastatic canine mast cell tumours. TT cannot be synthesised on a commercial scale and instead is manufactured by puri cation from the fruit of F. picrosperma [10]. The natural biosynthetic pathway leading to the biosynthesis of tigliane esters such as TT is currently unknown, yet given the important role cytochrome P450s (CYP450s) play in the biosynthetic pathways of both primary and secondary metabolite production [11], it is likely that they are critical to the biosynthesis of TT and epoxytiglianes more generally.
CYP450s are widely distributed in eukaryotes, where they form a large and diverse class of enzymes consisting of more than 35,000 members [12,13] and play vital roles in biosynthesis of natural products, degradation of xenobiotics, biosynthesis of steroid hormones, drug metabolism and synthesis of secondary metabolites [11,12]. They catalyse reactions in biosynthetic pathways of many compounds such as alkaloids, avonoids, lignans, isoprenoids, phenolics, antioxidants and phenylpropanoid [14,15]. In most plant species, root tissue has relatively higher levels of CYP450 gene expression compared to leaf tissue [16,17]. CYP450 have relatively low sequence identity among different organisms (such as plants and animals) [18,19]. In plants, the CYP450 superfamily is one of the largest gene families of enzyme proteins. For instance, the CYP450 gene family is the third largest gene family present in Arabidopsis [20]. At present, 5100 plant CYP450s have been annotated and clustered into two different categories (A type and non-A-Type) and 11 different clans [21]. The A-type CYP450 enzymes are grouped as the CYP71 clan, whereas the non-A type are subdivided into 10 clans -CYP51, CYP72, CYP74, CYP85, CYP86, CYP97, CYP710, CYP711, CYP727, and CYP746 [14,15,19] according to the standard nomenclature system [22]. The CYP71 clan includes more than 50% of all plant CYP450s [23,24] and phylogenetic analysis of diterpenoid CYP450 members reveals that CYP71 and CYP726 are important for the biosynthesis of diterpenoids in the Euphorbiaceae family [25].
CYP450 in plants are membrane bound [26] that primarily localize to the endoplasmic reticulum, chloroplast or mitochondria and other secretary organelles [20]. The molecular mass of plant CYP450s range from 45 to 62 kDa with an average of 55 kDa in molecular mass [27]. They have four conserved key domains: a hemebinding domain, I-helix, K-helix and PERF/W domain [27]. The heme-binding signature motif has 10 conserved residues, among which cysteine is highly conserved, while the heme-iron motif has a binding site for oxygen, which is a requirement for transferring electrons to their substrate [28].
To date, there are no reports of CYP450 genes in Fontainea, yet the next-generation sequencing (NGS) approach provides the ideal tool towards their elucidation in this genus. Fontainea picrosperma and Fontainea venosa are closely related species that presumably produce similar arrays of natural products, including epoxytigliane diterpenes. In this study, we report the general metabolomic pro les of these two Fontainea species, and compared to non-Fontainea species. Towards better understanding these similarities and differences, we used NGS trancriptomics to elucidate the Fontainea CYP450 family and their relative gene expression in leaf and root tissue, with particular focus on those predicted to be involved in diterpenoid synthesis.

Metabolomic analysis and CYP450 identi cation
Metabolomic analysis of leaf tissue from F. picrosperma, F. venosa and 4 other non-Fontainea plant species provided a total of 49,098 mass spectral ions extracted from the LC-MS dataset. A partial least squarediscriminant analysis (PLS-DA) was performed to analyse the chemodiversity among samples. The PLS-DA model with three components accounting for 20.8%, 25.0% and 15.8% of the total variance showed that F. picrosperma and F. venosa were notably separated from other species (Fig. 1A). This untargeted metabolomic analysis indicated that Fontainea species were considerably more closely related from a chemical perspective compared to two Euphorbiaceae (Manihot esculenta, Ricinus communis) and two non-Euphorbiaceae (Arabidopsis thaliana, Solanum lycopersicum) plants.
To investigate if chemodiversity is correlated with CYP450 diversity, the F. picrosperma and F. venosa CYP450 were initially identi ed through NGS, de novo reference assembly and gene ontology. Transcriptome libraries were constructed from combined root and leaf tissue for both F. picrosperma and F. venosa plants using Illumina HiSeq 2500 trimmed reads (150 -200 bp). In total, 12 Gb and 30 Gb raw reads were generated for F. picrosperma and F. venosa libraries, respectively, which were assembled into 192,639 (N50 1,450 bp) and 246,608 (N50 1,248 bp) contigs. From these reference Fontainea transcriptomes, we identi ed 103 and 123 full-length CYP450 genes (and 127 and 125 partial-length) in F. picrosperma and F. venosa, respectively, all of which contained the conserved cytochrome P450 domain. The 4 different recognised CYP450 motif/regions (Ihelix, K-helix, PERF and heme-binding) are presented with noted conservation (Fig. 1B). A comparative CYP450 sequence identity analysis of the same 6 species used in metabolomic analysis demonstrated that Fontainea share considerably more CYP450 homologs (48% of CYP450) compared to other species of Euphorbiaceae and non-Euphorbiaceae (Fig. 1C). The non-Euphorbiaceae species of A. thaliana and S. lycopersicum shared no CYP450 homologs with F. picrosperma or F. venosa.
F. picrosperma CYP450 genes were classi ed into 10 clans, within which there were 37 families and 45 subfamilies (Table 1). In F. venosa,CYP450 genes were classi ed into 9 clans, containing 37 families and 67 subfamilies ( Table 2). The general characteristics of Fontainea full-length CYP450 proteins were also investigated, including the amino acid (aa) length, molecular weight, isoelectric point (pI) and presence of secretory signal peptide. Length varied from 300-618 aa in F. picrosperma and 301-632 aa in F. venosa, with molecular weights ranging from ~27-71 kDa. The pI values ranged from 5.22 -9.8 in F. picrosperma and 5.29-9.91 in F. venosa. We also calculated their instability index (II) and found that 18 F. picrosperma and 52 F. venosa CYP450 were stable (stability factor < 40). The GRAVY values were negative for all CYP450 proteins, indicating them to be hydrophilic. In F. picrosperma, 68 CYP450 were predicted to have a secondary pathway signal peptide, whereas only 5 sequences (FpCYP74B2, FpCYP707A2, FpCYP707A3, FpCYP707A4 and FpCYP88A1) contained chloroplast-targeting peptides. In F. venosa, 78 CYP450 had pathway signal peptides and 3 (FvCYP707A1, FvCYP707A2 and FvCYP707A4) had chloroplast-targeting signal peptides. There were no CYP450 with mitochondrial-targeting signal peptides.

Phylogenetic and putative functional analysis of CYP450
A phylogenetic analysis containing 1042 CYP450s from 6 species (F. picrosperma, F. venosa, R. communis, M. esculenta , A. thaliana and S. lycopersicum) con rmed that the majority of F. picrosperma and F. venosa CYP450 do not show substantial relatedness to CYP450 from species outside of the Euphorbiaceae family (Additional Fig. 1). There were 72 CYP450 genes exclusive to Fontainea (identity >92%). A Fontainea-speci c CYP450 phylogeny showed that in F. picrosperma, 16 CYP85 were assigned into 7 families that form a single clade, while the CYP72 clan contained 17 genes assigned to 3 families ( Fig. 2A). In F. venosa, 14 genes were assigned into 7 families that formed a single clade for CYP85. In the CYP72 clade, 17 CYP450 clustered into 4 families. A single CYP450 was represented in clan CYP710 (FpCYP710A1, FvCYP710A1) and CYP51 (FpCYP51G, FvCYP51G), which are phylogenetically most related to CYP85.
The majority of F. picrosperma CYP450 belong to the CYP71 clan (57 genes; 57.68%), followed by the CYP72 and CYP85 clans, which is also known as A-type (Fig. 2B). The CYP71 clan is responsible for alkaloid, sesquiterpenoid, cyclic terpenoid and avonoid biosynthesis. The majority of F. venosa CYP450 also belong to the CYP71 clan (68 genes; 55.28%), followed by the CYP72 and CYP85 clans (Fig. 2B). The non-A type encompass the remaining 46 CYP450, which belong to 9 CYP450 clans and 21 families in F. picrosperma. In F. venosa, there were 55 non-A type CYP450, which belong to 8 clans and 19 families. Of note, representative CYP450 from CYP711 were absent from F. venosa, while CYP97 were more well represented in F. venosa compared to F. picrosperma.

CYP450 genes involved in diterpenoid metabolism
Phylogenetic analysis of the Fontainea CYP450 classi ed within clans CYP71D and CYP726A were relatively closely related to those found in other Euphorbiaceae species (Jatropha curcus, Euphorbia peplus, Euphorbia latex and R. communis), con rming their position within diterpenoid CYP450 subfamilies (Fig. 4A). This was additionally supported by conserved motif analysis. Of the 4 F. picrosperma CYP726A, FpCYP726A4 was most divergent. A single CYP726A was identi ed in F. venosa (FvCYP726A1), which formed a clade with FpCYP726A1-A2. Three of the Fontainea diterpenoid CYP450 were present in both F. picrosperma and F. venosa but not in other Euphorbiaceae.
To con rm tissue expression of the common Fontainea diterpenoid CYP450 genes CYP726A1, CYP726A2 and CYP71D1, 12 biological samples of F. picrosperma were quantitatively analysed by mapping RNA-seq to the reference transcriptome. These data con rmed that gene expression was signi cantly higher in root tissue compared to leaf tissue (Fig. 4B). The housekeeping genes, glyceraldehyde-3-P dehydrogenase (GAPC) and elongation factor 1-alpha (EF1α) showed consistency in both leaf and root tissues and higher expression in root tissues compared to leaf tissue [29].

Discussion
Cytochrome P450s are evolutionarily conserved enzymes that are involved in the catalysis of numerous reactions, required for growth, development, defence [30] and secondary metabolism [31]. Prior to this study, no CYP450 genes had been identi ed, let alone characterized, in any species of the genus Fontainea. This is signi cant as CYP450 genes are likely to be important for future understanding of the biosynthetic pathways that produce medicinally signi cant diterpene esters, such as TT, which are unique to Fontainea. Towards that aim, we have identi ed and classi ed putative full-length CYP450 encoding genes in two species of Fontainea, F. picrosperma and F. venosa. Phylogenetic analysis allowed us to identify groups of genes for further evaluation. Moreover, their expression pro les in leaf and root tissues were investigated, with a particular focus on the CYP450 genes linked with diterpenoid biosynthesis, potentially involved in the production of TT.
We report 103 and 123 full-length CYP450 genes from F. picrosperma and F. venosa, respectively, that were classi ed into clans, which cumulatively consisted of 37 families and 67 subfamilies that t into conformed plant-derived functions, most prominently with diterpenoid, avonoid and other functions. An ortholog comparison showed that CYP450 genes of Fontainea species are largely unique when compared to other plant species of both Euphorbiaceae and non-Euphorbiaceae. In support of our metabolomics analysis, the S. lycopersicum and A. thaliana CYP450 showed low overall similarity to Fontainea species. This may be attributed to the unique biosynthesis of diterpenoid derivatives that are phorbol ester-speci c, found in Fontainea and other members of the Euphorbiaceae family.
The total number of Fontainea CYP450 identi ed in this study was consistent with that found in other plant transcriptomics CYP450 research, including the total number of full-length sequences, clans, families and subfamilies [16,32,33]. For example, transcriptomic studies allowed the elucidation of 151 full-length CYP450 genes in Lonicera japonica [32], 118 full-length in Taxus chinensis [33] and 116 full-length in Salvia miltiorrhiza [16]. However, in S. miltorrhiza, the tissues used for transcriptomics included leaves, roots and owers, while in L. japonica, ower and buds were used. Our study identi ed 127 and 125 partial-length CYP450 gene from F. picrosperma and F. venosa, respectively. To obtain the full-length sequence, additional RNA-seq from the stem, ower and fruit would be helpful, as well as from different stages of development. In addition, this could be complemented by genome sequencing.
If a genome is available for a species, it does provide an alternate mechanism for CYP450 gene identi cation, through genome-wide interrogation. In A. thaliana, this approach identi ed 246 genes that clustered into 9 clans and 47 families [21,34], while a much larger number were identi ed from the soybean (G. max) and rice (Japonica) genomes, containing 332 and 355 CYP450 genes, respectively [35]. Far fewer were present in the legume (Medicago truncatula), where 151 putative CYP450 genes were identi ed, including 135 novel CYP450 [36]. We expect that once a genome is available for Fontainea, a more complete list of full-length CYP450 genes will be established.
Our results using these predictive protein characterisation analyses (i.e. molecular weight, cell localization, function) were in line with prior studies of CYP450 proteins. CYP450s are typically anchored on the surface of the endoplasmic reticulum [37] and some may target to the plastids or mitochondria [38]. It is common that animal CYPs are anchored to mitochondria, but there is no report of any plant CYP with mitochondrial localization [36] except maize, where 3 CYPs have been reported (Zea mays) [39]. In our study of F. picrosperma or F. venosa, no deduced CYP450 proteins were predicted to have mitochondrial targeting peptides. CYP74A1, CYP74B1 and CYP74B2 in F. picrosperma and F. venosa and CYP726A4 and CYP71B12 in F. picrosperma were found in the chloroplast. In other plants, such as Triticum araraticum, Z. mays all members of CYP74 and CYP701 were targeted to chloroplast [39].
The diversity of different CYP450s between F. picrosperma and F. venosa, and other species, likely contributes to the observed differences in their chemical pro les. In all plant species that have been researched to date, the largest CYP450 clan is CYP71 [22]. The families and subfamilies within the clan have diverged remarkably during plant evolution, many of which are known to be involved in secondary metabolite biosynthesis of avonoids and alkaloids [40]. Similarly, the CYP71 family is the largest CYP450 clan in Fontainea (see Fig. 2). On the contrary, two CYP711 representatives were identi ed from F. picrosperma, but were absent in F. venosa, although CYP711 have been described in other plant species [24]. In our phylogenetic tree, the CYP74 clan is adjacent to CYP711 family, suggesting that Fontainea CYP711 may also function within the metabolism of oxylipins and strigolactone signals [24], as strigolactones have been identi ed as branching inhibition hormones in plants, and several CYP711 have been experimentally con rmed as strigolactones biosynthetic enzymes [41,42]. We additionally found that F. venosa had more CYP97 genes compared to F. picrosperma; the CYP97 clan is involved in the hydroxylation of carotenoids [43]. Carotenoids are a group of widely distributed pigments derived from the ubiquitous isoprenoid biosynthetic pathway and play diverse roles in plant primary and secondary metabolism. Carotenoids contain two pigments, carotene and lutein, which absorb and transfer energy to protect chlorophyll [34]. We speculate that this may partially explain why F. venosa have darker leaves compared to F. picrosperma.
We found that Fontainea (F. picrosperma and F. venosa) CYP450 genes were more actively expressed in root tissue compared to leaf tissue (see Fig. 3). Among those signi cantly more highly expressed in the root have been associated with fertility reduction (CYP78A), UV stress tolerance (CYP84A), gibberellin metabolism (CYP714A) and jasmonic acid metabolism (CYP74A) [44][45][46][47]. Those signi cantly more highly expressed in the leaf include those previously associated with catalysing successive oxidation steps of the plant hormone jasmonoyl-isoleucine for catabolic turnover (CYP94), expression of ABA 8'-hydroxylase and affects ABA levels to control seed dormancy (CYP707A), hydroxylation of carotenoids (CYP97), biosynthesis of castasterone in the brassinosteroid biosynthetic pathway (CYP85A) and glucosinolate metabolism (CYP83) [37,43,[48][49][50]. Some species variation existed in CYP450 homolog tissue expression (see Fig. 3B). This may be explained by the different growth and developmental stage of plants from which the tissue was sampled, as CYP450s are involved in the regulation of plant hormone metabolism, growth and development and hormones are involved in formation and development of owers, leaves, stems and fruits [51].
Diterpenoids are one of the most widespread classes of secondary metabolites in higher plants, which are synthesized from basic isoprene units (C 5 H 8 ) and further modi ed by various oxidoreductases, acyltransferases, dehydrogenases and glucosyltransferases [52]. CYP450-dependent oxidative modi cation is essential for the biosynthesis of diterpenes [52]. There are countless products formed in plants, among them diterpenoids are one of the most diverse groups, consisting of more than 12,000 metabolites [53] that have proven to be valuable as therapeutic drugs. According to previous research, CYP71D and CYP726A subfamilies are key CYP450s involved in diterpenoid biosynthesis [25,53] and most of these diterpenoids can only be found in plants [54]. Our phylogenetic analysis of diterpenoid CYP71 clan members revealed that Fontainea have representatives within two different diterpenoid subfamilies, namely CYP71D and CYP726A.
F. picrosperma diterpenoid CYP450 genes are signi cantly more highly expressed in root tissue compared to leaf tissue. The expression of genes can be affected by the developmental stage of plants, environmental conditions, seasonal and diurnal effects as well as biotic and abiotic stress [55]. Therefore, future research should explore the expression of the identi ed CYP450 genes under these different scenarios and in additional tissues. In other Euphorbiacea, diterpenoid genes were found to be co-regulated in rhizome and hairy roots [17] or highly expressed in root tissue compared to leaf and ower [16]. Nonetheless, Fontainea CYP71D and CYP726A genes are excellent candidates for involvement in diterpenoid biosynthesis pathways, in particular, the biosynthesis of epoxy-tigliane diterpene esters, which are only found in species of Fontainea, although further experiments are required to con rm this hypothesis. Their identi cation allows for experimental analysis of their function, for example, in vitro expression of the proteins followed by activity detection, or by knock-in and knock-out can be completed. This may be followed by activity detection in vivo, depending on the availability of a robust experimental system. Also, the analysis of high TT producing F. picroserma, compared with low producers, will provide guidance about CYP450 (and other genes) that potentially regulate TT production.

Conclusions
This research represents an important rst step in understanding the role of CYP450 genes involved in the biosynthesis of diterpene esters found only in species of Fontainea. A metabolome analysis showed that Fontainea species possess a chemical pro le different from other plant species. This could at least partially be explained by the diversity of unique CYP450 found in Fontainea. Further intra-genus chemical variation could also be due to variation in CYP450 between the two Fontainea species investigated, including the diterpenoid CYP450 genes. Our study showed that the majority of Fontainea CYP450 genes identi ed are more active in the root tissue compare to leaf tissue. The root-derived F. picrosperma diterpenoid CYP450 genes identi ed in this study (i.e. CYP71D1, CYP726A1 and CYP726A2) are strong candidates as key enzymes in the biosynthesis of medicinally signi cant diterpene esters of the epoxy-tigliane class.

Materials And Methods
Metabolome analysis Plant tissue collection for transcriptomics F. picrosperma and F. venosa seedlings were provided by EcoBiotics Ltd and plants were grown in the University of the Sunshine Coast (Sippy Downs) greenhouse. Plants were grown in independent pots and kept in the greenhouse at ambient temperature and humidity according to Mitu et al [29]. The plants were used from 2 -4 years old and healthy, fully expanded leaves and actively growing root tips, including the apical meristem and root caps were dissected (single leaf from each plant) and (1 cm 2 root) from plants of each species and preserved following the procedure described by Mitu et al. [29].

RNA isolation
Total RNA was isolated from ~100 mg of leaf and root tissue

De novo assembly and functional annotation
Two reference transcriptome libraries were prepared from two individual plant samples of F. picosperma and F. venosa leaf and root tissue. Quality of raw reads of each library were checked separately using FastQC [56] and Trimmomatic [57]. Trimmed reads of the different RNA-seq libraries for F. picrosperma and F. venosa were merged separately prior to assembly using Trinity [58], which applies a de novo reconstruction method. Quality of the assembly was assessed using the built-in Trinity Perl script to generate an N50 value. Alignment coverage rate was calculated using the program Bowtie [59] with a cut-off set at 70%. Following assembly, with an E-value cut-off of 10 −5 [60].

Classi cation and characterization of Fontainea CYP450 genes
An integrated HMM-search and InterProScan-veri cation approach was applied to identify the putative CYP450 gene families in Fontainea species. The CYP450 family HMM model was used through HMMER3 [61]. The ltered sequences were further blasted using NCBI (https://www.ncbi.nlm.nih.gov/) with the cut-off E-value of 10 −5 . Sequences annotated as CYP450 members were collected. The full-length CYP450 proteins were identi ed manually using nucleotide sequences in ExPASy [62] based on Chen et al. [16] and after annotation those sequences that did not match with any plant species were discarded. After ltering, the coding sequences of the resultant subjects were retrieved. Finally, results from the two methods were integrated and corrected manually. A BLAST search of F. picrosperma and F. venosaCYP450 genes against 814 previously identi ed CYP450 genes from 4 different plant species (A. thaliana, R. communis, M. esculenta, S. lycopersicum) was performed for CYP450 gene classi cation.
All full-length CYP450 genes were named according to the standard CYP450 nomenclature [22]. Brie y, 40%, 55% and 97% sequence identities were used as cut-offs for family, subfamily and allelic variants, respectively.
According to a previous study [51], functions of CYP450 clans were identi ed. We also calculated their instability index (II) using ExPASy tool (https://web.expasy.org/protparam/). Theoretical iso-electric points (PI) and molecular weight (kDa) were used to assess the physicochemical properties of putative CYP450s for each full-length CYP450 protein, as predicted by the ExPASy tool (http://www.expasy.org/tools/ Phylogenetic analysis of CYP450 One hundred and three (103) full-length genes in F. picrosperma and 123 genes in F. venosa were used for phylogenetic representation in Fontainea species. Sequence alignment was performed using Geneious 11.02 software performing ClustalW alignment. The phylogenetic tree was constructed using FastTree and maximum likelihood (ML) algorithm. The statistical bootstrap support of each branch was assessed by re-sampling the amino acid positions 1,000 times. The maximum likelihood phylogenetic tree and evolutionary analyses were carried out using iTOL web server (https://itol.embl.de/) [64]. For conserved domain identi cation, multiple sequence alignment of full-length Fontainea species protein sequences were carried out using ClustalX program using default parameters [65]. The alignment le was submitted to Web Logo generator software for generating the logo of conserved domains available at (http://weblogo.berkeley.edu/) [66].

Relative CYP450 gene expression
Total RNA was extracted from leaf and root tissue of 3 different F. picrosperma and F. venosa plants using an RNAeasy plant extraction kit from QIAGEN® (Hilden, Germany) according to Mitu et al. [29]. The expression levels of CYP450 genes were calculated using the CLC genomic 11.01 software package following default parameters. Raw counts for RNA-sequencing data of F. picrosperma and F. venosa genes were normalized to Transcripts Per Million (TPM). Levels of expression were represented as the log2 ratio of transcript abundance between leaf and root tissues. Next, we generated a z-score for sequencing depth normalized reads counts. Expression of each enzyme in leaf and root tissue was analysed by normal clustering. Relative expression pro les of CYP450 genes were presented in the form of a heatmap, which was constructed using z-score with Clustvis (https://biit.cs.ut.ee/clustvis/) [67], using default parameters and a hierarchical clustering analysis to assess biological sample relatedness. We also identi ed homolog sequences in both Fontainea species (percentage of identity 92%). Another heatmap was constructed with z-score of average TPM value of 3 different plant of each species. To determine the statistically signi cant differences between leaf and root tissue of homolog sequences, we used Microsoft Excel software 2013 to conduct Student's t-test. Values are reported as average z-score from three different plant of Fontainea species. Signi cant differences: p < 0.05.

Phylogenetic and quantitative analysis of diterpenoid CYP450
Previously identi ed diterpenoid CYP450s genes from members of the Euphorbiaceae family were acquired from NCBI (NCBI; www.ncbi.nlm.nih.gov), then used to identify Fontainea diterpenoid CYP450s genes, by homology. Identi ed genes were used to construct a phylogeny tree using Geneious 11.02 software following ClustalW alignment. The phylogenetic tree was constructed using the maximum likelihood method, as described above. For quantitative analysis, RNA-seq from leaf and root tissue from 12 F. picrosperma individuals were used. The high-quality cleaned reads were aligned to the F. picrosperma reference transcriptomes using CLC genomic workbench 11.01 following default settings.

Declarations
Availability of data and materials The raw sequence data from this study have been deposited in the publicly accessible NCBI Sequence Read Archive (SRA) database as accession number PRJNA687112. All data generated and used in this article is included as Additional Figures 1-3 and Additional Files 1-2.

Funding
The authors acknowledge nancial support from EcoBiotics Ltd and the University of the Sunshine Coast.
Author Contributions SAM performed the experiments, analysed the data, wrote the paper, prepared gures and/or tables. AHK helped in transcriptome analysis and TDT conducted the metabolomics analysis. The project was conceptualized by SMO, PWR and SFC. All authors contributed to drafting the manuscript and approval of the submitted version.
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.
Competing Interests SAM, AHK, SFC and TDT declare no competing interests. PWR is a shareholder and Executive Director of EcoBiotics Ltd and QBiotics Group. SMO is a shareholder and Non-Executive Director of QBiotics Group. Table 1.
List of full-length Fontainea picrosperma CYP450s identi ed in this study. Cellular location of the protein predicted using the TargetP program. 'C': chloroplast; 'S': secreted; '*': unknown and '-' not secreted.   Graph showing gene expression of diterpenoid genes across 12 biological replicates of Fontainea picrosperma in leaf and root tissue. Signi cant differences: * p < 0.05 and ** p < 0.01. TPM, transcripts per million.