Analysis of the WUSCHEL-RELATED HOMEOBOX gene family in the conifer picea abies reveals extensive conservation as well as dynamic patterns

Background Members of the WUSCHEL-RELATED HOMEOBOX (WOX) gene family have important functions during all stages of plant development and have been implicated in the development of morphological novelties during evolution. Most studies have examined the function of these genes in angiosperms and very little is known from other plant species. Results In this study we examine the presence and expression of WOX genes in the conifer Picea abies. We have cloned 11 WOX genes from both mRNA and genomic DNA and examined their phylogenetic relationship to WOX genes from other species as well as their expression during somatic embryogenesis and in adult tissues. Conclusions Our study shows that all major radiations within the WOX gene family took place before the angiosperm-gymnosperm split and that there has been a recent expansion within the intermediate clade in the Pinaceae family. Furthermore, we show that the genes from the intermediate clade are preferentially expressed during embryo development in Picea abies. Our data also indicates that there are clear orthologs of both WUS and WOX5 present in the P. abies genome.


Background
The morphogenesis of the plant body is a developmental process that continues throughout the entire lifespan of the plant. Different morphologies in different plant lineages are dependent on developmental differences that have a molecular basis in the function and/or expression of regulatory genes. In seed plants major patterning events take place during embryo development, the embryo is polarised with an apical shoot meristem, giving rise to all above ground parts, and a basal root meristem, giving rise to the root system. The specification of the body plan during embryogenesis requires the coordination of cell fates according to their position along the embryo axis, from the apical to the basal part. During post-embryonic development new organs are formed successively from the shoot-and root meristems at the same time as the meristems are maintained. The genetic control of patterning and morphogenesis during plant development is dependent on a large number of genes. Many aspects of the molecular regulation of plant development have been analysed, primarily in the model angiosperm species Arabidopsis thaliana, whereas very little is known in plant species outside the angiosperms.
In angiosperms members of the WUSCHEL-RELATED HOMEOBOX (WOX) gene family play important roles determining cell fates during plant development. The WOX gene family is characterised by the phylogenetic relationship of the homeodomain of these genes. It is present only in the "green" lineage comprising land plants and green algae. The number of WOX genes present in the genomes of different species correlates to some degree with the complexity of the species body pattern. Only one WOX gene is found in the genomes of the green micro algae Ostreococcus tauri and O. lucimarinus, three WOX genes are present in the genome of the moss Physcomitrella patens, nine in the lycophyte Selaginella moellendorffii, whereas the genome of A. thaliana contains 15 WOX genes. Phylogenetic analyses have divided the WOX gene family into three major clades [1][2][3]. Only one clade contains genes from early diverging plants, e.g. the moss P. patens, as well as the green algae O. tauri and O. lucimarinus, and this clade is therefore referred to as the ancient clade [1]. The ancient clade also contains representatives from A. thaliana, namely AtWOX10, 13, and 14. The other two clades are termed the intermediate clade, containing AtWOX8, 9,11 and 12, and the modern clade, containing AtWUS and AtWOX1-7 [1].
The role of the WOX genes during plant development has been studied to some detail in A. thaliana as well as in Petunia hybrida, Zea mays and Oryza sativa (e.g. [4][5][6][7]). The founding member of the gene family, A. thaliana WUSCHEL (AtWUS), has been shown to act in the organising centre of the shoot apical meristem (SAM) to maintain the stem cell population [4,8]. A similar role has been proposed for AtWOX5 in the root apical meristem [9] and for AtWOX4 in the cambial meristem [10,11]. Other WOX genes have been implicated in the patterning and morphogenesis of the early embryo, e.g. AtWOX2, AtWOX8 and AtWOX9 [12][13][14] and in regulating flower development and/or inflorescence architecture, e.g. AtWOX1, AtWOX3, AtWOX6 and P. hybrida EVERGREEN (PhEVG) and SISTER OF EVERGREEN (PhSOE) [5]. The WOX genes function as transcriptional regulators and at least some can act as both activators and repressors depending on tissue type or developmental stage [15]. Furthermore, all WOX genes examined show very specific expression patterns, both spatially and temporally, which are important for their functions (e.g. [16]). Several studies suggest that the WOX gene family may be involved in the evolution of developmental processes [3,5,17,18]. Thus, analysis of the tissue specific expression of WOX genes is of interest to elucidate similarities and differences in the regulatory mechanisms of plant development also in species outside the angiosperms.
The WOX genes in O. tauri (OtWOX) and P. patens (PpWOX01-03), all belonging to the ancient clade, have been shown to be constitutively expressed in all tissues and at all developmental stages analysed [3]. The A. thaliana ancient clade genes AtWOX13 and AtWOX14 are also expressed in most tissues and developmental stages (roots, shoots and reproductive organs), although the expression pattern is limited to certain cells within an organ [3]. In the conifer Picea abies the intermediate clade gene PaWOX8/9 has been shown to be preferentially expressed during embryo development [19]. Furthermore, the P. abies modern clade gene PaWOX2 was shown to have a similar expression pattern as PaWOX8/ 9 [19,20]. The A. thaliana genes AtWOX8 and AtWOX9, which belong to the intermediate clade are expressed in specific domains from the first cell division of the proembryo until the end of embryogenesis and the same is true for the A. thaliana modern clade gene AtWOX2 [13]. In the fern Ceratopteris richardii it has been shown that CrWUL, a gene with strong similarity to AtWUS, is expressed in the shoot meristem [21]. Another AtWUS homolog, GgWUS from the gymnosperm Gnetum gnemon, is expressed in both the root and shoot meristems. Furthermore, heterologous expression of GgWUS in A. thaliana confers the same phenotype as the overexpression of AtWUS [2]. Thus, although little is known about the WOX genes outside the angiosperms most studies have revealed extensive similarities, at least with regard to gene expression, suggesting a conserved function for these genes. However, the shoot specific expression of WUS and root specific expression of WOX5 seem to be specific to angiosperms as the gymnosperm WUS homologs GgWUS, Ginkgo biloba WUS (GbWUS), and Pinus sylvestris WUS (PsWUS) were shown to be expressed in both the shoot and the root [2]. Furthermore, Nardmann et al. (2009) could only detect one homolog of WUS/WOX5 in these gymnosperms suggesting that the WUS and WOX5 genes are the result of an angiosperm specific gene duplication [2].
In this paper we present an analysis of the WOX gene family in the conifer P. abies. We have identified 11 WOX homologs and analysed the phylogenetic relationship of these genes to other known WOX genes. Our phylogenetic analyses have identified one member of the ancient clade and several members in the intermediate and modern clades. The P. abies WOX genes of the intermediate clade group together apart from the angiosperm genes, whereas there are clear P. abies orthologs of most angiosperm modern clade genes. Furthermore, we have analysed the expression of the different WOX genes in different tissues and developmental stages with a special focus on somatic embryo development.

Results
Cloning and phylogeny of P. abies WOX genes We have successfully isolated 11 P. abies WOX genes using degenerate primers targeting the homeodomain. The isolated homeodomain sequences were extended by genome walking to acquire the full genomic sequences (exons and introns). Each genomic locus was then cloned using gene specific primers to confirm the genome walking and the exon-intron pattern was predicted. The gene structure of the isolated genes is presented in Figure 1. Two of these genes have been described as cDNAs earlier, namely PaWOX2 and PaWOX8/9 [19,20]. The P. abies WOX genes were named according to their closest homolog in A. thaliana as indicated by phylogenetic analyses (see below) except for PaWOX8A, PaWOX8B, PaWOX8C, and PaWOX8D, which were named after AtWOX8, which is the A. thaliana gene in the intermediate clade with the lowest number, and A-D because the P. abies genes are more similar to each other (65% average pairwise identity, also see Additional file 1) than they are to any A. thaliana genes. Due to long introns we were not able to amplify the full-length genomic sequence of PaWOX13 by either genome walking or with gene specific primers, thus the gene model is made from a combination of genomic and transcript data and lacks sequence in both introns ( Figure 1).
We have cloned transcripts containing a full coding sequence for all genes except for PaWUS, PaWOX8C and PaWOX8D for which we could not detect any transcripts in the tissues analysed. According to our predictions these latter three genes have intact open reading frames (ORFs), although the predicted PaWUS intron is more than 2 kb and contains repetitive sequences ( Figure 1). We were not able to detect any transcript of PaWUS using primers targeting either the homeobox alone or targeting the entire ORF.
To put the P. abies WOX genes in a phylogenetic context the homeodomain sequences from the P. abies WOX genes were aligned to a subset of homeodomain sequences from other WOX genes. We included sequences from green algae, bryophytes, lycophytes, ferns, gymnosperms, and angiosperms and used two different phylogenetic methods and both nucleotide and protein alignments to infer the phylogeny. The analyses showed good support for the three major clades of WOX genes, the ancient clade, intermediate clade, and modern clade ( Figure 2). However, there was some uncertainty as to the placement of the fern sequences CrWOXA and CrWOXB in the intermediate clade since the support is quite low, especially in the Maximum Likelihood (ML) analyses.
According to the phylogenetic analysis there was one P. abies representative in the ancient clade, PaWOX13, and several representatives of the intermediate clade (PaWOX8/ 9 and PaWOX8A-D) and the modern clade (PaWOX2-5, PaWUS) ( Figure 2). In order to be able to elucidate the relationships in the intermediate and modern clades in more detail new sub-trees including only members from the respective clades were generated using additional sequence outside the homeodomain (Figures 3 and 4).
In the intermediate clade there were no clear P. abies orthologs to the angiosperm sequences ( Figure 3). The ML analysis using nucleotide sequences supported a monophyletic group for the P. abies genes (data not shown) whereas the other methods suggested a division into two sub-clades. There was good support for the monophyly of the genes in the clade containing AtWOX11 and 12 ( Figure 3). However, the AtWOX8 and 9 genes, as well as the PhEVG, PhSOE and VvWOX9 genes, did not show a monophyletic relationship. It is therefore difficult, with this limited data set, to conclude if the P. abies genes are more closely related to AtWOX11 and 12, closer to AtWOX8 and 9, or if they should group as sister to the angiosperm sequences.
In the modern clade there was almost a complete set of orthologous sequences to the A. thaliana sequences ( Figure 4). The support was good for the orthology of PaWUS and AtWUS, as well as for PaWOX3 and AtWOX3. The support was not very strong, but the topology and number of genes still support the orthology of PaWOX2 and AtWOX2, PaWOX4 and AtWOX4, and PaWOX5 and AtWOX5. The only modern clade A. thaliana sequences we did not find orthologs for in P. abies were AtWOX1, AtWOX6 and AtWOX7. The support for the orthology of PaWOX2 and AtWOX2 was rather strong in the full WOX gene tree ( Figure 2) but not as strong in the modern clade sub-tree ( Figure 4). The monophyly of the WOX4 clade had a low support in the full WOX gene tree ( Figure 2). In the modern clade sub-tree there was no support for the monophyly of the WOX4 genes. Although the monophyly is in question PaWOX4 still groups closer to AtWOX4 than to any other A. thaliana WOX gene. The support for PaWOX5 to group with AtWOX5 was not very strong (Figures 2 and 4). However, there was good support for the WUS/WOX5 sub-clade ( Figure 4) including all WUS and WOX5 orthologs from the different species.

Expression of WOX genes in P. abies
In order to get a better understanding of where the P. abies WOX genes are active we analysed their expression in different tissues and at various developmental stages using quantitative real-time PCR (qPCR) ( Figure 5). PaWOX8/9 was mainly expressed during embryo development although some expression could be detected  outside the embryo at very low levels ( Figure 5A). PaWOX2 and PaWOX8A showed a very specific expression pattern with expression only detectable in proembryogenic masses (PEMs) and late embryos ( Figure 5A). No expression was detected in early or mature embryos. PaWOX8A expression could also be detected in young needles although this expression was extremely low. Thus, PaWOX2, PaWOX8A and PaWOX8/9 are all more or less embryo specific. PEMs were the only tissue where we could detect expression of PaWOX8B ( Figure 5A). The high sequence similarity between PaWOX8B, C, and D made it very difficult to design gene specific primers, therefor the expression we detected could potentially also include PaWOX8C and D and not only B. However, we cloned and sequenced several qPCR products (16) and they all corresponded to PaWOX8B.
The highest expression of PaWOX3 was detected in mature embryos and shoot tips whereas a very low expression was detected in PEMs, early embryos and root tips ( Figure 5B). PaWOX4 showed expression in all d. e. c. f. g. h. a. b. c. d. e. f. g. h.

Modern clade
Intermediate clade Ancient clade tissues analysed with a maximum in shoot tips and cambium ( Figure 5B). PaWOX5 expression was only detected in mature embryos, shoot tips and root tips ( Figure 5B). Interestingly, PaWOX5 was the only gene not expressed in PEMs. PaWOX13 expression was detected in all tissues analysed ( Figure 5B). The expression was somewhat higher in old needles (last years shoot) and in cambial tissue whereas the expression levels in other tissues seemed to be very similar.

Discussion
Previous studies of the WOX gene family in plants have identified three major clades; the ancient clade with representatives from all lineages of land plants as well as green algae, and the intermediate and modern clades with representatives found only in ferns and seed plants [1][2][3]21]. In our study we have identified 11 P. abies WOX genes representing all major clades of WOX genes. Only one P. abies gene groups into the ancient clade (Figure 2), PaWOX13, which is in accordance to what has been reported for other gymnosperms (G. gnemon, G. biloba, and P. sylvestris) [2]. At this point we cannot exclude the existence of additional genes of the ancient clade in gymnosperms. We could detect expression of PaWOX13 in all tissues tested, both during embryogenesis and in more adult tissues. The ancient clade genes AtWOX13 and AtWOX14 from A. thaliana are expressed in most tissues, although there is no clear expression data from the embryo stage [3]. Thus at this scale of resolution the expression patterns of the ancient clade genes are similar in P. abies compared to A. thaliana, i.e. a general expression in most tissues.
We could identify five different genes belonging to the intermediate clade (PaWOX8/9 and PaWOX8A, B, C, and D). The phylogenetic analyses show that these genes group very closely even though there is not enough support to confirm if they are monophyletic or not (Figure 3). Several of the P. abies intermediate clade genes are very similar to each other, i.e. PaWOX8B-D, especially in the coding regions (Additional file 1), suggesting that they are the result of very recent gene duplications. We could identify at least two different sequence variants of each gene, which we believe are allelic (data not shown). Therefore we cannot exclude the existence of more intermediate clade genes in P. abies. Based on this we suggest that the P. abies intermediate clade genes are the result of an expansion that happened after the split between gymnosperms and angiosperms. Interestingly, we could find at least four different EST sequences from Pinus taeda in Genbank (accession numbers DR690333, DT636589, DR692518, DT634290) belonging to the intermediate clade. Phylogenetic analyses group these genes within the P. abies intermediate clade genes suggesting that the expansion might be common to species within the Pinaceae family (Additional file 2). Whether this is common also to other gymnosperm lineages is still an open question.
PaWOX8A and PaWOX8/9 show expression preferentially during embryo development with only a very low expression for PaWOX8/9 in non-embryo tissues. These expression patterns could be compared with those for AtWOX8, 9, 11, and 12 since the phylogenetic analyses do not conclusively show which of these A. thaliana gene(s) are the closest homologs of the P. abies genes. AtWOX8 and AtWOX9 are expressed during embryo development from very early stages and throughout development [13]. AtWOX8 expression is very low outside the embryo whereas AtWOX9 is expressed also in more adult tissues where it is involved in the maintenance of the SAM [22,23]. The expression of AtWOX11 and AtWOX12 has so far not been described in any detail but array data suggest that at least AtWOX11 is expressed during embryo development (Arabidopsis eFP Browser, [24]). Thus, the expression of PaWOX8/9 and PaWOX8A seems to be comparable to the A. thaliana intermediate clade genes although their expression is more limited to embryo development. Furthermore, the before mentioned P. taeda EST sequences were all derived from embryogenic tissues, which might suggest that the embryo specific function of these genes is common within the Pinaceae family. PaWOX8B expression was only detected in PEMs, which is a very specific tissue type for embryogenic cell cultures and therefor we do not know if the gene has any specific function during the life cycle of the plant. We could not detect any expression of PaWOX8C and D, which might suggest that these genes are inactive pseudogenes. However,  there is nothing in the genomic sequences of these loci to support this notion. Taken together, PaWOX8B-D might all be silenced, or at least partially silenced, recent copies of either PaWOX8A or PaWOX8/9.
Our phylogenetic analyses identified an almost complete set of orthologous sequences in the modern clade between P. abies and A. thaliana. The only A. thaliana sequences we could not find orthologs of were: AtWOX1, AtWOX6 and AtWOX7. AtWOX7 is most likely the result of a very recent duplication in the A. thaliana lineage since AtWOX5 and AtWOX7 are more closely related to each other than to WOX5 orthologs from O. sativa or V. vinifera (Figure 4). The lack of orthologous sequences to AtWOX1 and 6 might be because we were unable to amplify them with our primers, alternatively there has been either an angiosperm specific diversification or a gymnosperm specific gene loss.
The orthology of PaWOX2 with AtWOX2 is well supported in the full WOX tree (Figure 2 support is lower in the modern clade sub-tree ( Figure 4). In A. thaliana AtWOX2 is expressed during embryo development and at low levels outside the embryo [13]. This is similar to the pattern seen for PaWOX2 ( Figure 5). However, in a previous study PaWOX2 was shown to be expressed both in embryos and seedlings [20]. We could recapitulate the results by Palovaara & Hakman [20] when using the same primers as they used whereas our primers only detected expression in PEMs and late embryos (Additional file 3). These differences might seem strange at first but the main difference between our primers and the primers of Palovaara & Hakman [20] is that one of our primers spans an exonintron boundary and does not amplify genomic DNA or unspliced transcripts whereas the primers of Palovaara & Hakman [20] are situated within the same exon and therefore cannot discriminate between processed and unprocessed transcript or genomic DNA (Additional file 3). It will be interesting to learn if the difference between our results and those of Palovaara & Hakman (2008) is due to that PaWOX2 is regulated post-transcriptionally. The orthology of PaWOX3 and AtWOX3 is well supported (Figure 4), which is suggestive of a conserved function for these genes in the different lineages. We could detect expression of PaWOX3 mainly in shoot tips and mature embryos, which, at this level of resolution, is similar to the expression reported for AtWOX3 as well as for a Z. mays WOX3 ortholog [25,26].
In our phylogenetic analyses the WOX4 clade has the lowest support (Figures 2 and 4). Furthermore, we could not find a clear WUS-box in the deduced PaWOX4 protein (see Additional file 4). However, since AtWOX4 is the closest A. thaliana homolog we named the P. abies gene in this clade PaWOX4. The expression of PaWOX4 is not limited to any particular tissue types, rather we could detect expression in most tissues. Interestingly, high expression was detected in the cambium, which might suggest that the function of WOX4 as a regulator of the cambial meristem is conserved between angiosperms and gymnosperms.
In contrast to other analysed gymnosperms (G. gnemon, G. biloba and P. sylvestris) [2] we found sequences grouping with both WUS and WOX5 (Figure 4). The phylogenetic support for the WUS clade, including PaWUS, is good (Figure 4), and is further supported by the fact that the PaWUS protein sequence contains an extra amino acid residue in the homeodomain, which is specific to proteins in the WUS clade [2]. In angiosperm WUS proteins, as well as in PaWUS and GbWUS, this extra amino acid is a tyrosine whereas it is a histidine in GgWUS and the fern sequences CrWUL and CaWUL [21]. The PaWOX5 and PsWUS proteins on the other hand contain no extra amino acid, consistent with them not being WUS orthologs, but rather orthologous to WOX5. Thus, the results presented here together with previous analyses [2], suggest that the gene duplication leading to distinct WUS and WOX5 genes predates the gymnosperm-angiosperm split. Although our data support the notion that there was both a WUS and WOX5 ortholog present in the last common ancestor of angiosperms and gymnosperms it might not mean that the division of function between the apical meristems of shoots and roots was present. Expression of NjWUS, from the basal angiosperm Nymphea jamesoniana, has only been detected in the shoot [2] whereas expression of the gymnosperm genes GgWUS and PaWOX5 has been detected in both shoot tips and root tips ( [2] and this study). This suggests that the subfunctionalisation of WUS and WOX5 took place at the base of the angiosperm lineage after the split between angiosperms and gymnosperms. It is interesting to note that in G. biloba and G. gnemon it is the WUS ortholog that is expressed whereas the existence of a WOX5 ortholog is uncertain [2] and in the conifers P. sylvestris [2] and P. abies it is the WOX5 ortholog that is transcribed and we could not find any expression of the WUS ortholog. Further investigations aimed at understanding the tissue specific expression and function of the gymnosperm WUS/ WOX5 genes might reveal specific functions for these genes in gymnosperms compared to angiosperms and if there are differences within the gymnosperms.

Conclusions
The results presented here show that the major diversifications within the WOX gene family happened before the split between angiosperms and gymnosperms approximately 300 million years ago. Together with previous studies it confirms the dynamic patterns and lineage specific evolutionary trajectories within certain branches of the WOX gene tree. First, it shows that there has been an independent expansion of the intermediate clade in the Pinaceae family. Second, it shows that there are clear orthologs of both WUS and WOX5 present in the gymnosperm P. abies. Thus, further investigations into the specific function of these genes might give insights into lineage specific developmental and morphogenetic differences.

Plant material
Picea abies L. Karst embryos and plants were grown as follows. Proliferating embryogenic cultures from cell line 28:05 were cultivated on solidified medium supplemented with the plant growth regulators (PGRs) auxin and cytokinin. To stimulate differentiation of early embryos the cultures were transferred to medium lacking PGRs and for further development of the embryos the cultures were transferred to medium supplemented with abscisic acid (ABA) [27]. Somatic embryo plants regenerated from cell line 28:05 were grown under accelerated growth conditions for seven growth periods in a phytotron [28]. Samples from the plants were collected at the end of the growth period during the second week (out of nine) under autumn growth conditions. Seedlings were derived from open pollinated seeds from a seed orchard germinated for two weeks in vermiculite at 22°C in micro-greenhouses under daylight conditions (approx. 16 h light).
Samples from nine different stages or tissues were collected for RNA extraction. Samples from embryogenic cultures were annotated as follows: proliferating Proembryogenic masses (PEM) in the presence of PGRs; Early embryos (EE) one week after withdrawal of PGRs; Late embryos (LE) two weeks after transfer to ABA containing medium; Mature cotyledonary embryos (ME) five weeks after transfer to ABA containing medium. The following tissues were collected from somatic embryo plants: shoot tips (ST), i.e. the topmost 2-3 mm of the elongating shoot; cambium (CM) dissected from branches where the bark was first removed; young needles, from elongating shoots (YN) and two year or older needles (ON). Root tips (RT) were collected from seedlings and the topmost 2-3 mm were sampled. The sampling was performed mid-day and all tissues were frozen in liquid nitrogen and stored at −80°C after collection.

DNA and RNA isolation and cDNA synthesis
Genomic DNA was isolated from PEMs by DNeasy plant mini kit (Qiagen) or from needles from somatic embryo plants using CTAB (adapted from [29]). Total RNA was extracted from the samples described above using spectrum plant total RNA kit (Sigma) or RNeasy plant mini kit (Qiagen). DNase I (Sigma) was used to remove any genomic contamination. The RNA quality was checked by agarose gel electrophoresis and quantified by spectrophotometry. For each sample, 1 μg of total RNA was reverse transcribed with ReversAid H Minus First Strand cDNA Synthesis Kit (Fermentas) using 1:1 mixed random primer and oligo-dT primer (Fermentas) according to the manufacturer's instructions.
Cloning of P. abies WOX genes P. abies WOX genes were identified by PCR using degenerate primers described in [30] as well as additional primers targeting the homeodomain in genomic DNA isolated from embryogenic cultures (Primer sequences can be found in Additional file 5). PCR-fragments of expected size (approx. 160 bp) were cloned using TOPO-kit from invitrogen and sequenced.
Each unique WOX homeodomain was extended by genome walking (5' and 3') using either Thermal Asymmetric Interlaced PCR (TAIL-PCR) [31] or Genome Walker Universal Kit (Clontech). Each gene was subsequently cloned using gene specific primers and sequenced. The gene structure was determined from the corresponding transcripts using gene specific primers or by 5'-and 3' extension using 5'/3' RACE technology (GeneRacer kit, Invitrogen). All cDNAs were cloned from PEM with the exceptions of PaWOX3, which was isolated from shoot tips (ST), and PaWOX5, which was isolated from root tips (RT). Gene predictions were performed with FGENESH (http://linux1.softberry.com/berry.phtml) trained with dicot plants (Arabidopsis thaliana). Primer sequences are listed in Additional file 5. Gene and cDNA accessions are listed in Additional file 6.
Sequences were aligned using the translation alignment tool and the MAFFT plug-in in Geneious (Biomatters Ltd. Auckland, New Zealand) [32] and edited manually. For the large tree only the homeodomain was used, whereas additional sequences outside the homeodomain was used for the intermediate clade and modern clade sub-trees (alignments can be found in Additional file 4). Parts of the sequences that were difficult to align were excluded from the analyses to reduce noise. Furthermore, only genes for which the full-length sequence was known was used for the construction of the sub-trees.
Phylogenetic analyses were performed using both Bayesian inference (BI) and maximum likelihood (ML). For the BI analysis we used MrBayes (v. 3.1.2) [33]. The results were viewed using the MrBayes plug-in in Geneious. The nucleotide analyses were performed using the GTR model with a 4 category discrete gamma-distribution of rate variation among sites (GTR + G model). The data set was partitioned into three sets corresponding to codon position and rates were allowed to differ across partitions. The JTT model with gamma-distribution of rates among sites was chosen for the protein analysis with the aid of ProtTest 2.4 [34][35][36]. For the large tree the analysis was run for 10.000.000 generations with a burn-in of 1250. For the intermediate clade and modern clade sub-trees the analysis was run for 3.000.000 generations with a burn-in of 600 and 1250 respectively. The results from the intermediate clade runs were consistent despite the low burn-in (see Additional file 2). The ML analyses were performed using raxmlGUI 1.0 [37,38]. For nucleotide data the GTR gammacat model was used and for protein data we used the JTT + gamma model. Support was calculated using the rapid bootstrap method implemented in RAxML with