Skip to main content
Figure 1 | BMC Plant Biology

Figure 1

From: Impact of recurrent gene duplication on adaptation of plant genomes

Figure 1

Workflow overview. (a) Example of a protein tree as it can be found on the GreenPhylDB. The Arabidopsis sequences 7 – 12 (cluster B) are only related by duplication (nodes with red boxes) and are therefore ultraparalogs (=UP; red lines). Sequences only related by speciation are superorthologs (=SO; blue lines). Those are sequences 1 – 6 (cluster A), 13 and 14 (cluster C), 15 – 17 (cluster D). For example, sequences 13 and 15 are paralogs (as they are related by duplication) but not ultraparalogs (as a speciation event occurred after duplication). We used only clusters containing at least six sequences (clusters A and B, bold) for further analysis. The dashed lines indicate that SO and UP clusters can come from the same GreenPhyl tree (SO1 and UP1 datasets) or from separate trees (SO2 and UP2 datasets). (b) Corresponding CDS sequences were downloaded for each cluster. The clusters were aligned using PRANK [71] and cleaned with GUIDANCE [42]. (c) For all alignments, phylogenetic trees were created using PhyML [76]. (d) Positive selection was inferred on codons using PAML’s codeml [74] and on braches using mapNH [78, 79] in all alignments.

Back to article page