Phylogenetic relationships in the YlmG family of proteins. (A) Amino acid sequence alignment of the YlmG family. The amino acid sequences were collected from the National Center for Biotechnology Information database. The alignment includes the YlmG family of proteins of A. thaliana (ATH), the red alga Cyanidioschyzon merolae (CME), the cyanobacteria S. elongatus PCC7942 (S7942), and S. pneumoniae (SPN). The locus IDs or GI numbers of the sequences are indicated with the name of the species. (B) Phylogenetic tree of the YLMG family. The tree shown is the maximum-likelihood tree constructed by the PHYML program . The numbers at the selected nodes are posterior probabilities by the Bayesian inference (left) and local bootstrap values provided by the maximum-likelihood analysis (right). The tree includes proteins of photosynthetic eukaryotes; A. thaliana (ATH), Oryza sativa (OSA), Chlamydomonas reinharditii (CRE), Ostreococcus tauri (OTA), C. merolae (CME), Thalassiosira pseudonana (TPS), and Phaeodactylum tricornutum (PTR), apiconplexa; Plasmodium vivax (PVI) and Theileria annulata (TAN), cyanobacteria; Synechocystis sp. PCC 6803 (S6803), S. elongatus PCC7942 (S7942), Gloeobacter violaceus PCC 7421 (G7421), and Prochlorococcus marinus str. MIT 9312 (P9312), other bacteria; Escherichia coli (ECO), Bacillus subtilis (BSU), Streptococcus pneumoniae (SPN), Chlamydophila caviae (CCA), Rhizobium etli (RET), Rhodospirillum rubrum (RRU), Caulobacter sp. K31 (C-K31), Chloroflexus aggregans (CAG), Chromohalobacter salexigens (CSA), and Pseudomonas syringae (PSY). The locus IDs or GI numbers of the sequences are shown with the name of the species. White boxes indicate non-photosynthetic organisms. * indicates proteins whose gene disruptants showed no effects on the activity of the photosystems, while ** indicates proteins whose gene disruptants reduced the photosystem activity [29, 30, 32]. Posterior probabilities and bootstrap values for all branches are shown in Additional file 1.