The loss of photosynthesis pathway in a holoparasitic plant Aeginetia indica revealed by plastid genome and transcriptome sequencing

Background With three origins of holoparasitism, Orobanchaceae provides an ideal system to study the evolution of holoparasitic lifestyle in plants. The evolution of holoparasitism can be revealed by plastid genome degradation and the coordinated changes in the nuclear genome, since holoparasitic plants lost the capability of photosynthesis. Among the three clades with holoparasitic plants in Orobanchaceae, only Clade VI has no available plastid genome sequences for holoparasitic plants. Results In this study, we sequenced the plastome and transcriptome of Aeginetia indica , a holoparasitic plant in Clade VI of Orobanchaceae, to study its plastome evolution and the corresponding changes in the nuclear genome as a response of the loss of photosynthetic function. Its plastome is reduced to 86,212 bp in size, and almost all photosynthesis-related genes were lost. Most protein coding genes in the plastome showed the signal of relaxation of purifying selection. Plastome and transcriptome analyses indicated that the photosynthesis pathway is completely lost, and that the porphyrin and chlorophyll metabolism pathways are partially retained, although chlorophyll synthesis is not possible. Conclusions Our study suggests the loss of photosynthesis-related functions in A. indica in both the nuclear and plastid genomes. The Aeginetia indica plastome also provides a resource for comparative studies on the repeated evolution of holoparasitism cytochrome b6f complex, and photosynthetic of photosynthetic pathways in the plastid and nuclear genomes of the Possible loss of the chloroplast genome in the

Plastid genomes of holoparasites in Clade III and Clade V of Orobanchaceae differ markedly in genome size and gene content. Plastid genome sizes of holoparasites in Clade III range from 45,673 (Conopholis americana) to 120,840 bp (Orobanche californica) [9]. However, the plastid genome size of Lathraea squamaria from Clade V is 150,504 bp [12], much larger than those in Clade III. The number of intact genes in the plastid genomes of Conopholis americana and Orobanche species ranges from 21 to 34 [9], and almost all genes related to photosynthesis (pet, psa, psb, and rbcL) were lost or became pseudogenes. Whereas in the plastid genome of Lathraea squamaria, there are 46 intact genes including many genes related to photosynthesis (such as psa, pab and pet). This might be due to holoparasitic lineages in Clade V is younger than those in Clade III [12].
In addition to plastome degradation, the nuclear genomes of holoparasitic plants are also expected to evolve as a response of the loss of photosynthesis capability, since the genes related to photosynthesis in the plastid genome interact with many genes in the nuclear genome [7,13]. The expressional changes of nuclear genes could be revealed by transcriptome sequencing. For example, the expression of genes in the photosynthesis and chlorophyll synthesis pathways has been examined in some parasitic plants [7,14,15].
Aeginetia is a small holoparasitic genus of Orobanchaceae and it consists of about four species distributed in southern and southeastern Asia [16]. According to the phylogenetic analyses of Orobanchaceae, Aeginetia, along with Hyobanche, Harveya and Christisonia, forms a monophyletic holoparasitic lineage in Clade VI [4,5].
Aeginetia indica is the most widespread species in this genus [17]. It usually parasitizes on the roots of Poaceae plants like Miscanthus and Saccharum [18]. In a recent study, transcriptome data of A. indica have been used to detect horizontally transferred genes from Fabaceae and Poaceae species [19]. So far, plastid genome sequence and the degragation of photosynthesis related pathways have not been studied in this holoparasitic plant.
In this study, we assembled the plastid genome of A. indica using Illumina short reads produced by genome skimming. We also sequenced the transcriptomes from multiple tissues to examine the expressional changes of genes involved in photosynthesis. The results of this study will contribute to our understanding of the coordinated evolution of plastid and nuclear genomes and also facilitate comparative analysis of convergent evolution of holoparasitisim in Orobancheae.

Complete plastid genome of A. indica
The complete plastid genome of A. indica has a typical quadripartite structure, and it is 86,212 bp in length, with 22,301 bp of the LSC region, 529 bp of the SSC region, and 31,691 bp each of the IR regions (Fig. 1). AT content of this plastid genome was 65.64%. Based on the DOGMA and GeSeq annotation, the plastid genome of A. indica contains 54 putative intact genes and three pseudogenes.
These intact genes contain 24 tRNA genes, 4 rRNA genes, 8 rpl genes, 12 rps genes and 6 other genes, namely, ycf1, ycf2, accD, matK, infA and clpP ( Table 1). The three pseudogenes are ψatpA, ψatpI and ψndhB. ψatpA and ψatpI genes in the LSC region of A. indica plastome became pseudogenes because of being truncated at the 88nd condon and a premature stop condon at the 32nd condon, respectively. ψndhB gene in the IR region became a pseudogene due to an internal stop codon at the 53rd condon. Transfer RNA genes trnH-GUG, trnQ-UUG, trnS-GCU, trnC-GCA, trnD-GUC, trnY-GUA, trnE-UUC, trnS-UGA, trnG-UCC, trnM-CAU, trnS-GGA, trnL-UAA, trnA-UGC, trnF-GAA,trnW-CCA, trnL-UAG, trnN-GUU, trnL-CAA, trnfM-CAU, trnI-CAU, trnV-GAC, trnI-GAU, trnT-GGU, trnP-UGG Ribosomal RNA genes rrn4.5, rrn5, rrn16, rrn23 Other protein-coding genes ycf1, ycf2, accD, clpP, matK, infA Pseudogenes ψndhB, ψatpA, ψatpI Supplementary information Additional file 1: Figure S1. Maximum likelihood tree of seven species in Orabanchaceae based on sequences of 20 plastid genes shared among them. Numbers in the nodes are bootstrap values. Scale in substitutions per site. Additional file 2: Figure S2. The expression of genes in the photosynthesis pathway observed in the Aeginetia indica transcriptome. Genes with detected expression were in the red boxes. With courtesy of © www.genome.jp/kegg/kegg1.html. Additional file 3: Figure S3. The expression of genes in the porphyrin and chlorophyll metabolism pathway detected in the Aeginetia indica transcriptome. Genes with detected expression were in the red boxes. With courtesy of © www.genome.jp/kegg/kegg1.html. Additional file 4: Table S1. Relaxation of purifying selection in parasitic plants of Orobanchaceae based on branch model analysis of 20 protein coding genes shared by seven species of Orobanchaceae. The likelihood ratio test was used to compare the three models (M0: one ratio model; M2: two ratio model; M3: three ratio model). P-values are in bold when they are less than 0.05. Additional file 5: Table S2. Expression level of unigenes of Aeginetia indica in the photosynthesis pathway based on transcriptome analysis.
The SSC region in plastome of A. indica shows a severe reduction in size and only two genes, rpl15 and trnL-UAG, were found in this region (Fig. 1). The two IR regions undergone expansions which towards both the LSC and SSC regions. In L.
philippensis and other autotropic plants, an intact ycf1 gene usually spans the IR and SSC regions, and rps8, rpl14, rpl16, rps3, rpl22 and rps19 genes were in the LSC region. Whereas, in A. indica, there is an intact ycf1 gene in each of the IR regions, and rps8, rpl14, rpl16, rps3, rpl22 and rps19 genes were all shifted into the IR regions.

Plastid genome rearrangements in A. indica
With Mauve 2.4.0, sequence alignment for the plastomes of A. indica and L. philippensis was shown in Fig. 2. We identified four locally co-linear blocks (LCBs) for the two species, and A. indica plastid genome has undergone two major inversions relative to L. philippensis. One is a 1,452 bp inversion which contains an intact accD gene and occurred in the LSC region, the other is a large inversion of 60,255 bp in length and it contains an intact infA gene at the boundary of the LSC and IR B regions, complete SSC and IR B region, and most of the IR A region.

Relaxed purifying selection of A. indica plastid genes
A total of 20 protein coding genes shared among the seven species in Orobanchaceae, including 10 rps genes, 7 rpl genes, and accD, infA and matK genes were used for phylogenetic analysis. The maximum likelihood tree was strongly supported, with bootstrap values of all branches being 100 ( Figure S1). Three Striga species were clustered into one clade, and Buchnera americana was sister to them.
Aeginetia indica was sister to the clade consisting of the former four species.
Non-synonymous (dN)/synonymous (dS) substitution rate ratio (ω) can be considered as an indicator for selection pressure. Two-ratio model (M2) was first compared with one-ratio model (M0). ω values of all genes but rpl20 and rps18 in the parasitic plant branch were larger than those of the nonparasitic plant branch (Table S1), and the likelihood ratio test showed that M2 is significantly better than M0 at nine genes, i.e. accD, infA, rpl22, rps11, rps14, rps19, rps2, rps3 and rps7, suggesting that these genes were under relaxed purifying selection in parasitic plants. Using three-ratio branch model (M3), we found that hemiparasitic species had higher or much higher ω than holoparasitic species at 13 of 18 genes (ω values of the remaining two genes are not available), while holoparasitic species had slightly higher ω than hemiparasitic species at only five genes (Table S1). This suggests that protein-coding genes retained in the plastome of A. indica still play important functional roles rather than experiencing more relaxed selective pressure than hemiparasitic species.
Transcriptome analysis for A. indica The photosynthesis pathway (ko00195) from the KEGG pathway database contains 63 genes (30 plastid genes and 33 nuclear genes). In the A. indica plastome, genes involved in photosystem I and II, cytochrome b6f complex, and photosynthetic electron transport are completely lost. The only two F-type ATPase related genes (atpA and atpI) in its plastome are pseudogenes. Based on the transcriptome analysis, only 14 unigenes in the photosynthesis pathway had expression (Table   S2). The 14 genes included one gene encoding PSII 6.1 kDa protein, seven involving in photosynthetic electron transport and six being components of F-type ATPase Four genes, atpA, clpP, rpl2 and rpl23, contain introns, which is consistent with the retention of matK's function as intron splicing.
The loss of photosynthesis related genes is a commom phenomenon in holoparasitic plants, such as Aphyllon and Orobanche [ 10,11]. Loss of housekeeping genes was also observed in other holoparasitic plants, for example, the plastid genome of Balanophora laxiflora is only 15,505 bp in size, with most genes being lost [23], and Rafflesia lagascae has even lost its whole plastid genome [24]. Some housekeeping genes have been transferred to the nuclear genome and their proteins can move back to the plastid to perform their functions [25]. Previous studies proposed models of plastome evolution in parasites and the order of gene losses [26][27][28]. The five stages in these models include "Photosynthetic", "Degradation I", "Stationary", "Degradation II" and "Absent" stages. The order of gene losses starts with ndh genes, followed by psa/psb genes and rpo genes, then atp genes, rbcL gene, nonessential housekeeping genes and other metabolic genes like accD, clpP, ycf1 and ycf2, ends with the remaining housekeeping genes like rpl and rps genes.
According to their models, plastome of A. indica is in the "Stationary" stage.

Rearrangement of the A. indica plastome
Rearrangement of A. indica plastome relative to L. philippensis chloroplast genome contained two inversions, one is a small fragment with an intact accD gene in the LSC region, while the other is a very large fragment across the most part of IR A , intact SSR and IR B regions. Large inversions around the IR regions were also observed in the plastomes of two other holoparasitic plants (Phelipanche ramosa and P. purpurea), and such large-scale rearrangements may be caused by relaxed selective pressure and progressive plastome nonfunctionalization [9].

The loss of photosynthesis pathway in A. indica
Aeginetia indica has no photosynthetic activity and obtains all carbon through connection with its host [19]. In the present study, the loss of photosynthesis pathway in A. indica was confirmed based on the loss of photosynthesis genes in its plastome and no detected expression of many genes in the photosynthesis pathway from its transcriptome. Chlorophyll is the reaction center of PSII and absorbs light energy, playing an important role in photosynthesis [29]. Chlorophyll synthesis is impossible in A. indica because some key genes in the later stage of the porphyrin and chlorophyll metabolism pathway was not detected with expression. In contrast, an intact chlorophyll synthesis pathway was ever found in a holoparasitic plant Phelipanche aegyptiaca, suggesting that the expression of the chlorophyll synthesis pathway is for other functions other than photosynthesis [30].

Conclusions
The plastid genome of Aeginetia indica, a holoparasitic plant from Clade VI of Assembly, annotation and alignment of plastid genome The plastid genome of A. indica was assembled from Illumina sequencing data using NOVOPlasty [31], with parameters of insert size (300 bp), K-mer (37) and coverage cut off (1500). Annotation of plastid genome was performed by combining the DOGMA program [32] and GeSeq in OGDRAW [33]. Genes which contain one or more  [37]. The ratios (ω) of non-synonymous (dN) to synonymous (dS) substitution rate for 20 shared genes were estimated using codon-based analysis (codeml) in the PAML v.4.8a package [38]. Different branch models were used to analyze selective pressures among these species. The null one-ratio model (M0, it hypothesizes that all branches have one ω) was firstly performed, and then the likelihood of a two-ratio model (M2), with a foreground ω1 for parasitic species and a background ω2 for autotrophic species, was compared with that of M0.
Moreover, a branch model with three ratios (M3) which assumes three different ω values for holoparasitic, hemiparasitic and autotrophic species, respectively, was compared with M2. The likelihood ratio test for M0 vs M2, and M2 vs M3, was conducted with the Chi-square distribution, with the degree of freedom equal to the difference in the number of parameters for the models, to evaluate the fit of the data to alternative branch models.

Transcriptome sequencing
Total RNA was isolated from flower, sepal, fruit, and stem tissues of A. indica, respectively. The quality and concentration of RNA were determined using 1% agarose gel electrophoresis and a Qubit spectrophotometer, respectively. mRNAs of these four tissues were purified with Oligo dT, and then used to construct cDNA

Transcriptome analysis
Raw sequencing data were filtered by removing the adaptors and low quality reads.

Declaration
Ethics approval and consent to participate: The plastid genome map of Aeginetia indica. Genes shown outside and inside the outer circle   Table S1.doc Figure S2.docx Table S2.docx