BeMADS1 is a key to delivery MADSs into nucleus in reproductive tissues-De novo characterization of Bambusa edulis transcriptome and study of MADS genes in bamboo floral development

Background The bamboo Bambusa edulis has a long juvenile phase in situ, but can be induced to flower during in vitro tissue culture, providing a readily available source of material for studies on reproductive biology and flowering. In this report, in vitro-derived reproductive and vegetative materials of B. edulis were harvested and used to generate transcriptome databases by use of two sequencing platforms: Illumina and 454. Combination of the two datasets resulted in high transcriptome quality and increased length of the sequence reads. In plants, many MADS genes control flower development, and the ABCDE model has been developed to explain how the genes function together to create the different whorls within a flower. Results As a case study, published floral development-related OsMADS proteins from rice were used to search the B. edulis transcriptome datasets, identifying 16 B. edulis MADS (BeMADS). The BeMADS gene expression levels were determined qRT-PCR and in situ hybridization. Most BeMADS genes were highly expressed in flowers, with the exception of BeMADS34. The expression patterns of these genes were most similar to the rice homologs, except BeMADS18 and BeMADS34, and were highly similar to the floral development ABCDE model in rice. Transient expression of MADS-GFP proteins showed that only BeMADS1 entered leaf nucleus. BeMADS18, BeMADS4, and BeMADS1 were located in the lemma nucleus. When co-transformed with BeMADS1, BeMADS15, 16, 13, 21, 6, and 7 translocated to nucleus in lemmas, indicating that BeMADS1 is a key factor for subcellular localization of other BeMADS. Conclusion Our study provides abundant B. edulis transcriptome data and offers comprehensive sequence resources. The results, molecular materials and overall strategy reported here can be used for future gene identification and for further reproductive studies in the economically important crop of bamboo.


Background
Bamboo is important not only to human industry, but also in the environment and for animal habitat. Because bamboo has a long juvenile phase, an unpredictable flowering time and dies after flowering, it is difficult to investigate its reproductive biology. In large bamboo forests, bamboo flowering can cause economic and ecological damage. For example, in 1970-80, a widespread flowering of the bamboos Bashania fangiana and Fargesia denudata in China threatened the food source of pandas in the affected area [1]. In 1963-73, two-thirds of the Phyllostachys bambusoides stands were flowering in Japan, limiting the bamboo industry [2]. Therefore, the mechanism timing bamboo flowering is of interest outside academic pursuits. To investigate this topic, a reliable source of reproductive materials is required. Using tissue culture, bamboo can be induced to flower [3] by addition of cytokinin [4]. Additionally, vegetative shoots can be induced by auxin treatment [5]. Using tissue culture, genomic resources have been established for the bamboo Bambusa edulis, including microarray [6] and Expressed Sequence Tag (EST) libraries [7].
Next Generation Sequencing (NGS) has been employed to supplement the microarray and EST libraries for non-model plants [8,9]. This method was also applied to the bamboos Dendrocalamus latiflorus [10,11] and P. heterocycla [12]. The Bambusa genus comprises more than one hundred species, which are widely distributed in the tropical and subtropical areas of Asia, Africa, and Oceania. There are many important economic species, such as B. edulis and B. oldhamii, which are grown for human consumption, B. pervariabilis and B. tuldoides, which are grown for building and furniture supplies, B. textilis and B. rigida, which are grown for fiber, and B. ventricosa and B. multiplex var. riviereorum, which are grown for ornamental use. Additionally, Bambusa has been used for cross-breeding with other bamboo genera [13]. Compared with the transcriptome resources of Dendrocalamus and Phyllostachys, the transcriptome data from Bambusa is limited.
Generally, flower morphology is diverse and unique, and therefore serves as an excellent material for taxonomic and evolutionary studies [14]. Recent studies on floral development-related genes in dicot plants can be understood by the ABCDE model of flower initiation [15,16]. A and B class genes cooperate to form the petal. B and C class genes cooperate to form the stamen. A whorl that only expresses a C class gene develops into a carpel. D class genes are related to ovule identity. E class genes are expressed in all four whorls of floral organs and ovule and correlate to the floral meristem determinacy [16][17][18]. Interestingly, all genes thus far identified in this model, except AP2, which belongs to the APE-TALA2/ ethylene-responsive element binding protein (AP2/EREBP) family, are MADS genes. MADS genes encode transcription factors. Based on amino acid sequences, these genes can be divided into two types: type I (SRF-like) and type II (MEF-like). In plants, the MEF-like MADS-domain proteins contain four conserved domains: the MADS (M) domain, the Intervening (I) domain, the Keratin-like (K) domain and a C-terminal domain. Therefore, these type II proteins are called MIKC-type MADSbox proteins. All MADS genes in the ABCDE model of plant floral development are MIKC-type MADS.
The ABCDE model was developed through research in dicot plants. However, the monocots, specifically the family Poaceae, contain important cereal crops, such as rice (Oryza sativa), maize (Zea mays), wheat (Triticum spp.), and barley (Hordeum vulgare) [19]. Together with bamboo, these species form the Bambusoideae, Ehrhartoideae (rice) and Pooideae (Wheat, barley and oats; BEP) phylogenetic clade. Similarities and differences in the genetic sequences and expression patterns of floral development genes in this clade are informative for both macroevolution [20] and agricultural application. Furthermore, since monocot flower development can directly affect the grain yield, the mechanism of flowering is an important topic in Poaceae research. Additionally, the morphology of monocot floral organs is different from that in the dicots. In rice and bamboo, the inflorescence is composed of spikelets. Each spikelet contains one floret. The floret is divided into four whorls, namely: lemma and palea (whorl 1), two lodicules (whorl 2), six stamens (whorl 3), and gynoecium (one ovary and two stigmas, whorl 4) [21]. In rice, MADS genes have been identified and divided into the ABCDE gene classes [20][21][22][23][24][25][26][27][28]. Compared with rice (Oryza sativa), relatively fewer MADS-box genes have been characterized in bamboo [29][30][31]. Therefore, to systematically study MADSbox genes involved in floral formation in bamboo, the B. edulis NGS transcriptome databases were developed and searched to identify putative flower developmentrelated MADS (BeMADS) genes.

Results and discussion
RNA-Seq, de novo assembly and sequence analysis Three B. edulis transcriptome libraries (454, Illumina and Hybrid, Additional file 1) were constructed from RNA derived from different developmental stages and various tissues in vitro (roots, stems, leaves and flowers). To comprehensively cover the B. edulis transcriptome, equal amounts of total RNA from each sample were pooled together before the mRNA was isolated, enriched, sheared into smaller fragments, and reverse-transcribed into cDNA. We performed RNA-Seq analyses by either Roche 454 or Illumina sequencing platforms based on the twophase assembly approach. The resulting sequencing data were subjected to bioinformatic analysis.
The size distribution of the B. edulis unigenes identified from the three transcriptome datasets is shown in Figure 1A and Table 1. These set of unigenes were annotated using BLASTX searches of a variety of protein databases, taking into account the identity between the unigene sequence and the sequence in the database (E-value ≤10 −5 ). The size distributions of the BLAST-aligned predicted proteins in the three B. edulis transcriptome datasets are shown in Figure 1B.
Currently, there are several NGS platforms, i.e. Illumina/Solexa Genome Analyzer, Roche 454 GS FLX and Applied Biosystems SOLiD, used in genome and transcriptome research, each with advantages and weaknesses. In research using NGS, the accuracy and length of the sequences are important. For instance, while the read length obtained using the Sanger method is longer, the method is more expensive. Illumina technology has higher sequencing coverage, resulting in higher accuracy, but the read length is short, making it difficult to obtain long contigs during de novo assembly. Therefore, integration of multiple sequencing platforms is one strategy for de novo sequencing when there is no reference genome available [32]. Through a hybrid assembly, contigs averaging 670 nts were constructed, an average length longer than that reported for the D. latiflorus transcriptomes, which only used Illumina methods [10,11].
Some pre-assembled sequences were lost during the integration of the Illumina and 454 sequences. Therefore, in this report, the transcriptomes derived from each sequencing platform are also presented. This allowed searches for DNA sequences of interest in two de novo transcriptomes and one virtual hybrid transcriptome, with the results further assembled after hunting in the three databases. Figure 1 Overview of sequence reads and assembly of the three B. edulis transcriptomes. The length distribution of the contigs obtained from de novo assembly of high-quality, clean reads from NGS data across three datasets, namely sequence data from Roche 454, Illumina, and Hybrid transcriptome. Panel A shows the lengths of all contigs from each dataset. Panel B: shows the contig lengths for only those contigs that had BLASTX hits in the NCBI protein database.

Functional annotation of B. edulis transcriptome
To predict the function of these assembled transcripts, non-redundant sequences were submitted to a BLASTX (E-value ≤ 10 −5 ) search against the following databases: Gene Ontology (GO), NCBI non-redundant database (Nr), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) and Orthologous Groups of proteins (COG) (Additional files 2 and 3). Nearly 77.0% (11,646 unigenes for 454 dataset), 71.6% (39,261 unigenes for Illumina dataset) and 86.7% (7,141 unigenes for Hybrid dataset) of all predicted unigenes significantly matched a sequence in at least one of the four databases used for annotation (Additional files 2, 3, and Figure 2). In order to determine if a complete representation of the known genes within a gene family could be found in our datasets, the MADS gene family was used for further transcriptome validation.
Sixteen putative BeMADS genes identified from B. edulis transcriptome database Using16 floral-specific rice MADS protein sequences, 16 BeMADS genes were identified (Table 2, accession no. is shown in Additional file 4). When using the data from only one sequencing platform, most of the sequences were partial and some could not be identified. For example, BeMADS2, 5, 8, 14, 15, and 18 were not found in the Illumina database. BeMADS4, 7, 13, 21, and 34 were not found in the 454 database. Combining the sequences from the three databases resulted in identification of full-length transcripts for BeMADS1, 2, 3, 4, 8, 14, 15, 16 and 58 (Table 2). These results indicated that combination of different sequencing platforms resulted in longer sequence lengths and more complete transcriptome assembly. The same observation was made in the Phalaenopsis transcriptome study [32].
The high sequence homology in the MADS gene family, especially the highly conserved M domain in the N-terminal region, can be a problem in distinguishing between paralogs during de novo assembly and promoter walking. To identify the promoter region and to clone full length genes, a BAC strategy was used [8,[33][34][35] to identify 7 additional full length BeMADS genes in B. oldhamii ( Table 2).
In addition to a sequencing strategy, it is possible to search databases from other closely related species. According to chloroplast genome results, P. heterocycla, D. latiflorus and B. oldhamii are highly homologous species [36,37]. Some bamboo gene sequences, including genomic, full-length cDNA, and EST, have been published [7,[10][11][12]. From the NCBI database, P. heterocycla and D. latiflorus MADS genes were identified. Integration of the data from different bamboo species will prove important not only for gene identification, but also for evolutionary studies.
Evolutionary relationships among bamboo and other monocot MADS genes similar to genes in the ABCDE model of floral development To identify the putative functional classification of the bamboo BeMADS in relation to the ABCDE model and to understand the phylogenetic relationships with other known MADS-box genes regulating floral development, we collected full-length amino acid sequences of MADS from bamboo (16), rice (16) [38], maize (10) [39] and wheat (18) [40] to perform phylogenetic analysis ( Figure 3). Our phylogenetic tree is organized with an overlay of the ABCDE model classes for ease of discussion, based on this [28].
BeMADS14, BeMADS15 and BeMADS18 belonged to the AP1 family in the A class ( Figure 3), which includes the FUL1, FUL2 and FUL3 clades [20,41]. BeMADS14, like OsMADS14, belonged to the FUL1 clade. BeMADS15 sorted into the FUL2 clade, close to ZAP1 from maize and OsMADS15 from rice. BeMADS18, like OsMADS18, belonged to FUL3 clade. These genes, identified as transcripts from B. edulis, clustered with genes that were hypothesized to occur twice in grass genomes due to duplication events [20].
BeMADS2, BeMADS4 and BeMADS16 were most orthologous to the B class proteins (Figure 3). BeMADS2 and BeMADS4 belong to the PI family, with BMADS2 closely related to OsMADS2 and maize ZMM2, and BeMADS4 most closely related to OsMADS4 ( Figure 3). BeMADS16 was most closely related to OsMADS16 (SPW1) in the AP3 clade. The presence of one AP3 ortholog (BeMADS16) and two PI orthologs (BeMADS2, BeMADS4) is similar to the other monocots and bolsters the hypothesis that early in the evolution of the monocots there was an ancient gene duplication event of the PI ortholog [21,42]. Four proteins, BeMADS3, BeMADS13, BeMADS21 and BeMADS58, belong to the AG family ( Figure 3), which functionally classifies as a C/D class MADS protein. In the C class functional group, BeMADS3 and BeMAD S58 were most closely related to rice OsMADS3 and OsMADS58, respectively. BeMADS13 and BeMADS21 were most orthologous to the D class functional group and closely related to rice OsMADS13 and OsMADS21, respectively ( Figure 3). Based on the phylogenetic tree analysis, the D class had four subclades in the grasses, and each subclade contained at least one gene from rice, maize or wheat. The AG family of proteins is divided between the C and D classes, the first of which contains the rice proteins OsMADS3 and OsMADS58which are like AG, SHATTERPROOF1 (SHP1), and SHATTERPROOF2 (SHP2) in Arabidopsis -and the second of which contains OsMADS13 and OsMADS21which are like SEEDSTIK (STK) in Arabidopsis [43]. Our data show that the bamboo BeMADS proteins in the C/D group also contain one gene in each subclade (Figure 3), which can be interpreted as a major gene duplication event that occurred in both grass C and D clades before the separation of the maize, rice, wheat and bamboo lineages [44,45].
Five proteins, BeMADS1, BeMADS5, BeMADS7, BeMADS8, and BeMADS34, were most closely related to the SEP family, which belongs to the E functional group (Figure 3). The class E genes in rice belong to two clades -the SEPclade (Clade II) and the LOFSEP-clade (Clade I) [41]. The OsMADS1/LEAFY HULL STERILE 1 (LHS1), OsMADS5/ OSM5, and OsMADS34/PANICLE PHYTOMER 2 (PAP2) grouped into the LOFSEP-clade [46]. While this class can be divided into multiple layers of derived clades, the most informative may be the five distinct subclades, 1-5 [40,41]. This phylogenetic division indicates that the bamboo BeMADS genes in E-group are closely related to the OsMADSs in each clade and that at least one BeMADS falls into each subclade (Figure 3), similar to the homologous genes identified from rice, maize and wheat. According to these results, the common ancestor of these species may contain at least five SEP-like genes.
There is one family of MIKC-MADS that does not have a defined role in the ABCDE model, the AGL6 clade. Recently, it was reported that OsMADS6/MOSAIC FLORAL ORGANS 1 (MFO1) plays a synergistic role in regulating floral organ identity, floral meristem determinacy and meristem fate with class B (OsMADS16), C (OsMADS3), and D (OsMADS13) genes and with the YABBY gene DROOPING LEAF (DL), which was previously known to function in carpel specification [28,47,48]. These results suggest that rice AGL6-clade gene may have an E-class function. Our phylogenetic analysis indicates that BeMADS6 belongs to the AGL6 family and is most similar to OsMADS6 (Figure 3). Past phylogenetic analysis showed that AGL6-like genes are sister to the SEP-like genes [49]. Interestingly, SEP genes were only identified in angiosperms, but AGL6-like genes were identified in both angiosperms and gymnosperms.
As a whole, this phylogenetic tree shows that bamboo contains MADS proteins not only in each putative functional group but also in each sub-clade and that the BeMADS are most often sister to the rice OsMADS. Therefore, functional experiments in bamboo can be designed based on previous work in monocots. Recently, two AP1/SQUA-like MADS-box genes from bamboo (Phyllostachys praecox), PpMADS1 (FUL3 subfamily) and PpMADS2 (FUL1 subfamily), were found to play roles in floral transition, since they caused early flowering through upregulation of AP1 when overexpressed in Arabidopsis. Yeast two-hybrid experiments demonstrated that PpMADS1 and PpMADS2 might interact with different partners to play a part in floral transition of bamboo [31].

BeMADS gene expression
The expression patterns of the 16 BeMADS were analyzed by real-time quantitative RT-PCR using genespecific primer sets across several tissue types and floret ages ( Figure 4). Data are grouped by functional classes, A-E, on the right. Most of the BeMADS genes were highly expressed in the floral organ (F). BeMADS34 was expressed in various tissues, but most highly expressed in stem (S). This result is different to the presumed ortholog in rice, OsMADS34, which is ubiquitously expressed but highly expressed in spikelet and has been shown to be involved in inflorescence and spikelet development [50].
The process of bamboo flower development can be divided into 5 stages, from small floral buds to mature flower (stages 1-5). The expression level of the A-, B-and E-class BeMADS genes were high in the youngest floral buds (stage 1) and decreased through floral maturity. The expression of C-and D-class BeMADS genes were reduced in stage 1, slightly increased in stages 3 to 4, and decreased in stage 5 ( Figure 4). Expression of BeMADS in class E showed two overall patterns, one that was high throughout floral development and one that was high just in stage 1.
We further analyzed the expression patterns of the BeMADSs in bamboo floral organs. From the outer whorl (See figure on previous page.) Figure 3 Phylogenetic tree based on amino acid sequences of MIKC-type MADS-box genes. 60 MIKC-type MADS-box genes were used: 16 from Bambusa edulis, 16 from rice (Oryza sativa), 10 from maize (Zea mays), and 18 from wheat (Triticum aestivum L.). Deduced full-length amino acid sequences were used for the alignments. The phylogenetic tree was constructed by the neighbor-joining method and evaluated by bootstrap analysis (MEGA version 4.0). Numbers on major branches indicate bootstrap percentage for 1,000 replicates. Six Arabidopsis sequences of the FLC subfamily were used as outgroups. Proteins from B. edulis were highlighted with red boxes. The three grass clades of FUL1, FUL2, and FUL3 within the AP1 subfamily and the two major clades of the SEP subfamily are labeled on the right. The five grass clades within the SEP subfamily are indicated by numbers showing their respective name according to previous studies [41], namely 1: LHS1/OsMADS1, 2: OsMADS5, 3: OsMADS34, 4: OsMADS7/45, 5: OsMADS8/24. Subfamilies of the plant MIKC-type genes and the functional classification according to the A/B/C/ D/E classes are indicated at the right margin. to the inner whorl within the floral organ, we divided the flower into lemma (Le, whorl 1), palea (Pa, whorl 1), lodicule (Lo, whorl 2), anther (An, whorl 3) and pistil (Pi, whorl 4). Our results showed that for the A-class genes, BeMADS14 was expressed throughout, but higher in the lemma and pistil, BeMADS15 was expressed in the lemma and palea, and BeMADS18 was most highly expressed in the pistil (Figure 4). The BeMADS14 homolog OsMADS14 was only detected in inflorescence and developing caryopses by transcript analysis [51]. Based on in situ hybridization analysis, OsMADS14 was expressed in the early spikelet meristem, the primordia of flower organs, and the reproductive organs, but did not express in the vegetative organs [51]. These data are consistent with that of BeMADS14, which was only expressed in floral organ (Figure 4). The BeMADS15 homolog OsMADS15 was first detected in the spikelet meristem and then in vegetative organs only after emergence of spikelet organs, including lodicules, palea, lemma, and glumes [52]. BeMADS15 showed a similar expression pattern, but very low expression in the lodicules (Figure 4), same like the ortholog in wheat, TaAP1-3 [40]. The expression pattern of BeMADS18 was different from the rice ortholog OsMADS18 and the wheat ortholog TaAP1-2. OsMADS18 is expressed in roots, leaves, inflorescences, and developing kernels, but not in young seedlings. The OsMADS18 transcript was also detected in leaves following germination after four weeks and increased during the reproductive phase [22]. A similar gene expression pattern was also found for wheat TaAP1-2, which is highly expressed in roots, stems, leaves, different developmental stages of spikes and different spikelet organs, including the glumes, lemma, and palea [40]. It is interesting that TaAP1-2 was also expressed at low levels in developing caryopses, lodicules, stamens and pistils [40]. However, our result showed that BeMADS18 was more highly expressed in the fourth whorl (pistil) than in other whorls in the floral organ. While BeMADS18 is classified into the A class by sequence similarity and phylogenetic analysis, its expression pattern differs somewhat from typical A-class genes from other grasses. Perhaps BeMADS18 functions in pistil formation with other functional genes in the C or E class.
A single copy of an AP3/DEF-like gene but two copies of the PI/GLO-like genes is a phenomenon common in other plant species, including Arabidopsis, Antirrhinum, rice, maize, and wheat, and also bamboo ( Figure 3). B class genes are required to specify petal and stamen identity [53]. Whether of PI/GLO or AP3/DEF lineage, the mRNA of B class genes (BeMADS2, BeMADS4 and BeMADS16) showed a similar expression pattern: mainly in flower, with low levels detected in lemma and palea, but high levels in lodicules and anthers (Figure 4). This may indicate redundant function as a safety measure to insure flower development. Transcripts of the AP3/DEFlike OsMADS16/SPW1 and maize SILKY1 (SL) were detected mainly in the lodicules and stamen primordia during floral development, but not in developing carpels [21,24]. The expression patterns of BeMADS16 and wheat TaAP3 are similarly in mature female organs [40], but the function of TaAP3 is unknown. The PI/GLO-like BeMADS2 and BeMADS4 display similar expression patterns, but BeMADS2 was highly expressed in anthers and BeMADS4 was highly expressed in lodicules. However, BeMADS2 and BeMADS4 expression patterns were still similar to other members of the PI family in the floral organ [40,42,52]. Rice in situ hybridization data showed that in the late stage of floral development OsMADS2 mRNA was not detected in the glumes, lemma, palea, pistil primordia or developing pistils, but limited to and highly expressed in lodicules. Expression in stamens occurred in later developmental stages once all the floral organs were differentiated [52]. To further explore the spatial and temporal expression pattern of BeMADS2 in early floral bud development of bamboo, we investigated the expression pattern of genes by in situ hybridization. BeMADS2 was highly expressed in the anthers of second flowers ( Figure 5). This result correlated with the qRT-PCR data (Figure 4). We also found that BeMADS4 and BeMADS2 showed similar expression patterns to wheat orthologs TaPI-1 and TaPI-2, including the initial expression in spike primordia and later expression in developing caryopses (5 days after anthesis), lodicules, stamens, and pistils from fully emerged spikes [40].
The C class genes are part of the AG-lineage and include BeMADS3 and BeMADS58, which were mainly expressed in the floral bud and then later in anthers and pistils, with especially high levels in pistils (Figure 4). This result is consistent with the involvement of the C class genes in development of the third (stamen) and fourth (carpel) whorls [26]. A similar result was also found for the other C class genes OsMADS3, OsMADS58, TaAG-1, and TaAG-2. In rice, in situ hybridization results indicated that OsMADS3 and OsMADS58 were limited to stamens, carpels, and ovule primordia. Only OsMADS3 was strongly expressed in the presumptive region from which the stamen, carpel, and ovule primordia subsequently differentiate, whereas OsMADS58 remained during differentiation and development [26]. Wheat TaAG-1 and TaAG-2 transcripts gradually increased during spike development and were only detected in the stamens and pistils [40]. The spatial and temporal expression of BeMADS3 and BeMADS58 requires further analysis.
The D class genes also belong to the AG-lineage and include BeMADS13 and BeMADS21, which were mainly expressed in flower and concentrated in pistils (Figure 4). This expression pattern of D class genes was consistent with the gene function in ovule identity determination and floral meristem determinacy [44]. The D class genes OsMADS13, maize ZAG2 and Arabidopsis STK have a similar expression pattern in floral organs [44,54,55]. The gene expression of rice OsMADS21 was very low in developing anthers, carpels, styles/stigmas, and ovule [44]. During the late stage of flower development, OsMADS21 was particularly evident in the inner cell layers of the ovary and in the ovule integuments, an expression region that overlapped with that of OsMADS13 [44]. Based on the qRT-PCR results, the expression amount was no different between BeMADS13 and 21. The expression localization was also similar: highly expressed in pistil.
The E class genes, such as Arabidopsis SEPALLATA (SEP), function in specification of sepal, petal, stamen, carpel, and ovule [16,56] and interact with genes from the other four ABCD groups at the protein level to form higher order MADS-box protein complexes that control the development of the fourth whorls within the flower [16,17,[56][57][58]. The E class genes in the SEP lineage in bamboo were BeMADS1, BeMADS5, BeMADS7, BeMADS8 and BeMADS34. BeMADS6 was located in the AGL6 lineage. The six genes were expressed in various flower structures, but were most highly expressed in the lemma (BeMADS1 and BeMADS5), lodicule (BeMADS7 and BeMADS8), and pistil (BeMADS1, BeMADS5, BeMADS7, BeMADS8 and BeMADS34) for the 5 SEP-like genes and in the palea and lodicule for the AGL6-like BeMADS6 (Figure 4). The expression pattern of E class genes in rice differed from BeMADS in the same group, such as the BeMADS1 homolog OsMADS1. OsMADS1 was not detected before glume primordia emergence, after which it was mainly present in the spikelet meristem, and then limited to the lemma and palea, with very low expression in the carpel [59]. BeMADS1 was expressed through the entire flower development, at all examined stages and tissues, but was highly expressed in the pistil, moderately expressed in lemma and anther, and very limited in anthers and lodicules (Figure 4). We also investigated the spatial and temporal expression pattern of BeMADS1 in early floral bud development of bamboo by in situ hybridization. Our result showed that the transcripts of BeMADS1 could also be detected in the pistil ( Figure 5), correlating with the expression pattern determined by qRT-PCR ( Figure 4). The other E class genes in rice, OsMADS7 and OsMADS8, were first detected in spikelet meristems, were not in lemma or palea primordia at a later stage, but were found in developing lodicules, stamens, and carpels during spikelet development [27]. Our result also showed that BeMADS7 and BeMADS8 have similar expression patterns in floral organs, but low levels in the anthers (Figure 4). The expression of BeMADS34 was high in the fourth whorl (pistils) ( Figure 4) and differed to that of its rice ortholog OsMADS34, which was initially expressed throughout the floral meristem and subsequently detected in palea, lemma, and the sporogenic tissue of the anthers in the mature flower [51]. A previous expression study showed a grass AGL6-like gene to mainly express in the inflorescence [60]. The BeMADS6 homolog in rice, OsMADS6, was first detected in the floral meristem and later in palea, lodicules, and pistil and at lower levels in stamens [48]. This similar expression pattern in floral organs was also shown for BeMADS6 (Figure 4).
In summary, we used transcriptomics to identify 16 BeMADS genes and used amino acid homology to cluster them according to their similarity to genes in the ABCDE model of floral development. Gene expression analysis demonstrated, except for BeMADS18 and 34, that most BeMADS have similar expression patterns during flower development as their better studied orthologous genes in rice.

Subcellular localization of BeMADS proteins
The putative functions of all the BeMADS proteins are as transcription factors. The localization of these proteins was predicted to be nuclear. To investigate the subcellular localization of BeMADS family members, B. edulis leaves  Figure 5). Except for some of the signal for BeMADS1-YFP, the fifteen BeMADS proteins, representing each of the 5 classes, were found throughout the cytoplasm when transiently expressed in leaves (Additional file 7).
When lemma was used as bombardment material for subcellular localization, some of the BeMADS proteins were localized to the nucleus ( Figure 6A). Interestingly, all of the signal for BeMADS1, 4, and 18 were localized in the nucleus ( Figure 6B). In lemma, BeMADS14, 15 (A class), 2, 16 (B class), 58 (C class), 13, 21 (D class), 6 and 7 (E class) did not localize into the nucleus. These results indicated that BeMADS proteins were only translocated into the nucleus in the tissues (lemma) where the gene is normally expressed. Since it is difficult to obtain the bamboo flowers from the field, in vitro bamboo flowers were used as target tissues.
Because MADS proteins form tetramers with other MADS proteins when functioning in floral development [15,61], we hypothesized that some BeMADS proteins do not translocate into the nucleus of lemmas without another MADS protein(s) to assist their import. These 9 BeMADS genes (linked to YFP) were co-transformed into lemmas with other nucleus BeMADS proteins (BeMADS1, 4, or 18, as CFP fusions). Our results indicated that only BeMADS1 could facilitate the translocation of these BeMADS proteins into the nucleus in lemma cells ( and BeMADS58 (class C)]. BeMADS14 protein was still in the cytosol when co-transformed with BeMADS1 ( Figure 7). The subcellular localization of MADS proteins can be affected by plant growth regulators, growth conditions, like sugar starvation [62], or other protein. For instance, the Arabidopsis MADS SOC1 can interact with AGL24 and then translocate into nucleus to activate LEAFY (LFY) expression [63]. According to our results, BeMADS1 plays an important role (directly or indirectly) in translocation of other cytosol BeMADS proteins into the nucleus, where they presumably can then function as transcription factors.
These data support previous studies using comprehensive matrix-based screens for petunia and Arabidopsis MADS-box transcription factor interactions, such as FRET (Fluorescence resonance energy transfer)-FLIM (fluorescence lifetime imaging microscopy) imaging and yeast two-, three-or four-hybrid analyses that revealed that MADS-box proteins form multimeric complexes [17,64]. This is the first report on monocot MADS subcellular localization using co-transformation with other proteins or testing different tissue for transient expression.

Conclusions
Using two different sequencing platforms, a transcriptome database of B. edulis was established from plant material grown in tissue culture. The N50 and number of genes in the combined databases are higher than previous bamboo transcriptome results, which used only Illumina methods. The cost of the combined strategy is less than whole genome sequencing. Although the contigs do not contain full length cDNA sequences, these cDNA can be identified by using other public resources, such as moso bamboo whole genome sequences or B. oldhamii BAC sequences. To show the usefulness of this strategy, 16 members of the floral developmentrelated MADS gene family were further investigated and cloned. Gene expression and amino acid sequence phylogeny were analyzed and compared to results from other monocot plant species. Since bamboo flowers are difficult to obtain as material for taxonomic and evolutionary studies, these protein sequences may be able to supplement morphological assessments of relatedness and serve as evidence for taxonomy both within Bambusideae and within the wider BEP monocot group.

Plant materials and RNA extraction
The B. edulis tissue culture system was established by following our previous protocol [4]. Multiple shoots were incubated in MS medium supplemented with 0.1 mg/l thidiazuron to induce flowering. The inflorescences were subcultured in medium containing 5 mg/l napththalene acetic acid to induce roots, shoots and flowers [65]. The total RNA of these organs were isolated using Trizol reagent (Invitrogen, Carlsbed, CA, USA), following the manufacturer's instructions. The pooled RNA was used for NGS sequencing.
The cDNA library preparation, sequencing and assembly on Illumina platform The cDNA library preparation followed the protocol described previously [8]. The raw sequencing data were filtered to remove low-quality sequences, including ambiguous nucleotides, adaptor sequences, and repeat sequences. The de novo transcriptome assemblies of these short reads were performed by the SOAPdenovo program [66] and organized into putative unigenes, which were used for further analysis.
Roche 454 cDNA library preparation, sequencing and assembly The cDNA library was constructed using the cDNA Rapid Library Preparation Kit (454 Life Sciences, Roche), starting from 200 ng of mRNA. All steps, including RNA fragmentation, cDNA synthesis, adaptor ligation and product quantification, followed protocols provided by the manufacturer. The resulting cDNA libraries were run on the Roche 454 GS FLX Titanium system. The raw sequence data (.sff) for all reads was obtained from the 454 Genome Sequencer (FLX System). The GS De Novo Assembler software version 2.8 was used for quality/primer trimming and isotig assembling with default parameters, except the "isotig length threshold" was set to 100 bp (default 3) and "Extend low depth overlaps" was enabled. (An isotig is meant to be analogous to an individual transcript.) The output 454 isotigs were then used in further analysis.
Hybrid transcriptome assembly combining data from 454 and Illumina platforms A two-phase hybrid assembly approach was performed in order to integrate the 454 platform (producing long reads with homopolymer errors) and Illumina platform (generating huge amount of short reads). The first transcriptome was assembled from the Illumina paired-end reads using a fast short-read assembler (SOAPdenovo) with multiple k-mers ranging from 41-51 bp. The second transcriptome was assembled from the 454 reads using a long-read assembler (MIRA) [67] with parameters tuned for 454 sequencing. The third transcriptome was generated from the pre-assembled contigs from the Illumina and 454 data through merger by MIRA with parameters tuned for assembly of Sanger sequencing reads. We aimed to merge concordant contigs assembled from the two platforms into longer contigs and to discard singleton contigs seen by only one platform. The merged MIRA contigs (the hybrid transcriptome) were used in the downstream analysis.

Functional annotation and classification
Clean reads were obtained by removing the adaptor sequences, empty reads, and the low-quality sequences (with ambiguous sequences 'N'). Functional annotation of the unigenes was performed by running our assembly against the NCBI non-redundant protein (Nr) database (http:// www.ncbi.nlm.nih.gov), the Swiss-Prot protein database (http://www.expasy.ch/sprot), the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database (http:// www.genome.jp/kegg) and the Cluster of Orthologous Groups (COG) database (http://www.ncbi.nlm.nih.gov/COG) using BLASTX algorithm (E-value threshold: 10 −5 ). The proteins that had the highest sequence similarity to our unigenes were used to determine functional annotations. The GO (Gene Ontology) annotations for the unigenes according to component function, biological process or cellular component ontologies were determined by Blas-t2GO [68]. The WEGO software [69] was used to analyze the GO functional classification for all the unigenes and to understand the distribution of gene functions in B. edulis at the macro level. Pathway assignments were made according to KEGG mapping [70]. Sequences were mapped to the KEGG biochemical pathways according to the Enzyme Commission (EC) distribution within the pathway database.

Phylogenetic analysis of BeMADS proteins
The MADS amino acid sequences from bamboo identified in this report and from other plant species were obtained from the NCBI database (http://www.ncbi. nlm.nih.gov/). Comparison with the bamboo MADS proteins was conducted by aligning all sequences in FASTA format using CLUSTAL W [71]. Multiple sequence alignment, phylogenetic, and molecular evolutionary analyses were conducted using MEGA software version 4 [72]. The distance matrices for the aligned sequences with all gaps ignored were calculated using the Kimura two-parameter method. Further molecular phylogenetic analyses used the neighbor-joining (NJ) method after alignment [73]. One thousand bootstrap resampling replicates were conducted to estimate support for the clades. Arabidopsis FLC genes were used as the root [74].

Real-time quantitative reverse transcription (qRT)-PCR
Plant tissues (Additional file 5) from in vitro cultures were excised for organ-specific RNA extraction, which was performed using Trizol as described above. RNA (2 μg) was reversely transcribed using Superscript III Reverse Transcriptase kit (Invitrogen) according to the manufacturer's instructions. The expression level of a target gene was detected with SYBR Green real-time PCR on Rotor-Gene Q real-time thermocyclers (Corbett Research, Australia). Data were analyzed using the Rotor-Gene Q software version 2.0 (Corbett Research) and Microsoft Excel (Microsoft, USA). Tubulin was used as the internal control. Experiments were performed for