Field transcriptome revealed critical developmental and physiological transitions involved in the expression of growth potential in japonica rice

Background Plant growth depends on synergistic interactions between internal and external signals, and yield potential of crops is a manifestation of how these complex factors interact, particularly at critical stages of development. As an initial step towards developing a systems-level understanding of the biological processes underlying the expression of overall agronomic potential in cereal crops, a high-resolution transcriptome analysis of rice was conducted throughout life cycle of rice grown under natural field conditions. Results A wide range of gene expression profiles based on 48 organs and tissues at various developmental stages identified 731 organ/tissue specific genes as well as 215 growth stage-specific expressed genes universally in leaf blade, leaf sheath, and root. Continuous transcriptome profiling of leaf from transplanting until harvesting further elucidated the growth-stage specificity of gene expression and uncovered two major drastic changes in the leaf transcriptional program. The first major change occurred before the panicle differentiation, accompanied by the expression of RFT1, a putative florigen gene in long day conditions, and the downregulation of the precursors of two microRNAs. This transcriptome change was also associated with physiological alterations including phosphate-homeostasis state as evident from the behavior of several key regulators such as miR399. The second major transcriptome change occurred just after flowering, and based on analysis of sterile mutant lines, we further revealed that the formation of strong sink, i.e., a developing grain, is not the major cause but is rather a promoter of this change. Conclusions Our study provides not only the genetic basis for functional genomics in rice but also new insight into understanding the critical physiological processes involved in flowering and seed development, that could lead to novel strategies for optimizing crop productivity.


Background
The high quality sequence of Oryza sativa L. ssp. japonica cv. Nipponbare genome elucidated the entire genetic blueprint of a major cereal crop that provides food for almost half the world population [1]. Subsequently, complete annotation of every trancriptional unit has become an enormous challenge not only for a complete understanding of the biology of rice, but more importantly, for efficient utilization of that information for genomics-based crop improvement [2][3][4]. Gene expression profiling is an important strategy for obtaining knowledge on presumed function of genes that comprise an organism [5]. Microarray analyses of the rice transcriptome encompassing different cell types [6], tissues and organs [7], specific stages of growth and development [8,9], and specific treatment conditions [10,11] have generated a large amount of information that provides initial clues for understanding the function of genes based on their time, place and level of expression in the plant.
Although rapid progress has been made during the last decade in understanding genes involved in developmental transitions, particularly the vegetative transition (juvenile-to-adult) and the transition to flowering [12][13][14], the physiological changes associated with phase transition have been poorly defined. Recently, microarray analysis of the vegetative transition in maize revealed that photosynthesis related genes are upregulated in the juvenile phase [15]. The changes in physiological state of the plant triggered by internal or external stimuli under natural field condition are thought to be reflected as corresponding changes in the transcriptome. However, the global configuration and complexity of the transcriptome that underlies physiological processes has not been scrutinized in sufficient depth particularly in a cereal crop. In order to understand these transcriptional programs reflecting physiological states it is essential to monitor the expression profiles of a large number of genes, including uncharacterized ones, throughout the life cycle of the rice plant in the field and to do this at high resolution.
Here, we establish a field transcriptome profile of the model rice cultivar, Nipponbare, by spatiotemporal gene expression analysis of 48 tissues and organs at various stages of growth and by continuous gene expression profiling of leaves at weekly intervals from transplanting until harvesting. Our gene expression profiling provides baseline information for functional characterization of genes revealed by the complete sequencing of the rice genome and for more exhaustive annotation of the elucidated genome. More importantly, we uncovered two drastic changes of leaf transcriptional programs reflecting growth stage-specific gene expression signatures that not only confirmed previously known physiological processes but also established new insights into developmental plant physiology that were never before demonstrated by studies involving non-global or semiglobal approaches.

Generation of gene expression profiles covering various tissues and organs
We performed spatiotemporal gene expression profiling using 48 different tissue and organ types representing the entire growth and developmental cycle from transplanting to harvesting (Table 1; See Additional file 1). Samples for vegetative organs, such as leaf blade, leaf sheath, root, and stem, were obtained at midday (12:00) and midnight (24:00) at the vegetative, reproductive, and seed ripening stages with reference to the number of days after transplanting (DAT). The entire inflorescence and specific floral organs, such as anther, pistil, lemma, and palea, were collected at various developmental stages. After the onset of pollination, the ovary, embryo, and endosperm were sampled at 10:00 AM based on the number of days after flowering (DAF). Transcriptome analysis was performed with the Agilent 44K rice microarray, which contains 35,760 independent probes corresponding to 27,201 annotated loci published in RAP-DB [4]. We obtained a total of 143 microarray data representing triplicate expression profiles for each organ/tissue sample except for one sample of anther (Table 1). Correlation coefficients calculated for each of the replicates indicates that all but two were above 0.9, testifying to the quality of the expression data (See Additional file 2). The number of expressed genes across organs/tissues did not vary significantly and ranged from 63-76% ( Figure 1A) and about 43% (15,224) of the transcripts were expressed in all organs/tissues. Principal component analysis (PCA) revealed three distinct transcriptome clusters corresponding to the profiles of vegetative organs such as leaf, stem, and root; reproductive organs such as anther, pistil, and entire inflorescence; and the endosperm ( Figure 1B). The profiles of lemma and palea clustered together with the reproductive organs in earlier stages of development and with the vegetative organs at the later stages. Relative expression levels of Gene Ontology (GO) categories using samples from various organs/tissues at various developmental stages revealed that photosynthesis-related genes had high expression values in leaf blade, leaf sheath, stem, and lemma/palea at the later developmental stage, while cell proliferation-related genes had high scores in inflorescence, anther, pistil, and lemma/palea in the early developmental stage (See Additional file 3). The transcriptome profiles of the endosperm were quite different from the others ( Figure 1B) and the GO categories related to glycogen biosynthesis showed high relative expression values, consistent with it being a specialized tissue for nutrition and storage.

Organ/tissue-specific gene expression
The degree of a gene's specificity for a particular organ or tissue was estimated by the Shannon entropy scores [16], leading to the identification of 731 organ/tissuespecific genes corresponding to 660 loci ( Figure 2; See Additional file 4). Nineteen percent of these genes were categorized as conserved hypothetical protein and hypothetical protein in the RAP-DB. We divided the organ/tissue-specific genes into 7 clusters based on the organ/tissue specificity of expression. The majority of the genes identified belonged to leaf-(Cluster 5), root-(Cluster 6), and seed-(Cluster 3) specific classes. Most of the genes specifically expressed in floral organs were found in anther (Cluster 1). Pistil-(Cluster 2), leaf sheath/stem-(Cluster 4), and inflorescence-(Cluster 7) specific genes belong to minor clusters, respectively. Many seed-specific genes (Cluster 3) were expressed in both the embryo and endosperm or in the endosperm alone, while only a small number of genes showed embryo-specific expression ( Figure 2). In addition, most of the seed-specific genes were induced from 5 days after flowering, when the embryo sac cavities are fully filled with endosperm cells and starch accumulation has been initiated, suggesting that most seed-specific genes are involved in grain filling and seed maturation. Among the 41 transcription factors showing organ or tissue-specific expression, 29 genes were seed-specific. These genes include OsVP1, which is an ortholog of the Arabidopsis ABA insensitive 3 (ABI3), and a homologue of Leafy cotyledon 1 (LEC1) (See Additional file 4), transcription factors that function in seed maturation [17]. The 29 seed-specific transcription factors contain 3 MADS-, 4 NAC-, 5 AP2-EREBP-, and 7 CCAAT-family genes. MADS-, NAC-, and CCAAT-family genes tend to express mainly in endosperm, and the expression of MADS genes were induced at early stages of seed development (from 1 to 14 DAF) in contrast with NAC and CCAAT genes which were expressed at the later stages (from 5 to 42 DAF) (See Additional file 5). On the other hand, AP2-EREBP genes were expressed mainly in embryo throughout seed development. These results suggested that each family of transcription factor might have a distinct function in embryo/endosperm development and grain filling. The seed-specific genes expressed at the early developmental stage include SLRL2 [18], a repressor of gibberellin (GA) signaling, and OsETR2;2 [19], a putative ethylene receptor that negatively regulates ethylene signaling, suggesting that repression of GA and ethylene signaling might also play a role in seed development.
The inflorescence specific genes (cluster 7) include LAX PANICE (LAX) and FRIZZY PANICLE (FZP). LAX gene encodes a putative transcriptional regulator containing a helix-loop-helix (bHLH) domain, which plays a role in axillary meristem formation [20,21] whereas FZP, an ERF transcription factor, represses the formation of the axillary meristem and establishes the spikelet meristem identity [22]. Differentiation of floral organs is more complex than other parts of the plant. Among them, the anther showed a unique feature in which most of the anther-specific genes (Cluster 1) were expressed only in a particular developmental stage ( Figure 2). These results indicate the complex regulation of gene expression in both the gametophytic and sporophytic tissues during anther development [9]. Pollen-specific genes contained 4 transcription factors, one of which encodes Tapetum Degeneration Retardation (TDR), a basic helix-loop-helix (bHLH) transcription factor [23]. TDR is a putative ortholog in rice of ABORTED MICRO-SPORES (AMS) in Arabidopsis, which play a role in tapetal cell development and postmeiotic microspore formation [24], and has recently been reported to interact with two bHLH proteins, AtbHLH089 and AtbHLH091 [25]. Os04g0599300 encodes a close homolog of AtbHLH089 and AtbHLH091, and is also involved in pollen-specific expressed genes, implying that in rice TDR would interact with the bHLH protein encoded by Os04g0599300 as in Arabidopsis. These results suggest Figure 2 Heat map of organ and tissue-specific expressed genes . A total of 731 organ/tissue-specific genes identified by the Shannon entropy based method were analyzed by hierarchical clustering. A heat map was constructed using the relative expression values of the genes based on correlation distance and average linkage method. As a result, the 731 genes were grouped into 7 clusters based on organ/tissuespecificity of gene expression. High expression values are shown in red. Details of the samples used for each organ and tissue are described in Table 1. that a wide range of expression profiling can be very useful as well in elucidating the interactome in cereal crops. In comparison with the anther, only a few specific genes were identified in the pistil (Cluster 2), where megasporogenesis and megagametogenesis occur. This is probably because rice has a monocarpellary ovary with a single ovule and transcripts associated with such events may be masked.
To characterize the expression profile of lemma and palea, we performed a two-way analysis of variance (ANOVA; FDR < 0.05) using tissues (lemma/palea) and the sizes corresponding to the developmental stages as factors. The two-way ANOVA identified 23 genes that were differentially expressed between lemma and palea, irrespective of the developmental stages, while 20,007 genes showed differential expression among the stages, irrespective of the tissues. None of genes showed interaction between the two factors. The results implied that lemma and palea have similar transcriptome profile as predicted from the similar morphology and function. However, among the 23 genes, 13 genes encode transcription factors including DROOPING LEAF [26] and MOSAIC FLORAL ORGANS1/OsMADS6 [27] which were expressed specifically in the lemma and palea, respectively (See Additional file 6), suggesting that these transcription factors maybe key regulators in the differentiation of lemma and palea.

Diurnal and growth stage-specific gene expression
The transcriptomes of vegetative organs at daytime and nighttime showed diurnal patterns for about 7% of transcripts particularly in mature leaf blade ( Figure 3A). The number of genes with diurnal expression pattern was much less in leaf sheath, and rare in root and stem, reflecting the importance of diurnal regulation of gene expression in the leaf blade for its biological functions such as photosynthesis. At the vegetative stage, a total of 20 genes were universally expressed with a diurnal pattern in leaf blade, leaf sheath, and root, including circadian-associated genes [28], OsPRR95 and OsPCL1, which showed high expression values at daytime and nighttime, respectively (See Additional file 7). Although diurnal expression depends on the daily rhythm induced by the light/dark cycle, several genes including the circadian-associated genes are also diurnally regulated in the root, which is not exposed to light under field conditions. It was recently reported that in Arabidopsis the circadian clock of the root is different from that of the shoot and is synchronized by a photosynthesis-related signal from the shoot [29].
In the leaf blade, leaf sheath, and root, the expression of many genes also showed growth-stage specific signatures. We extracted 215 genes that universally showed changes in expression in all 3 of these tissues from vegetative to reproductive stages ( Figure 3B; See Additional file 8). These genes included four MADS box transcription factors, OsMADS1, OsMADS14, OsMADS15, and OsMADS18, which were highly expressed in the reproductive phase. OsMADS14 and OsMADS15 are homologs to an Arabidopsis floral identity gene APETALA1, and were reported to be induced by Hd3a and RFT1, rice orthologs of Arabidopsis florigen gene FLOWERING LOCUS T (FT) [30,31]. Hd3a and RFT1 are synthesized in leaf blade and transported to the shoot apical meristem (SAM) through phloem as a florigen [31,32]. Although the expression of OsMADS14 and OsMADS15 may not be directly affected by Hd3a and RFT1 particularly in roots, the transition from vegetative to reproductive phase may have induced the changes in the transcriptome of vegetative organs resulting in the expression of such reproductive organ identity genes. Among the universally downregulated genes going from the vegetative to reproductive phase, we also found a number of phosphate (Pi)-starvation induced genes which may be related to the physiological state transition associated with the reproductive phase change as discussed below.
Continuous gene expression profiling throughout the entire growth cycle in the field In order to further understand the transcriptional programs associated with growth stage of rice grown under the natural field conditions, we performed continuous gene expression profiling of the leaves from 13 until 125 DAT in 2008 to establish a transcriptome profile encompassing the entire growth phase in the field. The uppermost fully-expanded leaf in the main stem, representing the 1st leaf up to 76 days after transplanting (DAT) and the flag leaf from 83 DAT until harvesting, were sampled at 12:00 PM every 7 days, covering 17 different growth stages with three replicates (See Additional file 1). For analyses, we used 29,119 probes with raw signal intensities above 100 in at least one sample of all 51 expression profiles. Interestingly, Pearson's correlation coefficients (PCCs) calculated across the 51 expression profiles identified three phases with high PCC scores, namely, 13-41 DAT (phase 1), 48-90 DAT (phase 2), and 97-125 DAT (phase 3) which approximately correspond to the vegetative, reproductive, and ripening stages, respectively (Figure 4). These results suggested that two major transcriptome changes occurred in the leaves from transplanting until harvesting.
First major transcriptome change associated with reproductive transition The first major change observed between phase 1 and phase 2 was assumed to be associated with the transition from vegetative to reproductive stage. The expression  Figure 5B). This suggests that induction of flowering might be controlled by RFT1 in the natural conditions in Tsukuba, Japan (~36°N), where natural day length at the time of reproductive transition is under long-day (LD) conditions. Consistent with this observation, Hd3a reportedly functions as a mobile flowering signal in short-day (SD) conditions while RFT1 functions in LD [31]. In addition, Itoh et al. [33] reported that the critical day length for Hd3a expression was around 13.5 h further supporting the fact that Hd3a was not induced before the reproductive transition in our field conditions.
We observed the reduction of miR169 precursors at the first transcriptome change ( Figure 5C; See Additional file 10). The target of miR169 is the HAP2 type transcription factor (also as known as NF-YA), which is thought to be involved in various traits, e.g., flowering and drought tolerance in Arabidopsis [34,35], and nodule development in Medicago truncatula [36]. Ten HAP2 genes have been identified in rice [37]. The expression of six HAP2 genes with the predicted miR169 target sites in their 3' UTRs (OsHAP2C, D, E, F, G, and H) increased in the first transcriptome change, but those of two HAP2 genes without a target site (OsHAP2A and OsHAP2B) did not change (See Additional file 10), suggesting the function of miR169 in the regulation of HAP2 expression in the first major transcriptome change. In Arabidopsis, CONSTANS (CO), which contains a CCT domain, is the key regulator of flowering genes [38,39]. The CCT domain exhibits similarity to a domain of HAP2, which mediates the formation of the HAP trimeric complex, HAP2/HAP3/HAP5. It has been suggested that replacement of CO with AtHAP2 in the HAP trimeric complex by overexpression of AtHAP2 delays flowering via down-regulation of FT [34]. In SD-flowering rice plants, the CCT-domain containing proteins Hd1 and Ghd7 regulate flowering by repressing expression of the florigen genes in LD [40][41][42]. Wei et al. [43] has reported that DTH8 QTL for days-to-heading encodes a putative HAP3 subunit for the trimeric HAP2/HAP3/HAP5 complex and suppresses flowering in LD, and further speculated that the formation of the Hd1/DTH8/HAP5 and Ghd7/DTH8/ HAP5 complex might be associated with the suppression of flowering by the downregulation of Ehd1 and Hd3a in LD. In this scenario, increased expression of OsHAP2 caused by miR169 reduction promotes reproductive transition in rice through functional inhibition of the CCT-domain containing proteins and the resultant induction of RFT1 expression, which was observed at the first major transcriptional change. Three of the six HAP2 genes were universally upregulated in root as well as leaf from vegetative to reproductive stages (See Additional file 8). In plant, HAP system has been thought to play diverged roles in gene transcription because each subunit in HAP complex, HAP2/HAP3/ HAP5, represents a gene family [37,44]. For example, it has been reported that NFYA5, a HAP2 type transcriptional factor regulated by miR169, is important for drought resistance in Arabidopsis [35]. Therefore, miR169-mediated HAP2 genes expression might synchronously regulate not only flowering time but also other agronomically important traits such as resistance to biotic and abiotic stress.
We extracted 1,316 genes with different expression patterns at 41 DAT and 48 DAT. A total of 357 upregulated and 333 downregulated genes were then selected based on their similarity in expression patterns from the results of hierarchical cluster analysis (See Additional file 11 and 12). The upregulated genes comprised a large number of 'newly expressed' genes, which were hardly detectable at 34 and 41 DAT (See Additional file 11). Gene Ontology (GO) analysis showed that the genes encoding protein kinase were significantly enriched among the upregulated genes ( Figure 5D). The results indicated that many signal transduction pathways accompanied by protein phosphorylation processes participate in the transition between phase 1 and phase 2. A number of genes that are induced under Pi-starvation conditions were downregulated from 41 to 48 DAT [45,46]. In Arabidopsis, regulation of miR399 and the ubiquitin-conjugating E2 enzyme gene PHO2 plays a central role in the maintenance of Pi homeostasis [47,48]. miR399 generated in shoots serves as a longdistance signal that represses PHO2 in roots under Pi-starvation conditions, resulting in activation of Pi uptake and translocation [49,50]. Five precursors of miR399 were downregulated in leaves before the initiation of panicle development and the potential rice ortholog, OsPHO2 [47], was upregulated in roots, suggesting an alteration of Pi homeostasis at this stage ( Figure 5E; See Additional file 13). MGD2 and MGD3 encode type-B monogalactosyldiacylglycerol (MGDG) synthase and are involved in Pi-starvation induced lipid remodeling for Pi-recycling, a typical response of Pi starvation [51]. Os08g0299400, a close homolog of MGD2 and MGD3 of Arabidopsis, was 225.4-fold downregulated from 41 DAT to 48 DAT, consistent with the relaxation of Pi demand described above.
PHR1 is a key transcriptional activator in controlling Pi uptake and allocation, and the PHR1 binding motif is often found in the upstream regions of Arabidopsis genes induced by Pi-starvation [52]. The PHR1 binding motif was enriched in the 1-kb upstream regions of the 333 downregulated genes, further supporting the alteration of Pi homeostasis (See Additional file 14). The expression of many Pi-response genes was changed in leaf sheath and root as well as leaf blade in the transition from the vegetative to reproductive phases (See Additional file 8). These observations strongly suggest that the rice plant undergoes a change in Pi homeostasis at the vegetative-reproductive phase transition. Pi is an important nutrient for increasing the number of tillers, one of the components of grain yield. The high demand for Pi during vegetative stage may be vital for proper development of tillers and rice plants may not need much Pi after the reproductive-phase transition, when few tillers are produced.
Taken together, the first transcriptome change involves not only the initiation of panicle development but also various aspects of the physiological state, which might be prerequisite for proper flowering and later developmental stages. The drastic phase change in shoot apical meristem is initiated by long-distance transport of FT family protein synthesized in leaves. Our results suggest that changes in physiological state also occurred in other tissues and organs, at least at the same time of induction of FT-like gene expression in leaves under the natural field conditions, and revealed interesting trends suggesting the potential role of similar or related signaling events in mediating the transcriptome change. One possibility is that floral transition and the shift in Pi homeostasis are parallel consequences of the same signaling event. Another is that the transcriptomes associated with Pi homeostasis and floral transition were consequences of independent signaling that happened to be developmentally coincident of each other. Although plant development is thought to be a continuous process, the phase transition maybe characterized by a transcriptome which is distinguishable from both the vegetative and reproductive phases. Further studies using various growth conditions as well as various cultivars and mutant lines maybe necessary to clarify the machinery of phase change.

Second major transcriptome change associated with senescence
Next, we focused on the leaf expression profiles from 62 DAT to 125 DAT to examine the second major transcriptome change observed at the transition between phase 2 and phase 3. The expression profile of 29,119 genes revealed that the change in transcriptome occurred around 90 DAT ( Figure 6A), when most of the rice plants in the field were at various stages of flowering. Eighty genes showing very high and transient expression at 90 DAT were pollen-specific genes, suggesting contamination of the leaf samples by pollen dispersed during anthesis (See Additional file 15). PCA excluding these genes revealed that the transcriptome change mainly occurred between 90 DAT and 97 DAT, the start of the post-flowering process, i.e., seed development ( Figure 6B). We extracted differentially expressed genes including 423 upregulated and 573 downregulated genes between 83 DAT and 97 DAT (See Additional file 16). Among the 423 upregulated genes, six NAC transcription factors were identified (See Additional file 16), one of these (Os07g0566500) is a close homolog of wheat NAM-B1, which was isolated as a QTL gene accelerating senescence and increasing nutrient remobilization from leaves to developing grains [53]. OsNAP (Os03g0327800) is a close homolog of AtNAP, of which a loss-of function mutation is known to result in a delay of leaf senescence in Arabidopsis [54]. These results suggest that the second transcriptome change is associated with leaf senescence, an active process whereby nutrients are salvaged from senescent leaves for use by emerging leaves and reproductive organs. To examine the role of formation of a very strong sink, i.e., developing seeds, in the second transcriptome change, we performed expression profiles on three independent sterile-mutant lines, pair1 [55], pair2 [56,57], and mel1-1 [58], in 2009 (See Additional file 17). The fertile and sterile lines basically showed similar expression profiles at the same sampling time, but the transcriptome change in the fertile lines was more rapid and enhanced than that of the sterile lines ( Figure 6C and 6D). This result indicates that the second major transcriptome change is associated with leaf senescence, which autonomously starts independent of the development of the sink, but is accelerated by the sink formation. Delaying leaf senescence in order to maintain the photosynthetic activity for as long as possible may improve source potential. However, we noted that the expression of photosynthesis related genes as described in KEGG database [59], namely, osa00196 (Photosynthesisantenna proteins) and osa00195 (Photosynthesis), decreased more drastically in fertile plants than in sterile plants (See Additional file 18), suggesting that the process of nutrient translocation has a negative effect on photosynthetic potential in senescent leaves. It is therefore unlikely that delaying leaf senescence is a viable approach for improving source potential in rice. In contrast with the QTLs for sink size [60][61][62], the QTL gene associated with source potential has not yet been cloned, presumably due to the complexity of sink-source interaction which makes it difficult to monitor physiological traits associated with photosynthesis and nutrient translocation. We have shown here particularly in the analysis of the first transcriptome change that a wide range of transcriptome profile could provide new insights into many physiological processes that underlie phase transition. Therefore, further high-resolution transcriptome analysis and data mining for integrative physiology will be essential in elucidating the complex regulation of sink-source interaction.

Conclusions
By spatiotemporal gene expression profiling, we were able to clarify organ/tissue-specific, diurnal, and growth stage-specific gene expression signatures throughout the entire growth cycle under natural field conditions. Our analysis also highlights the synchronized change of gene expression across separate organs and tissues suggesting the possible involvement of a long distance signaling mechanism in developmental and diurnal gene expression. Consistent with this, we found that rice florigen gene and miR399 expressed in leaves have direct effects on gene activity in other organs. More importantly, we have identified two major changes of leaf transcriptional programs reflecting growth stage-specific gene expression signatures. The drastic change in phosphate-homeostasis state before the panicle differentiation as evident from the behavior of several key regulators such as miR399 under the natural field conditions maybe associated with reproductive processes involved in expression of agronomic traits that determine crop yield. Taken together, a field transcriptome obtained by a wide range of gene expression profiling provided not only baseline information for functional characterization of genes but also revealed critical developmental and physiological transitions involved in the expression of growth potential under natural field conditions. With the accompanying gene expression profile database, RiceXPro [63] http://ricexpro.dna.affrc.go.jp/, our resources on gene expression profiling may contribute to innovative crop improvements that have not yet been tried in classical or molecular breeding.

Plant materials and sampling
Rice (Oryza sativa L. ssp. japonica cv. Nipponbare) seeds were sown in germinating boxes. At 30 days after germination, the seedlings were transplanted in a paddy field in Tsukuba, Japan (~36°N) and grown under normal conditions during the 2008 cultivation season. To clarify the effect of sink formation during leaf senescence, three sterile Tos17 insertion mutant lines http:// tos.nias.affrc.go.jp/~miyao/pub/tos17/, namely, pair1 (ND0016) [55], pair2 (NC0122) [56,57], and mel1-1 (NC0005) [58], were also grown in the field in 2009 for leaf sampling. Fertile plants (homozygous or heterozygous) derived from a segregating population were used for comparison. All samples were immediately frozen in liquid nitrogen and stored at -80°C until RNA isolation.

Microarray analysis
Total RNA was isolated from each plant sample using RNeasy plant mini kit (QIAGEN). For endosperm samples, total RNA was isolated by phenol extraction [64], and further purified using the mini spin column from the RNeasy plant mini kit. The extracted RNA was quantified using NanoDrop ND-1000 UV-VIS spectrophotometer (NanoDrop) and checked for quality using an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA). One-color spike-mix was added to the total RNA prior to the labeling reaction. Labeling was performed using a Quick Amp Labeling Kit, One-Color (Agilent Technologies) in the presence of cyanine-3 (Cy3)-CTP according to the manufacturer's protocol. For microarray hybridization, 1650 ng of Cy3-labeled cRNA (except for samples with low cRNA yield) was fragmented and hybridized on a slide of rice 4 × 44K microarray RAP-DB (Agilent; G2519F#15241) at 65°C for 17 hours. Hybridization and washing of the hybridized slides were performed according to the manufacturer's instruction. Slides were scanned on the Agilent G2505B DNA microarray scanner, and background correction of the Cy3 raw signals was performed using the Agilent Feature Extraction software (version 9.5.3.1).

Statistical analysis
The processed raw signal intensity of all probes (45,151) were subjected to 75 percentile normalization with GeneSpringGX11 for inter-array comparison (Agilent Technologies) and transformed to log2 scale. A total of 35,760 probes were extracted after the normalization and used for analysis of the gene expression profile of various organs/tissues from 143 microarray data. For comparison of expression patterns of each gene, we performed an additional normalization procedure. The median expression values across the data used for each analysis were subtracted for each gene. The gene-normalized values were assigned as relative expression values. Analysis of the continuous expression profiles comprising of 51 microarray data, was based on 29,119 probes with raw signal intensities above 100 in at least one sample. Unpaired t-test and PCA were performed with the GeneSpringGX11. In the t-test, the p values were adjusted for multiple testing by the Benjamine and Hochberg's method to correct the false discovery rate (FDR). For hierarchical cluster analyses in Additional file 11 and 16, we used the uncentered Pearson correlation and centroid linkage algorithm with relative expression values with the GeneSpringGX11. We performed statistical analysis of sterile and fertile lines using expression profiles of ND0016, NC0122, and NC0005 as three replicates. Analysis was based on 24,577 probes with raw signal intensities above 100 in at least one sample. The heat maps in Figure 2, 4, Additional file 3, and 5 were constructed using heatmap.2 in the "gplots" package of R program http://www.R-project.org. Clustering of fertile and sterile lines in Figure  6C were performed based on correlation distance and complete linkage with normalized signal intensities of 24,577 probes using the R program.

Extraction of organ/tissue-specific expressed genes
We used Shannon entropy to evaluate the organ and tissue specificity of gene expression [16]. The raw signal intensity values were applied to quantile normalization and transformed to log2 scale with GenespringGX11 prior to calculation of Shannon entropy. We selected a total of 731 organ/tissue-specific genes with entropy values below 4.5 and relative expression values above 8 in at least one sample. Transcription factors contained in the organ/tissue-specific genes were extracted based on the information in PlnTFDB [65] Analysis of miRNA expression In addition to the 35,760 probes, the rice 4x44K microarray RAP-DB contains 340 independent probes for the precursors of microRNAs with two different probes designed for each precursor. We used these probes to examine the expression of miR169 and miR399.

GO analysis
For the GO analysis, generic GO annotated in RAP-DB was converted to GO slim using map2slim.pl script available from the go-perl page at CPAN http://search. cpan.org/~cmungall/go-perl/. For analysis of significantly enriched GO categories, we used the BiNGO plugin http://www.psb.ugent.be/cbd/papers/BiNGO/ for