Complementary genetic and genomic approaches help characterize the linkage group I seed protein QTL in soybean
- Yung-Tsi Bolon†1,
- Bindu Joseph†2,
- Steven B Cannon3,
- Michelle A Graham3,
- Brian W Diers4,
- Andrew D Farmer5,
- Gregory D May5,
- Gary J Muehlbauer6,
- James E Specht7,
- Zheng Jin Tu8,
- Nathan Weeks3,
- Wayne W Xu8,
- Randy C Shoemaker3 and
- Carroll P Vance1, 6Email author
© Bolon et al; licensee BioMed Central Ltd. 2010
Received: 15 October 2009
Accepted: 3 March 2010
Published: 3 March 2010
The nutritional and economic value of many crops is effectively a function of seed protein and oil content. Insight into the genetic and molecular control mechanisms involved in the deposition of these constituents in the developing seed is needed to guide crop improvement. A quantitative trait locus (QTL) on Linkage Group I (LG I) of soybean (Glycine max (L.) Merrill) has a striking effect on seed protein content.
A soybean near-isogenic line (NIL) pair contrasting in seed protein and differing in an introgressed genomic segment containing the LG I protein QTL was used as a resource to demarcate the QTL region and to study variation in transcript abundance in developing seed. The LG I QTL region was delineated to less than 8.4 Mbp of genomic sequence on chromosome 20. Using Affymetrix® Soy GeneChip and high-throughput Illumina® whole transcriptome sequencing platforms, 13 genes displaying significant seed transcript accumulation differences between NILs were identified that mapped to the 8.4 Mbp LG I protein QTL region.
This study identifies gene candidates at the LG I protein QTL for potential involvement in the regulation of protein content in the soybean seed. The results demonstrate the power of complementary approaches to characterize contrasting NILs and provide genome-wide transcriptome insight towards understanding seed biology and the soybean genome.
Seed protein and oil are crucial to the value of many crop species. During seed development, carbon and nitrogen are partitioned among protein, oil, and carbohydrates [1–6]. In legumes, particularly soybean (Glycine max (L.) Merrill), protein and oil are primary nutritional components of mature seed. Protein and oil comprise some 40% and 20%, respectively, of soybean seed. Protein meal is a major byproduct of soybean processing, and high seed protein content allows processors to derive meal with high nutritional value . A better understanding of the genetic basis of seed protein variation is important for developing strategies to improve seed quality traits not only in soybean but also in other legumes and cereal grains.
Storage reserves account for the majority of the protein in the seed [8, 9]. The period of seed development where these reserves accumulate is commonly referred to as the seed filling stage, a 4- to 5-week period of cell expansion that occurs once cell division is complete . The most prevalent seed storage proteins in soybean are beta-conglycinin and glycinin [11, 12]. A number of diverse and interlinked processes, including photosynthesis, sucrose signaling, and transport, are associated with seed development and the regulation of complex traits [2, 13, 14].
Genetic control of seed constituents and size is inherited in a quantitative manner. Many quantitative trait loci (QTLs) associated with seed protein and size have been identified in several species including wheat , Arabidopsis , rice , pea , and barley . In soybean, numerous QTLs associated with protein have been identified [19–23]. The seed protein QTL mapped to soybean linkage group I (LG I) is of particular interest due to the large additive effect that accounts for its consistent detection in many soybean mapping populations [22, 24, 25] and across multiple environments . Inheritance of the high protein allele from G. soja at LG I resulted in a seed protein increase of 18 to 24 g/kg, and this increase was also associated with lower oil concentration [24, 25, 27]; a negative phenotypic correlation between soybean seed protein and oil content is well documented [28–31]. Nichols et al.  fine mapped the LG I protein QTL region to a 3 cM interval using BC5F5-derived near-isogenic lines (NILs) contrasting in seed protein and oil. Although linkage analysis is a valuable tool for localizing genetic regions of interest for a trait, the capabilities of mapping can be greatly enhanced by genomic approaches to identify genes that may control these traits.
Analyses of transcript profiles by microarrays have provided insight into the genes and processes involved in developing seed of Arabidopsis [33, 34], soybean [35–37], Medicago truncatula [4, 38, 39], wheat , barley [5, 41], and rice panicles . Transcript changes, especially when used to contrast NILs, have proven useful for the discovery of genes of interest in soybean and other species [5, 43–45].
In the present study, we leveraged a combination of resources - a NIL pair that differed substantially in seed protein , transcript profiling by Affymetrix® Soy GeneChip microarray, Illumina® high-throughput transcriptome sequencing platforms, and the newly available soybean genome sequence--to assess genomic and genetic contributions to seed protein traits in soybean. The objectives of our study were to: 1) define the borders of the genomic segment encompassing the LG I protein QTL region, 2) characterize transcript accumulation in the developing seed of a NIL pair known to produce contrasting final seed protein content, and 3) identify candidate genes for this seed protein QTL. The accomplishment of these objectives constitutes the first step toward understanding the genetic and molecular mechanisms underlying the regulation of seed protein. In addition, the large dataset provided through this study is a valuable tool for further analysis of the soybean transcriptome.
Demarcation of the QTL region
To obtain a physical map of the protein QTL region, BAC (bacterial artificial chromosome) libraries of soybean genomic DNA were scanned for alignment to known markers, and a BAC-based physical map was assembled to span markers Satt239 and Satt496 (Figure 1). This BAC-based map accounted for approximately 1.2 Mb of the QTL region. Newly derived SSR (simple sequence repeat) markers from the BAC sequence that were polymorphic between A81-356022 and PI468916 were screened to determine if they segregated in the P-C609-45-2 population. Because the introgressed QTL-containing segment was segregating in the P-C609-45-2 population, markers located in that region were expected to segregate in the population. Upon release of the soybean whole genome sequence, alignment of BAC sequences to the soybean whole genome assembly (version Glyma1, ) identified chromosome 20 as the best match to all the BACs in the LG I protein QTL physical map. The order of BAC sequence alignment to chromosome 20 was in agreement with the physical map (Figures 1C, D, and 1E).
Forty-eight SSR markers (Figure 1D), including 42 SSR markers (see Additional file 1) derived from the BAC sequences and from the whole genome sequence spanning the QTL region plus six previously genetically mapped SSR markers , were screened for segregation as described above. Thirty-four of the 42 SSR markers derived in this study segregated in the P-C609-45-2 population. The high and low protein phenotypes of the segregating progeny corresponded to the expected parental marker alleles originating from the high and low protein parents .
The protein QTL region was delineated to approximately 8.4 Mbp of genomic sequence between Sat_174 and ssrpqtl_38, the two closest non-segregating SSR markers flanking the left and right borders of the protein QTL region on chromosome 20 (Figure 1D). The coordinates of the borders stretch from 24.54 Mb to 32.92 Mb on chromosome 20.
Phenotypic evaluation of seed protein and oil in NILs
Transcript accumulation changes during seed fill
To examine transcript accumulation changes during seed fill, transcript profiles were evaluated in seeds of each genotype (LoPro or HiPro) from the four stages above by Soy Genome Affymetrix® GeneChip analyses. Out of 37,701 soybean probesets on the GeneChip, 64-69% were defined as 'present' in three out of three replicates by MAS5 analysis of the various seed stages in both genotypes. These detection figures are comparable to those found in seed microarray studies of other species [34, 38]. Differences in the transcriptomes of the NIL pair may reflect or affect the high and low protein and oil phenotypes seen in the lines. Using Student's t-test to evaluate significance, Affymetrix® GeneChip probesets with at least 1.5-fold change between stages were identified at an FDR (false discovery rate) of less than 5% . Transcript accumulation changes across stages were evaluated with reference to the stage one profiles (stage two vs. stage one, stage three vs. stage one, stage four vs. one). In both genotypes, no probesets from the stage two versus stage one comparison qualified under the FDR < 0.05 criterion, so this comparison was excluded from further analysis. The number of probesets representing differentially accumulated transcripts with higher accumulation in stage three compared to stage one was greater in HiPro than in LoPro (716 vs. 616), and this difference was again apparent between stages four and one (2094 vs. 1294) (see Additional files 2 and 3).
Analysis of all probeset expression changes revealed that 18.2% of the genes that significantly increased in expression over time in either genotype were shared between LoPro and HiPro (see Additional file 4). Transcripts common to both genotypes that increase significantly in stage four seed as opposed to stage one seed include: beta-conglycinins and glycinins, sucrose binding proteins, heat shock chaperonins, late embryogenesis messages, seed maturation proteins, glutathione S-transferases and peroxidases, iron binding and flavonoid synthesis proteins, and numerous transporters. Interestingly, 25 transcripts with ubiquitin-related annotations were found to increase in accumulation over time in both genotypes while three were found to decrease in accumulation levels in both genotypes (see Additional files 4 and 5). It is noteworthy that some 53 transcription factor messages showed enhanced abundance at stage three or four versus stage one seed.
Of the genes that decreased in expression over time, 30.2% of these genes were shared between LoPro and HiPro (see Additional file 5). Transcripts common to both genotypes that were reduced in abundance at stage four as opposed to stage one include genes involved in flavonoid metabolism, cell wall deposition, kinases (particularly those related to cell cycle), response to arachidonic acid, strictosidine synthesis, and disease resistance response. Twenty transcription factor annotations were common to both lines and displayed reduced abundance by stage three or four.
During seed development, the synthesis of seed storage products is coordinated with carbohydrate and nitrogen metabolic processes involving many transporters . Some 26 transport-related transcripts increased in abundance in both genotypes, including gene transcripts annotated as ammonium, sugar, metal, and ion transporters (see Additional file 4). Meanwhile, some 33 transport-related transcripts decreased in accumulation levels in both genotypes, and these included transcripts annotated as ammonium, sugar, and ABC transporters (see Additional file 5).
A high number of microtubule-related gene transcripts were also found to decrease in abundance, supporting a role for fundamental transport mechanisms [50, 51] and the slowing of cell expansion  during these stages of seed development. Eleven microtubule-related transcripts, including those involved in activity and movement, were found to decrease in abundance versus four microtubule-related transcripts that increased in abundance in both genotypes (see Additional files 4 and 5). Cyclin-related transcripts were also found in both genotypes. It is interesting to note that of the transcripts directly associated with cell division cycle annotations, those that increased in abundance included transcripts for Cdc48 and five transcripts annotated as tyrosine kinase specific for activated (GTP-bound) p21cdc42Hs (see Additional file 4). Those that decreased in abundance included transcripts for Cdc20 and Cdc50 (see Additional file 5). At least one transcript related to Cdc2 was found to accumulate in both directions for both genotypes over time (see Additional files 4 and 5).
Sucrose is well known for its many roles during seed development [3, 53–55]. Sixteen transcripts with sucrose-related annotations were found to increase in accumulation in both genotypes, and these annotations included sucrose-binding protein and sucrose degradation and transport-related genes (see Additional file 4). This number is in contrast to the five sucrose-related transcripts that were found to decrease in accumulation in both genotypes and that included sucrose degradation and sucrose response genes (see Additional file 5).
The 15 most highly upregulated Affymetrix® probesets found in HiPro from stage one to stage four
Ratio of Means
Stage 4/Stage 1
Cluster: Beta-conglycinin, beta chain precursor; Glycine max
Cluster: Seed maturation protein PM31; Glycine max
4 × 10-87
Cluster: G. max mRNA from stress-induced gene; Glycine max
2 × 10-79
Cluster: Oxidoreductase, short chain dehydrogenase/reductase family, putative; Medicago truncatula
6 × 10-56
Cluster: Sucrose-binding protein 2; Glycine max
Cluster: Glycinin G3 precursor [Contains: Glycinin A subunit; Glycinin B subunit]; Glycine max
Cluster: Oxidoreductase, short chain dehydrogenase/reductase family, putative; Medicago truncatula
1 × 10-65
Cluster: Ferritin-2, Chloroplast precursor; Glycine max
1 × 10-142
Cluster: Late embryogenesis-abundant protein; Glycine max
7 × 10-54
Cluster: Ferritin-2, chloroplast precursor; Glycine max
1 × 10-142
Cluster: Expressed protein; Oryza sativa (japonica cultivar-group)
1 × 10-19
Rep: Pv42p - Phaseolus vulgaris (Kidney bean) (French bean)
1 × 10-14
Cluster: Basic 7S globulin 2 precursor (Bg) (SBg7S) Glycine max
The 15 most highly upregulated Affymetrix® probesets found in LoPro from stage one to stage four
Ratio of Means
Stage 4/Stage 1
Cluster: Heat shock protein Hsp20; Medicago truncatula
2 × 10-65
Cluster: Specific tissue protein 1; Cicer arietinum (Chickpea)
1 × 10-45
Cluster: Putative peptide transporter; Arabidopsis thaliana
2 × 10-55
Cluster: Suspensor-specific protein; Phaseolus coccineus
5 × 10-23
Cluster: Glutathione S-transferase GST 11; Glycine max
1 × 10-124
Rep: At5 g54075 - Arabidopsis thaliana
5 × 10-7
Cluster: Os12 g0514100 protein; Oryza sativa (japonica cultivar-group)
2 × 10-17
Cluster: Expressed protein; Arabidopsis thaliana
1 × 10-12
Cluster: predicted protein; Magnaporthe grisea 70-15
4 × 10-07
Cluster: Hypothetical protein F21F14.210; Arabidopsis thaliana
3 × 10-99
Transcripts for specific genes were also examined closely. The effect of Dof transcription factors on seed oil regulation have been previously documented , where GmDof4 and GmDof11 were found to contribute to high seed oil phenotypes in Arabidopsis. In our study, Dof22 and Dof24 genes were upregulated in the HiPro soy line, but no significant difference was seen in the transcript abundance for Dof4 and Dof11 in either genotype (data not shown).
Differentially accumulated transcripts between NILs identified by microarray
Differentially accumulated transcripts between LoPro and HiPro identified by Affymetrix® Soy GeneChip
Ratio of Means
2.43 × 10-15
9.16 × 10-11
1 × 10-19
2.63 × 10-12
1.98 × 10-08
5.51 × 10-15
1.04 × 10-10
6 × 10-24
1.05 × 10-14
1.32 × 10-10
4.38 × 10-13
4.12 × 10-09
2.19 × 10-10
9.17 × 10-07
5.95 × 10-12
3.74 × 10-08
2.99 × 10-11
1.61 × 10-07
1.01 × 10-10
4.74 × 10-07
2 × 10-27
8.08 × 10-07
2.42 × 10-03
3 × 10-23
8.35 × 10-07
2.42 × 10-03
1.51 × 10-06
3.79 × 10-03
2.31 × 10-05
4.58 × 10-02
3 × 10-75
An N-way ANOVA test was also conducted to examine transcript accumulation differences simultaneously across multiple factors, genotype, and time (stage) within the genotype. At FDR < 0.05 , a total of 66 Soy Affymetrix® probesets were detected with differential changes in transcript accumulation using this method (see Additional file 9). Interestingly, five transcription factor-related transcripts, annotated as bZIP, ethylene-responsive, or heat shock, were detected with differential accumulation patterns (see Additional file 9). Again, the six probesets with the most highly differential accumulation values were represented (Table 3, Figure 3, see Additional file 9).
To further validate the microarray data, quantitative reverse transcriptase-polymerase chain reaction (qRT-PCR) was performed. Specific primers were designed for the three genes and an actin control. Significant differences between LoPro and HiPro were observed for pqi2 and pqi3 (Figure 4B). However, no significant transcript level fold changes were observed for pqi1 (Figure 4B). Thus, only two of the three genes identified as upregulated in LoPro in prior analyses were determined to display differentially accumulating transcripts between the two genotypes by qRT-PCR.
Genes with differentially accumulated transcripts between NILs map to the LG I protein QTL
Transcripts identified by N-way ANOVA (see Additional file 9) were aligned to the genome sequence to show the range and distribution along the soybean chromosomes (Figure 5B). The soybean genome sequence reveals a general bias toward gene-rich chromosome ends , a phenomenon that has been observed in other plant genomes . However, a striking concentration of probes (16 out of 66) mapped to chromosome 20 at the protein QTL region (Figure 5B). The presence of differentially accumulating transcripts in this region is consistent with the development of a near-isogenic line pair that displays variation in seed protein phenotype and segregation of markers within the protein QTL region. Recently, Wei et al.  also performed a transcriptome analysis using rice superhybrid LYP9 and mapped differentially expressed genes to yield-related QTLs in the rice genome.
Differentially accumulated transcripts between NILs identified by high-throughput transcriptome sequencing
Because the Soy Genome Affymetrix® GeneChip does not represent the complete set of soybean genes, high-throughput transcriptome sequencing (HTTS) was performed to confirm the microarray data and search for additional candidate genes. Using the same RNA samples prepared for microarray analysis as templates for high-throughput deep sequencing, more than 76 million reads were sequenced, each 36 or 46 nucleotides in length, using the Illumina® Genome Analyzer. Sequences were generated from random priming sites within transcript cDNA from each of the four stages in LoPro and in HiPro, producing more than 7 million reads per stage. Of these reads, more than 20 million aligned uniquely to the genome sequence. The soybean genome sequencing consortium predicted 68,013 gene models and 5,977 additional transposon-like gene models . From that initial set of gene models, the consortium identified 46,430 "high-confidence" genes. In the current seed NILs sequencing effort, 40,352 of the 46,430 (86%) of the high-confidence genes show evidence of expression. An additional 6,078 predicted genes not in the high-confidence set show evidence of expression from the seed NILs data.
Differentially accumulated transcripts between LoPro and HiPro identified by Illumina® high-throughput transcriptome sequencing
2 × 10-21
no alignments with E-value < 10-10
no alignments with E-value < 10-10
5 × 10-72
Putative ammonium transporter AMT1
no alignments with E-value < 10-10
ATP synthase D chain
4 × 10-17
no alignments with E-value < 10-10
Putative uncharacterized protein
2 × 10-21
no alignments with E-value < 10-10
Affymetrix® GeneChip vs. Illumina®high-throughput transcriptome sequencing analysis
Close comparison of the transcripts identified by HTTS (Table 4, #1 and #8) showed the presence of the two most highly differentially accumulated transcripts identified by Affymetrix® GeneChip analysis (Table 3, #1-pqi2 and #3-pqi3). Examination of the coordinates of the most highly differentially accumulated transcripts revealed a distance of 3.7 Mb between pqi2 and pqi3 (Figure 5A). However, the positioning of the soybean target sequence from the Affymetrix® GeneChip for these genes did not directly conform to the predicted gene models in the soybean genome (version Glyma1, ).
Interestingly, two pairs of transcripts identified from the Illumina® deep sequencing analysis (Table 4, #2 and #3, #5 and #6) appeared in the same region with overlapping chromosome coordinates but on opposite strands. Transcripts with sequence homology to known proteins included an ethylene receptor and a glutamyl-tRNA synthetase that presented differentially accumulated transcripts at only one stage, as well as a putative ammonium transporter (Table 4). Examination of the available Affymetrix® Soy GeneChip target equivalents that overlapped the region, however, did not provide support for the ethylene receptor and ammonium transporter transcript accumulation differences (Tables 4, see Additional file 11). In all, the union of Affymetrix® Soy GeneChip and Illumina® deep sequencing transcriptome data yielded 13 genes with differentially accumulating transcripts that mapped to the protein QTL region at LG I on chromosome 20 (Tables 3 and 4).
Genome-wide gene expression coverage
Ten genes at the LG I protein QTL region with high expression evidence.
1 × 10-112
Putative 40S ribosomal protein
1 × 10-153
Putative isopenicillin N epimerase
1 × 10-100
ADP ribosylation factor 002
5 × 10-39
1 × 10-159
Putative uncharacterized protein
5 × 10-39
Nicotiana lesion-inducing like
Ser/Thr protein kinase - Lotus japonicus
1 × 10-47
Figure 7 shows the gene region for pqi2 where only four of the eight libraries show transcript counts (values for A1 to A4 only; red, orange, yellow, green), consistent with transcript accumulation in LoPro versus HiPro (Tables 3 and 4). The coverage depth track shows the extent of redundancy in coverage at any nucleotide location; for this gene, peak coverage is at approximately 12 reads in any single location. The coverage track shows transcript accumulation at four of the seven predicted exons in the Glyma1.01 gene model for Glyma20 g18880 but also at several other regions outside the predicted gene model. Thus, the HTTS data provide information about expression patterns as well as gene structure and can aid in the improvement of soy gene annotation in the soybean genome while providing genome-wide expression data on seed development.
Seed protein and oil relationships
It has long been documented that seed protein and oil content are inversely correlated in the soybean seed [28–31, 46, 61]. Low oil alleles are consistently cotransmitted with high protein alleles in many instances [30, 62], and attempts to separate these two traits through chromosomal recombination in the NILs used in this study have not been successful . It has been hypothesized that this relationship may be due to either very tight linkage or pleiotropic effects . Whether one phenotype directly or indirectly results in the other is unknown, and the timing of events regarding differential accumulation of contrasting protein and oil levels in the seed is uncertain. GmDof4 and GmDof11 transcription factors, however, have been reported to activate genes involved in lipid biosynthesis and simultaneously suppress the expression of storage protein genes .
Transcription factors have also been shown to influence seed traits in other studies. For example, the putative AP2/EREBP transcription factor WRINKLED1 was found to be involved in the regulation of seed oil accumulation in Arabidopsis [63, 64], and a QTL encoding a NAC transcription factor was found to control grain protein and leaf senescence in wheat . In addition, seed mass in Arabidopsis has been shown to be regulated by the APETALA2 (AP2) class of transcription factors . Verdier et al.  evaluated the expression of transcription factors throughout seed development of Medicago truncatula. They found some 343 transcription factors were expressed equally throughout seed development while 169 had differential expression at one or more stages. Cluster analysis demonstrated six different clusters of transcription factor genes that corresponded to the developmental stages evaluated. Many of the 53 transcription factors that were found to be upregulated in this study during seed development of the soybean NILs were similar to those described by Verdier et al. .
Transcriptional suppression of some aspect of seed protein accumulation could be envisioned for the low protein/high oil NIL homozygous for the G. max allele of the LG I QTL. However, transcriptional suppression of seed oil accumulation in the NIL homozygous for the G. soja allele (assuming a repulsion-based pleiotropy of the two alleles of the candidate gene underlying this QTL) would be envisioned to occur in a time frame late in seed fill. This assumption is due to the observation that the rate of seed oil accumulation in HiPro did not differ from that of LoPro until the last stage of seed fill (Figure 2). Although HiPro matures slightly earlier and generally yields less seed than LoPro [27, 32], these differences do not fully account for the striking differences in NIL seed protein content observed at the early stages of seed fill. Whether additional differences in the morphology or composition of the seed exist between the near-isogenic lines remains to be seen. Further detailed investigation is in progress to study the temporal and spatial distribution and partitioning of candidate gene expression that may govern the relationship between protein and oil accumulation in the developing soybean seed.
Processes and pathways influencing seed content
Comprehensive evaluation of seed transcripts through microarray analyses have been reported for Arabidopsis , Medicago truncatula [4, 38], barley [5, 41], and wheat . These studies, in common, report differential expression of hundreds of genes at one or more stages of seed development involved in processes related to carbon and nitrogen metabolism, protein processing, transport of nutrients, organ development (transcription factors), signal transduction, and phytohormone balance. The transcript accumulation patterns we observed during NILs seed fill by GeneChip® microarray data were consistent with these studies.
Prior studies have demonstrated the transcription and accumulation of both mRNA and protein for beta-conglycinin and glycinin genes during the seed fill stage of seed development [66–68]. Transcripts for these seed storage proteins were identified during seed fill with particular abundance in the HiPro line (Tables 1 and 2, see Additional files 2, 3, 4, 5, 6, 7). Additional classes of genes with roles in seed development and maturation, flavonoid metabolism, and sucrose binding were also identified. Proteome analyses of the seed filling stages in soybean have provided support for the presence of gene transcripts found in this study with the detection of proteins associated with protein destination and storage, metabolism, and disease/defense . Expression of different protein isoforms have been shown to display different accumulation trends, and the activities of many genes may have multiple roles during seed filling. This phenomenon may be reflected in the increase and decrease in accumulated transcripts of lipoxygenase-related genes in this study, consistent with proteomic data on various lipoxygenases in the developing soybean seed .
Carbon metabolism directed toward oil and protein deposition plays an important role in seed quality. Changes in seed protein or oil in many plant species have been linked to the activity of acetyl-CA carboxylase (ACCase) [69–71] and phosphoenolpyruvate carboxylase (PEPC) [72–75]. Recent proteomic and microarray studies have shown the presence of peptides and transcripts for both enzymes during seed development [6, 38, 76]. Overexpression of Arabidopsis acetyl-CoA carboxylase led to increased oil content of Brassica napus seeds  and potato tubers . The acetyl-CoA carboxylase gene has also been associated with a major groat oil content QTL . In addition, inhibition of plastid acetyl-coA carboxylase resulted in lower seed oil . In soybean, a significant correlation was found between phosphoenolpyruvate carboxylase activity and seed protein and oil concentrations , although this correlation was found to be higher for seed protein. Furthermore, overexpression of phosphoenolypruvate carboxylase in Vicia narbonensis seed was shown to increase seed storage capacity and protein content . Although we found no significant differences in transcript expression of ACCase and PEPC between NILs, we observed that transcripts corresponding to several forms of ACCase and PEPC were expressed at all stages of seed development in this study (data not shown). Interestingly, some forms of ACCase were expressed at higher levels in the seed than others. Such data may reflect the importance of enhanced isoforms of ACCase and PEPC in seed development compared to isoforms expressed elsewhere in the plant.
Impaired storage metabolism has been linked with decreased sucrose levels , and sucrose may affect carbon flux at the transcriptional or post-transcriptional levels . Studies have shown the importance of photosynthesis in seed filling metabolism  and for the biosynthesis of seed storage products [83, 84] consistent with the wide array of photosynthesis-related genes detected during seed fill in this study. Regulation of protein destination, storage, and proteolysis, as well as metabolic and photosynthetic pathways, may contribute to the contrasting seed phenotypes seen in the NIL pair.
Additional transcript accumulation changes have been documented during seed development. A heat shock protein and peptide transporter were among the annotations of the transcripts with the greatest fold change increases from stage one to stage four in LoPro (see Additional file 3). Both a peptide transporter and heat shock-related proteins were previously found to increase dramatically during seed development in a high oil soybean line . Down-regulation of lipoxygenases and sucrose UDP-glycosyltransferase during seed development in a high oil soybean line of a previous study  is also consistent with the detection of down-regulated lipoxygenase and UDP-glycosyltransferase transcripts in LoPro (see Additional file 3). The transcription accumulation patterns of these genes may be a feature common to soybean lines with high oil phenotypes.
Candidates for regulation of seed protein and oil
We identified 14 genes mapping to the protein QTL region at LG I that may play a role in the regulation of seed protein and oil. Thirteen of these 14 genes displayed differentially accumulating transcripts. Of these 13, 11 were found at high levels in the low protein line with low or no detectable levels in the high protein line. Based on sequence homology searches to protein databases, these candidates include a potential regulatory protein in the Mov34-1 family, a heat shock protein Hsp22.5, and an ATP synthase (Table 3).
Although the Mov34-1 candidate appeared to possess versatile domains for the potential regulation of multiple processes, transcripts isolated from this candidate region contained numerous stop codons, raising the possibility of non-coding genes. The same was true for a number of the other candidates and may account for the high percentage of genes with no significant E-value returns to the Uniprot protein database . There is increasing evidence for the role of riboregulators, either as long non-protein coding RNAs or processed into small RNAs in plant development , and these molecules may play a role in seed protein and oil accumulation. Two pairs of genes among the candidates (Table 4) were found to possess overlapping transcripts; one possibility is that these overlapping transcripts form double-stranded RNAs that may be processed into small RNAs .
Evidence for the expression of heat shock proteins during the stress-independent development of the seed has previously been observed [89, 90]. Interestingly, heat shock protein genes were found to be expressed at higher levels in the low protein line of a near-isogenic line pair in barley , a phenomenon also observed in the LoPro line of this study. Previous studies have detailed an indirect relationship among the accumulation of storage proteins, lipid biosynthesis, and photosynthesis in the seed, correlating to the availability and distribution of ATP [83, 84, 91, 92]. Further investigation into the modulation of ATP synthase levels on energy status and storage product accumulation in the soybean seed will shed light on the potential role for ATP synthase as a candidate gene. Currently, the occurrence of additional candidate genes from even earlier stages of seed development is being evaluated through differential analysis of transcriptome profiles of the near-isogenic line pair.
Potential modes of regulation for seed protein and oil
The LoPro line was converted into the HiPro line upon inheritance of a G. soja allele at the LG I protein QTL region. However, the LoPro line is also the high oil line, and a number of scenarios may explain how gene expression differences relate to variation in protein and oil phenotypes in the seed.
Protein content may be positively regulated by the expression of a gene that increases protein production in HiPro. Alternatively, protein content may be negatively regulated by expression of a gene in LoPro that inhibits or reduces protein accumulation and thus allows for increased oil accumulation. Significant protein differences would then be observed at an earlier stage than oil differences, as in Figure 2. Inhibition of protein accumulation could take place at many levels, including transcriptional and post-transcriptional control and regulation of protein synthesis, transport, and turnover. The presence of candidate genes with non-coding segments raises the possibility of regulation at the transcriptional level that may affect the transcription of genes outside the list of candidates shown in this study.
Differences in transcriptome profiles may correlate directly or indirectly with the differences in protein and oil accumulation between the NILs. Previous studies have shown that seed storage proteins are largely controlled by transcriptional regulation during the seed fill stage (reviewed in ). Extensive analysis of cis-regulatory elements of seed storage proteins has demonstrated interaction of these elements with bZIP and MYB factors [39, 93–96]. Transcription of a candidate gene in LoPro may result in negative regulation of transcriptional regulators or key factors involved in high protein accumulation. The presence of sequence polymorphisms in gene sequences or promoter regions within the segregating region of the protein QTL may account for the low or absent levels of differentially accumulating gene transcripts in HiPro versus LoPro (Table 4).
In an alternative scenario, oil content may be regulated. Gene expression or transcript accumulation leading to a higher oil phenotype may act in concert with other factors to directly or indirectly lead to reduced protein accumulation. Genes regulated by transcription factors could initiate this effect. In support of this model, batch analysis of the promoter regions of the genes with the greatest differentially accumulated transcripts between the NILs revealed a number of transcription factor binding sites and seed-specific motifs (data not shown). A regulatory factor expressed in the high oil LoPro line may activate higher oil synthesis or accumulation pathways. This is consistent with the greater abundance of candidate gene transcript accumulation seen in LoPro (Table 4). Inheritance of a G. soja allele that does not allow for expression or accumulation of the high oil gene could account for the low oil and high protein phenotype in HiPro.
Utility of the HTTS dataset for understanding the soybean genome
Although we focused on the transcripts derived from the LG I region of the genome in this study, the high-throughput transcriptome sequencing data set we obtained compiles greater than 76 million reads and 2.76 × 109 nucleotides of transcript data and is an excellent resource for increasing our understanding of the soybean genome. The use of HTTS in conjunction with microarrays allowed us to detect a more comprehensive set of soybean gene transcripts. Our observation that 86% of gene transcripts in soybean were present during seed development greatly extends previous microarray-dependent seed development studies.
Recent reports demonstrate the value of high-throughput transcriptome sequencing in eukaryotes for identification of novel transcripts and transcript isoforms, untranslated regions, and gene structures, leading to improved genome annotation [97–100]. For the soybean genome, current gene models using the 8× genome sequence assembly (version Glyma1, ) were predicted based on protein coding sequences. By comparison, our transcriptome dataset encompasses both protein coding and non-protein coding sequences and will be useful for identification of transcripts outside of gene models. Analyses of our dataset also show evidence for the existence of novel transcript isoforms, including alternative splicing, between genotypes and among seed stages (data not shown). Moreover, beyond the detection of feature polymorphisms reported here, a comparative analysis of common transcripts between soybean lines will provide a multitude of single nucleotide polymorphisms useful in following agronomic traits in breeding populations. Currently we are analyzing high-throughput sequencing of transcripts from many soybean tissues. That data, along with the seed transcriptome data, will compile an atlas of gene expression for soybean.
This study provided the rare opportunity to intersect structural mapping and molecular profiling studies. Here, we compared the transcript abundance profiles of the developing seed from a soybean NIL pair with contrasting seed protein and identified gene candidates at the LG I protein QTL for potential involvement in the regulation of protein content in the soybean seed. The entire transcriptome sequencing dataset generated from this study is also provided as a valuable resource.
Control of protein and oil accumulation in the seed occurs at many different levels and is likely influenced by more than one gene. Of the candidates genes identified in this study, any combination could be responsible for the observed change in protein and oil and phenotypes conditioned by the alleles of the LG I QTL. Other protein/oil QTLs have been identified in QTL mapping studies, but the LG I QTL is of great interest because its additive effect on seed protein and oil is the largest of any QTL identified to date. The models presented here are compatible with the role of additional genes and pathways as well as mixed models for control of seed protein and oil. Resources that include the availability of additional recombinants and the use of markers derived from this study will allow for further demarcation of the QTL region. Further studies are being conducted on additional mapping populations to dissect the relationship between protein and oil levels, and functional studies are under way to identify and validate the role(s) of candidate genes in the accumulation of protein and oil in the seed.
Physical mapping of the QTL region
The QTL flanking SSRs from a previous genetic study , Satt239 and Satt496, as well as three other SSR markers (Sat_174, Sat_219, and Satt700) in the vicinity of the putative QTL region were used to PCR (polymerase chain reaction) screen multi-dimensional pools of the soybean [Glycine max (L.) Merrill] 'Williams 82' and 'Fairbault' BAC libraries. BAC clones were end-sequenced using M13 forward and reverse primers at the Iowa State University DNA sequencing and synthesis facility. The BAC libraries were then rescreened by PCR using primers designed from BAC end-sequences, and the BAC contigs were extended by chromosome walking. BACs were fingerprinted using restriction enzymes EcoRI and AccI, and BAC overlap was confirmed by FPC (FingerPrinted Contig) 4.6.4 . BAC overlap was also verified by PCR using primers from BAC end-sequences. A minimal tiling path of BACs were identified and subsequently sequenced.
BAC sequencing and assembly
BAC DNA was isolated by plasmid midi-prep (Qiagen, Valencia, CA). Random sheared BAC DNA was size selected for 2 to 3 kb and subcloned onto vector pCR® 4Blunt-TOPO® using the TOPO® shotgun subcloning kit (Invitrogen). The recombinant plasmids were transformed into competent TOP10 E. coli cells by electroporation. Transformants were isolated on LB plates containing kanamycin. Subclones were sequenced using M13 forward and reverse primers at the Iowa State University DNA sequencing and synthesis facility. Vector trimming, removal of poor quality reads, and sequence assembly were carried out using the program SeqManII (DNASTAR, Inc.) using default parameters with a minimum match percentage of 95% for sequence assembly. Contigs were ordered based on the positions of the reverse and forward reads of the same subclones. Sequence gaps were filled either by complete sequencing of the subclones that spanned the gaps or by PCR amplification across the gap using BAC DNA followed by complete sequencing of the PCR products.
Demarcation of the QTL region
The BAC sequences were aligned to the sequence scaffolds (version Glyma0 and Glyma1, ) of the genome sequence http://www.soybase.org by BLASTN . All the BAC sequences showed the best match to chromosome 20. Additional SSRs were identified from within the putative QTL region and tested for polymorphism between lines A81-356022 and PI468916. All the polymorphic SSRs were initially amplified from 'Williams 82' (the reference genotype for which the genome sequence is available) to verify that the primers were amplifying products of expected sizes and therefore were targeted to the QTL region. Further, the polymorphic markers from within the QTL region were screened for segregation in the population P-C609-45-2 described below that segregates for only the 3 cM region surrounding the QTL . This SSR analysis identified the recombination break points for a more precise positioning of the QTL region.
Development of NILs
NILs were developed by introgression of the high protein QTL allele on LG I from G. soja PI468916 into G. max A81-356022 for BC5F5 populations [25, 32]. The NIL population P-C602-15-6 contained 53 lines. A single BC5F5 plant from P-C609-45-2 that was heterozygous for the Satt496 marker in the LG I protein QTL region was designated as P-C609-45-2-2 and produced 39 BC5F6 lines . A NIL pair (LD04-15154 = HiPro and LD04-15146 = LoPro) derived from P-C609-45-2-2 was chosen from among the BC5F6lines for segregation at the LG I protein QTL region for marker Satt496 and for corresponding high and low seed protein phenotypes from field trials. Additional markers for segregating and non-segregating regions were confirmed for the NIL pair and verified in the parental lines as described above.
Plant growth and experimental design
In order to minimize uncontrolled environmental conditions, the NIL pair consisting of LoPro and HiPro was grown in growth chambers at the University of Minnesota. Soybeans were initially grown in the growth chamber at a photoperiod of 14/10 and thermocycle of 22°C/10°C. Day length and temperature were monitored to mimic Illinois field growing conditions. Contrasting NILs were planted in staggered pairs, and three biological replicates were conducted following a complete random design. Each replicate was harvested at the same time of day and consisted of seed samples at four developmental stages pooled from three plants. Samples were harvested from the NILs in parallel and flash frozen in liquid nitrogen before storage at -80°C. Stage one corresponded to 25 to 50 mg, stage two to greater than 50 to 100 mg, stage three to greater than 100 to 200 mg, and stage four to greater than 200 to 300 mg seed.
Seed protein and oil analysis
The NILs were grown to maturity, and seed from both genotypes was harvested at each of the four stages. Seed was also harvested from the final mature seed stage, and replicate samples were pooled by stage and genotype and analyzed for protein and oil at the Agricultural Experiment Station chemical laboratories at the University of Missouri-Columbia (UMC). Soybean seed was weighed before and after freeze-drying and then submitted to UMC for laboratory analysis. A combustion protocol using AOAC Official Method 990.03  was used to analyze protein concentration in the soybean seed samples. Oil levels were determined by ether extraction following AOAC Official Method 902.39A .
Seed was ground with liquid nitrogen by mortar and pestle. Total RNA was isolated by a modified TRIzol® (Invitrogen) protocol  and then digested with on-column RNase-free DNase (Qiagen) and purified by RNeasy column (Qiagen). RNA quality was evaluated by gel electrophoresis, spectrophotometer, and Agilent 2100 bioanalyzer.
Microarray preparation and processing
Processing and labeling of RNA samples was performed by Qiagen® Target Prep Robot at the Biomedical Image Processing Facility at the University of Minnesota. Synthesis of cDNA was performed using the SuperScript Double-Stranded cDNA Synthesis Kit (Invitrogen) on 5 μg of total RNA from each sample, and biotinylated cRNA was produced using the Enzo BioArray HighYield RNA transcript labeling kit (Enzo Life Sciences, Farmingdale, NY, U.S.A.) in the presence of biotinylated UTP and CTP. Samples were purified by RNeasy kit (Qiagen), quantified by Biotek® Synergy HT plate reader, and chemically fragmented using the Affymetrix® GeneChip sample cleanup module. Samples were then hybridized to the Soy Genome Affymetrix® GeneChip using an Affymetrix® Hybridization Oven 640, and arrays were washed on an Affymetrix® Fluidics Station 450 using Affymetrix® fluidics protocol EukGE-WS2v4_450. Details of this protocol can be found in the Affymetrix® Genechip Expression Analysis Technical Manual, Section 2, Chapter 3 http://www.affymetrix.com/support/downloads/manuals/expression_analysis_technical_manual.pdf.
Microarray data processing and analysis
The Soy Genome Affymetrix® GeneChip http://www.Affymetrix.com containing greater than 37,500 probesets and representing 35,611 soybean transcripts , was used to assess gene expression. Microarray data were analyzed using Expressionist Pro software from Genedata Inc. Raw data in the form of .CEL files from the Affymetrix® GeneChip were uploaded to the platform, and the robust microarray analysis (RMA) algorithm  was used to condense and normalize all soybean probeset data with a median of ten thousand. Correlation coefficients for the three biological replicates assessed per sample genotype and time point (stage) ranged from 0.9809 to 0.9982 after normalization. The detection quality was set to a value of one to ensure that all probe sets were considered. MAS5.0  data condensation and normalization were also performed for comparison purposes. An FDR value was computed for each P value . Differentially accumulated gene transcript lists were produced at false discovery rates estimated at 5% or less. Microarray data sets were deposited under experiment GM11 in the Plant Expression database (PLEXdb) .
Single Feature Polymorphisms (SFPs) were identified using a method  based on the Li-Wong model . This method compares the relative probe intensities of each of the 11 probes on the Affymetrix® GeneChip between genotypes. Statistical analysis of the probe affinity difference was calculated using the feature intensity of the perfect match (PM) probes. Given the raw intensity (S) of each feature (probe) determined by the gene expression level (I), the affinity (A) between the target transcript and the probe, and random error (E) [58, 109–111], the equation can be modeled as Atij + Etij = Stij - Iti . Here, Stij is the raw PM intensity and Iti is derived from the RMA expression value of each gene for the designated genotype (t), probe set (i), and probe (j), where Et1ij ≈ Et2ij, since E is an independent identically distributed error with a mean of zero. The Bioconductor Affymetrix® package was used to extract PM intensity and to calculate RMA expression, and the Bioconductor Siggenes package was used to evaluate all probe sets.
Genes were annotated using the Affymetrix® GeneChip Soybean Genome Array Annotation http://www.soybase.org/AffyChip from SoyBase and The Soybean Breeder's Toolbox in conjunction with annotations from the HarvEST soy assembly website http://www.harvest-web.org. Unannotated genes were individually scanned by BLASTX and TBLASTX at an E-value cutoff of 10-4. The UniProt protein database , the Pfam protein database , the Arabidopsis thaliana genome database (TAIR, http://www.arabidopsis.org), and the Medicago truncatula genome database http://www.medicago.org were used for annotation purposes. TAIR gene ontology (GO) and GO slim annotations  were provided for each Arabidopsis match. BLASTP results with an E-value of less than 10-10 were used to describe gene sequences referenced on the soybean genome (version Glyma1, ).
Statistical analysis of gene ontology and expression
The consensus sequences of the soybean genes on the Soy Genome Affymetrix® GeneChip were compared to the most recent release of predicted genes in the Arabidopsis genome (TAIR v. 8, http://www.arabidopsis.org) using TBLASTX (E < 10-4, ). The top Arabidopsis gene was used to query the Arabidopsis gene ontology (TAIR ATH_GO_GOSlim.20080308, http://www.arabidopsis.org) . A database was created linking each Affymetrix® probe to the most similar Arabidopsis gene (E < 10-6) and its corresponding gene ontology information . Custom Perl scripts were used to mine the database for the GO slim annotations of the differentially expressed genes of interest.
To determine if particular GO slim categories were over-represented in our expression data, the number of genes matching each GO slim category was determined. This procedure was repeated to determine the number of genes matching each GO slim category for all the soybean consensus sequences represented on the chip. For each GO slim category, Fisher's exact test  was used to compare the number of expressed genes in the GO slim category, the number of genes not differentially expressed in the GO slim category, the number of differentially expressed genes outside the GO slim category, and the number of genes not differentially expressed and outside the GO slim category. To correct for oversampling, a Bonferroni correction  was used to adjust the two-tail probability P value. The P value obtained using Fisher's exact test was multiplied by the total number of GO categories represented on the Affymetrix® Soy GeneChip. Only P values more significant than 0.05 after Bonferroni correction are reported. Further, only GO Slim categories that were significantly over-represented in the expression data are reported.
Quantitative RT-PCR was performed and analyzed using the Applied Biosystem Real-Time PCR system. Gene-specific primers spanning a maximum of 150 bp were designed using Primer Express® software (Applied Biosystems). Gene-specific actin primers were also used for control and calculation purposes. Template cDNA was synthesized from total RNA using a reverse transcription cDNA synthesis kit (Invitrogen). Reactions with no reverse transcriptase were performed as controls. Quantitative RT-PCR was performed in three replicates in a 96-well plate using SYBR® Green (BioRad) at 35 cycles. Results were calculated using the comparative CT method to evaluate gene expression in LoPro vs. HiPro or HiPro vs. LoPro with respect to the actin control at each stage.
Total RNA from stages one through four of LoPro and HiPro was used for Illumina® sequencing. Poly A+ RNA was isolated from total RNA through two rounds of oligo-dT selection (Invitrogen Inc., Santa Clara, CA). The mRNA was annealed to high concentrations of random hexamers and reverse transcribed. Following second strand synthesis, end repair, and A-tailing, adapters complementary to sequencing primers were ligated to cDNA fragments. Resultant cDNA libraries were size fractionated on agarose gels, and 250 bp fragments were excised and amplified by 15 cycles of polymerase chain reaction. Ensuing libraries were quality assessed using the Agilent 2100 bioanalyzer platform and sequenced for 36 or 46 cycles on an Illumina® Genome Analyzer DNA sequencing instrument using standard Illumina® procedures.
Sequencing data processing and analysis
To process the data for analysis, files were mirrored to an off-instrument computer using the Illumina® platform to perform image analysis, base-calling, and per base confidence scores. Individual transcript tags were identified, counted, and scored for uniqueness. Sequence reads were then aligned against the 8X soybean genome sequence assembly (version Glyma1, ) using MAQ . Read mappings were retained if they met the following criteria: they had a mapping quality of 99, or had no mismatches, or the sum of the quality scores of the mismatched bases was less than or equal to six (using Phred quality scores). If a read mapped equally well to multiple locations (therefore producing a mapping score of zero), MAQ randomly returned one of the locations. Counts were made with respect to predicted genes in the Glyma1.01 annotation by incrementing the count for a gene when any part of a read overlapped the longest splice variant of the gene model. Counts per gene and tissue are displayed in an "expression" GBrowse track at http://soybase.org/gbrowse, and all reads, without regard to gene boundaries, are displayed in another expression GBrowse track. The significance of gene expression between treatment pairs (e.g., A1 LoPro vs. A1 HiPro) was tested for each gene by comparing the normalized values for that gene against a two-tailed binomial distribution using a P-value of 0.001. Normalizations were calculated by multiplying the count values in each treatment by the experiment-wide average over the treatment sum. The test for significance for a given gene, with counts C1 and C2 (and C1 < C2), is whether the probability of observing C1 or fewer counts out of C1 + C2 trials (counts observed for genes from both treatments) is less than or equal to 0.0005 (for a two-tailed threshold of 0.001).
Soybean physical mapping
Sequence information was downloaded from the latest soybean genome sequence assembly (version Glyma1, ) to obtain 50,527 unique soybean gene identifiers with chromosome locations. Soy Genome Affymetrix® GeneChip probeset consensus sequences were retrieved http://www.Affymetrix.com for a total of 61,035 cDNA sequences. The NCBI blast program  was used to align Affymetrix® Soy GeneChip probeset consensus sequences against the soybean cDNA database (Glyma1.cDNA.fa, http://www.phytozome.net/soybean.php) containing 75,778 sequences. With the blastn search tool, the match matrix BLOSUM62 was used with the following parameters: mismatch penalty -3, E-value 10-5, and bit score 100. This analysis aligned 36,406 Affymetrix® soy identifiers to soy genome identifiers with chromosome locations. The genome sequences and probesets with chromosome information were imported into the genome browser of GeneSpring version 7.3.1 http://www.Agilent.com for mapping of genes and probeset locations onto chromosomes.
This work was supported by the U.S. Department of Agriculture, Agricultural Research Service, Current Research Information System (CRIS No. 3640-21000-024-00D). We are grateful for funding from the Minnesota Soybean Research and Promotion Council and for funding from the United Soybean Board.
We would like to acknowledge the use of resources at the MSI Supercomputing Institute at the University of Minnesota. Mention of trade names or commercial products in this report is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture.
- Hobbs DH, Flintham JE, Hills MJ: Genetic control of storage oil synthesis in seeds of Arabidopsis. Plant Physiol. 2004, 136 (2): 3341-3349. 10.1104/pp.104.049486.PubMedPubMed CentralGoogle Scholar
- Weber H, Borisjuk L, Wobus U: Molecular physiology of legume seed development. Annu Rev Plant Biol. 2005, 56: 253-279. 10.1146/annurev.arplant.56.032604.144201.PubMedGoogle Scholar
- Hajduch M, Ganapathy A, Stein JW, Thelen JJ: A systematic proteomic study of seed filling in soybean. Establishment of high-resolution two-dimensional reference maps, expression profiles, and an interactive proteome database. Plant Physiol. 2005, 137 (4): 1397-1419. 10.1104/pp.104.056614.PubMedPubMed CentralGoogle Scholar
- Gallardo K, Firnhaber C, Zuber H, Hericher D, Belghazi M, Henry C, Kuster H, Thompson R: A combined proteome and transcriptome analysis of developing Medicago truncatula seeds: evidence for metabolic specialization of maternal and filial tissues. Mol Cell Proteomics. 2007, 6 (12): 2165-2179. 10.1074/mcp.M700171-MCP200.PubMedGoogle Scholar
- Jukanti AK, Heidlebaugh NM, Parrott DL, Fisher IA, McInnerney K, Fischer AM: Comparative transcriptome profiling of near-isogenic barley (Hordeum vulgare) lines differing in the allelic state of a major grain protein content locus identifies genes with possible roles in leaf senescence and nitrogen reallocation. New Phytol. 2008, 177: 333-349.PubMedGoogle Scholar
- Dam S, Laursen BS, Ornfelt JH, Jochimsen B, Staerfeldt HH, Friis C, Nielsen K, Goffard N, Besenbacher S, Krusell L, Sato S, Tabata S, Thogersen IB, Enghild JJ, Stougaard J: The proteome of seed development in the model legume Lotus japonicus. Plant Physiol. 2009, 149 (3): 1325-1340. 10.1104/pp.108.133405.PubMedPubMed CentralGoogle Scholar
- Liu K: Soybeans: Chemistry, Technology, and Utilization New York: Chapman & Hall 1997.Google Scholar
- Hill JE, Breidenbach RW: Proteins of soybean seeds. I. Isolation and characterization of the major components. Plant Physiology. 1974, 53: 742-746. 10.1104/pp.53.5.742.PubMedPubMed CentralGoogle Scholar
- Rubel A, Rinne RW, Canvin DT: Protein, oil and fatty acid in developing soybean seeds. Crop Sci. 1972, 12: 739-741.Google Scholar
- Herman EM, Larkins BA: Protein storage bodies and vacuoles. Plant Cell. 1999, 11 (4): 601-614. 10.1105/tpc.11.4.601.PubMedPubMed CentralGoogle Scholar
- Thanh VH, Shibasaki K: Major proteins of soybean seeds. A straightforward fractionation and their characterization. J Agric Food Chem. 1976, 24: 1118-Google Scholar
- Roberts RC, Briggs DR: Isolation and characterization of the 7S component of soybean globulins. Cereal Chem. 1965, 42: 71-Google Scholar
- Gutierrez L, Van Wuytswinkel O, Castelain M, Bellini C: Combined networks regulating seed maturation. Trends Plant Sci. 2007, 12 (7): 294-300. 10.1016/j.tplants.2007.06.003.PubMedGoogle Scholar
- Domoney C, Duc G, Ellis TH, Ferrandiz C, Firnhaber C, Gallardo K, Hofer J, Kopka J, Kuster H, Madueno F, Munier-Jolain NG, Mayer K, Thompson R, Udvardi M, Salon C: Genetic and genomic analysis of legume flowers and seeds. Curr Opin Plant Biol. 2006, 9 (2): 133-141. 10.1016/j.pbi.2006.01.014.PubMedGoogle Scholar
- Uauy C, Distelfeld A, Fahima T, Blechl A, Dubcovsky J: A NAC Gene regulating senescence improves grain protein, zinc, and iron content in wheat. Science. 2006, 314 (5803): 1298-1301. 10.1126/science.1133649.PubMedGoogle Scholar
- Ohto MA, Fischer RL, Goldberg RB, Nakamura K, Harada JJ: Control of seed mass by APETALA2. Proc Natl Acad Sci USA. 2005, 102 (8): 3123-3128. 10.1073/pnas.0409858102.PubMedPubMed CentralGoogle Scholar
- Zhang W, Bi J, Chen L, Zheng L, Ji S, Xia Y, Xie K, Zhao Z, Wang Y, Liu L, Jiang L, Wan J: QTL mapping for crude protein and protein fraction contents in rice (Oryza sativa L.). J Cereal Sci. 2008, 48: 539-547. 10.1016/j.jcs.2007.11.010.Google Scholar
- Timmerman-Vaughan G, Mills A, Whitfield C, Frew T, Butler R, Murray S, Lakeman M, McCallum J, Russell A, Wilson D: Linkage mapping of QTL for seed yield, yield components, and developmental traits in pea. Crop Sci. 2005, 45: 1336-1344.Google Scholar
- Mansur LM, Lark KG, Kross H, Oliveira A: Interval mapping of quantitative trait loci for reproductive, morphological, and seed traits of soybean (Glycine max L.). Theoret Appl Genet. 1993, 86 (8): 1432-2242.Google Scholar
- Lee SH, Bailey MA, Mian MAR, Carter TE, Shipe ER, Ashley DA, Parrot WA, Hussey RS, Boerma HR: RFLP loci associated with soybean seed protein and oil content across populations and locations. Theoret Appl Genet. 1996, 93: 649-657. 10.1007/BF00224058.Google Scholar
- Fasoula VA, Harris DK, Boerma HR: Validation and designation of quantitative trait loci for seed protein, seed oil, and seed weight from two soybean populations. Crop Sci. 2004, 44: 1218-1225.Google Scholar
- Csanádi G, Vollman J, Stift G, Lelly T: Seed quality QTLs identified in a molecular map of early maturing soybean. Theor Appl Genet. 2001, 103: 912-919. 10.1007/s001220100621.Google Scholar
- Panthee DR, Pantalone VR, West DR, Saxton AM, Sams CE: Quantitative trait loci for seed protein and oil concentration and seed size in soybean. Crop Sci. 2005, 45: 2015-2022. 10.2135/cropsci2004.0720.Google Scholar
- Diers BW, Keim P, Fehr WR, Shoemaker RC: RFLP analysis of soybean seed protein and oil content. Theoret Appl Genet. 1992, 83: 608-612. 10.1007/BF00226905.Google Scholar
- Seboldt AM, Shoemaker RC, Diers BW: Analysis of a quantitative trait locus allele from wild soybean that increases seed protein concentration in soybean. Crop Sci. 2000, 40: 1438-1444.Google Scholar
- Brummer EC, Graef GL, Orf JH, Wilcox JR, Shoemaker RC: Mapping QTL for seed protein and oil content in eight soybean populations. Crop Sci. 1997, 37: 370-378.Google Scholar
- Chung J, Babka HL, Graef GL, Staswick PE, Lee DJ, Cregan PB, Shoemaker RC, Specht JE: The seed protein, oil, and yield QTL on soybean linkage group I. Crop Sci. 2003, 43: 1053-1067.Google Scholar
- Brim CA, Burton JW: Recurrent selection in soybeans. II. Selection for increased percent protein in seeds. Crop Sci. 1979, 19: 494-498.Google Scholar
- Burton JW: Quantitative genetics: results relevant to soybean breeding. Soybeans: Improvement, Production, and Uses. Edited by: Wilcox JR Madison, WI: ASA, CSSA, and SSSA, 16: 211-247.1987, 2Google Scholar
- Wilcox JR, Cavins JF: Backcrossing high seed protein to a soybean cultivar. Crop Sci. 1995, 35: 1036-1041.Google Scholar
- Cober ER, Voldeng HD: Developing high-protein, high-yield soybean populations and lines. Crop Sci. 2000, 40: 39-42.Google Scholar
- Nichols DM, Glover KD, Carlson SR, Specht JE, Diers BW: Fine mapping of a seed protein QTL on soybean linkage group I and its correlated effects on agronomic traits. Crop Sci. 2006, 46: 834-839. 10.2135/cropsci2005.05-0168.Google Scholar
- Ruuska SA, Girke T, Benning C, Ohlrogge JB: Contrapuntal networks of gene expression during Arabidopsis seed filling. Plant Cell. 2002, 14 (6): 1191-1206. 10.1105/tpc.000877.PubMedPubMed CentralGoogle Scholar
- Schmid M, Davison TS, Henz SR, Pape UJ, Demar M, Vingron M, Scholkopf B, Weigel D, Lohmann JU: A gene expression map of Arabidopsis thaliana development. Nat Genet. 2005, 37 (5): 501-506. 10.1038/ng1543.PubMedGoogle Scholar
- Goldberg RB, Barker SJ, Perez-Grau L: Regulation of gene expression during plant embryogenesis. Cell. 1989, 56: 149-160. 10.1016/0092-8674(89)90888-X.PubMedGoogle Scholar
- Le BH, Wagmaister JA, Kawashima T, Bui AQ, Harada JJ, Goldberg RB: Using genomics to study legume seed development. Plant Physiol. 2007, 144 (2): 562-574. 10.1104/pp.107.100362.PubMedPubMed CentralGoogle Scholar
- Vodkin L, Jones S, Gonzalez DO, Thibaud-Nissen F, Zabala G, Tuteja J: Genomics of soybean seed development. Genetics and Genomics of Soybean. Edited by: Stacey G. 2008, New York: Springer, 2: 163-184. full_text.Google Scholar
- Benedito VA, Torres-Jerez I, Murray JD, Andriankaja A, Allen S, Kakar K, Wandrey M, Verdier J, Zuber H, Ott T, Moreau S, Niebel A, Frickey T, Weiller G, He J, Dai X, Zhao PX, Tang Y, Udvardi MK: A gene expression atlas of the model legume Medicago truncatula. Plant J. 2008, 55 (3): 504-513. 10.1111/j.1365-313X.2008.03519.x.PubMedGoogle Scholar
- Verdier J, Thompson RD: Transcriptional regulation of storage protein synthesis during dicotyledon seed filling. Plant Cell Physiol. 2008, 49 (9): 1263-1271. 10.1093/pcp/pcn116.PubMedGoogle Scholar
- Wan Y, Poole RL, Huttly AK, Toscano-Underwood C, Feeney K, Welham S, Gooding MJ, Mills C, Edwards KJ, Shewry PR, Mitchell RA: Transcriptome analysis of grain development in hexaploid wheat. BMC Genomics. 2008, 9: 121-10.1186/1471-2164-9-121.PubMedPubMed CentralGoogle Scholar
- Sreenivasulu N, Usadel B, Winter A, Radchuk V, Scholz U, Stein N, Weschke W, Strickert M, Close TJ, Stitt M, Graner A, Wobus U: Barley grain maturation and germination: metabolic pathway and regulatory network commonalities and differences highlighted by new MapMan/PageMan profiling tools. Plant Physiol. 2008, 146 (4): 1738-1758. 10.1104/pp.107.111781.PubMedPubMed CentralGoogle Scholar
- Wei G, Tao Y, Liu G, Chen C, Luo R, Xia H, Gan Q, Zeng H, Lu Z, Han Y, Li X, Song G, Zhai H, Peng Y, Li D, Xu H, Wei X, Cao M, Deng H, Xin Y, Fu X, Yuan L, Yu J, Zhu Z, Zhu L: A transcriptomic analysis of superhybrid rice LYP9 and its parents. Proc Natl Acad Sci USA. 2009, 106 (19): 7695-7701. 10.1073/pnas.0902340106.PubMedPubMed CentralGoogle Scholar
- Wang CS, Todd JJ, Vodkin LO: Chalcone synthase mRNA and activity are reduced in yellow soybean seed coats with dominant I alleles. Plant Physiol. 1994, 105 (2): 739-748. 10.1104/pp.105.2.739.PubMedPubMed CentralGoogle Scholar
- He G, Luo X, Tian F, Li K, Zhu Z, Su W, Qian X, Fu Y, Wang X, Sun C, Yang J: Haplotype variation in structure and expression of a gene cluster associated with a quantitative trait locus for improved yield in rice. Genome Res. 2006, 16 (5): 618-626. 10.1101/gr.4814006.PubMedPubMed CentralGoogle Scholar
- Li R-J, Wang H-Z, Mao H, Lu Y-T, Hua W: Identification of differentially expressed genes in seeds of two near-isogenic Brassica napus lines with different oil content. Planta. 2006, 224: 952-962. 10.1007/s00425-006-0266-4.PubMedGoogle Scholar
- Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA: Genome sequence of the palaeopolyploid soybean. Nature. 2010, 463 (7278): 178-183. 10.1038/nature08670.PubMedGoogle Scholar
- Cregan PB, Jarvik T, Bush AL, Shoemaker RC, Lark KG, Kahler AL, Kaya N, VanToai TT, Lohnes DG, Chung J, Specht JE: An integrated genetic linkage map of the soybean genome. Crop Sci. 2006, 39 (5): 1464-1490.Google Scholar
- Storey JD, Tibshirani R: Statistical significance for genomewide studies. Proc Natl Acad Sci USA. 2003, 100 (16): 9440-9445. 10.1073/pnas.1530509100.PubMedPubMed CentralGoogle Scholar
- Weber H, Golombek S, Heim U, Borisjuk L, Panitz R, Manteuffel R, Wobus U: Integration of carbohydrate and nitrogen metabolism during legume seed development: implications for storage product synthesis. J Plant Physiol. 1998, 152: 641-648.Google Scholar
- Asada T, Collings D: Molecular motors in higher plants. Trends Plant Sci. 1997, 2: 29-37. 10.1016/S1360-1385(96)10051-0.Google Scholar
- Hashimoto T: Dynamics and regulation of plant interphase microtubules: a comparative view. Curr Opin Plant Biol. 2003, 6: 568-576. 10.1016/j.pbi.2003.09.011.PubMedGoogle Scholar
- Borisjuk L, Walenta S, Weber H, Muehller-Klieser W, Wobus U: High-resolution histographical mapping of glucose concentrations in developing cotyledons of Vicia faba in relation to mitotic activity and storage processes: glucose as a possible developmental trigger. Plant J. 1998, 15 (4): 583-591. 10.1046/j.1365-313X.1998.00214.x.Google Scholar
- Koch KE: Sucrose metabolism: regulatory mechanisms and pivotal roles in sugar sensing and plant development. Curr Opin Plant Biol. 2004, 7: 235-246. 10.1016/j.pbi.2004.03.014.PubMedGoogle Scholar
- Koch KE: Carbohydrate-modulated gene expression in plants. Annu Rev Plant Physiol Plant Mol Biol. 1996, 47: 509-540. 10.1146/annurev.arplant.47.1.509.PubMedGoogle Scholar
- Smeekens S: Sugar-induced signal transduction in plants. Annu Rev Plant Physiol Plant Mol Biol. 2000, 51: 49-81. 10.1146/annurev.arplant.51.1.49.PubMedGoogle Scholar
- Wang HW, Zhang B, Hao YJ, Huang J, Tian AG, Liao Y, Zhang JS, Chen SY: The soybean Dof-type transcription factor genes, GmDof4 and GmDof11, enhance lipid content in the seeds of transgenic Arabidopsis plants. Plant J. 2007, 52: 716-729. 10.1111/j.1365-313X.2007.03268.x.PubMedGoogle Scholar
- Xu WW, Cho S, Yang SS, Bolon YT, Bilgic H, Jia H, Xiong Y, Muehlbauer GJ: Single-feature polymorphism discovery by computing probe affinity shape powers. BMC Genet. 2009, 10: 48-10.1186/1471-2156-10-48.PubMedPubMed CentralGoogle Scholar
- Cui X, Xu J, Asghar R, Condamine P, Svensson JT, Wanamaker S, Stein N, Roose M, Close TJ: Detecting single-feature polymorphisms using oligonucleotide arrays and robustified projection pursuit. Bioinformatics. 2005, 21: 3852-3858. 10.1093/bioinformatics/bti640.PubMedGoogle Scholar
- Rostoks N, Borevitz JO, Hedley PE, Russell J, Mudie S, Morris J, Cardle L, Marshall DF, Waugh R: Single-feature polymorphism discovery in the barley transcriptome. Genome Biology. 2005, 6 (6): R54-10.1186/gb-2005-6-6-r54.PubMedPubMed CentralGoogle Scholar
- Gill KS, Gill BS, Endo TR: A chromosome region-specific mapping strategy reveals gene-rich telomeric ends in wheat. Chromosoma. 1993, 102: 374-381. 10.1007/BF00360401.Google Scholar
- Wilcox JR: Increasing seed protein in soybean with eight cycles of recurrent selection. Crop Sci. 1998, 38: 1536-1540.Google Scholar
- Helms TC, Orf JH: Protein, oil and yield of soybean lines selected for increased protein. Crop Sci. 1998, 38: 707-711.Google Scholar
- Focks N, Benning C: wrinkled1: A novel, low-seed-oil mutant of Arabidopsis with a deficiency in the seed-specific regulation of carbohydrate metabolism. Plant Physiol. 1998, 118 (1): 91-101. 10.1104/pp.118.1.91.PubMedPubMed CentralGoogle Scholar
- Cernac A, Benning C: WRINKLED1 encodes an AP2/EREB domain protein involved in the control of storage compound biosynthesis in Arabidopsis. Plant J. 2004, 40 (4): 575-585. 10.1111/j.1365-313X.2004.02235.x.PubMedGoogle Scholar
- Verdier J, Kakar K, Gallardo K, Le Signor C, Aubert G, Schlereth A, Town CD, Udvardi MK, Thompson RD: Gene expression profiling of M. truncatula transcription factors identifies putative regulators of grain legume seed filling. Plant Mol Biol. 2008, 67 (6): 567-580. 10.1007/s11103-008-9320-x.PubMedGoogle Scholar
- Walling L, Drews GN, Goldberg RB: Transcriptional and post-transcriptional regulation of soybean seed protein mRNA levels. Proc Natl Acad Sci USA. 1986, 83 (7): 2123-2127. 10.1073/pnas.83.7.2123.PubMedPubMed CentralGoogle Scholar
- Harada JJ, Barker SJ, Goldberg RB: Soybean beta-conglycinin genes are clustered in several DNA regions and are regulated by transcriptional and posttranscriptional processes. Plant Cell. 1989, 1 (4): 415-425. 10.1105/tpc.1.4.415.PubMedPubMed CentralGoogle Scholar
- Meinke DW, Chen J, Beachy RN: Expression of storage-protein genes during soybean seed development. Planta. 1981, 153: 130-139. 10.1007/BF00384094.PubMedGoogle Scholar
- Ohlrogge JB, Jaworski JG: Regulation of Fatty Acid Synthesis. Annu Rev Plant Physiol Plant Mol Biol. 1997, 48: 109-136. 10.1146/annurev.arplant.48.1.109.PubMedGoogle Scholar
- Roesler K, Shintani D, Savage L, Boddupalli S, Ohlrogge J: Targeting of the Arabidopsis homomeric acetyl-coenzyme A carboxylase to plastids of rapeseeds. Plant Physiol. 1997, 113 (1): 75-81. 10.1104/pp.113.1.75.PubMedPubMed CentralGoogle Scholar
- Turnham E, Northcote DH: Changes in the activity of acetyl-CoA carboxylase during rape-seed formation. Biochem J. 1983, 212 (1): 223-229.PubMedPubMed CentralGoogle Scholar
- Macnicol PK, Jacobsen JV: Endosperm acidification and related metabolic changes in the developing barley grain. Plant Physiol. 1992, 98 (3): 1098-1104. 10.1104/pp.98.3.1098.PubMedPubMed CentralGoogle Scholar
- Gonzalez MC, Osuna L, Echevarria C, Vidal J, Cejudo FJ: Expression and localization of phosphoenolpyruvate carboxylase in developing and germinating wheat grains. Plant Physiol. 1998, 116 (4): 1249-1258. 10.1104/pp.116.4.1249.PubMedPubMed CentralGoogle Scholar
- Golombek S, Heim U, Horstmann C, Wobus U, Weber H: Phosphoenolpyruvate carboxylase in developing seeds of Vicia faba L.: gene expression and metabolic regulation. Planta. 1999, 208 (1): 66-72. 10.1007/s004250050535.PubMedGoogle Scholar
- Smith AJ, Rinne RW, Seif RD: Phosphoenolpyruvate carboxylase and pyruvate kinase involvement in protein and oil biosynthesis during soybean seed development. Crop Sci. 1989, 29: 349-353.Google Scholar
- Agrawal GK, Hajduch M, Graham K, Thelen JJ: In-depth investigation of the soybean seed-filling proteome and comparison with a parallel study of rapeseed. Plant Physiol. 2008, 148 (1): 504-518. 10.1104/pp.108.119222.PubMedPubMed CentralGoogle Scholar
- Klaus D, Ohlrogge JB, Neuhaus HE, Dormann P: Increased fatty acid production in potato by engineering of acetyl-CoA carboxylase. Planta. 2004, 219 (3): 389-396. 10.1007/s00425-004-1236-3.PubMedGoogle Scholar
- Kianian SF, Egli MA, Phillilps RL, Rines HW, Somers DA, Gengenbach BG, Webster FH, Livingston SM, Groh S, O'Donoughue LS, Sorrells ME, Wesenberg DM, Stuthman DD, Fulcher RG: Association of a major groat oil content QTL and an acetyl-CoA carboxylase gene in oat. Theor Appl Genet. 1999, 98: 884-894. 10.1007/s001220051147.Google Scholar
- Thelen JJ, Ohlrogge JB: Both antisense and sense expression of biotin carboxyl carrier protein isoform 2 inactivates the plastid acetyl-coenzyme A carboxylase in Arabidopsis thaliana. Plant J. 2002, 32 (4): 419-431. 10.1046/j.1365-313X.2002.01435.x.PubMedGoogle Scholar
- Rolletschek H, Borisjuk L, Radchuk R, Miranda M, Heim U, Wobus U, Weber H: Seed-specific expression of a bacterial phosphoenolpyruvate carboxylase in Vicia narbonensis increases protein content and improves carbon economy. Plant Biotechnol J. 2004, 2 (3): 211-219. 10.1111/j.1467-7652.2004.00064.x.PubMedGoogle Scholar
- Huber SC, Hardin SC: Numerous posttranslational modifications provide opportunities for the intricate regulation of metabolic enzymes at multiple levels. Curr Opin Plant Biol. 2004, 7: 318-322. 10.1016/j.pbi.2004.03.002.PubMedGoogle Scholar
- Allen DK, Ohlrogge JB, Shachar-Hill Y: The role of light in soybean seed filling metabolism. Plant J. 2009, 58 (2): 220-234. 10.1111/j.1365-313X.2008.03771.x.PubMedGoogle Scholar
- Borisjuk L, Nguyen TH, Neuberger T, Rutten T, Tschiersch H, Claus B, Feussner I, Webb AG, Jakob P, Weber H, Wobus U, Rolletschek H: Gradients of lipid storage, photosynthesis and plastid differentiation in developing soybean seeds. New Phytol. 2005, 167 (3): 761-776. 10.1111/j.1469-8137.2005.01474.x.PubMedGoogle Scholar
- Rolletschek H, Weber H, Borisjuk L: Energy status and its control on embryogenesis of legumes. Embryo photosynthesis contributes to oxygen supply and is coupled to biosynthetic fluxes. Plant Physiol. 2003, 132 (3): 1196-1206. 10.1104/pp.102.017376.PubMedPubMed CentralGoogle Scholar
- Wei W-H, Chen B, Yan X-H, Wang L-J, Zhang H-F, Cheng J-P, Zhou X-A, Sha A-H, Shen H: Identification of differentially expressed genes in soybean seeds differing in oil content. Plant Sci. 2008, 175: 663-673. 10.1016/j.plantsci.2008.06.018.Google Scholar
- Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Natale DA, O'Donovan C, Redaschi N, Yeh LS: UniProt: the Universal Protein knowledgebase. Nucleic Acids Res. 2004, D115-119. 10.1093/nar/gkh131. 32 Database
- Ben Amor B, Wirth S, Merchan F, Laporte P, d'Aubenton-Carafa Y, Hirsch J, Maizel A, Mallory A, Lucas A, Deragon JM, Vaucheret H, Thermes C, Crespi M: Novel long non-protein coding RNAs involved in Arabidopsis differentiation and stress responses. Genome Res. 2009, 19: 57-69. 10.1101/gr.080275.108.PubMedPubMed CentralGoogle Scholar
- Borsani O, Zhu J, Verslues PE, Sunkar R, Zhu JK: Endogenous siRNAs derived from a pair of natural cis-antisense transcripts regulate salt tolerance in Arabidopsis. Cell. 2005, 123 (7): 1279-1291. 10.1016/j.cell.2005.11.035.PubMedPubMed CentralGoogle Scholar
- zur Nieden U, Neumann D, Bucka A, Nover L: Tissue-specific localization of heat-stress proteins during embryo development. Planta. 1995, 196: 530-538. 10.1007/BF00203653.Google Scholar
- DeRocher AE, Vierling E: Developmental control of small heat shock protein experssion during pea seed maturation. Plant J. 1994, 5 (1): 93-102. 10.1046/j.1365-313X.1994.5010093.x.Google Scholar
- Rolletschek H, Radchuk R, Klukas C, Schreiber F, Wobus U, Borisjuk L: Evidence of a key role for photosynthetic oxygen release in oil storage in developing soybean seeds. New Phytol. 2005, 167 (3): 777-786. 10.1111/j.1469-8137.2005.01473.x.PubMedGoogle Scholar
- Borisjuk L, Rolletschek H, Walenta S, Panitz R, Wobus U, Weber H: Energy status and its control on embryogenesis of legumes: ATP distribution within Vicia faba embryos is developmentally regulated and correlated with photosynthetic capacity. Plant J. 2003, 36 (3): 318-329. 10.1046/j.1365-313X.2003.01879.x.PubMedGoogle Scholar
- Kawagoe Y, Mura N: A novel basic region/helix-loop-helix protein binds to a G-box motif CACGTG of the bean seed storage protein β-phaseolin gene. Plant Sci. 1996, 116: 47-57. 10.1016/0168-9452(96)04366-X.Google Scholar
- Pla M, Vilardell J, Guiltinan MJ, Marcotte WR, Niogret MF, Quatrano RS, Pagès M: The cis-regulatory element CCACGTGG is involved in ABA and water-stress responses of the maize gene rab28. Plant Molecular Biology. 1993, 21: 259-266. 10.1007/BF00019942.PubMedGoogle Scholar
- Ezcurra I, Wycliffe P, Nehlin L, Ellerström M, Rask L: Transactivation of the Brassica napus napin promoter by ABI3 requires interaction of the conserved B2 and B3 domains of ABI3 with different cis-elements: B2 mediates activation through an ABRE, whereas B3 interacts with an RY/G-box. Plant J. 2000, 24: 57-66. 10.1046/j.1365-313x.2000.00857.x.PubMedGoogle Scholar
- de Pater S, Pham K, Chua NH, Memelink J, Kijne J: A 22-bp fragment of the pea lectin promoter containing essential TGAC-like motifs confers seed-specific gene expression. Plant Cell. 1993, 5: 877-886. 10.1105/tpc.5.8.877.PubMedPubMed CentralGoogle Scholar
- Emrich SJ, Barbazuk WB, Li L, Schnable PS: Gene discovery and annotation using LCM-454 transcriptome sequencing. Genome Res. 2007, 17 (1): 69-73. 10.1101/gr.5145806.PubMedPubMed CentralGoogle Scholar
- Sultan M, Schulz MH, Richard H, Magen A, Klingenhoff A, Scherf M, Seifert M, Borodina T, Soldatov A, Parkhomchuk D, Schmidt D, O'Keeffe S, Haas S, Vingron M, Lehrach H, Yaspo ML: A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science. 2008, 321 (5891): 956-960. 10.1126/science.1160342.PubMedGoogle Scholar
- Wilhelm BT, Marguerat S, Watt S, Schubert F, Wood V, Goodhead I, Penkett CJ, Rogers J, Bahler J: Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature. 2008, 453 (7199): 1239-1243. 10.1038/nature07002.PubMedGoogle Scholar
- Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB: Alternative isoform regulation in human tissue transcriptomes. Nature. 2008, 456 (7221): 470-476. 10.1038/nature07509.PubMedPubMed CentralGoogle Scholar
- Soderlund C, Longden I, Mott R: FPC: A system for building contigs from restriction fingerprinted clones. Comput Appl Biosci. 1997, 13: 523-535.PubMedGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215: 494-498.Google Scholar
- AOAC: Official Methods of Analysis. Gaithersburg, MD: AOAC International 2006.Google Scholar
- Boddu J, Cho S, Muehlbauer G: Transcriptome analysis of trichothecene-induced gene expression in barley. Mol Plant Microbe Interact. 2007, 20 (11): 1364-1375. 10.1094/MPMI-20-11-1364.PubMedGoogle Scholar
- Mortel van de M, Recknor JC, Graham MA, Nettleton D, Dittman JD, Nelson RT, Godoy CV, Abdelnoor RV, Almeida AM, Baum TJ, Whitham SA: Distinct biphasic mRNA changes in response to Asian soybean rust infection. Mol Plant Microbe Interact. 2007, 20 (8): 887-899. 10.1094/MPMI-20-8-0887.PubMedGoogle Scholar
- Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003, 4 (2): 249-264. 10.1093/biostatistics/4.2.249.PubMedGoogle Scholar
- Hubbell E, Liu WM, Mei R: Robust estimators for expression analysis. Bioinformatics. 2002, 18 (12): 1585-1592. 10.1093/bioinformatics/18.12.1585.PubMedGoogle Scholar
- Wise RP, Caldo RA, Hong L, Shen L, Cannon EK, Dickerson JA: BarleyBase/Plexdb - a unified expression profiling database for plants and plant pathogens. Methods in Molecular Biology. 2007, Totowa, NJ, U.S.A.: Humana Press, 406: 347-363. full_text.Google Scholar
- Wu Z, Irizarry RA, Gentleman R, Murillo FM, Spencer F: A model based background adjustment for oligonucleotide expression data. J Am Stat Assoc. 2004, 99: 909-917. 10.1198/016214504000000683.Google Scholar
- Li C, Wong HW: Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc Natl Acad Sci USA. 2001, 98: 31-36. 10.1073/pnas.011404098.PubMedPubMed CentralGoogle Scholar
- Hubbell E, Liu W-M, Mei R: Robust estimators for expression analysis. Bioinformatics. 2002, 18 (12): 1585-1592. 10.1093/bioinformatics/18.12.1585.PubMedGoogle Scholar
- Bateman A, Coin L, Durbin R, Finn RD, Hollich V, Griffiths-Jones S, Khanna A, Marshall M, Moxon S, Sonnhammer EL, Studholme DJ, Yeats C, Eddy SR: The Pfam protein families database. Nucleic Acids Res. 2004, 32: D138-141. 10.1093/nar/gkh121.PubMedPubMed CentralGoogle Scholar
- Berardini TZ, Mundodi S, Resier R, Huala E, Garcia-Hernandez M, Zhang P, Mueller LM, Yoon J, Doyle A, Lander G, Moseyko N, Yoo D, Xu I, Zoeckler B, Montoya M, Miller N, Weems D, Rhee SY: Functional annotation of the Arabidopsis genome using controlled vocabularies. Plant Physiol. 2004, 135 (2): 1-11. 10.1104/pp.104.040071.Google Scholar
- Gene Ontology Consortium T: Gene ontology: tool for the unification of biology. Nature Genet. 2000, 25: 25-29. 10.1038/75556.Google Scholar
- Fisher R: A preliminary linkage test with Agouti and undulated mice; the fifth linkage-group. Heredity. 1949, 3: 229-241. 10.1038/hdy.1949.16.Google Scholar
- Bonferroni CE: Ill calcolo delle assicurazioni su gruppi di teste. Studi in Onore del Professore Salvatore Ortu Carboni. 1935, 13-60.Google Scholar
- Li H, Ruan J, Durbin R: Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 2008, 18 (11): 1851-1858. 10.1101/gr.078212.108.PubMedPubMed CentralGoogle Scholar