A new allele of flower color gene W1 encoding flavonoid 3'5'-hydroxylase is responsible for light purple flowers in wild soybean Glycine soja

Background Glycine soja is a wild relative of soybean that has purple flowers. No flower color variant of Glycine soja has been found in the natural habitat. Results B09121, an accession with light purple flowers, was discovered in southern Japan. Genetic analysis revealed that the gene responsible for the light purple flowers was allelic to the W1 locus encoding flavonoid 3'5'-hydroxylase (F3'5'H). The new allele was designated as w1-lp. The dominance relationship of the locus was W1 >w1-lp >w1. One F2 plant and four F3 plants with purple flowers were generated in the cross between B09121 and a Clark near-isogenic line with w1 allele. Flower petals of B09121 contained lower amounts of four major anthocyanins (malvidin 3,5-di-O-glucoside, petunidin 3,5-di-O-glucoside, delphinidin 3,5-di-O-glucoside and delphinidin 3-O-glucoside) common in purple flowers and contained small amounts of the 5'-unsubstituted versions of the above anthocyanins, peonidin 3,5-di-O-glucoside, cyanidin 3,5-di-O-glucoside and cyanidin 3-O-glucoside, suggesting that F3'5'H activity was reduced and flavonoid 3'-hydroxylase activity was increased. F3'5'H cDNAs were cloned from Clark and B09121 by RT-PCR. The cDNA of B09121 had a unique base substitution resulting in the substitution of valine with methionine at amino acid position 210. The base substitution was ascertained by dCAPS analysis. The polymorphism associated with the dCAPS markers co-segregated with flower color in the F2 population. F3 progeny test, and dCAPS and indel analyses suggested that the plants with purple flowers might be due to intragenic recombination and that the 65 bp insertion responsible for gene dysfunction might have been eliminated in such plants. Conclusions B09121 may be the first example of a flower color variant found in nature. The light purple flower was controlled by a new allele of the W1 locus encoding F3'5'H. The flower petals contained unique anthocyanins not found in soybean and G. soja. B09121 may be a useful tool for studies of the structural and functional properties of F3'5'H genes as well as investigations on the role of flower color in relation to adaptation of G. soja to natural habitats.


Background
Soybean (Glycine max (L.) Merr.) is believed to have been domesticated in north-eastern China from its wild relative, Glycine soja Sieb. & Zucc. [1]. Glycine soja is native throughout China, the adjacent area of Russia, Korea, Japan and Taiwan [1]. Flower color of G. soja is almost exclusively purple; by contrast, 33% (5,544 out of 16,855) of the soybean accessions in the USDA Soybean Germplasm Collections have white flowers (Dr. R.L. Nelson, personal communication 2006). The reason why G. soja almost lacks flower color variants is uncertain [2]. A few white-flowered G. soja accessions were reported in a Chinese germplasm collection, but these had high 100-seed weight, strongly suggesting recent outcrossing with G. max [2]. One white-flowered plant (PI 424008C) was found in 1998 among the progeny of a purple-flowered G. soja accession (PI 424008A) that was originally introduced from South Korea in 1976 [2]. Genetic analysis indicated that the white flower was caused by a recessive allele at the W1 locus similar to the white-flowered soybeans [2].
In soybean, six genes (W1, W2, W3, W4, Wm and Wp) primarily control flower color and two genes (T and Td) control pubescence color [3,4]. The hydroxylation pattern of B-ring in flavonoids plays an important role in the coloration of seed coats, flower and pubescence of soybeans. The B-ring of flavonoids can be hydroxylated at either the 3' position leading to the production of cyanidin-based pigments, or at both the 3' and 5' positions to produce delphinidin-based pigments.
Two key enzymes involved in this pathway are flavonoid 3'-hydroxylase (F3'H) and flavonoid 3'5'-hydroxylase (F3'5'H) which are both microsomal cytochrome P450 dependent monooxygenases that require NADPH as a co-factor [5]. Chromatographic experiments suggested that T and W1 loci are responsible for the formation of flavonoids with 3', 4' and 3', 4', 5' B-ring hydroxylation patterns, respectively [6][7][8]. Hence, T and W1 are presumed to encode F3'H and F3'5'H, respectively. The F3'H cDNA was cloned and characterized from a pair of near-isogenic lines (NILs) for the T locus, To7B (TT, tawny pubescence) and To7G (tt, gray pubescence) [9]. Sequence analysis revealed that they differed by a singlebase deletion of C in the coding region of To7G. The deletion generated a truncated polypeptide lacking the GGEK consensus sequence of F3'H gene and the hemebinding domain, resulting in non-functional protein.
The W1 gene has a pleiotropic effect on flower and hypocotyl color: soybean cultivars with purple/white flowers have purple/green hypocotyls. The soybean F3'5'H gene was cloned from NILs for W1 and confirmed that W1 encodes F3'5'H and that the gene of white-flowered NILs contained a 65 bp insertion in the coding region [10]. In addition to the F3'5'H protein, a cytochrome b5 is required for full activity of F3'5'H in petunia, and mutation in cytochrome b5 results in a reduction in F3'5'H activity and alteration of anthocyanin amount and composition [11].
Yasuda discovered B09121, a flower color variant of G. soja, at a slope between a paddy field and a ditch in Karatsu, Saga Prefecture (southern Japan) in 2002 (unpublished result) ( Figure 1). Its banner petals have a pale pinkish hue and a pronounced light purple pigmentation that originates from the base of the petals and spreads in streaks towards the petal margins. We designated the flower color as light purple. Considering its growth habit, small seed size and unique flower color, it is unlikely that the flower color of B09121 was derived from outcrossing with soybean. To our knowledge, B09121 is the first example of a flower color variant of G. soja found in nature. The first objective of this study was to determine the genetic basis of purple flower color by crossing experiments. The second objective was to analyze the flavonoids in flower petals of G. soja accessions. The third objective was to clone and characterize a gene responsible for light purple flowers.

Genetic analysis
A Canadian soybean cultivar Harosoy (W1W1 W2W2 w3w3 W4W4 WmWm WpWp tt) with purple flowers and gray pubescence and a NIL of a US soybean cultivar Clark for W1 gene, Clark-w1 (L63-2373, w1w1 W2W2 w3w3 W4W4 WmWm WpWp TT) with white flowers and tawny pubescence were crossed with B09121 having light purple flowers and tawny pubescence in 2005. Flowers of Harosoy and Clark-w1 were emasculated one day before opening and pollinated with B09121. Hybridity of the F 1 plants was ascertained by tawny pubescence color in crossing with Harosoy and by spindly growth habit in crossing with Clark-w1. Seeds of L63-2373 were provided by the USDA Soybean Germplasm Collection. The NIL was produced by crossing Clark with T139 and backcrossing the progeny to Clark up to BC6 [14].
A total of 120 to 130 F 2 seeds derived from the crossing with Harosoy and Clark-w1 were planted in field (lowhumic andosols) on June 13 in 2007 at the National Institute of Crop Science, Tsukuba, Japan (36°06'N, 140°05'E). A bulk of 30 seeds each of 36 F 3 families derived from the cross with Clark-w1 were planted at the same location on June 8 in 2008. N, P and K were applied at 3.0, 4.4 and 8.3 g m -2 , respectively. Flower color was scored in individual F 2 and F 3 plants. Banner petals were collected with forceps at the day of opening. Two 200 mg samples of banner petals were collected in 2 ml of MeOH containing 0.1% (v/v) HCl for anthocyanin analysis. Two 200 mg samples in 2 ml of absolute MeOH were also collected for the determination of flavonol and dihydroflavonol. High performance liquid chromatography (HPLC) of anthocyanins, flavonols and dihydroflavonol was performed following previously described protocols [12]. The 2 ml extracts were filtered through disposable filtration units (Maishoridisc H-13-5, Tosoh, Japan) and 10 μl from each sample was subjected to HPLC analysis. The amount of flavonoids was estimated from the pertinent peak area in the HPLC chromatogram (detection wavelength of anthocyanins = 530 nm; flavonols= 351 nm; dihydroflavonols = 290 nm). The peak area was subjected to analysis of variance using Statistica software (StatSoft, Inc. Tulsa, OK).

RNA extraction and cDNA cloning
Total RNA was extracted from banner flower petals (100 mg) of Clark, Clark-w1 and B09121 using the TRIZOL Reagent (Invitrogen) according to the manufacturer's instructions. cDNA was synthesized by reverse transcription of 5 μg of total RNA using the Superscript III First-Strand Synthesis System (Invitrogen) and an oligo(dT) primer according to the manufacturer's instructions. The full-length cDNA was cloned from Clark and B09121, using a pair of PCR primers (5'-AACTAGCAAATTAAT-TAGCTT and 5'-CAACCCAAACATTACTTAT) and end-to-end PCR. The PCR mixture contained 0.5 μg of cDNA, 10 pmol of each primer, 10 pmol of nucleotides and 1 unit of ExTaq in 1 × ExTaq Buffer supplied by the manufacturer (Takara) in a total volume of 50 μl. A 5 min denaturation at 94°C was followed by 30 cycles of 30 sec denaturation at 94°C, 1 min annealing at 58°C and 1 min extension at 72°C. A final 7 min extension at 72°C completed the program. The PCR was performed in an Applied Biosystems 9700 thermal cycler. The~1.8 kbp PCR product was cloned into pCR 2.1 vector (Invitrogen) and sequenced.

dCAPS and indel analyses
Genomic DNA of Clark, Clark-w1, B09121 and 36 F 2 plants that were used for F 3 progeny tests was isolated from trifoliolate leaves by CTAB [19]. A pair of PCR primers (5'-GTCTAACGAGTTCAAGGCCAT, 5'-CAACTTGGCCAAAAAGGGTAT) was designed to detect a single-base substitution at nucleotide number 653 that is unique to B09121. The first primer contains a nucleotide C that is mismatched with its target DNA to artificially create a restriction site of NcoI (CCATGG) in Clark ( Figure 2). The base substitution within the restriction site would result in presence/ absence of the restriction site in the amplified product to generate a polymorphism. The PCR mixture contained 30 ng of genomic DNA, 5 pmol of each primer, 10 pmol of nucleotides and 1 unit of ExTaq in 1 × ExTaq Buffer supplied by the manufacturer (Takara) in a total volume of 25 μl. After an initial 30 sec denaturation at 94°C, there were 30 cycles of 30 sec denaturation at 94°C, 1 min annealing at 56°C and 1 min extension at 72°C. A final 7 min extension at 72°C completed the program. The amplified products were digested with NcoI, and the digests were separated on an 8% nondenaturing polyacrylamide gel in 1 × TBE buffer (90 mM Tris-borate, 2 mM EDTA, pH 8.0). After electrophoresis, the gel was stained with ethidium bromide and the DNA fragments were visualized under UV light.
A pair of indel PCR primers (5'-TTTTGAGCTTATTC-CATTTGG, 5'-TGAATATTCGAACCCAACCA) was designed to identify the 65 bp insertion in F3'5'H gene of soybean lines with w1 allele based on the previous report [10]. The PCR profile and electrophoresis conditions were identical with the dCAPS analysis except that annealing was conducted at 59°C.

Semi-quantitative RT-PCR analysis
Semi-quantitative RT-PCR was conducted by reversetranscription of 5 μg of total RNA using the Superscript III First-Strand Synthesis System and an oligo d(T) primer according to the manufacturer's instruction. To test the transcription level of the F3'5'H gene, PCR reactions were carried out in a volume of 25 μl, using 125 ng of cDNA. The initial 30 sec denaturation at 94°C was followed by 26 cycles of 30 sec denaturation at 94°C, 1 min annealing at 59°C and 1 min extension at 72°C. A final 7 min extension at 72°C completed the program. The primers were 5'-GACGCTGAGGATATTCAACC and 5'-AGAAATCTGTGAGGTCACGA. A soybean actin gene was used as a control. The initial 30 sec denaturation at 94°C was followed by 20 cycles of 30 sec denaturation at 94°C, 1 min annealing at 56°C and 1 min extension at 72°C. A final 7 min extension at 72°C completed the program. The primers were 5'-CTGGGGATGGTGTCAGCCACAC and 5'-CACC-GAACTTTCTCTCGGAAGGTG. PCR products were loaded on a 1.2% agarose gel, stained by ethidium bromide and visualized under UV light.

Accession Numbers
Sequence data from this article have been deposited with the DDBJ Data Libraries under accession nos. AB540111 (Clark) and AB540112 (B09121).    Table 1). Results of the F 3 progeny tests supported the hypothesis that light purple flower was controlled by a new allele at the W1 locus that was dominant to the w1 allele. We designated the allele as w1-lp (light purple). The gene symbol was approved by the Soybean Genetics Committee. Dominance relationship of the locus was W1 >w1-lp >w1.
In contrast, flower petals of white-flowered line PI 424008C contained no anthocyanins. Further, the flower petals of B09121 contained about half the amount of A1 to A4 compared with Clark. The HPLC chromatogram of B09121 exhibited three additional anthocyanin peaks, A5 to A7, that were not found in soybeans or other G. soja lines (Figure 3). Based on the comparison of retention time with authentic specimens, peonin, cyanin and chrysanthemin, A5, A6 and A7 were estimated as peonidin 3,5-di-O-glucoside, cyanidin 3,5-di-O-glucoside and cyanidin 3-O-glucoside, respectively.
Eight peaks, F1 to F8, corresponding to flavonol glycosides, were detected by HPLC analysis: F1 (kaempferol Table 1 Segregation of flower color in F 1 plants and F 2 population derived from a cross between a soybean cultivar Harosoy with purple flowers and B09121, a Glycine soja accession with light purple flowers, and segregation of flower color in F 1 plants, and F 2 and F 3 populations derived from a cross between a soybean near-isogenic line Clark-w1 with white flowers and B09121 in Tsukuba, Japan.  The amounts of flavonol glycosides estimated by peak areas in HPLC analysis are presented in Table 4. PI 424008A and PI 424008C contained all eight of the flavonol glycosides found in Clark. COL/AOMORI/1983/ NASU-2 lacked F4 and Kokaigawa-1 lacked F8. B09121 was devoid of F4 and F7. However, the total amount of flavonol glycosides was not significantly different among the Clark and G. soja lines included in this report. One peak (F9) corresponding to dihydroflavonol (aromadendrin 3-O-glucoside) was found by HPLC analysis in Clark and all of the G. soja lines. The amount of aromadendrin 3-O-glucoside estimated by peak area in HPLC analysis is presented in Table 5. The G. soja lines had 33 to 155% more aromadendrin 3-O-glucoside than Clark.   position 210) and from glutamic acid to valine (amino acid position 475), respectively ( Figure 2). The latter amino acid substitution was also found in Chin-Ren-Woo-Dou, whereas the former was unique to B09121.

dCAPS and indel analysis
The PCR reaction for dCAPS analysis produced bands with expected size of about 100 bp in Clark, Clark-w1 and B09121 (Figure 4). NcoI digested the bands of Clark and Clark-w1 and shortened the bands by 18 bp. In contrast, the band from B09121 was largely undigested. In addition, a faint band with approximately the same size was also observed among the digested bands in B09121. Similar results were obtained in dCAPS analysis using a different set of PCR primers and a different restriction enzyme (HphI) (data not shown). The PCR reaction for indel analysis produced shorter bands (255 bp) in Clark and B09121, and a longer band in Clark-w1 (310 bp) due to the 65 bp insertion (Figure 4). In addition, a faint band was also observed in Clark-w1 that was approximately the same size as the shorter bands in Clark and B09121.
In dCAPS analysis of the F 2 population, plants with w1w1 genotype (white flower) had only shorter bands, whereas plants with w1-lpw1-lp genotype (light purple flower) had longer bands with faint bands similar to B09121 ( Figure 5). Heterozygous plants with the w1-lpw1 genotype (light purple flower) had both bands at similar band intensity. In indel analysis, plants with w1-lpw1-lp genotype had only shorter bands, whereas plants with w1w1 genotype had longer bands and faint shorter bands. Heterozygous plants with w1-lpw1 genotype had both bands at similar band intensity. Thus, dCAPS and indel markers co-segregated in plants with white and    light purple flowers. In contrast, the F 2 plant with purple flowers had the two bands with similar intensities in dCAPS analysis and only a shorter band in indel analysis.

Semi-quantitative RT-PCR analysis
Results of semi-quantitative RT-PCR suggested that the transcript level of the F3'5'H gene was not substantially different among lines (data not shown).

Discussion
Flower color of G. soja is almost exclusively purple, whereas about 30% of soybean cultivars have white flowers. The reason why G. soja almost lacks flower color variants is uncertain [2]. In 1998, researchers found a white-flowered variant of PI 424008A, a USDA accession of G. soja with purple flowers that was originally introduced from South Korea in 1976 [2]. The mutation may have occurred during propagation at USDA. To our knowledge, B09121 is the first example of flower color variant found in the natural habitat. Genetic analysis suggested that light purple flower is controlled by a new allele at the W1 locus, w1-lp. Dominance relationship of the locus was W1 >w1-lp >w1. Interestingly, one F 2 plant and four F 3 plants with purple flowers were generated in the cross with Clark-w1.  Considering the fact that purple-flowered plants were produced from heterozygous plants (w1-lp w1) in both F 2 and F 3 generations and that frequency of purple-flowered plants was similar (about 1%) across generations, the purple flower color may have been produced by a crossover in the W1 gene instead of seed contamination or out-crossing. The purple-flowered F 2 plant produced F 3 plants with purple and light purple flowers at a 3:1 ratio; this suggests that the region including the 65 bp insertion was eliminated from the genome. The dCAPS and indel analyses indicated that the base substitution was heterozygous but the indel region was homozygous without the 65 bp insertion in the purple-flowered F 2 plant. The results further supported elimination of the insertion that is responsible for gene dysfunction from the plant by intragenic recombination. It remains to be investigated whether the existence of tandem repeats derived from the insertion is responsible for the high frequency of intragenic recombination. Sequence analysis of F3'5'H cDNA from Clark and B09121 indicated that two amino acids (amino acid numbers 210 and 475) were substituted. The former substitution has not been observed in soybean cultivars examined to date. It may be responsible for light purple flower and unique anthocyanin composition. However, no catalytic domains have been assigned to the region. The spontaneous mutation leading to flower color change may not have affected amount of F3'5'H gene transcripts, based on the results from semi-quantitative RT-PCR analyses.

310
Flavonoids in flower petals of G. soja with purple flowers were generally similar to those of soybean cultivars with purple flowers. Flower petals of PI 424008C with white flowers had no anthocyanin but contained comparable amounts of flavonol glycosides and dihydroflavonols. It is consistent with the result from Clark-w1, a soybean NIL with white flowers, whose flower petals contained no anthocyanin although it had levels of flavonol glycosides and dihydroflavonol similar to Clark [12]. The present results confirmed that W1 solely controls anthocyanin biosynthesis in G. soja.
Purple flowers of soybean and G. soja contain four major anthocyanins with 3'4'5'-substituted form, malvidin 3,5-di-O-glucoside, petunidin 3,5-di-O-glucoside, delphinidin 3,5-di-O-glucoside and delphinidin 3-O-glucoside. In contrast, flower petals of B09121 contained lower amounts of the four major anthocyanins. In addition, they contained small amounts of the 5'-unsubstituted versions of the above anthocyanins, peonidin 3,5-di-O-glucoside, cyanidin 3,5-di-O-glucoside and cyanidin 3-O-glucoside. It appears that F3'5'H activity was reduced and F3'H activity was increased in flower petals of B09121. Flower petals of soybean and G. soja contain large amounts of kaempferols and very small amounts of quercetins [12,13], suggesting that F3'H activity may be very low. Further, alleles at the T locus encoding F3'H did not affect 3'-hydroxylation of flavonols in flower petals [12,13]. Therefore, it is unlikely that the F3'H gene might be responsible for the anthocyanin alteration. Alternatively, mutation in the F3'5'H gene possibly led to a reduction in F3'5'H activity and an increase in F3'H activity. In petunia, a cytochrome b5 is required for full activity of F3'5'H. A mutation in cytochrome b5 reduced 3'4'5'-substituted anthocyanins and increased 3'4'-substituted anthocyanins [11]. It is possible that the amino acid substitution might directly affect the amount and composition of anthocyanins or interact with cytochrome b5. Functional analysis using yeast recombinant assays may be necessary to identify the amino acid substitution that led to light purple flower and unique anthocyanin composition, to investigate the association with cytochrome b5, and to verify whether the amino acid substitution generated F3'H activity. Transformation experiments using a soybean line with w1 allele may be necessary to verify the function of the F3'5'H gene from B09121. B09121 may be the first example of a flower color variant of G. soja found in the natural habitat. It may be a useful tool for studies of the structural and functional properties of F3'5'H genes as well as investigations on the role of flower color in relation to adaptation of G. soja to natural habitats.

Conclusions
This study is the first report of a flower color variant of wild soybean G. soja discovered in nature. Genetic analysis revealed that light purple flower of the accession B09121 was controlled by a new allele of W1 locus encoding F3'5'H. The new allele was designated as w1lp. The dominance relationship of the locus was W1 >w1-lp >w1. In crossing experiments, purple-flowered plants were generated in the cross between B09121 and a soybean near-isogenic line with w1 allele. F 3 progeny test, and dCAPS and indel analyses suggested that the plants with purple flowers might be due to intragenic recombination. Flower petals of B09121 contained lower amounts of four major anthocyanins common in purple flowers and contained small amounts of the 5'-unsubstituted versions of the above anthocyanins that are absent in soybeans and other G. soja accessions. The results suggested that F3'5'H activity was reduced and flavonoid 3'-hydroxylase activity was increased in the flower petals. The cDNA of B09121 had a unique base substitution resulting in the substitution of valine with methionine. B09121 may be a useful tool for studies of the structural and functional properties of F3'5'H genes as well as investigations on the role of flower color in relation to adaptation of G. soja to natural habitats.