Flax plants (Linum usitatissimum L. cv AC McDuff) were grown at AAFC Harrington farm (Harrington, PEI, Canada) in the 2008 to 2011 growing seasons. Plants were grown in four replications each year. At anthesis, referred to as 0 days after anthesis (0 DAA), individual flowers were tagged. Developing bolls were harvested at 0, 8, 16, 24, 32 DAA and at maturity and immediately frozen in liquid nitrogen as previously described in Arabidopsis , soybean  and flax [46, 47]. The 0 DAA samples consisted of ovaries free of other flower tissues, whereas the other boll samples (8–32 DAA and maturity) contained seeds at different developmental stages (Additional file 7). At the flowering stage, young leaf and stem tissues were similarly collected. Developing bolls, leaves and stems were stored at −80°C until use.
Before RNA isolation, ovules (0 DAA) and developing seeds were first extracted from the bolls. Total RNA was isolated using Trizol (Invitrogen, Carlsbad, ON, Canada) as previously described . RNA samples were further purified using the Invitrogen PureLink™ RNA Mini kit (Invitrogen, Mississauga, ON, Canada) as per manufacturer’s instructions, quantified by spectrophotometry, and the quality was verified by agarose gel eletrophoresis and the Experion RNA analyzer (BioRad, Missisauga, ON, Canada).
Library mining and UGT cloning
The flax NAPGEN EST database (Plant Biotechnology Institute, NRC, Saskatoon) was mined using the keywords UGT, glucosyltranferase and glycosyltranferase. A total of 893 UGT hits were found amongst 178,656 ESTs. For primer design, we retained members of UGT subclasses 71 (7 hits), 88 (3 hits) and miscellaneous (7 hits). A set of 19 flax-specific and one degenerated primer pairs were designed (Additional file 8).
Total RNA (2 μg) from all developmental stages was used as template to create the cDNA using the first strand cDNA synthesis kit (Invitrogen, Mississauga, ON, Canada) following manufacturer’s instructions. After treatment with 2 U RNAse H (Invitrogen), the cDNA samples were diluted 10-fold and 1 μL was used as template. Each of the 20 primer pairs (Additional file 8) was used in PCR reactions consisting of an initial denaturation 94°C for 2 min followed by 35 cycles of 94°C for 30 s, 60-63°C for 30 s, and 72°C for 60 s prior to a final extension at 72°C for 10 min. Aliquots of 10 μL of the PCR products were resolved on 1% agarose gels stained with ethidium bromide. The amplified fragments were purified using the QIAquick gel extraction kit (Qiagen) for direct sequencing and for TOPO cloning (Invitrogen) in E. coli prior to sequencing.
The identities of the obtained partial sequences were confirmed by BLASTx against the NCBI non-redundant protein sequence (nr) database using a cut off value of 1e−30. The relationship between the partial sequences was inferred by a phylogenetic consensus tree constructed using UPGMA method with 1000 bootstrap replicates as implemented in MEGA4 .
To clone the full length UGTs, 5′ and 3′ gene specific primers (GSP) and nested gene specific RACE PCR primers were designed from representative sequences of each group observed in the consensus tree (Additional file 1) and were used in 5′ and 3′ cDNA end amplification reactions. Briefly, using the Gene Racer kit (Invitrogen, Mississauga, ON, Canada), the purified total RNA was dephosphorylated using a calf intestinal phosphatase, and decapped with a tobacco acid pyrophosphatase. The RNA oligos were ligated to the decapped mRNA by T4 RNA ligase (Invitrogen, CA, USA) before reverse transcription of mRNA using oligo-dT primers. The 5′ and 3′ RACE PCR reactions were carried out using eight pairs of GSP and nested primers (Additional file 9) following the kit’s specifications. The expected 5′ and 3′ RACE PCR products of the putative UGTs CL809, CL5227, CL8584, RP131, RP250 were gel-purified, cloned in TOPO 4.0 vector (Invitrogen, Mississauga, ON, Canada) and sequenced using M13 forward and reverse primers. New primer sets containing restriction sites compatible with the multiple cloning site of pYES2/NT C plasmid vector (Invitrogen, Mississauga, ON, Canada) were designed from the 5′ and 3′ ends (Additional file 10) for the amplification of the full length cDNAs (Additional file 2). The amplified full length cDNAs were gel-purified, restriction digested and similarly cloned into pYES2/NT C. The cDNA corresponding to one of the UGT clones reported by Bavkar et al.  (accession JN088324.1) was also cloned as described above. The plasmids carrying the full length cDNA clones were sequenced using T7 promoter primer x(5′-TAATACGACTCACTATAGGG-3′) and CYC1 reverse primer (5′-GCGTGAATGTAAGCGTGAC-3′).
UGT structural gene organization
To characterize the structural organization of the flax genomic DNA corresponding to each of the five UGTs, a BLASTn search within the flax sequence assembly (http://www.linum.ca) was performed to identify the 5′ and 3′ untranslated regions (UTRs), and the intron and exon structure of the coding regions. The PROSITE scan tool of the ExPASy web interface (http://www.expasy.org) was used to determine the position of the conserved motifs characteristic of plant UGTs such as the PSPG box.
In silicoanalysis of UGTs
To characterize the relative abundance of the cloned UGTs, an in silico EST analysis was performed. The five full length UGT sequences were compared to 13 flax tissue-specific EST libraries (globular embryo, heart embryo, torpedo embryo, cotyledon embryo, mature embryo, pooled endosperm, globular stage seed coat, torpedo stage seed coat, etiolated seedling, leaves, stem, stem peel and mature flower) previously described  and the number of EST hits corresponding to each query UGT in each library was recorded and plotted.
UGT real time gene expression analysis
To assess the gene transcript expression levels of the putative cloned UGTs in developing flax seed, leaf and stem tissue, real-time PCR primers were designed from the five flax UGTs, one PLR and one ribosomal (EU307117) RNA sequence (Additional file 11). The rRNA primers were used for data normalization. Total RNA was extracted from three separate biological replicates for each seed developmental stage (0, 8, 16, 24, 32 DAA, and mature seed). First strand cDNA was obtained as described earlier. The cDNA samples were quantified by spectrophotometry or Qubit (Invitrogen) and diluted to 100 ng/μL. Real-time PCR reactions were performed using the SYBR Green PCR Master Mix (BioRad Laboratories, Canada) on a CFX96 Real Time system (BioRad). For each sample, three biological and three technical replicates, for a total of 9 data points, were obtained. The 25 μL Real Time amplification reactions consisted of 1x SYBR Green Master Mix, 300 nM of each primer, 100 ng of first strand synthesis cDNA obtained from ovaries (0 DAA), developing seeds (8, 16, 24, 32 DAA), mature seeds, leaves, stems and water controls. Real-time PCR reactions were performed as follows: denaturation at 95°C for 10 min followed by 40 cycles of 95°C for 30 s, 60°C for 30 s. Following the final amplification cycle, a melting dissociation curve was generated to ensure specificity of the primers and to confirm the uniqueness of the amplification product. The output data was determined following the 2-∆∆CT method described by Livak and Schmittgen  and it is reported as fold changes of relative expression.
SDG lignan profiling in developing flax seeds
To assess the SDG lignan biosynthesis in developing flax seeds, 250 mg of ovary or seed at six developmental stages was used as starting material following modifications to a protocol described by Popova et al. . Developing flax seed tissue was ground to a fine powder in liquid nitrogen using mortar and pestle. The powder (200 mg) was transferred into a glass centrifuge vial and defatted with 2 mL hexane (1:10 w/v) on a Wrist Action Shaker (Burrell Scientific, PA, USA) for 2 h at room temperature. After centrifugation at 1500 rpm for 15 min, the supernatant was discarded. The pellet was rinsed with 2 mL hexane, centrifuged and air-dried for 15 min. The defatted material was extracted with 2 mL of 70% (v/v) methanol/water at 55°C for 2 h using rotation in an oven, with intermittent manual shaking 2–3 times. A final vigorous shaking was performed for 15 min on a Wrist action shaker (Burrell Scientific, Pittsburgh, PA, USA) at room temperature. The samples were centrifuged at 1500 rpm for 15 min and the supernatant (S1) was collected in new capped vial. The residue was rinsed again with 0.5 mL 70% methanol, centrifuged and the supernatant (S2) was collected and pooled with S1. The total supernatant volume was recorded before hydrolysis. The combined samples (S1 + S2) were hydrolysed for 1 h at 60°C with 0.5 N NaOH at a ratio of 3:5 (v/v). After hydrolysis, the samples were immediately neutralized using 0.5 N HCl at a ratio of 0.4 mL for every 0.5 mL extract. The hydrolysate was cooled and purified via solid phase extraction using 10 mL Waters HLB columns (Waters, Mississauga, ON, Canada). The eluted lignan fractions were collected in glass vials and dried using a rotary evaporator (Heidolph instrument Gamborg, Germany). The dried material was dissolved in methanol:water (50:50), filtered and injected for UPLC-MS analysis using a commercially available SDG standard (Chromadex, Irvine, CA, USA) as reference.
An Acquity H-Class, quaternary pump UPLC system (Waters, Mississauga, ON, Canada) equipped with in-line degassing, diode array detector (DAD), robotic autosampler, sample and column temperature controls and tandem quad mass spectrometer (TQD) was used for lignan profiling analysis. A ternary solvent system for UPLC-MS analysis consisting of water, acetonitrile and 10% formic acid in water was used for UPLC-MS analysis. UV–vis spectra were recorded from 210–600 nm, and the MS was run in ESI mode, 3000 V capillary voltage, in scanning mode from 100–2000 a.m.u., with a fragmentation setting of 150 V, 13.0 L/min carrier gas (N2) flow at 350°C and 60 psi to ensure identity of the profiled metabolites. The post-hydrolysis SDG lignan peak was identified and quantitated through comparison (UV–VIS absorption, retention time) to a commercial standard. Other phenolic compounds, including hydroxycinnamic acids liberated by the base hydrolysis were present but were not quantified. A standard curve for SDG was created, relating integrated peak area (mAU*s) (Y) versus concentration of SDG (mg/mL) (X). In brief, 1 mg of authentic standard was dissolved in 50% methanol and a serial dilution was created in triplicate, halving the concentration each time. The resulting standard curve was linear from 0.5 mg/mL to 0.00781 mg/mL (R2 = 0.9901) and was used to determine SDG content in relation to developmental stage (DAA). For each of the six developmental stages, three extractions and HPLC analyses were performed from three biological replicates and the values were presented as the mean of the three data points.
Heterologous expression of flax UGTs in yeast
The pYES2/NT C plasmid constructs harbouring the cDNA of the five UGTs described in this study were used to transform yeast INVSc1 strains using S.c. EasyComp transformation™ kit (Invitrogen, CA, USA). The flax UGT cDNA of Genebank accession JN088324.1  was similarly transformed for functional comparative analyses. Single transformant INVSc1 yeast colonies were inoculated into 15 mL of Saccharomyces cerevisiae minimal media without uracil (SC-U, prepared as recommended by Invitrogen) supplemented with 2% raffinose and grown for 3 days under shaking at 30°C until the OD600 reached 2.0. The culture was diluted in 50 mL of induction medium (SC-U supplemented with 1% raffinose and 2% galactose) to achieve an initial OD600 of 0.4. The culture was further incubated under shaking at 30°C for 24 hours, with 5 mL sub-sample collection at 0, 4, 8, 12 and 24 h to monitor the protein expression. The OD600 for each time point was recorded. The induced yeast cells were harvested by centrifugation at 1,500 g for 5 min at 4°C. The cells were washed using 500 μL cold sterile distilled water and centrifuged. The pellets were washed again at 4°C in 500 μL of lysis buffer (50 mM sodium phosphate, pH 7.4 supplemented with 5% glycerol and 1 mM PMSF). After centrifugation, the cells were mechanically disrupted by vortexing for 30 seconds in the presence of an equal volume of 425–600 μm acid-washed glass beads (Sigma Aldrich, Canada). After vortexing, the sample was incubated on ice for 30 seconds. The vortexing and incubation cycle was repeated 4 times to ensure complete cell lysis. The lysates were centrifuged at 18,620 g for 10 min at 4°C and the supernatant was collected. The optimum induction time for all the UGTs was monitored by western blot using equal amount of proteins and antibodies raised against the anti-ExpressTM epitope present between the 6x Histidine tag and the multiple cloning site of the construct. The polyhistidine containing recombinant proteins was purified using the ProBond™ (Invitrogen, CA, USA) purification system following manufacturer’s instruction. The purified enzymes were concentrated using 0.5 mL UltracelR-10 k Amicon membrane column (Millipore, Ireland). Protein concentrations were determined using the Bradford protein assay kit (BioRad Laboratories, Canada).
The crude and purified recombinant protein extracts obtained from the yeast cultures harboring the five different UGT cDNAs reported in this study and the one derived from JN088324.1  were reacted with different aglycone substrates including SECO (Chromadex, Irvine, CA, USA), sillibinin, quercetin, kaempferol and the phenolic acids coumaric acid, caffeic acid, sinnapic acid, cinnamic acid, and ferulic acid (Sigma Aldrich, Canada). The 100 μL reaction mixture consisted of a reaction buffer (50 mM sodium phosphate, 1 mM PMSF, 5% glycerol, pH 7.4), 280 μM aglycone substrate (acceptor for glycosylation), and 1.64 mM UDP-glucose (sugar donor) (Sigma Aldrich, Canada). The reaction mixtures were pre-incubated at 30°C for 10 min and the reactions were initiated with the addition of 50 μg of enzyme. After incubation at 30°C for 30 min, the reactions were stopped with 100 μL of 0.5% trifluoroacetic acid in acetonitrile. The reaction mixtures were purified using 0.2 μm filters (Pall Life Sciences, Mississauga, ON, Canada) to remove any particulates that might form during the reaction. The separation and identification of the reactants and products derived from the enzyme assays were carried out using a Waters H-Class Acquity UPLC system (Waters, Missisauga, ON) equipped with a TQD tandem mass spectrometer. The formation of glycosylated products was monitored by examining the masses and the principle fragments of eluted peaks via ESI–mass spectrometry. Two parallel MS2 scans were performed ranging from 120–800 a.m.u., using 15 and 45 V cone voltages. Selected ion recording (SIR) spectra were also collected to enhance the sensitivity of detection of SECO, SMG and SDG. The capillary voltage was 3 kV, the extractor set to 3 V, and RF lens at 0.1 V. Chromatographic conditions consisted of a binary gradient system composed of 3% formic acid in water (A) and acetonitrile (B), varied according to the following gradient: t0, A = 68%; t1 = 4.4 min, A = 0%; t2 = 6 min, A = 0% isocratic; t3 = 7 min, A = 68%; t4 = 8 min, A = 68% isocratic. Peaks detected at 280 nm, indicative of phenolic compounds, were validated using authentic standards (SECO and SDG) purchased from Chromadex (Chromadex, Irvine, CA, USA). A standard curve for SDG was created as detailed above. Standard purified SMG was prepared as described by .
Kinetic and biochemical characterization of UGT74S1
Ranges of pH from 6.0 to 9.0, temperature from 25°C to 50°C, enzyme concentration from 10 to 120 μg and two concentrations (1 and 10 mM) of seven metal cofactors (NaCl, KCl, MgCl2, MnCl2, CaCl2, FeSO4 and CuSO4) were tested in 100 μL reaction mixture for determining the optimal pH, temperature, enzyme concentration and metal cofactor effect on the enzyme activity. To determine the initial velocity of the recombinant UGT74S1 enzyme, a time course (5, 10, 15, 30, 45, 60 min) study using the optimum enzyme concentration and fixed excess substrate (280 μM SECO; 1.67 mM UDP-glucose) concentration was conducted at 30°C, pH 8. The linearity was maintained in assays up to 30 min at 30°C. The initial velocity of the reaction was measured at 10 min, where no more than 10% of SECO was converted to SDG at this time point. Then, the assays were carried out using various substrate concentrations (70–1400 μM SECO with UDP-glucose fixed at 1.67 mM; 0.82–6.56 mM UDP-glucose with SECO fixed at 280 μM), under optimum conditions, for 30 min, for the determination of kinetic parameters. The apparent Vmax and Km value for the glucosyl donor and acceptor substrate in the presence of 80 μg of the enzyme were determined from Lineweaver-Burk plots. The kcat was determined by dividing Vmax by the enzyme concentration.