Comparative transcriptomic analysis of seed coats with high and low lignin contents reveals lignin and flavonoid biosynthesis in Brassica napus

Brassica napus L. (2n = 38, AACC) is one of the most important oil crops and sources of protein for animal feed worldwide. Lignin is a large molecule aromatic polymer and a major cell wall component. However, lignin in the seed coat reduces the availability and restricts the development of rapeseed cake. Therefore, it is critical to reduce the lignin content of the seed coat. Here, high-lignin (H-lignin) and low-lignin (L-lignin) content recombinant inbred lines (RILs) were selected from an RIL population for analysis. The cross-section results indicated that the seed coat of the H-lignin lines was thicker than that of the L-lignin lines, especially the palisade layer. The seed coats and embryos at 35, 40 and 46 days after flowering (DAF) were subjected to RNA sequencing (RNA-Seq), and the expression of the BnPAL and BnC4H gene families in the lignin pathway was significantly higher in the H-lignin seed coat than in the L-lignin seed coat. The Bn4CL gene family also showed this trend. In addition, among the genes related to plant hormone synthesis, BnaC02g01710D was upregulated and BnaA07g11700D and BnaC09g00190D were downregulated in H-lignin lines. Some transcription factors were upregulated, such as BnNAC080, BnNAC083, BnMYB9, BnMYB9-1, BnMYB60 and BnMYB60-1, while BnMYB91 was downregulated in H-lignin lines. Moreover, most genes of the flavonoid pathway, such as BnCHS and BnDFR, were strongly expressed in H-lignin seed coat. In Our study, some key genes such as hormone synthesis genes, transcription factors and miRNAs related to lignin and flavonoid biosynthesis were identified. A regulatory model of B. napus seed coat lignin was proposed. These results provide new insight into lignin and flavonoid biosynthesis in B. napus.

of plant cell walls, providing sufficient compressive strength to the cells to make the outer wall hydrophobic and waterproof [4][5][6]. Lignin appears in various tissues of plants, such as stems, roots, petioles, pods, and seed coats. There are three main types of lignin, i.e., p-hydroxyphenyl lignin (H-type lignin), syringyl lignin (S-type lignin) and guaiacyl lignin (G-type lignin), all of which are formed by polymerization of three monomers, p-coumaryl alcohol, sinapyl alcohol and coniferyl alcohol [7]. The lignin in angiosperms is mainly composed of G-type and S-type lignin, while that in gymnosperms is mainly G-type lignin [8].
In recent years, genes encoding the key enzymes in lignin synthesis have been cloned and functionally studied by a large number of researchers with the ultimate goal of altering the content or composition of lignin in plants. It was proven that a reduction in L-Phenylalanine ammonia-lyase (PAL) activity resulted in a decrease in lignin content [9]. Sewalt et al. [10] revealed that the content and composition of lignin were changed when the expression of PAL and cinnamate 4-hydroxylase (C4H) was altered in transgenic tobacco lines (Nicotiana tabacum). The lignin content of transgenic poplar (Populus tremuloides Michx.) and Arabidopsis lines were reduced when 4-coumarate: CoA ligase (4CL) expression was downregulated, and the G-and S-type lignin monomer ratio was also reduced [11][12][13]. As reported in some studies, genes encoding enzymes, including coumarate 3-hydroxylase (C3H) [14,15], hydroxycinnamoyl-CoA shikimate/quinate transferase (HCT) [16], caffeoyl coenzyme A O-methyltransferase (CCoA-OMT) [17], cinnamoyl CoA reductase (CCR ) [18,19], ferulate 5-hydroxylase (F5H) [20], caffeic acid O-methyltransferase (COMT) [17] and cinnamyl alcohol dehydrogenase (CAD) [18,21], have played a critical role in lignin biosynthesis. Laccase (LAC) and peroxidase (PER) have important catalytic effects in the formation of lignin polymers from lignin monomers [22][23][24]. In addition, NST1, NST2 and NST3 of the NAC transcription factor family regulate the formation of the Arabidopsis secondary wall, and as upstream regulators directly affect the downstream AtMYB46 and AtMYB83, while overexpression of NSTs leads to ectopic lignification [25][26][27]. Except for F5H, most of the genes encoding key enzymes in the synthesis of lignin monomers are directly regulated by AtMYB58 via AC-acting elements [28]. Although many studies have been conducted on related topics, the molecular mechanisms of this phenotype remain unclear due to the genomic complexity of B. napus and the influence of other factors on the rapeseed seed coat [29,30].
Reducing the lignin content in the seed coat is one of the important goals of rapeseed breeding. Previous studies have mapped candidate genes for the seed coat lignin content. Liu et al. [31] detected five and three QTLs accounting for 4.7% to 21.9% and 7.3% to 16.9% of the phenotypic variation for cellulose and hemicellulose by using a high-density SNP genetic map, and this result was supported by Wang et al. [32] observed three significant associations on A05, A09 and C05 by genome-wide association study (GWAS). To date, there is no transcriptomic information about the seed coat, especially for lines of B. napus with extremely different seed lignin contents. In the present study, different lines with high and low lignin contents in the seeds were screened, and high-throughput sequencing was used to identify the key genes involved in lignin synthesis in the seed coat of B. napus.

Screening and quality trait analysis of H-and L-lignin lines
Seeds of 172 RILs of B. napus were harvested in May 2014 and May 2015 and scanned by a near-infrared rapid quality analyser. The ADL contents in the L-lignin lines and H-lignin lines ranged from 0.97% to 1.34% and from 2.49% to 4.70%, respectively. The L-lignin lines GH06, RIL134, RIL215, RIL216, and RIL238 and the H-lignin lines P174, RIL4, RIL14, RIL103 and RIL162 were used for subsequent comparisons ( Table 1). The degree of yellowing in L-lignin lines was 116.58 to 144.71, whereas that of the H-lignin lines was 47.45 to 67.35 (Table 1). Under natural light, the coats of mature seeds in the H-lignin lines appeared greyish black, while those of the L-lignin lines appeared ginger yellow (Fig. 1A). In 2014, the average oil contents of the H-and L-lignin lines were 36.70% and 37.60%, respectively. The average oil contents of the H-and L-lignin lines were 40.99% and 42.63% in 2015, respectively, which were 4% ~ 5% higher than those in 2014. In 2014, the average protein contents of the H-and L-lignin lines were 28.25% and 30.89%, respectively. In contrast, the protein content was lower than that in 2015, and the average protein contents of the H-and L-lignin lines were 25.58% and 27.80%, respectively (Table 1). Three H-lignin (RIL14, RIL162 and P174) and L-lignin (RIL216, RIL238 and GH06) lines were chosen for comparison of hull rates. The results showed that the hull rate of H-lignin seeds was 15.26% ~ 18.76%, while that of L-lignin lines was 10.94% ~ 12.26% (Fig. 1B). The hull rate of H-lignin seeds was significantly higher than that of L-lignin lines (Table S1).

Microscopic observation of seed coat thickness of Hand L-lignin lines
To investigate whether the lignin content is related to the seed coat thickness, transverse sections of the seed coat of the H-and L-lignin lines (P174/GH06) were obtained for microscopic analysis. It was revealed that the palisade (p) of the H-lignin lines was thicker than that of the L-lignin lines; otherwise, the mucilaginous epidermal (mu) and endothelial layers (en) were not significantly different between the L-lignin and H-lignin lines (Fig. 1C). The results indicated that the thicker seed coat in the H-lignin line than in the L-lignin line mainly resulted from the difference in the palisade layer. Further analysis revealed that lignin was mainly distributed in the palisade of the seed coat and in small amounts in the endothelium (Fig. 2). The lignin staining in the H-lignin line seed coat was very strong (Fig. 2B, D and F), while the seed coat of the L-lignin line remained very light with  Because the lignin content in the seed coat is relatively low (relative to that in other tissues such as stems and roots) and the palisade of the seed coat is pigmented, it was difficult to observe dyeing.

Transcriptomic analysis of H-and L-lignin seed coats and embryos
A total of 98.58 Gb of clean data were obtained, and the Q30 of 12 samples was ≥ 91.33% (Table S2), which verified the quality of the transcriptome sequencing data. The obtained 12 samples of clean data were compared with the B. napus reference genome, and the average comparison efficiency of seed coats and embryos was 71.60% and 77.52%, respectively (Table S3). A total of 15,030 DEGs were screened (the number of DEGs is the sum of each DEG set) (Table S4). With the development of the seed, the number of upregulated and downregulated genes in the seed coat increased (Fig. S1), which indicated that complex changes may occur.

GO and TopGO enrichment analyses
The upregulated and downregulated DEGs of H-and L-lignin lines in the three seed development stages were annotated with the GO database. At 35 DAF, 1698 DEGs were annotated for "LC1 vs HC1", and more than 80% of DEGs were enriched in cell and cell parts. More than 60% of upregulated DEGs were enriched in organelles, metabolic processes and cellular processes; more than 60% of downregulated DEGs were enriched in organelles and metabolic processes (Fig. S2A). At 40 DAF, a total of 1839 DEGs were annotated for "LC2 vs HC2", and more than 80% of DEGs were enriched in cell and cell parts. More than 60% of upregulated genes were enriched in organelles, metabolic processes and cellular processes, and more than 40% of the DEGs were enriched in catalytic activity, binding and response to stimulus ( Fig. S2B). At 46 DAF, 2093 DEGs were annotated for "LC3 vs HC3", and the upregulated genes were mainly enriched in cell, cell part, organelle, catalytic activity and metabolic process; the downregulated genes were mainly enriched in cell, cell part and organelle, and catalytic activity (Fig. S2C). TopGO analysis was performed to further explore the enrichment of DEGs in the three stages of seed development in the H-and L-lignin lines. Based on KS significance, GO terms associated with lignin biosynthesis were screened with a threshold of KS < 0.01. During seed development, DEGs associated with the "cell wall" and "secondary metabolic processes" were significantly enriched in "LC2 vs HC2" and "LC3 vs HC3" ( Table 2). In addition, cinnamic acid biosynthetic process, vacuole, phenylalanine ammonia-lyase activity and transcinnamate 4-monooxygenase activity were significantly enriched, which indicated that these biological metabolic processes have a close relationship with lignin synthesis. The "biological process" (GO: 0008150) category of the GO primary classification contained nine biological processes that might be involved in lignin synthesis, namely, metabolic process (GO: 0008152), cellular process (GO: 0009987), developmental process (GO: 0032502), immune system process (GO: 002376), singleorganism process (GO: 0044699), multicellular organismal process (GO: 0032501), response to stimulus (GO: 0050896), localization (GO: 0051179) and biological regulation (GO: 0065007) (Fig. 3A). The cinnamic acid biosynthesis process was significantly enriched, and the secondary GO terms specifically enriched were metabolic process/single-organism process, single-organism metabolic process, secondary metabolic process, phenylpropanoid metabolic process, phenylpropanoid biosynthesis process, and cinnamic acid biosynthetic process (Fig. 3B). The cinnamic acid biosynthesis process was significantly enriched, indicating that some of the genes involved in this process might promote the biosynthesis of seed coat lignin.

KEGG metabolic pathway enrichment analysis
To explore the metabolic pathways enriched for the DEGs, the related DEGs in 12 samples were subjected to KEGG (Kyoto Encyclopedia of Genes and Genomes) metabolic pathway enrichment analysis. The KEGG metabolic pathways are divided into five functional regions, namely, organismal systems, metabolism, genetic information processing, environmental information processing, and cellular processes. In "LC1 vs HC1", most of the DEGs were enriched in the biosynthesis of amino acids, starch and sucrose metabolism, carbon metabolism, etc. (Fig. S3A). Most of the DEGs were enriched in ribosomes, biosynthesis of amino acids, starch and sucrose metabolism and protein processing in the endoplasmic reticulum. In "LC2 vs HC2" (Fig. S3B) and in "LC3 vs HC3", most of the DEGs were enriched in ribosome, starch and sucrose metabolism, phenylpropanoid biosynthesis and biosynthesis of amino acids. (Fig. S3C). This result indicated that the synthesis of a large number of amino acids, starch and sucrose promoted the formation of phenylpropane compounds, thereby further promoting lignin biosynthesis. Moreover, the DEGs were significantly enriched in phenylalanine metabolism, biosynthesis of amino acids and flavonoid biosynthesis during the three stages of seed development (Fig. S4). As expected, cluster analysis of the phenylpropanoid-lignin synthesis and flavonoid synthesis pathways revealed that related genes were upregulated in the seed coat and downregulated in the embryo (Fig. 4A, C), and the expression levels in the H-lignin lines (HC1, HC2 and HC3) were significantly higher than those in the L-lignin lines (LC1, LC2 and LC3) (Fig. 4B, D).
For DEG analysis, the genes encoding PAL, C4H and 4CL in the phenylpropane pathway were highly expressed and upregulated in H-lignin lines (Fig. 5). BnPAL4 was highly expressed in the seed coat of H-lignin lines, and the differential expression fold changes in "LC1 vs HC1"  S4B). Among the specific pathways for lignin synthesis, the BnCCoAOMT and BnCOMT gene families were strongly expressed in the seed coat (Fig. 5). We also observed that the genes encoding key enzymes of the flavonoid pathway were highly expressed, especially the CHS and DFR family genes. Moreover, the expression of most genes was higher in the H-lignin lines than in the L-lignin lines (Fig. 5). At 35, 40 and 46 DAF, all members of the BnPAL gene family and BnC4H gene family were significantly more highly expressed in the H-lignin seed coat than in the L-lignin seed coat (Fig. S5B, C), and most members of Bn4CL also exhibited the same phenomenon (Fig. S5D). To further verify the expression of these key genes related to lignin biosynthesis in the seed coat of B. napus, 22 genes, including 20 upregulated and two downregulated genes, were selected for qRT-PCR analysis (Table  S5). The gene expression levels (log 2 FC) by qRT-PCR were consistent with those by RNA-Seq (Figs. 6, S6), and the reliability of the data provided by RNA-Seq was confirmed.

Identification of candidate genes by combining GWAS and RNA sequencing
To identify potential candidate genes related to seed coat lignin content, we systematically explored the DEGs between H-and L-lignin lines. In total, 4270 significantly DEGs were identified (Fig. S9). For the seed coat hull lignin content, 123 candidate genes were detected by GWAS in a previous report [32]. Among the candidate genes, seven genes, BnaA05g27640D, BnaA07g15220D, BnaA09g31780D, BnaC04g16150D, BnaC05g42720D, BnaC09g40740D, and BnaC09g43250D, were confirmed by RNA sequencing analyses (Fig. 8).

Possible network of seed coat lignin in B. napus
The regulatory network of B. napus seed coat lignin was drawn with the combination of the present transcriptome data and that from previous studies [33,34]. The results are shown in Fig. 9. The NAC domain transcription factors play a "master switch" role in the regulation of the secondary wall and further regulate downstream MYB9, MYB60 and MYB91 through SND3 and NST1, while the MYB domain transcription factors are a "secondary switch". MYB9 and MYB60 positively regulate the structural genes of downstream lignin-specific synthesis, whereas MYB91 negatively regulates the structural genes. Moreover, MYB91 may also regulate other transcription factors. NAC domain transcription factors may also be actively affected by gibberellins and jasmonic acid, and are negatively affected by auxin. In addition, gibberellins, jasmonic acid and auxin may also act directly on MYB domain transcription factors.

Lignin content and seed coat thickness
The seed coats of yellow-seeded B. napus lines are thin, and the proportion occupied by the embryo is large, thereby increasing the contents of oil and protein in the same genetic background [35]. The seed lignin content is significantly negatively correlated with the seed oil content [36]. The hull rate of yellow-and black-seeded B. napus is different at different developmental stages. In this study, the hull rate of the H-lignin lines was significantly higher than that of the L-lignin lines, which was consistent with the results of Wang et al. [32]. To observe the thickness of the seed coat, seed coat cross sections were observed microscopically. The results suggested that the mature rapeseed seed coat was roughly divided into three layers, the mucous epidermal cells and the palisade and endothelial layers, from outside to inside. The seed coat of the H-lignin lines was obviously thicker than the Previous reports have shown that the palisade layer of the seed coat is developed from the endothelial layer of the seed [37], pigment accumulates in the palisade layer, and the black seed pigment is significantly more abundant than the yellow seed pigment. We proposed that lignin was deposited in the palisade cells, which increased the thickness of the cell wall, thereby increasing the overall thickness of the palisade layer.

Effect of transcription factors on seed coat lignin
Related reports indicate that transcription factors play an important role in plant secondary walls [33,34]. In this study, the DEGs at the three stages of seed development were annotated into the plant transcription factor database, and 120, 135 and 136 DEGs were screened and matched to 28, 34 and 34 families of transcription factors in the "LC1 vs HC1", "LC2 vs HC2" and "LC3 vs HC3" comparisons, respectively. The number of MYB and NAC transcription factors associated with lignin biosynthesis was higher, and the number of upregulated NAC transcription factors was greater than that of downregulated transcription factors. NST1, NST2 and NST3 of the NAC transcription factor family regulate the formation of the Arabidopsis secondary wall, and as upstream regulators directly affect downstream AtMYB46 and AtMYB83, overexpression of NSTs leads to ectopic lignification [25]. Miao et al. [38] showed that MYB46 interacted with four target genes, MYB83, CCR1, 4CL1, and PAL1. In this study, eight NAC transcription factors showed upregulated expression. In particular, BnNAC080 and BnNAC083 showed high differential expression fold changes, and their upregulated expression may regulate structural genes of lignin biosynthesis. Zhou et al. (2009) observed that AtMYB58 regulates most of the genes encoding key enzymes in the synthesis of lignin monomers through AC-acting elements [28]. In this study, five MYB transcription factors related to lignin synthesis were screened. In the H-lignin lines, BnMYB9, BnMYB60, BnMYB60-1 and BnMYB9-1 were upregulated, while BnMYB91 was downregulated.

Effect of hormone regulation on seed coat lignin
In this study, DEGs encoding plant hormones were annotated with the plant hormone database and were associated with auxin, brassinolide, cytokinin, ethylene, gibberellin and jasmonic acid, which indicated that these hormones play a role in the regulation of seed coat lignin. Analysis of differential expression fold changes and expression patterns suggested that upregulation of BnaC02g01710D and BnaA07g11700D may promote the synthesis of lignin in the seed coat, whereas downregulation of the B. napus auxin pathway may have an inhibitory effect on the regulation of lignin. Cecchetti et al. (2013) showed that exogenous application of auxin can decrease the expression level of AtMYB26 and inhibit secondary wall deposition, indicating that auxin is a negative regulator of lignin biological regulation [39], which is consistent with the results of our study. Gibberellin 20-oxidase (GA20ox) is a key enzyme in the anabolism of gibberellin in plants [40]. The upregulated expression of BnaC02g01710D may increase the content of gibberellin and promote the formation of lignin in the seed coat. In the xylem of Coleus blumei, GA3 and low concentrations of IAA promote an increase in S-type lignin content [41]. Jasmonic acid affects lignin biosynthesis at the transcriptional level by AtMYB61 [42][43][44]. In this study, the upregulated expression of Bna-A07g11700D might promote the biosynthesis of lignin.

Association between seed lignin and flavonoid biosynthesis
In plants, since lignin monomers are derived from the same sources of flavonoid biosynthesis, the biosynthesis of lignin and flavonoids is jointly affected by upstream phenylalanine anabolism [9]. In Arabidopsis, silencing AtHCT leads to a decrease in biomass, which has a major impact on the growth of the plant. In HCT-silenced plants, inhibition of lignin synthesis leads to conversion to flavonoids by metabolic flux reversal of chalcone synthase activity [45]. In this study, in the common phenylpropanoid pathway, the expression levels of DEGs encoding the enzymes PAL, C4H and 4CL were higher in the seed coat of H-lignin lines than in that of L-lignin lines. Similar results have been found by Jiang et al. (2019), which suggests that most DEGs are downregulated in yellow-seeded B. napus [46], and these results are consistent with the decreased flavonoid and lignin contents. 4CL catalyses the formation of p-coumaroyl-CoA by p-coumaric acid, which is the precursor of lignin and flavonoid biosynthesis. In the H-lignin lines, the expression of genes encoding PAL, C4H, and 4CL was significantly higher than that in the L-lignin lines. Among the specific pathways for lignin synthesis, CCoA-OMT catalyses the conversion of caffeoyl-CoA to feruloyl-CoA, which is the first methyltransfer reaction and is mainly involved in the synthesis of G-type lignin monomers. In addition, the expression of COMT was also high, and we speculated that lignin in the seed coat was mainly composed of G-type and S-type lignin monomers. Qu et al. (2013) found that the flavonoid content in the seed coat of B. napus was the highest at 42 d after pollination, and black-seeded B. napus had a significantly higher lignin content than yellow-seeded B. napus [47], which is consistent with the transcriptome data of this study. In the flavonoid synthesis pathway, the upregulated expression of DEGs encoding most of the enzymes in the H-lignin lines indicated that these enzymes have higher activity and therefore contribute to high flavonoid content in these lines. Liu et al. [31] constructed a high-density genetic linkage map for a B. napus recombinant inbred line population using a genome-wide single nucleotide polymorphism marker determined by a 60 K chip for four different environmental lines of B. napus. QTL analysis of the cell wall fibre components ADL, cellulose and hemicellulose showed that the seed colour and fibre traits of 11 QTLs were detected. The SNP sites of ADL traits are mainly located on chromosomes A09 and C05. Similar results have been found, for example, Bna.CCR1 is localized in silico near the peak of a corresponding seed fibre QTL [48]. In addition, Wang et al. [32] performed a GWAS analysis on the seed ADL content of B. napus, detected significant SNPs on A05, A09 and C05 and proposed the key genes BnaA.PAL4, BnaA.CAD2/BnaA. CAD3 and BnaC.CCR1 for lignin biosynthesis [32]. The comparison of the GWAS and our transcriptome results revealed that BnPAL4, which was verified by qRT-PCR, is the key gene for the difference in the lignin content among the lines.

Conclusions
Conclusively, our study revealed the key genes related to lignin biosynthesis in the Brassica napus seed coat. First, the seed coat of the H-lignin lines was thicker than that of the L-lignin lines, especially with regard to the palisade layer. Then, KEGG and GO analyses suggested that phenylpropanoid and flavonoid biosynthesis were significantly enriched in the H-lignin lines. The genes BnPAL, BnC4H and Bn4CL of the lignin biosynthesis pathway, BnCHS and BnDFR of the flavonoid biosynthesis pathway, BnaC02g01710D, BnaA07g11700D and BnaC09g00190D of plant hormone synthesis pathways, and some transcription factors BnNAC080, BnNAC083, BnMYB9, BnMYB9-1, BnMYB60, BnMYB60-1 and BnMYB91 were screened in the current study. Finally, by combined analysis of RNA-Seq and reported data, a model of B. napus seed coat lignin was proposed.

Plant materials
The seeds of two lines GH06 and P174 were observed by our lab. The population of 172 RILs was derived by single seed descent from F 2 offspring of a cross between the Chinese semi-winter rapeseed parental lines GH06 (yellow seeds, low lignin content) and P174 (black seeds, high lignin content) [31].

Data analysis
Analyses of the means from three individual experiments and determination of significant differences by Duncan's multiple range test were performed with SPSS 17.0. The level of significance (α) was set at 0.01 and 0.05.

Determination of lignin, fatty acid and protein contents
Seed traits were determined using a near-infrared rapid quality analyser (NIR System 6500, FOSS, Sweden) according to the method of Dimov et al. [49]. The instrument was warmed up for at least 50 min before scanning. A seed quality near-infrared model was used to determine seed quality traits, including the contents of acid detergent lignin (ADL), acid detergent fibre (ADF), neutral detergent fibre (NDF), fat, and protein, and the degree of yellowing in seeds.

Determination of the hull rate
Seeds of the H-and L-lignin lines were dried in an electric thermostat-controlled blast drying oven at 80 ± 2 °C for 72 h and weighed, and the total dry weight (W) (g) of the seeds was recorded. Each seed sample was approximately 3.00 g, and 3 replicates were used. The weighed seeds were immersed in sterile water for 24 h until fully imbibed, and the seed coat and embryos were separated by using anatomic needles. The seed coat was dried in an electric thermostat blast oven at 80 ± 2 °C for 48 h, and the dry weight of the seed coat (W1) (g) was recorded. The formula was as follows: Microscopic observation of seed coat thickness and staining of rapeseed At 40 DAF, mature seeds were used for the observation of seed coat thickness according to the method of Zhou et al. [28]. A cryostat (CM1850UV Leica) was used for UV disinfection, and the chassis temperature was decreased to -20 °C before use. Embedding and sectioning: The embedding agent was added dropwise until the rapeseed seeds were completely covered and the colour of the embedding agent changed from transparent to white. The temperature of the cryostat cabinet was adjusted from -20 °C to -10 °C for more than 30 min. The sample holder was attached to the sample head on the microtome, and the section thickness was adjusted to 1 μm. The clean slides were placed flat on top of the cut tissue sheets to attach the tissue to the slides, and the prepared sections were placed under a microscope (Nikon SMZ1500) for observation. Sections from seeds at 40 DAF and mature seeds were stained with phloroglucinol-HCl as described by Zhou et al. [28]. The sections were transferred to glass slides, 1 mL of 1% phloroglucinol solution was added for 5 min, and then 1 mL concentrated hydrochloric acid solution was added. The sections were observed under a microscope (Nikon ECLIPSE E600) with 4 × , 10 × , 20 × , 40 × and 100 × objective lenses, and high-resolution photographs were taken.

RNA sequencing and analysis
Total RNA was extracted from rapeseed coats and embryos of five lines at 35, 40 and 46 DAF using EZ-10 RNA Miniprep Kits (Sangon Biotech Co., Ltd., Shanghai, China). The Agilent Bioanalyzer 2100 system (Agilent Technologies, CA, USA) was used to assess RNA integrity. High-quality RNA from all three samples was obtained by Biomarker Technologies Co., Ltd. (Beijing, China) for RNA sequencing and analysis. TopHat2 software was used for reference genome (Brassica napus v4.1) mapping.
Fragments per kilobase of exon per million fragments mapped (FPKM) values were used to represent the levels of gene expression. The EBSeq R package was used to analyse the differentially expressed genes (DEGs) with an false discovery rate (FDR) < 0.01 and |log 2 (fold change)|≥ 1. Two samples were compared using the "A vs B" method, wherein the "A" sample was used as a control. If the expression level of a DEG in sample "B" was higher than that in "A", the DEG was upregulated; otherwise, it was downregulated. We used the GOseq R package [50] to perform GO enrichment analysis of the DEGs, and DEG enrichment in KEGG pathways was assessed using KOBAS software [51]. In this study, DEGs of the three sets of comparisons ("LC1 vs HC1", "LC2 vs HC2" and "LC3 vs HC3") were annotated in Plant Transcription Factor Database v4.0 [52] and Arabidopsis Hormone Database 2.0 [53]. The raw RNA-Seq data have been uploaded to NCBI under the accession number PRJNA597958.

Prediction of miRNAs targeting the BnPAL, BnC4H and Bn4CL genes
The genome sequences of BnPAL, BnC4H and Bn4CL family genes were submitted as candidate genes for predicting potential microRNAs (miRNAs) by searching against the available B. napus reference miRNA sequences by using a plant small RNA target analysis server (psRNATarget) [54]. Cytoscape software was used to analyse the interactions between miRNAs and corresponding target genes.

Validation of RNA-Seq data
A quantitative reverse-transcription PCR (qRT-PCR) experiment was performed to verify the differential expression levels of DEGs measured by RNA-Seq according to Zhao et al. [55] and Zhang et al. [56]. Primers were designed using Primer Premier 5 and synthesized by Sangon Biotech (Shanghai) Co., Ltd.
(Shanghai, China). Primer Premier 5.0 software was used to design specific primers for quantitative RT-PCR (Table S5), and qPCR was performed using a Bio-Rad CFX96 Real-time System with SYBR ® Green PCR Supermix (California, USA). Each reaction contained 10 μL of SYBR Supermix, 0.4 μL of each primer (10 μM), 7.2 μL of H 2 O, and 2 μL of cDNA in a final volume of 20 µL. Three technical replicates were used for each reaction. The following procedure was used for qRT-PCR: 98 °C for 30 s, followed by 40 cycles of 98 °C for 10 s and 60 °C for 30 s. The BnActin7 gene was used as a control. To verify the expression levels detected by RNA-Seq, the RNA-Seq data were compared to the data obtained by qRT-PCR.