A combination of genome-wide association study and transcriptome analysis in leaf epidermis identifies candidate genes involved in cuticular wax biosynthesis in Brassica napus

Background Brassica napus L. is one of the most important oil crops in the world. However, climate-change-induced environmental stresses negatively impact on its yield and quality. Cuticular waxes are known to protect plants from various abiotic/biotic stresses. Dissecting the genetic and biochemical basis underlying cuticular waxes is important to breed cultivars with improved stress tolerance. Results Here a genome-wide association study (GWAS) of 192 B. napus cultivars and inbred lines was used to identify single-nucleotide polymorphisms (SNPs) associated with leaf waxes. A total of 202 SNPs was found to be significantly associated with 31 wax traits including total wax coverage and the amounts of wax classes and wax compounds. Next, epidermal peels from leaves of both high-wax load (HW) and low-wax load (LW) lines were isolated and used to analyze transcript profiles of all GWAS-identified genes. Consequently, 147 SNPs were revealed to have differential expressions between HW and LW lines, among which 344 SNP corresponding genes exhibited up-regulated while 448 exhibited down-regulated expressions in LW when compared to those in HW. According to the gene annotation information, some differentially expressed genes were classified into plant acyl lipid metabolism, including fatty acid-related pathways, wax and cutin biosynthesis pathway and wax secretion. Some genes involved in cell wall formation and stress responses have also been identified. Conclusions Combination of GWAS with transcriptomic analysis revealed a number of directly or indirectly wax-related genes and their associated SNPs. These results could provide clues for further validation of SNPs for marker-assisted breeding and provide new insights into the genetic control of wax biosynthesis and improving stress tolerance of B. napus.


Background
Brassica napus L. (2n = 38, genome AACC) is an allotetraploid crop evolved from natural hybridization between two diploid progenitor species, Brassica rapa (AA, 2n = 20) and Brassica oleracea (CC, 2n = 18), followed by chromosome doubling about~7500 years ago [1]. It is a globally important oil crop which contributes edible oil, biofuels, and industrial compounds such as plasticizers and stabilizer for plastics, lubricants, and surfactants [2]. However, unfavorable environmental factors induced by climate changes such as drought and high temperature, severely influence its yields and qualities [3,4]. For stabilizing or enlarging oilseed supply, it is important to improve stress tolerance of B. napus. The aerial parts of plants are covered with cuticular wax, a mixture of hydrophobic compounds. Both increased cuticular wax deposition and improved tolerance to water deficiency stress were observed in some transgenic crops with the overexpression of wax-associated genes [5][6][7][8], suggesting that this hydrophobic layer could protect plants from abiotic stresses.
Cuticular waxes are consisted of very long chain fatty acids (VLCFAs) and their derivatives such as alcohols, aldehydes, alkanes, ketones, and wax esters [9,10]. The biosynthesis of these aliphatic wax components involves C 16 -C 18 fatty acids synthesis in the plastid, C 16 -C 18 fatty acid elongation in the endoplasmic reticulum (ER), and subsequent modification via either the decarbonylation or the acyl reduction pathway [11].
Many genes involved in cuticular wax biosynthesis and its regulation have been identified in Arabidopsis. However, the genetic basis of cuticular waxes in B. napus has still not been fully understood. The wax load on B. napus leaves is considerably higher in comparison to that on Arabidopsis leaves. However, the wax composition patterns are similar in these two plant species, both consisting of fatty acids, aldehydes, alkanes (predominant compounds), primary alcohols, secondary alcohols, ketones, and esters. Characterization of a novel dominant glossy mutant BnaA.GL revealed that the suppression of the CER1 (a putative aldehyde decarbonylase gene involving in the biosynthesis of alkane) and other wax-related genes such as MAH1 (a midchain alkane hydroxylase gene involving in the biosynthesis of secondary alcohol and ketone) and WSD1 (a wax ester synthase/acyl-CoA:diacylglycerol acyltransferase gene involving in the biosynthesis of wax ester) drastically altered the wax production via the alkane pathway in B napus [12]. The product of the KCS1 encodes a condensing enzyme KCS1 (3-ketoacyl-CoA synthase 1) which is involved in the critical fatty acid elongation process in wax precursor biosynthesis [13]. Overexpression of BnKCS1-1, BnKCS1-2, and BnCER1-2 in B. napus promoted cuticular wax production and increases drought tolerance [14]. Overexpression of BnLAS, a member of the GRAS family of putative transcriptional regulators, resulted in inhibition of growth, delays in leaf senescence and flowering time, and more epidermal wax deposition on transgenic leaves of Arabidopsis [15]. Overexpression of BraLTP1 in B. napus, a lipid transfer protein gene from B. rapa, caused abnormal green coloration, reduced wax deposition, and resulted in leaf water loss [16]. However, these progresses are still not enough to elucidate the molecular mechanisms of cuticular wax production in B. napus.
Recently, genome-wide association studies (GWAS) based on high-throughput genotyping technologies become available as a powerful alternative for dissecting the genetic architecture of complex traits in crops [17,18]. In rapeseed, traits including flowering time [19], seed oil content [20], seed weight and seed quality [21], branch angle [22], harvest index [23], and resistance to Sclerotinia [24], have been dissected by GWAS. However, up to date, no genome-wide association mapping of wax traits in rapeseed has been reported.
In a previous study, 517 B. napus accessions were used to analyze the leaf wax phenotype, and the heritability of wax compositions suggested that wax variations were mainly driven by genetic factors and were possibly suitable for GWAS [25]. Luo et al. [26] also applied GWAS analysis to identify putative SNPs associated with as many as 50 leaf wax traits in Camelina. These researches suggest that GWAS is feasible for detecting genes related to wax compositions in B. napus. In addition, cuticular waxes are synthesized in the ER of epidermal cells, and therefore, an aerial epidermis transcriptome might be more efficient in identifying the candidate genes involved in the synthesis of wax and cutin [27].
Here, we quantified the levels of total cuticular waxes, wax classes and wax compounds in leaves of 192 B. napus accessions for two years. Then, 31 wax traits were used to perform GWAS to detect genes potentially related to cuticular wax biosynthesis. To further aid identification of wax related genes and explore molecular mechanism of wax biosynthesis, expression profiles of all GWAS-identified genes were determined by transcriptome of the leaf epidermis from high-and low-wax load B. napus lines. This study first used the GWAS tool and the epidermis transcriptome to identify candidate genes related to B. napus wax traits. Our results provided insight into the genetic regulation of B. napus cuticular wax metabolism; therefore, laid a foundation for genetic improvement of B. napus stress tolerance by wax modification.

Phenotypic variations of leaf cuticular wax
A total of 192 B. napus accessions were used to characterize the leaf wax profiles in 2016 and 2017. The cuticular wax profiles in this panel were similar to those reported by Tassone et al. [25] and Holloway et al. [28]. The leaf wax was mainly consisted of long-chain fatty acids, aldehydes, alkanes, primary alcohols (1-alcohols), secondary alcohols (2-alcohols), ketones, and alkyl esters. In total 21 predominant compounds were obtained from seven wax classes, such as C 27 alkane, C 29 alkane, C 31 alkane, and C 29 2-alcohol, etc. Total wax coverage, amounts of wax classes, and the amount of each predominant wax compound were assessed as single trait. Additionally, the sum of C 29 alkane, C 29 ketone and C 29 2-alcohol, the three most abundant compounds from same biosynthesis pathway were also assessed as a single trait (Total C 29 ). The sum of products from alkane-forming pathway and the sum of products from alcohol-forming pathway were also characterized as single trait, respectively. A complete list of the 31 wax traits was provided in Table 1. Extensive phenotypic variations of these wax traits were observed in two consecutive years (Table 1; Additional file 1: Fig. S1). The total wax coverage ranged from 7.75 to 53.93 μg·cm − 2 in 2016 (with an average of 27.44 μg·cm − 2 ) and from 4.23 to 44.83 μg·cm − 2 in 2017 (with an average of 18.65 μg·cm − 2 ).
A two-way ANOVA analysis indicated that most wax traits were influenced by genotype (G), year (Y) and their interactions (G × Y) (P < 0.001), suggesting the indispensable role of environment on wax synthesis regulation (Table 2). Heritability values ranged from 0.60 (C 28 acid) to 0.84 (C 27 alkane) for each independent wax trait ( Table 2). Most of the wax traits in B. napus showed continuous variations and approximated a normal distribution (Additional file 1: Fig. S1), suggesting that the wax traits were controlled by multiple genes.
High correlation coefficients were observed between C 29 alkane and C 29 ketone (r = 0.69) and between C 29 alkane and C 29 2-alcohol (r = 0.67), which were produced from alkane branch pathway, and between 1-alcohol and C 38 ester (r = 0.87), C 40 Ester (r = 0.93), and C 42 ester (r = 0.63), which were produced from alcohol branch pathway (Additional file 2: Table S1). High positive correlation coefficient was also found between the products from alkane-forming pathway and alcohol-forming pathway (r = 0.72), indicating that these wax compositions were not independently regulated (Additional file 2: Table S1).
Population structure and relative kinship of the association panel A subset of 4623 SNPs with missing data < 0.2 and MAF > 0.2, which distributed evenly across the entire B. napus genome, was selected for population structure and relative kinship analysis (K). Population structure analysis can provide information about the optimal number of subgroups (i.e. the optimal K value) and the proportion of each subgroup in each accession (i.e. Q matrix), which is useful to select the Q matrix corresponding to the optimal K value in the next association analysis, so as to control the false positives caused by the population structure. A clustering inference performed with possible clusters (K) from 1 to 10 showed that the most significant change in likelihood occurred when K increased from 2 to 4 (Fig. 1a), and the highest Δk-value was observed at k = 4 (Fig. 1b). Based on the Δk method described by Evanno et al. [29], the 192 accessions could be divided into four major sub-populations, which were designated as P1, P2, P3, and P4 (Fig. 1c). Most of the spring rapeseed accessions were distributed in P1, while most of the winter accessions were distributed in P2 and P3 (Additional file 3: Table S2).
The analysis of genetic relatedness revealed that 55.8% of the pairwise kinships were equal to 0, and 69% of them ranged from 0 to 0.05 (Fig. 1d), suggesting that most of the accessions in this panel have no or weak kinship, which might be attributed to the broad ranging collections of the genotypes. The results of genetic relatedness analysis would be used in GWAS model as random effect covariate matrix (K matrix) to avoid the false positives in the next association analysis.
The Linkage disequilibrium (LD) decay rate was measured as the chromosomal distance at which the average pairwise correlation coefficient (r 2 ) between all pairs of SNP markers dropped to half of its maximum value. The genome-wide LD decay of A and C subgenome for 192 rapeseed lines were shown in Fig. 1e. The LD of A subgenome decayed faster than that of the C subgenome. The average distance for A subgenome was 500 kb, and for C subgenome was 1600 kb, where r 2 decayed to 0.1.

Association mapping in B. napus for wax traits
To evaluate the effects of population structure (Q, PCA) and kinship (K), six models, including Q, PCA, K, PCA + K, Q + K and naïve (without controlling for Q, PC and K), were separately performed association analysis with the 31 wax traits (Additional file 4: Fig. S2). According to the P values from six models, the population structure and kinship can be corrected effectively by the linear mixed model such as PCA + K, Q + K and K when performing GWAS. Eventually, the PCA + K model was selected to perform association mapping for C 28 acid, C 26 alkane, C 30 alkane, C 29 2-alcohol, C 29 ketone, and C 38 ester, while Q + K model for the remaining 25 wax traits for controlling population structure in GWAS. Thus, a total of 202 significantly associated SNPs for 31 wax traits were identified in a genome-wide scan (P < 2.95E-05) ( Fig. 2; Additional file 5: Fig. S3; Additional file 6: Table S3). Among these SNPs, 18 were co-associated with multiple wax traits (Additional file 6: Table S3). For example, Bn-A01-p6380934, Bn-A05-p2030789, and Bn- scaff_15798_1-p733219 were simultaneously associated with total wax, alkanes, alkane-forming pathway, and 1alcohol-forming pathway. The marker Bn-A02-p25285941 was closely related to C 26 1-alcohol, total 1-alcohols and 1-alcohol-forming pathway. Bn-A08-p16793918 was closely related to C 38 ester, C 40 ester, C 42 ester, and total esters. Bn-A05-p19622826 was closely related to C 28 aldehyde, C 29 2-alcohol, and total 2-alcohols. According to A-and C-subgenome's LD decay, genes within~250 kb upstream and downstream to the associated SNPs on A-subgenome and~800 kb on C-subgenome were selected for identification of candidate genes. No SNP was significantly associated with C 30 aldehyde.
Genome-wide expression profiles in B. napus epidermis based on RNA-seq data Next, RNA from epidermis was pooled among highwax load (HW) lines and low-wax load (LW) lines separately and performed sequencing. The wax load of leaves was 53-70% lower in LW lines when compared to HW lines (Fig. 3b). A reduction in the alkane, 2-alcohol and ketone content was prominent in leaves of LW lines ( Fig. 3a and c). By mapping all unique sequences to the B. napus genome, a total of 216.49 million mapped reads (258.15 clean reads) were obtained (Additional file 7: Table S4). A high correlation (R 2 > 0.98) was observed among the three biological replicates, suggesting that the sequencing results were reliable (Additional file 8: Fig. S4). Using software DESeq [30] and FDR < 0.001 and absolute fold change ≥4 as the criteria, a total of 6966 differentially expressed genes (DEGs) between HW and LW lines were detected (Additional file 9: Fig. S5). Among the DEGs, 4608 genes were down-regulated while 2358 genes were up-regulated in LW when compared to HW (Fig. 4a). Since the epidermal samples used in this study possibly contained all epidermal cell types, thus cell type-specific genes, like guard cell-specific genes, could also be included in the DEGs. Among 490 DEGs identified as encoding transcription factors (TFs), 152 were up-regulated in LW when compared to HW, while 338 were downregulated (Additional file 10: Table S5). Additionally, 20 TFs were only expressed in LW, while 109 TFs only in HW. Among these differentially expressed TFs, 86 belonged to MYB type, 44 belonged to zinc finger type, 40 belonged to AP2/ERF, and the others (Additional file 10: Table S5). About 70% of the MYB, WRKY, ERF and zinc finger type genes were down-regulated in LW, of which 21 MYBs, 7 WRKYs, 3 ERFs and 7 zinc finger types were not expressed in LW. Some orthologs of well-characterized Arabidopsis genes related to wax regulation were identified, such as MYB16 and MYB30 (Additional file 10: Table S5). Note: Total C 29 , the sum of C 29 Alkane, C 29 Ketone and C 29 2-Alcohol; Alkane Pathway, the sum of products from alkane-forming pathway; 1-Alcohol Pathway, the sum of products from alcohol-forming pathway  It is reported that the overexpression of MYB30 in transgenic Arabidopsis plants promoted the production of cuticular wax [31], while MYB16 functioned as a major regulator of cuticle formation in vegetative organs [32]. Our results indicated that MYB family could play a role in the wax regulation of B. napus.

Functional classification of DEGs in the B. napus epidermis
To monitor the difference of gene expression pattern between LW and HW lines, Gene Ontology (GO) enrichment analysis was conducted (Additional file 11: Fig. S6). Significantly overrepresented top GO terms of Note:G and Y indicate genotype and year, respectively, and G x Y indicate interaction of G and Y. Total C 29 , the sum of C 29 Alkane, C 29 Ketone and C 29 2-Alcohol; Alkane Pathway, the sum of products from alkane-forming pathway; 1-Alcohol Pathway, the sum of products from alcohol-forming pathway DEGs between HW and LW were enriched in response to stress, cell wall, and transcription factor activity, etc. (Additional file 12: Fig. S7). Among KEGG significantly enriched pathways, 12 DEGs were annotated in fatty acid elongation (ko00062) (Fig. 4b) and 14 DEGs were annotated in wax, and cutin and suberin biosynthesis (ko00073) (Fig. 4 c and d). In the most cases, more than one DEG was assigned to the same enzyme in KEGG pathway.

Identification of candidate genes
For decreasing false positive error, the expression profile of candidate gene regions on A-and C-subgenome were determined by transcriptome of leaf epidermis from HW and LW lines. Totally, 792 GWAS-identified genes, which associated 147 GWAS-identified SNPs, were revealed to have differential expression between HW and LW lines, including 344 up-regulated genes and 448 down-regulated genes in LW when compared to those in HW ( Fig. 4a; Additional file 6: Table S3). KEGG pathway analysis showed that some differentially expressed GWAS-identified genes enriched in fatty acid elongation, wax biosynthesis, and cutin and suberin biosynthesis pathway (Fig. 4b, c and d). Proposed wax-related genes were listed in Table 3, including some reported A. thaliana orthologous genes. For example, BnaA10g00700D, BnaC09g16050D and BnaC09g51620D were annotated as KCS1, CER1 and MAH1, which were mainly involved in VLCFAs biosynthesis, alkane biosynthesis, and secondary alcohol and ketone biosynthesis, respectively ( Fig. 4b and c; Table 3) [13,33,34].
To further validate the efficiency of RNA-seq analysis, expression profiles of 10 genes that were commonly identified by GWAS and RNAseq were detected by qRT-PCR. The results showed that the expression changes in these 10 genes from LW and HW were similar to those based on RNAseq analysis (Fig. 5), suggesting the reliability of the RNA-seq data.

Discussion
B. napus is an allotetraploid crop with complex genome structure, which imposes a huge challenge to genomewide SNP discovery. In the present study, a total of 192 lines were genotyped with the Brassica 60 K SNP array, and the variations in cuticular wax were investigated for two consecutive years. A total of 202 significantly associated SNPs for 31 wax traits were identified in GWAS (Additional file 6: Table S3). These SNP corresponding genes within the LD decay range were selected for identification of the candidates. Considering significant limitations in existing methods for the genome-wide identification of genes, such as false positive and negative results remaining in GWAS [35], we further analyzed the transcriptome of epidermal cells in B. napus leaves from HW and LW lines. By integrating GWAS with transcriptomic data, 73% GWAS-identified SNPs, including 792 genes, were revealed to have differential expression between HW and LW lines (Fig. 4a). Although it was impossible to clearly say that these genes were associated with wax traits, they should nevertheless be considered potential wax-related genes.
Based on GO analysis, organic substance metabolic process, cellular metabolic process, primary metabolic process, single-organism cellular process and response to stimulus enriched most of the related genes. A number of genes enriching in cellular process and metabolic process suggested that a complex polygenic network was involved in wax production or deposition in B. napus. Cuticular wax is synthesized in the plastids of epidermal The results show averages of three replicates, and error bars indicate ± SD. HW, high wax load rapeseed; LW, low wax load rapeseed cells with the de novo C 16 and C 18 fatty acyl-acyl carrier proteins (ACPs) synthesis. BnaC09g10800D, the candidate gene in the LD range of the SNP Bn-scaff_17487_1-p812141, encoded ortholog of Arabidopsis KAS III [36] and was potentially involved in the de novo fatty acid synthesis. Before being exported to the ER, C 16 and C 18 fatty acids were released from ACPs by fatty acyl-ACP thioesterases (FaTA and FaTB) and subsequently esterified to Coenzyme A (CoA) by long-chain acyl-CoA synthetases (LACS) [37]. The acyl-CoA forms of these fatty acids are then elongated to wax precursors of VLCFAs by fatty acid elongase (FAE) complex in the ER. In this study, BnaA05g23790D, BnaC06g43550D and BnaA07g0 8340D, locating within the LD range of marker Bn-A05-p19622826, Bn-scaff_16485_1-p747170 and Bn-A07-p6765464, separately, encoded FaTA which was potentially involved in the release of C 18 fatty acid from ACP [38]. BnaA10g00700D, BnaA10g02480D and BnaA10g0 0380D, the SNP Bn-A10-p4786596 corresponding genes, were annotated as genes encoding KCS1, KCS2 and Fig. 4 Differentially expressed genes between high-wax load (HW) lines and low-wax load (LW) lines. a The number of differentially expressed genes between HW lines and LW lines. b, c and d Annotation map of KEGG pathway of differentially expressed genes involved in fatty acid elongation, wax biosynthesis, and cutin and suberin biosynthesis, respectively. Green represents down-regulation, red represents up-regulation, and blue means up-and down-regulation co-existed in LW lines compared to HW lines. The number of differentially expressed genes encoding a specific enzyme was noted in italics next to the colored box Bn-A02-p5516551 BnaA02g05260D-BnaA02g06170D (4) BnaA02g05700D* 4.14 AT5G22500** Alcohol-forming fatty acyl reductase (AlcFAR1) CYP86A4 in KEGG pathway, which catalyze the biosynthesis of VLCFAs and cutin ( Fig. 4b and d) [13,39,40].
Wax components synthesized in the ER need to be exported to the extracellular matrix. In the current study, some genes were identified to function in lipid transportation and maintaining ER morphology. For a Thresholds for significantly differential expression between high-wax load (HW) and low-wax load (LW) lines were set to false discovery rate (FDR) < 0.001 and fold change ≥4. example, BnaA07g08720D, locating in LD to the SNP Bn-A07-p6765464, is orthologous to Arabidopsis LTPG1 which is participated in the output of cuticular wax [47]; BnaC04g45800D, locating in LD to the marker Bn-scaff_ 16888_1-p1834154, is orthologous to B. rapa LTP1, which is involved in wax production or deposition as a lipid transfer protein member [48]. BnaA08g17990D, locating in LD to Bn-A08-p16793918, is orthologous to Arabidopsis ERD2 which mediates the transport of soluble ER-localized proteins containing a C-terminal K/ HDEL signal [49]. Although the cuticle is usually considered independently from the epidermal cell wall underlying polysaccharide, the cuticle and the cell wall are structurally related (the cuticular layer protrude deeply into cell wall) and have some overlapping functions with each other [50]. Several DEGs related to cell wall were also observed in this study. For example, BnaA05g32440D and BnaA05g32300D, locating in LD to the SNP Bn-A05-p23445454, encoded orthologs of Arabidopsis xylan biosynthesis gene AXY9 and cellulose synthase-like D3 (CSLD3), respectively, which were involved in cell wall formation [51,52]. Some stress-responsive related genes have also been identified, such as BnaC04g11110D, which potentially encoded ortholog of Arabidopsis Responsive to Dehydration gene (RD20), belonging to a member of lipid surface protein family. An increased sensitivity to Botrytis cinerea infection and water deficiency was observed in rd20 mutant [53]. To adapt to dry environment, plants increase leaf cuticular wax deposition to restrict nonstomatal water loss and avoid dehydration [54]. BnaC02g39310D, BnaC02g39360D and BnaA02g27940D, potentially encoding orthologs of Arabidopsis bZIP63, WRKY74 and bHLH105/ILR3 which were involved in modulating responses to starvation, salt tolerance and multiple stress responses [55][56][57], were also obtained in this study. Plant cuticular wax has been associated with improved plant stress tolerance. In this study, some stress-responsive related genes have been identified by a combined GWAS-RNAseq approach, suggesting that cuticular wax biosynthesis directly or indirectly related with the response to environmental stimulus. However, it remains unknown whether these genes have overlapping effects on wax biosynthesis besides regulating plant responses to stresses. Further studies are required to explore the precise roles of these identified genes.
In this study, some orthologs of well-characterized Arabidopsis wax-related genes could be identified by epidermis transcriptome, such as BnaC02g37590D (orthologous to AtMYB16) and BnaC02g05990D (orthologous to AtMYB30) (Additional file 10: Table S5) [31,32], whereas they couldn't be identified by GWAS, implying that the role of these genes in wax production might act through variable expressions. Our results showed that combined GWAS and the epidermis transcriptome sequencing analysis could increase the efficiency to detect genes associated with cuticular wax biosynthesis and to prioritize likely candidates, though some wax-related genes that did not vary in gene transcription but had an impact on enzyme function/activity might be overlooked.

Conclusions
This study first used the GWAS tool and the epidermis transcriptome to identify candidate genes associated with B. napus wax traits. A total of 202 SNPs were found to be significantly associated with 31 wax traits. Furthermore, 792 GWAS-identified genes and their associated 147 SNPs were revealed to have differential expressions between HW and LW lines, including 344 up-regulated genes and 448 down-regulated genes in LW when compared to those in HW. These identified SNPs could provide clues for further exploration and validation for marker-assisted breeding, and the proposed wax-related genes could provide new insights into the genetic control of wax metabolism and improving stress tolerance of B. napus.

Plant material and experimental design
A total of 192 B. napus accessions classified as spring, semi-winter and winter types, were used for an association analysis in this study (Additional file 3: Table S2). All the accessions were provided by the Chongqing Rapeseed Engineering & Technology Research Center, Southwest University, Chongqing, China. Most of these accessions were derived from research institutions in China, with the remaining introduced from Germany, Demark, and Canada. The experiments were conducted at Xiema Experimental Station (29°45′N, 106°22′38E), Beibei, Chongqing, China. The B. napus accessions were seeded in randomized complete blocks with three replicates in September, 2016 and 2017. Each accession was planted in two rows, 40 cm between rows, and 20 cm between plants. Routine field management was carried out. Nine weeks after planting (late December, pre-flowering stages for all accessions) uniform and fully expanded leaf samples from all plots were collected for wax trait investigation. The average month temperature in September, October, November and December in 2016 and 2017 were 22.9 and 23.3°C, 18.9 and 17.4°C, 12.7 and 13.3°C, and 9.4 and 8.3°C, respectively; whereas the month rainfall were 95.7 and 71.9 mm, 96.9 and 178.4 mm, 89.7 and 25.5 mm, and 17.1 and 6.9 mm, respectively.

Cuticular wax analysis
The cuticular wax composition of leaves from all accessions was determined as described by Wang et al. [8] with some modifications. Three leaves (third leaf from the top) were sampled from three plants in each replicate for each accession. The sampling was finished within three days. Leaf cuticular waxes were extracted in chloroform containing 10 μg tetracosane (Sigma Aldrich, Missoui, USA) as an internal standard. Then the extracts were derivatized with 50 μL N,O-bis (trimethylsilyl) trifluoroacetamide (BSTFA) and 50 μL pyridine. The derivatized extracts were dried and re-dissolved in chloroform for GC-FID analysis. The wax compound was identified and quantified by mass spectrums, internal standard and leaf area. Leaf areas were determined using ImageJ software (http://rsb.info.nih.gov/ij/).

Statistical analysis
Wax traits was investigated for three replicates in 2016 and 2017 (Additional file 13: Table S6). Each wax trait of each accession was defined as the average of the three replicates in the same year. For each wax trait, best linear unbiased predictors (BLUP) were estimated for each line across the two years based on a linear model using the lme4 package [58]. The BLUP values were used as phenotypes for the association analysis. The broad-sense heritability was calculated as H 2 = δ g 2 / (δ g 2 + δ ge 2 /n + δ e 2 / nr), where δ g 2 is the genetic variance, δ ge 2 is the interaction variance of the genotype with year, δ e 2 is the error variance, n is the number of years and r is the number of replicates within a year.
SNP genotyping, filtering and in silico mapping of SNPs SNP genotyping was performed using the Brassica 60 K Illumina SNP array, and SNP data were analyzed using Illumina GenomeStudio genotyping software. SNPs with call frequencies < 0.8 or MAF < 0.05 were excluded in the study. After processing, 31,846 SNP sequences were aligned with the genome sequences of B. napus, with an E-value cut-off of <'1E-10'. The blast-hits with a minimum E value and a maximum score were selected for further analysis.
Population structure, relative kinship and disequilibrium (LD) analysis A subset of 4623 SNPs distributed evenly across the entire genome (missing data < 0.2, MAF > 0.2, and unique position on chromosome) was selected for population structure and relative kinship analysis (K). The modelbased program STRUCTURE 2.3.4 software was used to estimate the population structure (Q) with a Bayesian Markov Chain Monte Carlo model (MCMC) [59]. Five independent runs were performed with a K-value (the putative number of genetic groups) from 1 to 10. Both the length of burning period and the number of MCMC replications after burning were set to 100,000 iterations under the "admixture model". The optimal K value was determined by the log probability of the data [LnP(D)] in the STRUCTURE output and a statistic Δk based on the rate of change of LnP(D) between successive k values as described by Evanno et al. [29]. The results of replicate run from STRUCTURE were integrated to acquire a Q matrix with the CLUMPP software [60] and graphically displayed using DISTRUCT software [61]. The relative kinship matrix of the natural population was calculated using TASSEL 5.2.1 [62]. All negative values between two accessions were set to 0. The linkage disequilibrium (LD) between pair-wise SNPs (with MAF ≥ 0.05) on A-and C-subgenome was estimated by a parameter r 2 calculated with the software TASSEL version 5.2.1 [62].

Genome-wide association studies
Trait-SNP association analysis was separately performed in six models including Q (controlling for population structure), PCA (controlling for principal component), K (controlling for kinship), PCA + K (controlling for both principal component and kinship), Q + K (controlling for both population structure and kinship), and naïve (without controlling for population structure and kinship) model. The naive, Q and PCA models were performed using a general linear model (GLM); the K, Q + K and P + K models were performed using a mixed linear model (MLM). Both GLM and MLM were implemented in TASSEL 5.2.1 [62]. For each trait, the optimal model was selected based on the distribution of the -log 10 (P) values of each SNP against the expected value in a Quantile-quantile (Q-Q) plot [63]. A uniform threshold for the significant SNPs-trait association was set to −log 10 (P) = 4.50 and a Manhattan plot was generated in the R package qqman [64]. Genes within~250 kb upstream and downstream to the associated SNPs on Asubgenome and~800 kb on C-subgenome were selected for identification of candidates.

Transcriptome sequencing and identification of differentially expressed genes
Three high-wax load lines (HW) and three low-wax load lines (LW) were selected from the GWAS population for transcriptome sequencing, including Zhongshuang11, Shilijia, and Yangyou6 (HW) and SWU68, Tonglinghuaye, and Shengguang77 (LW). Leaf epidermis was collected from HW and LW lines, separately, with three biological replicates. In each replicate, two independent plants were sampled from each line and pooled together. Epidermal peels were manually dissected from leaves as a thin transparent film. Total RNA were extracted from peel samples for sequencing and quantitative reversetranscription polymerase chain reaction (qRT-PCR). Sequencing library preparation and sequencing reactions were conducted at the Biomarker Technologies Corporation (Beijing, China). Sequencing libraries were constructed using NEBNext UltraTM RNA Library Prep Kit (NEB, USA). Subsequently, these libraries were sequenced on an Illumina platform and paired-end reads were generated.
Raw reads were transformed into clean reads after removing reads containing adapter, reads containing ploy-N and low-quality reads. These clean reads were then mapped to the B. napus reference genome sequence using Hisat2 tools soft. Quantification of gene expression levels were estimated by fragments per kilobase of transcript per million fragments mapped. Genes with significantly differential expression between HW and LW were identified based on the following criteria: false discovery rate (FDR) < 0.001 and absolute fold change ≥4. Transcription factor prediction was performed using BMKCloud (www.biocloud.net).
GO and KEGG enrichment analysis of differentially expressed genes (DEGs) To assess the biological significance of DEGs, GO enrichment analysis was implemented by the GOseq R packages based Wallenius non-central hyper-geometric distribution [65], which can adjust for gene length bias in DEGs. Statistically significant GO terms were obtained based on Kolmogorov-Smirnov like test. KEGG pathway enrichment analysis was performed by using the KOBAS software [66].

Validation of DEGs by qRT-PCR
According to the results of combining GWAS with RNA-seq analysis, ten candidate genes were selected for qRT-PCR analysis. The primers for qRT-PCR are listed in Additional file 14: Table S7, and the B. napus actin 7 gene (BnACT7) was used as the internal control. The total RNA was the same as used for RNA sequencing. The qRT-PCR assay was performed on a Bio-Rad CFX96 Real-Time PCR Detection System using the SYBR Premix Ex TaqII (Takara, Beijing, China). The relative gene expression levels were calculated using the 2 −ΔΔCt method. Three independent biological replicates, each with two technical replicates were analyzed for HW and LW, respectively.