A consensus and saturated genetic map provides insight into genome anchoring, synteny of Solanaceae and leaf- and fruit-related QTLs in wolfberry (Lycium Linn.)

Background Lycium Linn. (Solanaceae) is a genus of economically important plants producing fruits and leaves with high nutritional value and medicinal benefits. However, genetic analysis of this plant and molecular breeding for quality improvement are limited by the lack of sufficient molecular markers. Results In this study, two parental strains, ‘Ningqi No. 1’ (Lycium barbarum L.) and ‘Yunnan Gouqi’ (Lycium yunnanense Kuang et A.M. Lu), and 200 F1 hybrid individuals were resequenced for genetic analysis. In total, 8,507 well-selected SNPs were developed, and a high-density genetic map (NY map) was constructed with a total genetic distance of 2,122.24 cM. A consensus genetic map was established by integrating the NY map and a previously published genetic map (NC map) containing 15,240 SNPs, with a total genetic distance of 3,058.19 cM and an average map distance of 0.21 cM. The 12 pseudochromosomes of the Lycium reference genome were anchored using this consensus genetic map, with an anchoring rate of 64.3%. Moreover, weak collinearities between the consensus map and the pepper, potato, and tomato genomes were observed. Twenty-five stable QTLs were identified for leaf- and fruit-related phenotypes, including fruit weight, fruit longitude, leaf length, the fruit index, and the leaf index; these stable QTLs were mapped to four different linkage groups, with LOD scores ranging from 2.51 to 19.37 and amounts of phenotypic variance explained from 6.2% to 51.9%. Finally, 82 out of 188 predicted genes underlying stable QTLs for fruit-related traits were differentially expressed according to RNA-seq analysis. Conclusions A chromosome-level assembly can provide a foundation for further functional genomics research for wolfberry. The genomic regions of these stably expressed QTLs could be used as targets for further fine mapping and development of molecular markers for marker-assisted selection (MAS). The present study provided valuable information on saturated SNP markers and reliable QTLs for map-based cloning of functional genes related to yield and morphological traits in Lycium spp. Supplementary Information The online version contains supplementary material available at 10.1186/s12870-021-03115-1.


Variation calling and annotation
Whole-genome resequencing generated a total of 4,549 million clean paired-end reads, 23,968 and 227,408 million in the female and male parents, respectively, with a Q30 value of 92.06%, indicating that high-quality source data were generated. The average depths in the female and male parents were 34⋅ and 18⋅, respectively, and the average depth in the offspring was 2.58⋅ (Supplementary Table 1). All clean reads were mapped onto the scaffolds of the Lycium reference genome, with 9,015,078 SNPs and 1,317,594 InDels called between the parents. The variation maps of SNPs and InDels are shown in Fig. 2 (Table 2). To further integrate our published map [14] , a consensus genetic map was constructed, which contained 15,240 SNPs with a total genetic distance of 3,058.19 cM and an average map distance of 0.21 cM (Table 2 and Supplementary Fig. 2B). Compared with the NY map, this consensus map harbored 6,733 more SNPs and 0.04 cM less average distance, indicating a higher resolution and likely the genetic map with the largest number of SNP markers for ligneous plants. In addition, the maximum gap was also somewhat narrowed. The consensus map represented a comprehensive improvement. Genetic map-assisted genome assembly High-density linkage maps can assist in chromosome-level genome assembly. To assemble the genome of Lycium at the chromosome level on the basis of our high-density consensus genetic map, we used ALLMAPS. Finally, ~ 1.21 Gb of scaffolds were mounted to 12 pseudochromosomes of Lycium, accounting for ~ 64.3% of the genome sequence, 51.3% of which were oriented ( Fig. 3; Supplementary Fig. 3; Table 3). The longest pseudochromosome was LG10, with a total length of 132.33 Mb, whereas only 57 Mb and 57.69 Mb were mounted onto LG05 and LG01, respectively, in line with the trends in SNP numbers in the genetic map. More scaffolds were not mounted (783,254), among which scaffolds < 1 kb accounted for 99.3% (777,737/783,254).
ALLMAPS scaffolding was performed by inferring and maximizing the collinearity between the genetic map and scaffolds/contigs. By comparing the collinearities between all LGs and pseudochromosomes, we found that certain collinearities occurred between each LG and the corresponding pseudochromosomes. Pseudochromosome 04 showed the lowest P value. Moreover, the P values of collinear pseudochromosomes 01, 02, 08, 09, and 11 were all > 0.8 ( Supplementary Fig. 3).

QTL mapping
Using the resequencing genetic map and continuous phenotypic data from 2016 to 2018, a large number of QTLs responsible for seven agronomic traits were mapped. The QTLs were distributed in all the LGs of wolfberry except LG08, with phenotypic variance explained (PVE) values ranging from 6-73.6%, and the maximum LOD value was 21.39. The fruit-related QTLs were mapped mainly to LG06, LG07, LG09, LG10, and LG12, whereas the leaf-related QTLs were located mainly in LG01, LG06, LG09, and LG10 (Supplementary Table 2).
We further screened stable QTLs detected in at least two years, and a total of 25 such QTLs were identi ed (Supplementary Table 3 predicted genes within 150 kb upstream and downstream of the markers in stable QTLs. To further explore the expression of stable QTLs for fruit-related traits, we performed RNA-seq and found that 82 out of 188 predicted genes showed differential expression (Supplementary Table 4). These genes represent valuable resources for further gene cloning and marker-assisted selection (MAS).

Discussion
A high-density genetic map can provide valuable information for deciphering the genetic basis of important and complex agronomic traits. With the rapid development of sequencing technology, SNPs and InDels have been mined for high-density linkage map construction, QTL dissection, candidate gene discovery and breeding applications in crops [10,15,16]. Whole-genome resequencing for SNP genotyping in biparental segregation populations has been successfully applied to plants [17][18][19]. Because of our complete de novo assembled genome sequence of Lycium (Cao et al., unpublished 2021), it has become possible to use whole-genome resequencing for genome-wide SNP discovery and high-density genetic map construction in Lycium. In this study, whole-genome resequencing resulted in 9,015,078 SNPs and 1,646,131 InDels, and a high-density genetic map containing 8,507 SNPs spanning 2,122.24 cM with an average distance of 0.25 cM between adjacent markers was constructed based on a hybrid F 1 population (NY map). However, the maximum gap in the integrated map was 19.41 cM, in LG 11. This indicated that there was recombination or undeveloped markers in this area [20]. Compared with that in the rst sequencing-based genetic map of Lycium that we constructed [14], the number of F 1 individuals was reduced by 100 (the number in the rst population containing 302 individuals), whereas the average distance was very similar to that of the published genetic map. Nevertheless, the total number of SNPs and total map distance of this resequencing-based linkage map were higher, indicating higher resolution for resequencing than for reducedrepresentation sequencing. The consensus map developed by integrating these two high-density maps indicated that a saturated genetic map had been obtained for Lycium.
High-density genetic maps can improve contig/scaffold assembly at the chromosome level [21][22][23]. Generally, a genetic map can distribute 60-90% of all the assembled contigs/scaffolds on pseudochromosomes [24][25][26]. Given the increased marker density, the distribution rate of the consensus genetic map will increase accordingly [23,27]. In this study, a high-density genetic map of Lycium was constructed on the basis of resequencing, and a consensus map was produced by integrating the high-density genetic map and a previously published map [14]. The nal genetic map contained 15,240 markers in 12 LGs. Based on this consensus genetic map, the scaffold Lycium genome (Cao et al., unpublished 2021) was mounted onto 12 pseudochromosomes with a distribution rate of 64.30%, which is relatively low. There were 783,254 unmounted scaffolds, which was 476.72 times the number of anchored scaffolds.
In addition, the average length of the unplaced scaffolds was 0.86 kb. Therefore, we speculate that a fragmented genome assembly is the cause of the low distribution rate. For the Lycium genome used in this study, the distribution rate could bene t from the use of PacBio long-read sequencing [28], a BioNano optical map [24], or Hi-C technology [29].
To gain insights into the evolutionary history of wolfberry, we conducted collinearity analyses on the genomes of wolfberry, tomato, pepper, and potato. Similar to previous results [14], the genetic map of wolfberry showed low collinearity with these three genomes overall, suggesting greater differentiation of wolfberry. However, high collinearity was observed in some cases, such as segments on chromosomes 02, 04, 06, 10, and 11 of wolfberry for wolfberrypepper pairs and wolfberry-tomato pairs, suggesting that these intervals are highly conserved among different species of Solanaceae.
Fruits and leaves are the main medicinal and edible parts of Lycium spp. [30]. In our study, many QTLs (Supplementary Table 2) were identi ed for fruit and leaf traits based on phenotypic data collected over 3 years. Most of the measured traits were located at more than two QTLs, with QTLs for the FI mapped to LG04, LG09, LG11, and LG12 at different positions. In addition, we found a large number of QTLs linked to two or three traits, with those for the FW, FL and FI traits mapped to 37.344 cM in LG10 with LOD and PVE values up to 16.02 and 62.5%, respectively. These results indicated that fruit and leaf traits in Lycium are controlled by multiple loci and single loci with pleiotropic effects [31].
Stable QTLs are valuable and useful for MAS-based breeding programs and have been identi ed in perennials [32]. In this study, 25 stable QTLs associated with two leaf and three fruit traits were identi ed in LG07, LG09, LG10, and LG11 (Supplementary Table 3), four of which were identi ed in two LG10 regions (38.6 cM and 66.3 cM) with a high LOD value (15.7-45.3) and were related to three traits (LI, FW, and FI). In addition, in a previous study [14], two stable QTLs for FW were located in the same two LG regions (133.6 cM and 146.4 cM) as two stable QTLs for FD, which might indicate that these stable QTLs (qLI10-10, qFW10, qFI10-2, and qFI10-5) show high reliability in different environments and should be considered major QTLs. Of note, given these major QTLs, there is a genetic basis for the phenotypic correlation between FW and the FI, which is consistent with the strong correlation observed between the two traits in phenotypic analyses ( Supplementary Fig. 1A-C). Therefore, we speculate that two regions (38.6 cM and 66.3 cM) in LG10 play crucial roles in regulating Lycium fruit growth and development. The SNPs underlying these QTLs could be veri ed using the 100 remaining F 1 individuals in the future, and the tightly linked SNPs could be converted into kompetitive allele-speci c PCR (KASP) markers and potentially used as early selection markers. Weeds were managed manually.

Mapping population construction and phenotyping
The leaf-and fruit-related traits were measured in the F 1 population (NY) and the two parents. LL was the maximum distance between the leaf base and tip. LW was the widest distance across the leaf. FW was the weight of one mature fruit. FL was the maximum distance between the top and bottom of the fruit. FD was the widest distance across a fruit. LL, LW, FW, FL, and FD were measured according to methods described elsewhere [33]. LL, LW, FL, and FD were measured using Vernier calipers, whereas FW was acquired using an electronic balance (SE602F, Ohaus, NJ, USA). LI and FI were calculated according to Equations (i) and (ii): (i) LI = LL/LW [33] and (ii) FI = FL/FD [33]. In total, 30 leaves and fruits collected from each tree for 3 consecutive years (from 2016 to 2018) were used to obtain phenotypic data. The average values of each trait per individual were used for QTL analysis. Complex variance analysis, variance analysis, and correlation analysis were carried out using SPSS V17.0 software (SPSS Inc., Chicago, IL, USA).

Population resequencing and genotyping
Genomic DNA was extracted from the young leaves of both parents and 200 F 1 plants using a TRIzol Kit (Tiangen Biotech, Beijing, China). DNA concentration was measured using a NanoDrop spectrophotometer (ND2000, Thermo Fisher Scienti c, USA), and DNA quality was monitored by electrophoresis on 0.85% agarose gels. The genomic DNA was sheared into 350-bp fragments using an S2/E210 Ultrasonicator (Covaris, USA). The puri ed products were then ligated for end repair, subjected to 3'A and adaptor addition, and selected according to fragment size on a 1% agarose gel. The Raw reads were ltered to generate high-quality clean reads by (i) removing adaptor sequences, (ii) ltering reads with > 10% unidenti ed nucleotides, and (iii) removing reads with > 50% bases with a low Phred quality score (≤ 10). Burrows-Wheeler Aligner [34] was used to align the clean reads to the Lycium genome (Cao et al., unpublished 2021), and duplicates were identi ed using Picard (Picard: http://sourceforge.net/projects/picard/). SNPs and insertions and deletions (InDels; 1-5 bp) were called using GATK software [35] and then annotated by SnpEff [36]. Genome variation maps of SNPs and InDels were drawn by Circos [37]. SNP genotypes were encoded using biallelic coding rules and eight genotyping patterns (aa × bb, ab × cd, ef × eg, hk × hk, lm × ll, nn × np, ab × cc, and cc × ab). All patterns except aa × bb were selected to construct a high-density genetic map for a cross-pollinated (CP) population.

Genetic linkage map construction and QTL mapping
The resulting SNPs were further screened. Speci cally, hk×hk and nn×np segregation-type SNPs with a depth < 6⋅ in the parents, an integrity < 60%, and signi cant segregation distortion (SD) (Chi-square P < 0.05) were ltered out, whereas the remaining segregation types with a depth < 10⋅ in the parents, an integrity < 70%, and highly signi cant SD (chi-square P < 0.01) were ltered out. SNP markers were arranged into linkage groups (LGs) based on pairwise modi ed logarithm of odds (MLOD) scores. Markers with MLOD scores > 5 were assigned to a single LG. SMOOTH algorithms [38] were used to correct genotypes, and then the k-nearest neighbor approach [39] was used for genotype imputation. The JoinMap software (V4.1) mapping function for the CP population type was applied for linear arrangement within LGs. Map distances were estimated using the Kosambi mapping function [40]. A heat map of adjacent SNP relationships was generated using R (www.r-project.org/). MapQTL V6.0 with the interval mapping (IM) model was used for QTL analyses [41], and the LOD threshold value was set to 2.5. QTLs with a threshold LOD value > 3.0 and PVE > 10% were considered major QTLs.
Genetic map-assisted genome scaffold assembly and genome synteny analyses SLAF markers of the published genetic map (NC map) [14] and SNPs identi ed in this study were aligned to the Lycium scaffold-level genome assembly (Cao et al., unpublished 2021) using BLAT software [42]. The corresponding relationship and shared markers of these two LGs were extracted according to the locations of SLAF and SNP markers. An integrated genetic map was constructed using BioMercator v4.2 [43]. This integrated genetic map was used to anchor the scaffolds of the Lycium genome at the chromosome level using ALLMAPS software [44]. The SNPs integrated genetic map were aligned to the genomes of solanaceous relatives, namely, pepper (http://peppersequence.genomics.cn/page/species/download.jsp #5), tomato (ftp://ftp.solgenomics.net/tomato_genome/assembly/build_2.50/), and potato (http://solanaceae.plantbiology. msu.edu /pgsc_download.shtml), using BLAT [42] , and the physical positions of the homologous sequence were used to generate a collinearity diagram in R.

Avaiability of data
All the sequencing clean data were uploaded to the China National GeneBank DataBase (CNP0001536). However, the data will be made public on 1 April 2022. Before 1 April 2022, the datasets are available from the corresponding author on reasonable request.
Authors' contributions ZJ and AW conceived and designed the experiments; LH, YY, HT and ZB performed F 1 population construction and phenotying; WY and LY performed the RNA isolation and RNA-seq data analysis; XY performed the genetic analysis of the plant populations, genetic map analysis and QTL mapping; ZJ and CY wrote the manuscript and ZJ revised the manuscript. All authors have read and approved the nal manuscript.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.

Competing interests
The authors declare that they have no competing interests.
showing differences in fruits and leaves.  The right panel represents the correlations between the consensus map and the reference genome. The ρ-values represent the Pearson correlation coe cients.