A consensus and saturated genetic map provides insight into genome anchoring, synteny of Solanaceae and leaf- and fruit-related QTLs in wolfberry (Lycium Linn.)

Lycium Linn. (Solanaceae) is a genus of economically important plants producing fruits and leaves with high nutritional value and medicinal benefits. However, genetic analysis of this plant and molecular breeding for quality improvement are limited by the lack of sufficient molecular markers. In this study, two parental strains, ‘Ningqi No. 1’ (Lycium barbarum L.) and ‘Yunnan Gouqi’ (Lycium yunnanense Kuang et A.M. Lu), and 200 F1 hybrid individuals were resequenced for genetic analysis. In total, 8,507 well-selected SNPs were developed, and a high-density genetic map (NY map) was constructed with a total genetic distance of 2,122.24 cM. A consensus genetic map was established by integrating the NY map and a previously published genetic map (NC map) containing 15,240 SNPs, with a total genetic distance of 3,058.19 cM and an average map distance of 0.21 cM. The 12 pseudochromosomes of the Lycium reference genome were anchored using this consensus genetic map, with an anchoring rate of 64.3%. Moreover, weak collinearities between the consensus map and the pepper, potato, and tomato genomes were observed. Twenty-five stable QTLs were identified for leaf- and fruit-related phenotypes, including fruit weight, fruit longitude, leaf length, the fruit index, and the leaf index; these stable QTLs were mapped to four different linkage groups, with LOD scores ranging from 2.51 to 19.37 and amounts of phenotypic variance explained from 6.2% to 51.9%. Finally, 82 out of 188 predicted genes underlying stable QTLs for fruit-related traits were differentially expressed according to RNA-seq analysis. A chromosome-level assembly can provide a foundation for further functional genomics research for wolfberry. The genomic regions of these stably expressed QTLs could be used as targets for further fine mapping and development of molecular markers for marker-assisted selection (MAS). The present study provided valuable information on saturated SNP markers and reliable QTLs for map-based cloning of functional genes related to yield and morphological traits in Lycium spp.

varieties of Lycium occur in China [2]; of these, Lycium barbarum ('goji berry' or Chinese wolfberry) has been domesticated and widely cultivated in Northwest China for > 600 years [2,3]. The edible fruits and leaves of L. barbarum are used as functional foods and traditional Chinese medicinal herbs in China [4,5]. Many compounds from L. barbarum fruits and leaves, including flavonoids, carotenoids, and polysaccharides, have been reported to be closely associated with the health-enhancing effects of this species [5,6]. However, it is difficult to improve the quality of Lycium because of the unclear molecular genetic mechanisms underlying Lycium fruit and leaf traits.
Next-generation sequencing (NGS) coupled with the growing number of reference genome sequences presents an opportunity to redesign genotyping strategies for more effective genetic mapping and genome analysis [7], which can result in ultra-high-density linkage map construction and localized quantitative trait loci (QTLs) for multiple traits [8,9]. Resequencing and high-density genetic mapping in crops with complete genome sequences identified days to heading8 (Dth8) and lax panicle1 (Lax1) as candidate genes in rice [10] and sequence alterations in a novel ion transporter gene (GmCHX1) inducing salt tolerance in wild soybean [11]. Moreover, structural variations were reported in allotetraploid cotton [12]. With the decreasing costs of sequencing technologies, wholegenome sequencing has been applied to an increasing number of plant species; in addition, the numerous single nucleotide polymorphism (SNP) markers developed by aligning resequencing data to the corresponding reference genome can provide a powerful approach for deciphering the genetic basis of complex traits and for large-scale gene discovery [13].
The first sequencing-based linkage map for Lycium was constructed by specific length amplified fragment sequencing (SLAF-seq) using a diploid F 1 population derived from a cross between 'Ningqi No. 1' (NN) and 'Chinese gouqi' , and 18 stable leaf and fruit QTLs were mapped onto the resulting genetic map [14]. Recently, a 1,891-Mb Lycium genome sequence (Cao et al., unpublished 2021) provided an opportunity to develop SNP markers for population genotyping. In the present study, we used an F 1 population (Fig. 1) of Lycium with the shared parent 'Ningqi No. 1' to map QTLs for agronomic traits. Genotyping was performed using resequencing followed by SNP identification. The resulting SNPs were used to construct a high-density linkage map and an integrated consensus map. Using these maps, we were able to map yield-related QTLs in Lycium. Such QTLs and closely linked markers could then be used for molecular breeding to improve Lycium yield and quality.

Leaf and fruit trait variation
Seven leaf-and fruit-related phenotypic traits were measured for three continuous years from 2016 to 2018. The coefficient of variation of most phenotypic traits was > 10%, with the highest value for fruit index (FI) in 2018 (30%) and the lowest (9%) for fruit diameter (FD)  (Table 1), indicating that all traits showed natural variation in the F 1 population. The seven traits were normally or partially normally distributed (Table 1 and Fig. 2A-C). Correlation analysis revealed significant or extremely significant positive correlations between leaf width (LW) and leaf length (LL), leaf index (LI) and LL, fruit longitude (FL) and LL, FI and LL, and LI and FI (P < 0.05) and between single-fruit weight (FW) and FL, FW and FD, and FI and FL (P < 0.01), respectively, whereas a significant or extremely negative correlation was observed between pairs LI and LW and LI and FD (P < 0.05) and between FI and FD (P < 0.01), respectively ( Fig. 2A-C).

Variation calling and annotation
Whole-genome resequencing generated a total of 4,549 million clean paired-end reads, 23,968 and 227,408 million in the female and male parents, respectively, with a Q30 value of 92.06%, indicating that high-quality source data were generated. The average depths in the female and male parents were 34 × and 18 × , respectively, and the average depth in the offspring was 2.58 × (Supplementary Table 1). All clean reads were mapped onto the scaffolds of the Lycium reference genome, with 9,015,078 SNPs and 1,317,594 InDels called between the parents.
The variation maps of SNPs and InDels are shown in Fig. 3. The SNP density was 4,880 per Mb, and the InDel density was 714 per Mb. Most annotated SNPs (63.74%) were located in intergenic regions, whereas in the coding sequence (CDS) region, most SNPs were nonsynonymous (54.66%). Similar to the SNPs, more than half of the InDels (53.87%) were annotated in intergenic regions, whereas 1.37% were located in the CDS. Of the CDS InDels, 60.77% gave rise to frameshift mutations (Fig. 3). Of all the SNPs, 8,734,495 were successfully classified into eight genotyping patterns, and a set of 3,451,010 SNPs (excluding those with pattern type aa × bb) were used to construct a high-density genetic map of Lycium.

Construction of an ultradense genetic map and a consensus genetic map
To guarantee a set of high-quality SNP markers, SNPs with a depth in the parents < 10 × , SNPs with an integrity ≤ 70%, and highly significant SNPs with SD (P < 0.01) were filtered out. Finally, a set of 10,446 SNPs was used to construct a high-density genetic map of wolfberry, into which 8,507 SNPs were successfully integrated (Supplementary Fig. 1A). The integrated genetic map included 12 linkage groups (LGs) with a total genetic distance of 2,122.24 cM and an average map distance of 0.25 cM.

Genetic map-assisted genome assembly
High-density linkage maps can assist in chromosomelevel genome assembly. To assemble the genome of Lycium at the chromosome level on the basis of our highdensity consensus genetic map, we used ALLMAPS. Finally, ~ 1.21 Gb of scaffolds were mounted to 12 pseudochromosomes of Lycium, accounting for ~ 64.3% of the genome sequence, 51.3% of which were oriented ( Fig. 4; Supplementary Fig. 2; Table 3). The longest pseudochromosome was LG10, with a total length of 132.33 Mb, whereas only 57 Mb and 57.69 Mb were mounted onto LG05 and LG01, respectively, in line with the trends in SNP numbers in the genetic map. More scaffolds were not mounted (783,254), among which scaffolds < 1 kb accounted for 99.3% (777,737/783,254). ALLMAPS scaffolding was performed by inferring and maximizing the collinearity between the genetic map and scaffolds/contigs. By comparing the collinearities between all LGs and pseudochromosomes, we found that certain collinearities occurred between each LG and the corresponding pseudochromosomes. Pseudochromosome 04 showed the lowest P value. Moreover, the P values of collinear pseudochromosomes 01, 02, 08, 09, and 11 were all > 0.8 ( Supplementary Fig. 2).

QTL mapping
Using the resequencing genetic map and continuous phenotypic data from 2016 to 2018, a large number of QTLs responsible for seven agronomic traits were mapped. The QTLs were distributed in all the LGs of wolfberry except LG08, with phenotypic variance explained (PVE) values ranging from 6% to 73.6%, and the maximum LOD value was 21.39. The fruit-related QTLs were mapped mainly to LG06, LG07, LG09, LG10, and LG12, whereas the leafrelated QTLs were located mainly in LG01, LG06, LG09, and LG10 (Supplementary Table 2).
We further screened stable QTLs detected in at least two years, and a total of 25 such QTLs were identified (Supplementary Table 3). We identified 13 QTLs for the leaf index (LI), and these QTLs were located in LG10 from 45.967 cM to 73.823 cM (27.856 cM), including 44 SNPs. The largest LOD (13.13) and PVE (50.5%) values were observed for qLI10-4 and qLI10-12, respectively (Fig. 6). Some LI QTLs in LG10 were gathered tightly with an average interval of less than 0.63 cM per marker, indicating that these QTLs might belong to the same QTL (Fig. 6). For LL, two QTLs (qLL9-1 and qLL9-2) were mapped to LG09 and supported by  We extracted predicted genes within 150 kb upstream and downstream of the markers in stable QTLs. To further explore the expression of stable QTLs for fruitrelated traits, we performed RNA-seq and found that 82 out of 188 predicted genes showed differential expression (Supplementary Table 4). These genes represent valuable resources for further gene cloning and marker-assisted selection (MAS).

Discussion
A high-density genetic map can provide valuable information for deciphering the genetic basis of important and complex agronomic traits. With the rapid development of sequencing technology, SNPs and InDels have been mined for high-density linkage map construction,   [20]. Compared with that in the first sequencing-based genetic map of Lycium that we constructed [14], the number of F 1 individuals was reduced by 100 (the number in the first population containing 302 individuals), whereas the average distance was very similar to that of the published genetic map. Nevertheless, the total number of SNPs and total map distance of this resequencing-based linkage map were higher, indicating higher resolution for resequencing than for reduced-representation sequencing. The consensus map developed by integrating these two highdensity maps indicated that a saturated genetic map had been obtained for Lycium. High-density genetic maps can improve contig/scaffold assembly at the chromosome level [21][22][23]. Generally, a genetic map can distribute 60-90% of all the assembled contigs/scaffolds on pseudochromosomes [24][25][26]. Given the increased marker density, the distribution rate of the consensus genetic map will increase accordingly [23,27]. In this study, a high-density genetic map of Lycium was constructed on the basis of resequencing, and a consensus map was produced by integrating the high-density genetic map and a previously published map [14]. The final genetic map contained 15,240 markers in 12 LGs. Based on this consensus genetic map, the scaffold Lycium genome (Cao et al., unpublished 2021) was mounted onto 12 pseudochromosomes with a distribution rate of 64.30%, which is relatively low. There were 783,254 unmounted scaffolds, which was 476.72 times the number of anchored scaffolds. In addition, the average length of the unplaced scaffolds was 0.86 kb. Therefore, we speculate that a fragmented genome assembly is the cause of the low distribution rate. For the Lycium genome used in this study, the distribution rate could benefit from the use of PacBio long-read sequencing [28], a BioNano optical map [24], or Hi-C technology [29].
To gain insights into the evolutionary history of wolfberry, we conducted collinearity analyses on the genomes of wolfberry, tomato, pepper, and potato. Similar to previous results [14], the genetic map of wolfberry showed low collinearity with these three genomes overall, suggesting greater differentiation of wolfberry. However, high collinearity was observed in some cases, such as segments on chromosomes 02, 04, 06, 10, and 11 of wolfberry for wolfberry-pepper pairs and wolfberry-tomato pairs, suggesting that these intervals are highly conserved among different species of Solanaceae. In addition, marker density was associated with collinearity as more markers meant a greater probability of markers aligning to the corresponding reference genome sequences. The markers on a high-density genetic map were typically developed by reduced-representation sequencing [14,30] or re-sequencing [17][18][19]31], which could yield thousands of randomly distributed markers. But the reference genomes of plants always harbor hundreds of Mbp at least. So only the collinearity of conserved regions could be detected using a sequencing-based high density genetic map and we believe that a comprehensive genomic synteny could be realized using a genome assembly of wolfberry.
Fruits and leaves are the main medicinal and edible parts of Lycium spp. [32]. In our study, many QTLs (Supplementary Table 2) were identified for fruit and leaf traits based on phenotypic data collected over 3 years. Most of the measured traits were located at more than two QTLs, with QTLs for the FI mapped to LG04, LG09, LG11, and LG12 at different positions. In addition, we found a large number of QTLs linked to two or three traits, with those for the FW, FL and FI traits mapped to 37.344 cM in LG10 with LOD and PVE values up to 16.02 and 62.5%, respectively. These results indicated that fruit and leaf traits in Lycium are controlled by multiple loci and single loci with pleiotropic effects [33].
Stable QTLs are valuable and useful for MAS-based breeding programs and have been identified in perennials [34]. In this study, 25 stable QTLs associated with two leaf and three fruit traits were identified in LG07, LG09, LG10, and LG11 (Supplementary Table 3), four of which were identified in two LG10 regions (38.6 cM and 66.3 cM) with a high LOD value (15.7-45.3) and were related to three traits (LI, FW, and FI). In addition, in a previous study [14], two stable QTLs for FW were located in the same two LG regions (133.6 cM and 146.4 cM) as two stable QTLs for FD, which might indicate that these stable QTLs (qLI10-10, qFW10, qFI10-2, and qFI10-5) show high reliability in different environments and should be considered major QTLs. Of note, given these major QTLs, there is a genetic basis for the phenotypic correlation between FW and the FI, which is consistent with the strong correlation observed between the two traits in phenotypic analyses (Fig. 1A-C). Therefore, we speculate that two regions (38.6 cM and 66.3 cM) in LG10 play crucial roles in regulating Lycium fruit growth and development. The SNPs underlying these QTLs could be verified using the 100 remaining F 1 individuals in the future, and the tightly linked SNPs could be converted into kompetitive allele-specific PCR (KASP) markers and potentially used as early selection markers.

Mapping population construction and phenotyping
A hybrid wolfberry population derived from a cross between 'Ningqi No. 1' (NN) (L. barbarum L.) and 'Yunnan Gouqi' (YG) (Lycium yunnanense Kuang et A.M. Lu) was generated in August 2014. The female parent NN is a major artificial breeding cultivar in Northwest China. Its fruit is bright red and elliptical with a sweet taste, and its leaves are lanceolate. The male parent YG is a wildtype wolfberry with dark red, long, oval, bitter-tasting fruit (Fig. 1). Seeds of the F 1 hybrid and the two parents were collected and sown in the Ningxia Academy of Agriculture and Forestry Sciences National Wolfberry Germplasm Resources Garden (38°38 N, 106°90 E), Yinchuan City, Hui Autonomous Region, Ningxia, China, in May 2015. In total, 300 F 1 individuals were grown, 200 of which were randomly selected to establish the mapping population. Water and fertilizer management was the same as that used in the production field. Weeds were managed manually.
The leaf-and fruit-related traits were measured in the F 1 population (NY) and the two parents. LL was the maximum distance between the leaf base and tip. LW was the widest distance across the leaf. FW was the weight of one mature fruit. FL was the maximum distance between the top and bottom of the fruit. FD was the widest distance across a fruit. LL, LW, FW, FL, and FD were measured according to methods described elsewhere [35]. LL, LW, FL, and FD were measured using Vernier calipers, whereas FW was acquired using an electronic balance (SE602F, Ohaus, NJ, USA). LI and FI were calculated according to Equations (i) and (ii): (i) LI = LL/LW [35] and (ii) FI = FL/FD [35]. In total, 30 leaves and fruits collected from each tree for 3 consecutive years (from 2016 to 2018) were used to obtain phenotypic data. The average values of each trait per individual were used for QTL analysis. Complex variance analysis, variance analysis, and correlation analysis were carried out using SPSS V17.0 software (SPSS Inc., Chicago, IL, USA).

Population resequencing and genotyping
Genomic DNA was extracted from the young leaves of both parents and 200 F 1 plants using a Super Plant Genomic DNA DP360 Kit (Tiangen Biotech, Beijing, China). DNA concentration was measured using a Nan-oDrop spectrophotometer (ND2000, Thermo Fisher Scientific, USA), and DNA quality was monitored by electrophoresis on 0.85% agarose gels. The genomic DNA was sheared into 350-bp fragments using an S2/ E210 Ultrasonicator (Covaris, USA). The purified products were then ligated for end repair, subjected to 3' A and adaptor addition, and selected according to fragment size on a 1% agarose gel. The final library was constructed by PCR. Library quantification was performed using an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA), and the fragments of the libraries were paired-end sequenced (PE125) using the Illumina XTen platform (Illumina, San Diego, CA, USA) according to the manufacturer's recommendations.The data that support the findings of this study have been deposited into CNGB Sequence Archive (CNSA) of China National GeneBank DataBase (CNGBdb) with accession number CNP0001536.
Raw reads were filtered to generate high-quality clean reads by (i) removing adaptor sequences, (ii) filtering reads with > 10% unidentified nucleotides, and (iii) removing reads with > 50% bases with a low Phred quality score (≤ 10). Burrows-Wheeler Aligner [36] was used to align the clean reads to the Lycium genome (Cao et al., unpublished 2021), and duplicates were identified using Picard (Picard: http:// sourc eforge. net/ proje cts/ picard/). SNPs and insertions and deletions (InDels; 1-5 bp) were called using GATK software [37] and then annotated by SnpEff [38]. Genome variation maps of SNPs and InDels were drawn by Circos [39]. SNP genotypes were encoded using biallelic coding rules and eight genotyping patterns (aa × bb, ab × cd, ef × eg, hk × hk, lm × ll, nn × np, ab × cc, and cc × ab). All patterns except aa × bb were selected to construct a high-density genetic map for a cross-pollinated (CP) population.

Genetic linkage map construction and QTL mapping
The resulting SNPs were further screened. Specifically, hk × hk and nn × np segregation-type SNPs with a depth < 6 × in the parents, an integrity < 60%, and significant segregation distortion (SD) (Chi-square P < 0.05) were filtered out, whereas the remaining segregation types with a depth < 10 × in the parents, an integrity < 70%, and highly significant SD (chi-square P < 0.01) were filtered out. SNP markers were arranged into linkage groups (LGs) based on pairwise modified logarithm of odds (MLOD) scores. Markers with MLOD scores > 5 were assigned to a single LG. SMOOTH algorithms [40] were used to correct genotypes, and then the k-nearest neighbor approach [41] was used for genotype imputation. The JoinMap software (V4.1) mapping function for the CP population type was applied for linear arrangement within LGs. Map distances were estimated using the Kosambi mapping function [42]. A heat map of adjacent SNP relationships was generated using R (www.rproje ct. org/). MapQTL V6.0 with the interval mapping (IM) model was used for QTL analyses [43], and the LOD threshold value was set to 2.5. QTLs with a threshold LOD value > 3.0 and PVE > 10% were considered major QTLs.
Genetic map-assisted genome scaffold assembly and genome synteny analyses SLAF markers of the published genetic map (NC map) [14] and SNPs identified in this study were aligned to the Lycium scaffold-level genome assembly (Cao et al., unpublished 2021) using BLAT software [44]. The corresponding relationship and shared markers of these two LGs were extracted according to the locations of SLAF and SNP markers. An integrated genetic map was constructed using BioMercator v4.2 [45]. This integrated genetic map was used to anchor the scaffolds of the Lycium genome at the chromosome level using ALL-MAPS software [46]. The SNPs integrated genetic map were aligned to the genomes of solanaceous relatives, namely, pepper (http:// peppe rsequ ence. genom ics. cn/ page/ speci es/ downl oad. jsp #5), tomato (ftp:// ftp. solge nomics. net/ tomato_ genome/ assem bly/ build_2. 50/), and potato (http:// solan aceae. plant biolo gy. msu.edu /pgsc_ download.shtml), using BLAT [44] and the physical positions of the homologous sequence were used to generate a collinearity diagram in R.