Demographic history and genetic differentiation of an endemic and endangered Ulmus lamellosa (Ulmus)

Background Ulmus lamellosa (one of the ancient species of Ulmus) is an endemic and endangered plant that has undergone climatic oscillations and geographical changes. The elucidation of its demographic history and genetic differentiation is critical for understanding the evolutionary process and ecological adaption to forests in Northern China. Results Polymorphic haplotypes were detected in most populations of U. lamellosa via DNA sequencing. All haplotypes were divided into three phylogeographic clades fundamentally corresponding to their geographical distribution, namely THM (Taihang Mountains), YM (Yinshan Mountains), and YSM (Yanshan Mountains) groups. The YSM group, which is regarded as ancestral, possessed higher genetic diversity and significant genetic variability in contrast to the YSM and YM groups. Meanwhile, the divergence time of intraspecies haplotypes occurred during the Miocene-Pliocene, which was associated with major Tertiary geological and/or climatic events. Different degrees of gene exchanges were identified between the three groups. During glaciation, the YSM and THM regions might have served as refugia for U. lamellosa. Based on ITS data, range expansion was not expected through evolutionary processes, except for the THM group. A series of mountain uplifts (e.g., Yanshan Mountains and Taihang Mountains) following the Miocene-Pliocene, and subsequently quaternary climatic oscillations in Northern China, further promoted divergence between U. lamellosa populations. Conclusions Geographical topology and climate change in Northern China played a critical role in establishing the current phylogeographic structural patterns of U. lamellosa. These results provide important data and clues that facilitate the demographic study of tree species in Northern China. Supplementary Information The online version contains supplementary material available at 10.1186/s12870-020-02723-7.


Background
Examining how historical and contemporary ecological factors contribute to the demographic history and genetic differentiation of plants is a central question in ecology and evolution. Climatic oscillations have dramatically influenced the demographic history and patterns of genetic diversification in many plant species, particularly during the Pleistocene periods with more frequent glacial-interglacial cycles [1][2][3][4][5][6][7][8]. In Asia, at least four major glaciations are thought to have occurred, which likely affected its flora and fauna, although the glacial advances were not as extensive as that of Europe and North America. Northern China, in particular, experienced severe climatic oscillations throughout the Quaternary, but was never covered by massive ice-sheets [9][10][11]. The climatic oscillations caused by repeated glaciations shifted species ranges and isolated populations, which could generate distinct population structures, namely, "southern richness" and "northern purity" [12][13][14].
Meanwhile, the potential role of geology in facilitating the population divergence and species diversification of plants has been revealed [9,11,15,16]. In China, the geographical and physiognomy complexity caused by the uplift of mountains [17,18] often provided dispersal corridors for the connection, gene flow, and range expansions/contractions of fragmented populations [19,20]. Mountainous areas might serve as shelters for the range shifts of plants in response to climatic oscillations [21][22][23][24][25][26]. Some mountains, such as the Taihang, Yanshan, Yinshan, and Qinling Mountains of Northern China, are an important component of the South-North vegetation transect, and crucial biodiversity areas covered with diverse plant biomes, ranging from tropical to cold forests and taiga [27,28]. These mountains form diverse topographies and wide spectrum environmental conditions, which provided refugia for species during the global scale climatic changes of the Quaternary [29,30].
Previous researches have revealed different refugia patterns for forest trees in China, such as single refugium, multiple refugia, microrefugia, and cryptic refugia [31][32][33]. Nevertheless, most studies have focused on the populations or species in the Southern, Southwestern, and Northwestern regions of China [3,24,34], with little research being done on the Northern areas. The potential roles of ecological factors (e.g., geologic and climatic oscillations) in promoting the divergence of plant populations in Northern China remain unclear.
Ulmus lamellosa (Ulmus) is primarily distributed across and endemic in Northern China, which is an important tree in many communities and terrestrial ecosystems. The abundance of decylic acid in the seeds of this species makes it a critical medicinal and light industrial resource [35], and further, the tree trunks have rich ornamental characteristics [36].
Unfortunately, U. lamellosa populations are constantly decreasing due to environmental degradation and anthropogenic destruction. Thus, it has been classified as a Class II State-Protected Endangered Plant Species in China [37,38]. As an ancient and long-lived tree species, U. lamellosa might have existed prior to the uplift of Taihang, Yinshan, and Yanshan Mountains, and experienced the climatic oscillations of the Quaternary and even the Tertiary [39]. Consequently, this species was considered to be an ideal species for investigating the influences of geographical topology and climatic dynamics on species divergence in Northern China.
Using chloroplast derived DNA sequences, Liu et al. (2017) preliminarily investigated the molecular phylogeographical structures and biogeographic history of U. lamellosa. A significant phylogeographic pattern was found for this tree species, where the intraspecific divergence of all cpDNA haplotypes began to occur in the late Miocene. Most studied U. lamellosa populations have not experienced recent expansions. Across the distribution ranges of this species, multiple refuge areas were identified in Northern China [40]. These results provided insights into the phylogeography and demographical history of U. lamellosa. However, chloroplast genes have a relatively slower evolutionary ratio than do nuclear genes, are generally of maternal inheritance, and cannot reflect pollen flow in angiosperms [41][42][43]. Furthermore, the results based on chloroplast genes provide only limited information on the phylogeographic structures and evolutionary history of plants, particularly intra-specifically [25,44].
For this study, we proposed the hypothesis that U. lamellosa had multiple refuges in Northern China, where geological changes and climatic oscillations played important roles in the demographic history of this species. Thus, our main goals were to comparatively investigate the demographic history and genetic differentiation of U. lamellosa, analyze the phylogeographic patterns of this boreal tree, identify the refugia patterns during glacial periods, and estimate intraspecies haplotype divergence times during the evolutionary process of this species. These results would shed light not only on the effect mechanisms of climatic oscillations and orogeny of mountains (Taihang, Yanshan and Yinshan Mountains) on the genetic differentiation and distribution patterns of populations, but also offer a theoretical foundation for the investigation of other forest trees across Northern China.

Plant materials
Our field work had been approved by the Chinese government and implemented in compliance with the laws of the People's Republic of China. Each sampling participant had a permission letter from School of Life Sciences of Shanxi Normal University. By referencing to the phenotypic characteristics of the collected samples, the professor (Su Junxia, Shanxi Normal University) of systematic botany assisted in the classification and identification. Voucher specimens were kept at the Herbarium of School of Life Sciences, Shanxi Normal University (No: 20180703422-20,180,703,436).
A total of 232 individuals across fourteen natural populations of U. lamellosa were collected from its distribution range in Northern China. According to the filed sampled sites, three main geographical groups were assigned based on mountain boundaries, which were marked as THM (Taihang Mountains), YSM (Yanshan Mountains), and YM (Yinshan Mountains), respectively. Details on the geographical locations of the populations and the number of sampled individuals are presented in Additional file 1: Table S1 and Fig. 1. Individuals within each population need to be kept at least 30 m away from each other to prevent the occurrence of sampling clones. The fresh leaves of each individual were stored in silica gel at room temperature for DNA extraction.

DNA extraction, amplification, and sequencing
Through Cetyl Trimethylammonium Bromide (CTAB) methods, the total genomic DNA was extracted from fresh leaves [45,46] followed by the determination of its quality and concentration with the spectrophotometer and agarose gel electrophoresis (0.8%), respectively. The universal primer ITS4 and ITS5 used to amplify of the nuclear ribosomal internal transcribed spacer regions (nrITS: ITS1, 5.8SrDNA and ITS2), and the single-copy gene (Aat) [47,48], were selected for this study due to the high sequence variation within and among U. lamellosa populations [40].
The DNA sequences obtained from nrITS and Aat were aligned using CLUSTAL X software [49] and then adjusted manually. All indels were encoded into the data matrix according to the simple indel coding method [50], as implemented in INDELCODER [51].

Data analysis Genetic diversity and differentiation
For each population, DNASP software [52][53][54][55] was employed to calculate the nucleotide diversity (π), the number of haplotypes, and the haplotype diversity (H d ). The presence of phylogeographic structure was tested by using PERMUT [56][57][58] software (10,000 permutations) based on the parameter N ST (a distance matrix based on the number of mutational steps between haplotypes) significantly higher than G ST (haplotype frequencies). NET-WORK software [59] was utilized to infer the relationship between the differentiation and evolution of haplotypes with the maximum parsimony median-joining model. STRUCTURE software can detect the potential population structure more accurately duo to eliminating the interference of population locations [60][61][62]. With the associated gene frequency admixture model, for each K (K = 1-10), ten independent runs were conducted with an initial burn-in of 100,000 iterations and 1000,000 subsequent Markov Chain Monte Carlo (MCMC) steps. The log probability of data (LnP(D)) and the second rate of change in the likelihood distribution (ΔK) were applied to evaluate the potential clusters (K) [63]. The Q matrix was obtained by integrating the membership coefficient matrix of repeated operation from the results of STRUCTURE analysis by using CLUMPP software. And it was finally visualized as a diagram by using DIST RUCT software [60][61][62][63].
To quantify and assess the proportion of explainable genetic variance between groups and among populations within groups, an hierarchical analysis of molecular variance (AMOVA) [64] was conducted by ARLEQUIN software [54,65], and then tested the significance with 1000 permutations.

Population demographic history
Mismatch distribution and neutrality tests To infer whether there existed significant historic demographic expansion events in each geographical group, the mismatch distribution analysis was performed by ARLEQUIN software with 1000 parametric bootstraps [65,66]. The parametric SSD (the sum of squared deviations) and HRag (Harpending's raggedness index) were applied to appraise the goodness-of-fit under a sudden expansion model [67]. Meanwhile, the Tajima's D (Tajima, 1989) [68] and Fu's F S (Fu, 1997) neutrality tests [69], as two alternate method, were used to infer potential expansions events.
The time of occurrence of expansion events was calculated by the following formula [66,70]: T = τ/2 u, where τ was the parameter value for the mode of the mismatch distribution and u was the result of the formula u = μkg. The value μ, k and g represent the number of substitutions per site per year (s/s/y), the average sequence length under study, and the generation time in years, respectively [71,72]. The parameter value α (maturity age of species, about 30 years) and [S/1 − S] (about 100; S, annual survival rate of adult individuals) was utilized to estimate the expansions time by using g = a + [S/(1 − S)] [73,74].
Divergence time To relate the historic events and differentiation among U. lamellosa haplotypes, the divergence time was calculated using BEAST software with a calibrated molecular clock [44,[75][76][77][78][79]. The HKY, as the optimal model, was employed in BEAST analysis according to the Akaike information criterion (AIC) using jMODELTEST software [80,81], and coalescence with a constant size was set as the prior tree. Due to the absence of records of ITS mutation rates in U. lamellosa or other Ulmaceae family members, the minimum and maximum values of all average ITS mutation rates of angiosperms [3.2 × 10 − 10 and 9.0 × 10 − 9 substitutions per site per year (s/s/y)] was selected to date the demographic events in this study [82].
The fossil records for Ulmus (15.97-13.82 Ma, https:// www.paleobiodb.org/classic) were to be calibrated. The analysis was implemented for 50 million generations, with sampling frequency of once per 1000 generations. The log files were run in TRACER [83] and the 1000, 000 generations were discarded as burn-in to ensure that the overall ESS value met the requirements (> 200). The TREEANNOTATOR software was applied to calculated an maximum clade credibility (MCC) tree with posterior probability (PP) limit > 0.5 using after removing the first 20% of trees as burn-in. And then the FIGTREE was used to visualize the final file [79,84].
Approximate Bayesian computation ABC analysis (Approximate Bayesian Computation) [76,[85][86][87] was performed in DIY-ABC software to model the potential demographic history and to evaluate the plausible evolutionary scenario of U. lamellose [88]. To simplify the number of model settings and operation time, we pooled all U. lamellosa populations into three geographical groups (THM, YSM, and YM). First, seven population divergence scenarios were tested with 7,000,000 coalescent simulations: scenario 1 (Simple split model), scenarios 2-4 (Hierarchical split model), and scenarios 5-7 (Isolation with admixture model). For the common parameters between models, the range was the same and the prior distribution was uniform. Based on the observation data closest to 1% of the simulated data, the posterior probability of each evolutionary event was predicted and the confidence interval up to 95% could be obtained by logistic regression. The optimal evolution scenario was filtered by using the highest a posteriori probability (PP) as the selection criterion.
In order to verify a goodness fit scenario, the "model checking" function in DIYABC software, namely principal component analysis (PCA), was used to analyze the simulated datasets (100,000) in the summary statistics. Based on the pseudo-observed datasets (PODs, 1000), type I and type II error rates was evaluated using the "evaluate confidence in scenario choice" functional option of DIYABC. According the above indices, the ancestral group of the optimal scenarios was further finally inferred.
Historical and contemporary gene flow The coalescent method, implemented in MIGRATE-N software, was employed to investigate the historical gene flow pattern between the three geographical U. lamellosa groups [89][90][91][92]. The Bayesian method was used to estimate historical migration rates (M = m/μ, where m = migration rate) among groups and the effective population size (θ = 4Neμ, where μ = mutation rate) of different groups [89,93]. The eq. 4Nm = θ * M was employed to compute the number of migration per generation (Nm) of each population of U. lamellosa [89,93,94].
Furthermore, the gene flow analysis was performed under the HKY mutation model. It was run for five replications with constant mutation rates and starting parameters based on F ST calculations. Each replicate was analyzed with 3 long chains and 10 short chains. The static heating treatment scheme corresponded to four temperature levels (1, 1.2, 1.5, and 3) and 1000,000 trees was discarded as burn-in. Then, the average of the results of five independent operations, namely the historical gene flow, would be obtained with a confidence interval of 95%.
BAYESASS software was used to calculate the contemporary gene flow [90,92] between the three geographical U. lamellosa groups. This analysis was ran for 10,000, 000 iterations with a sampling frequency of 1000 following a burn-in of 1000,000. The delta values for a (allele frequencies), m (migration), and f (inbreeding) was adjusted to 0.5 for guaranteeing the fixed numbers of changes were limited within the range from 20 to 60% of the total number of iterations. The data with the lowest deviance in 20 runs were selected in the further analysis. And the 95% confidence interval was calculated by mc ± 1.96 × SD [74].
Isolation by distance and environment To assess the influences of geographic isolation (IBD) and / or environmental conditions (IBE) on the observed genetic differentiation of U. lamellosa [54,95,96], the Mantel test of correlations was performed between genetic, geographic, and environmental distances, respectively.
Pairwise F ST estimated from the ITS / Aat sequences was used as the genetic distances matrix. And the geographic distances between populations was calculated by using the GENALEX software [97]. While the matrix of the environmental distances was computed using PAS-SAGE software [98].
In addition, a multiple matrix regression with randomization (MMRR) was also performed to test whether the genetic distance responded to variations in the geographic and/or environmental distances via using the R package 'PopGenReport' [99,100]. Correlation coefficients of the Mantel test (r) and regression coefficient of the MMRR (r 2 ) and their significance were determined based on 9999 random permutations.

Phylogeographical structures and genetic differentiation
The alignment of ITS and Aat sequences from 232 U. lamellosa individuals produced a consensus sequence of 652 and 392 base pairs, respectively. For the ITS sequences, eighteen haplotypes were defined with 27 polymorphic sites, including 17 parsimonious informative sites (Table 1). Between the 18 haplotypes, the most common was H1 (in 13 of 14 populations), followed by H2 (in 3 of 14 populations). Thirteen of the 14 populations were polymorphic, whereas only one population exhibited one haplotype (MS population, Table 1 and Fig. 1a). All haplotypes presented a 'star-like' network, where the common haplotype (H1) was in the central position and regarded as the most widespread one that connected other subordinate haplotypes (Fig. 1b).
For the Aat data, twenty-three haplotypes (H1-H23) were defined, which contained 33 polymorphic sites, 26 of which were parsimonious informative sites (Table 1 and Fig. 1c). Among all of the haplotypes, the most common was H4 (in 5 of 14 populations). Eleven of the 14 populations possessed high haplotype diversity, and only one haplotype was found in the MS, LS, and WLS populations. As the most common and widespread haplotype, compared with the others s, H4 had more connections, was relatively situated in the middle of the network (Fig. 1d).
A high genetic diversity was detected for U. lamellosa. At the species level, the estimated haplotype diversity (H d ITS , H d Aat ) was 0.424 and 0.941, whereas the nucleotide diversity (π ITS , π Aat ) was 0.00212 and 0.01601, respectively. For the group level based on ITS data, the highest values of H d ITS (0.909) and π ITS (0.00939) existed in the YSM group, followed by the THM and YM groups. On the contrary, the H d Aat and π Aat values of the YSM group were slightly lower than that of the THM and YM groups via Aat data.
Between populations, the highest H d value was detected in the different populations for different data (H d ITS = 0.798 for the SFS population; H d Aat = 1.000 for the SGS, HHG, SFS, and MLG populations). The highest π ITS value also existed in the SFS population (0.00583), and the highest π Aat value was found in the SGS and HHG populations ( Table 1).
The phylogenetic relationships of the ITS haplotypes (H1-H18) constructed using the Bayesian approach, revealed that all haplotypes were clustered into three major clades I-III (Fig. 2), which was essentially consistent with the geographical population distribution (namely YSM, THM, and YM groups). Clade I contained H9-H12, which were found in the YSM group. Clade II contained H8 and H16, which were found in THM and YSM groups. Clade III consisted of twelve haplotypes (H1-7, H13-15, H17, and H18) found in the THM and YM groups. For Aat data, 23 haplotypes were also clustered into three major clades (Additional file 2: Fig. S1), which was basically consistent with that of the haplotype network. Clade I and II contained most of the haplotypes, which were found in the THM and YSM groups. Clade III contained H5-H7, H15-H17, and H23, which were found in the THM and YM groups.
No matter the ITS or Aat data, the genetic differentiation between populations was significantly higher when computed using a distance matrix (N ST ITS = 0.435; N ST Aat = 0.738) than when using haplotype frequencies (G ST ITS = 0.117; G ST Aat = 0.192; P < 0.05), which indicated a significant phylogeographic structure within U. lamellosa.
For ITS, the STRUCTURE output showed that ΔK supported the existence of three clusters (Additional file 3: Fig.  S2), which corresponded roughly with our geographic assignment to U. lamellosa (Table 1 and Fig. 1). Nevertheless, the STRUCTURE output for Aat data indicated that ΔK supported the existence of four clusters (Additional file 3: Fig. S2). However, when K = 4, the THM group was further subdivided into two clusters. Therefore, we preferred to divide the U. lamellosa into three clusters. Additionally, a higher inter-group differentiation of this species was confirmed by AMOVA analysis (Φ ST ITS = 0.550, Φ ST Aat = 0.674, P < 0.01, Table 2). Significant variation (ITS: 40.60%, Aat: 41.96%) occurred between the three groups ( Table 2).

Inference of demographic history Expansion events and divergence time
The SSD and HRag values were non-significant (P > 0.05) for all groups based on the ITS sequences. Nevertheless, the unimodal mismatch distribution for the THM group was a perfect fit to the expected expansion  (Table 3). When Ulmus pumila and Ulmus rubra were used as outgroups, the phylogenetic tree based on ITS data revealed that all of the haplotypes from U. pumila and U. rubra formed a monophyletic group, whereas the haplotypes of U. lamellosa formed another monophyletic group (Fig. 2). The haplotypes identified in the outgroups were divergent from all of the other haplotypes at~14. 89

IBD and IBE
Pairwise F ST values between populations (used as genetic distance) were correlated against the respective pairwise geographic and environmental distances (Additional file 4: Fig. S3). The positive correlation between the matrix of genetic distances and geographical distances was significant (ITS: r = 0.594, P < 0.05; Aat: r = 0.391, P < 0.05). However, the correlations between genetic distances estimated from the ITS /Aat and environmental distance were non-significant (P > 0.05).
Multiple matrix regression with randomization (MMRR) analyses also revealed a significant effect of geographical distance on the divergence of this species. Furthermore, IBD explained more genetic variation than IBE (ITS: MMRR ≈ 12.4 times as much; Aat: MMRR ≈ 3.7 times as much).

Discussion
Phylogeographic structures and genetic differentiation of U. lamellosa A high level of genetic diversity was observed for U. lamellosa (H d ITS = 0.424, π ITS = 0.00212; H d Aat = 0.941, π Aat = 0.01601), which was similar with other plants, such as Gentiana lawrencei var. farreri (H d = 0.414, π = 0.0026), and was consistent with the results of Liu et al. [40]. Compared with cpDNA data (H d = 0.783, π = 0.00891) [40], the genetic diversity index of U. lamellosa Table 3 Neutrality tests (Tajima's D and Fu's F S tests) and mismatch distribution analyses was higher than that of ITS sequences and lower than that of Aat sequences ( Table 1). The different evolutionary trajectories and rates of nuclear and plastid markers might have been the cause of different degrees of variation, and reflected different biogeographic events such as pollen flow and seed dispersal [7,16,18,44,57,101,102]. High genetic diversity might reflect the accumulation of nucleotide mutations over long evolutionary U. lamellosa time-scales [103,104]. In addition, U. lamellosa was widely distributed in Northern China (Additional file 1: Table S1 and Fig. 1). The geographic distribution of this species might also be one of the factors that determined high genetic diversity [105].
Being an ancestor of the other two groups (Fig. 3), the YSM group had a higher diversity and variation (Table 1), which suggested a smaller influence of historical climatic oscillations on this group. The Yanshan Mountains, which run roughly East-West, is an important mountain range in Northern China [106]. The heterogeneous geology of this mountain might have facilitated the existence of multiple glacial refugia for the plants [28,107]. In our study, higher haplotype diversity was detected in the YSM group with five unique haplotypes (H8-H12, H16). Thus, the YSM group could be regarded as a refugia for U. lamellosa during the glacial period.
Meanwhile, QLY, SGS, JM, and HLS populations in the THM group (Table 1) exhibited the highest genetic diversity, although this group's total genetic diversity was lower. It is possible that THM was another refugia for U. lamellosa. However, the YM group, having not only the lowest genetic diversity, but also being a descendent of THM and YSM (Fig. 3), was not regarded as a refugia in this study, which was inconsistent with Liu et al. [40].
A significant genetic differentiation between groups ( Table 2, Fig. 1b, d, Additional file 2: Fig. S1) was revealed in the studied species, which suggested a long period of isolation during the distribution of U. lamellosa. Our results also found asymmetrical and lower contemporary gene flow (ITS: 0.0058-0.0690, Aat: 0.0083-0.1636, Fig. 4) among the three groups. Meanwhile, significant isolation by distance (r ITS = 0.594, r Aat = 0.39, P < 0.05; Additional file 4: Fig. S3) was presented in populations of this species. Such findings pinpointed that, as barriers, mountains might have driven population differentiation through long-term geographic isolation. More strikingly, most haplotypes were private as single groups. This pattern more likely resulted from long-term in situ diversification between isolated montane habitats.
Moreover, a strong phylogeographic structure was found, which was consistent with Liu et al. [40]. Theoretically, phylogeographic structures often arise as a consequence of restricted gene flow between populations and genetic drift within populations over relatively long timelines [56,108,109]. With the uplift of mountains in Northern China, the U. lamellosa habitats were naturally fragmented [38]. Climatic fluctuations that occurred during inter−/postglacial would further accelerate the habitat fragmentation of this species.
Such isolation made it difficult for pollinators to locate or access flowers and thus formed potential barriers to gene flow. Further, due to natural decomposition and animal consumption, the dispersal of U. lamellosa seeds was also restricted [36,57]. The historical and modern restricted gene flow (Fig. 4) between groups supported this view. In addition, fragmentation is a consequence of isolation that can arise either from geographic or environmental factors [40]. The fragmented habitats of U. lamellosa could not counteract the effects of genetic drift. Therefore, limited gene flow and genetic drift were the primary causes of the phylogeographical structure for U. lamellosa.

Demographic population history of U. lamellosa
Neither ITS nor Aat sequences indicated that most of the U. lamellosa populations did not experience expansion, which was supported by the mismatched distribution (Table 3), as well as non-significance Tajima's D and Fu's F S values, and consistent with Liu et al. [40]. Interestingly, only the THM group (ITS data showed) of U. lamellosa experienced a recent demographic expansion that occurred~3800 years ago. This was strongly supported by ABC analyses, which revealed that the THM group had the larger effective population size (5.92 × 10 5 , 95% CI: 1.88 × 10 5_ 9.52× 10 5 ) over ancestral populations (YSM group). Furthermore, a higher gene flow occurred from THM to YM (Fig. 4).
Despite the absence of major Quaternary glaciations, significant climatic oscillations still occurred in Northern China [110][111][112]. During this period, plant communities might have experienced the upward and downward migration along the local altitude, rather than southward retreat or northward advance in latitude [113]. Therefore, U. lamellosa might have taken refuge in situ and experienced only limited expansion during the interglacial period. The geographical distribution of haplotypes also supported this expansion pattern in the historical demography of U. lamellosa, namely the occurrence of common haplotypes H1 (ITS data) and H4 (Aat data) in the populations (Fig. 1a, c).
The intraspecific divergence of all U. lamellosa ITS haplotypes (13.35 Ma, 95% HPD: 9.38-14.89 Ma) likely began during the period between Middle-Miocene and the early stage of the Late Miocene (Fig. 2). This estimated divergence time was earlier than the Late Miocene based on cpDNA data (9.27 Ma, 95% HPD: 5.17-13.33 Ma) [40]. The Miocene to Pliocene period is an important differentiation epoch for this tree species in China. The Miocene was a period of climatic instability and abrupt environmental change. A global cooling event reoccurred in the Miocene (14 Ma) [11,14,114]. Prior to and following 16~12 Ma, the monsoon circulation was significantly enhanced in Asia [14,[115][116][117]. Most areas of Northern China were gradually altered from the arid climate, which was controlled by the original planetary climate, to the present warm temperate monsoon climate system [118,119]. Thus, Northern China was not only under the control of global climate change, but also affected by the expansion of an arid climate from the West to East [120]. Paleoclimate oscillation caused by corresponding events prompted the U. lamellosa populations to migrate from the Yanshan Mountains (YSM, ancestral group) to the Taihang Mountains (THM group, the descendent).
Furthermore, the deformation of the Taihang Mountains occurred with the significant uplift of the Tibetan-Himalayan Plateau [14,[121][122][123]. The uplift height of the northern section of the Taihang Mountains was only half of the Western portion of the Yanshan Mountains during the Miocene [121]. The low height of the Taihang Mountains did not develop as a geographical barrier and did not block U. lamellosa migration, from the northwest and northeast, which is supported by the more frequent gene exchange between the three groups ( Fig. 4 and Additional file 3: Fig. S2).
From the Eocene to the end of the Pliocene, the Yinshan Mountains emerged rapidly, which might have hindered glacial activities and climatic oscillations. Consequently, the U. lamellosa populations of the Taihang Mountains began to migrate northward to form the YM group. The higher gene flow from the THM to YM group (Fig. 4) supported the above points.
Certainly, the demographic history of U. lamellosa had also been influenced by climatic oscillations and historical processes during the Quaternary [14,124], which was supported by the divergence time of Clade II (2.89 Ma, 95% HPD: 1.24-8.86) (Fig. 2). During this period, the Taihang and Yanshan Mountains were uplifted sharply along with the Himalayan orogeny movement to finally form in the Pleistocene (~2.5 Ma B.P.) [16]. Meanwhile, repeated glacial periods had occurred since the Quaternary. These events would further promote differentiation between U. lamellosa populations and lead to the current pattern of this species.

Conclusions
Higher genetic diversity and significant genetic variations occurred in U. lamellosa populations. During glaciation, the regions of Yanshan and Taihang Mountains might be regarded as refugia for this long-lived tree species. The divergence of U. lamellosa intraspecies haplotypes occurred during the Miocene-Pliocene, which was associated with major Tertiary geological and/or climatic events in Northern China. Subsequently, the series of Mountains (e.g., Yanshan and Taihang Mountains) uplifted in Northern China once Tertiary and Quaternary climatic oscillations blocked gene exchange and further promoted the divergence of populations.
In summary, the causes of the phylogeography pattern of U. lamellosa were suggested based on geological evidence and paleoclimate data. The results in this study may provide insights into how the endangered U. lamellosa species might be protected and utilizing. Further, they may play an important role in the exploration of genetic differentiation patterns and dynamic changes of other tree species, while offering a scientific basis toward understanding the evolutionary history and ecological adaption of forests in Northern China.