Analyses of DNA sequence and morphological data separately
The combined cpDNA matrix, which comprises six chloroplast regions of trnL-F, matK, rps16, atpI-atpH, trnH-psbA, and trnT-L, had aligned sequences of 5662 bp, of which 4719 (83.35 %) were constant, 560 (9.89 %) were variable but uninformative, and 383 (6.76 %) were parsimony informative. We were unable to amplify cpDNA regions from P. confluens. Modeltest indicated GTR + G as the best-fit model for the cpDNA sequence data. The strict consensus of 6 trees yielded by MP (Maximum Parsimony) analysis (L = 1182, CI = 0.884, RI = 0.873) was generally congruent with the ML (Maximum Likelihood) tree and the majority rule BI (Bayesian Inference) tree in the topology (Additional file 1: Figure S2). Support values less than 50 % are marked with asterisk.
In the nuclear DNA analysis with P. confluens added to the matrix, the ILD (incongruence length different) test gave a p value of 0.42, indicating that the sequence data from ITS and PeCYC1D were congruent. The combined nuclear DNA matrix of ITS and PeCYC1D consisted of 1662 bp, of which 1213 (72.98 %) were constant, 228 (13.72 %) were variable but uninformative, and 221 (13.3 %) were parsimony informative. Modeltest indicated GTR + G as the best-fit model for the combined nuclear DNA data. The strict consensus of eight trees from MP analysis (L = 642, CI = 0.872, RI = 0.849) was congruent with the ML tree and the majority rule consensus BI tree (Additional file 1: Figure S3).
In the combined cpDNA and nuclear DNA analysis, P. rosettifolia and P. longianthera were removed because of their obvious topological differences between cpDNA and nuclear DNA data, but P. confluens was included despite lacking cpDNA data. The ILD test gave a value of p = 0.25, indicating that the data from the two distinct genome regions excluding these two species did not contain significant incongruence. Modeltest suggested that the GTR + G model best fit the combined data. The combined datasets consisted of 7320 bp, 774 (10.57 %) of which were variable and 587 (8.02 %) parsimony informative sites. Parsimony analyses resulted in a single tree (L = 1767, CI = 0.886, RI = 0.872) which was congruent with the ML tree and the majority rule consensus BI tree (Fig. 2).
The MP-ML-BI tree of the combined cpDNA and nuclear DNA datasets was similar to the cpDNA and nuclear DNA trees but with stronger support (Figs. 2, Additional file 1: Figure S2-S3). The combined cpDNA and nuclear DNA tree comprises five main clades labeled A–E (Fig. 2). Each clade receives strong or maximum support, and they are grouped together successively by strong to maximum support (Fig. 2).
For the analysis of the morphological data, Forty-one morphological characters were coded. The strict consensus of 125 trees yielded from the MP analysis (L = 82, CI = 0.842, RI = 0.972) was congruent with the majority rule consensus BI tree (Additional file 1: Figure S4). Similar to the DNA trees, the morphological tree comprises five major clades including the same species as the molecular based trees. However, most nodes within the major five clades have weak to moderate support with frequent polytomies.
Analysis of combined DNA sequence and morphological data
In the analysis of the combined data of DNA and morphology with P. rosettifolia and P. longianthera removed, the ILD test gave a value of p = 0.082, indicating that the data from the DNA and morphological data did not contain significant incongruence. Both P. rosettifolia and P. longianthera were removed from the combined molecular and morphological analyses due to the discrepancies in the placement of these two species with ITS and cpDNA. The combined data sets consisted of 7361 bp, 774 (10.51 %) of which were variable and 628 (8.53 %) parsimony informative sites. Parsimony analyses resulted in a single tree (L = 1853, CI = 0.882, RI = 0.888) which was congruent with the majority rule consensus BI tree (Fig. 3).
The trees of the combined data set of DNA and morphology and the combined DNA data are identical in topology with only a few fluctuations in support values of some branches (Figs. 2-3). The tree of combined DNA and morphological data consists of five major clades labeled A-E with strong to maximum support, which are clustered together with maximum support (Fig. 3). Clade A, which consists of four taxa (P. kerrii var. kerrii, P. kerrii var. crinita, P. menglianensis, and P. grandifolia) of sect. Deinanthera sensu Wang (1985) [9] and one species (P. parryorum) of sect. Anisochilus sensu Wang (1985) [9], is sister to the remaining species with maximum support. The five species bear a series of synapomorphies exclusive to clade A, i.e., vestigial caulescent habit with ascendant leaves, an upper lip slightly shorter than the lower lip in length, anthers that are constricted at the tip and two dark red-brown spots on the lower side of the corolla-tube below the filaments (Figs. 1, 4). In addition, P. kerrii var. kerrii is sister to P. parryorum with maximum support, a relationship that is morphologically reflected in the shared feature of blue-violet flowers with geniculate filaments. In contrast, P. kerrii var. crinita is sister to P. grandifolia/P. menglianensis with maximum support rather than sister to the type variety of P. kerrii, consistent with their shared traits of white flowers with straight filaments. Petrocosmea kerrii var. kerrii and P. kerrii var. crinita are apparently two independent species because they are not recovered as an exclusive monophyletic group.
Clade B contains eight taxa (P. coerulea, P. begoniifolia, P. melanophthalma, P. confluens, P. hexiensis, P. duclouxii, P. sichuanensis, and P. mairei var. intraglabra) of sect. Anisochilus sensu Wang (1985) [9], and is a well-supported clade sister to clades C-D with maximum support. Petrocosmea mairei var. intraglabra and P. sichuanensis as a pair of sister species with maximum support are strongly supported to come together successively with P. duclouxii (MP-BS (bootstrap) =96 %; PP (posterior probabilities) =100 %), P. hexiensis (MP-BS = 99 %; PP =100 %), and P. confluens (MP-BS = 98 %; PP = 100 %). Petrocosmea coerulea and P. melanophthalma as sister species with moderate support (MP-BS = 78 %; PP = 98 %) are further clustered together with P. begoniifolia with MP-BS = 70 % and PP = 100 %). The two branches in clade B are further joined together with strong support (MP-BS = 97 %; PP = 100 %). The species of clade B are defined by their short upper lips with semiorbicular corolla lobes. The morphological synapomorphies of clade B also include two upper corolla lobes highly reflexed backward with two purple spots on the lower side of the corolla-tube below the filaments (Fig. 1). Apparently, P. mairei var. intraglabra is a species apart from P. mairei var. mairei which is nested in clade D (Figs. 2-3).
Clade C includes eight taxa (P. iodioides, P. martinii var. leiandra, P. martinii var. martinii, P. minor, P. sericea, P. shilinensis, P. xingyiensis and P. huanjiangensis) of sect. Anisochilus and two species (P. grandiflora and P. yanshanensis) of sect. Petrocosmea. There are two lineages in Clade C with maximum support. In one lineage, P. grandiflora and P. yanshanensis as strongly supported sister species (MP-BS = 97 %; PP = 100 %) are grouped in sequence with P. sericea (MP-BS = 98 %; PP = 100 %), P. martinii var. martini (MP-BS = 99 %; PP = 100 %), and maximally supported sister species of P. iodioides and P. martinii var. leiandra. In another lineage, P. minor and P. shilinensis are sister to each other (MP-BS = 71 %; PP = 97 %), and further grouped with P. xingyiensis by moderate support (MP-BS = 73 %; PP = 100 %), and together they are sister to P. huanjiangensis with strong support (MP-BS = 98 %; PP = 100 %).
The eight species traditionally placed in sect. Anisochilus all share a specific floral character; the two upper corolla lobes are fused nearly their entire length and each lobe is folded and rolled laterally to form a carinate-plicate shape of the upper lip that encloses the style. In the traditional classification, the upper lip of these species is only described by the phrase “indistinctly 2-lobed, emarginate, or undivided”. This specific structure of the upper lip is first recognized herein in Petrocosmea (Fig. 1). Petrocosmea grandiflora and P. yanshanensis as a pair of sister species exhibit a series of floral characters distinctively different from other species of clade C (Fig. 5). These two species have striking similarities to species of clade E in the external appearance of the corolla (Fig. 5), the reason that they all had been formerly placed in sect. Petrocosmea. Nevertheless, the highly fused upper lips in the flowers of P. grandiflora and P. yanshanensis as the synapomorphy shared with other species of clade C hint at membership in clade C. The similarity between these two species and members of clade E is likely the result of floral convergent evolution. Clade C is sister to clades D and E with maximum support.
Clade D comprises six taxa (P. forrestii, P. mairei var. mairei, P. barbata, P. cavaleriei, P. xanthomaculata, and P. longipedicellata) of sect. Anisochilus and two newly described species P. nanchuanensis and P. glabristoma with strong support (MP-BS = 98 %; PP = 100 %). Petrocosmea nanchuanensis is sister to a maximally supported branch containing P. barbata, and P. longipedicellata gathered together by strong support (MP-BS = 91 %; PP = 100 %) with two maximally supported sister species, P. cavaleriei and P. xanthomaculata. These five species as a maximum supported branch are further united with three well resolved sister species P. glabristoma, P. forrestii and P. mairei var. mairei. The species in clade D have a generally similar bilateral corolla to the species in clade B. However, the two lobes in the upper lip are extended forward rather than reflexed backward. In addition, they can also be easily recognized by two bright yellow spots or cicatrices on the lower lip and hairs on the upper lip in the corolla throat (Fig. 1).
Five species (P. nervosa, P. oblata, P. flaccida, P. sinensis, and P. qinlingensis) of sect. Petrocosmea form clade E with maximum support. In clade E, P. oblata and P. flaccida are sister with maximum support and these two are grouped with another set of sister species, P. sinensis and P. qinlingensis, with strong support (MP-BS = 90 %; PP = 100 %). Petrocosmea nervosa is sister to the remaining species in Clade E with maximum support. The species of clade E all share a large bilobed upper lip that is equal or almost equal to the trilobed lower lip (Fig. 1). Correspondingly, their styles are generally located in the center of the flower. In addition, the longitudinal anthers, and three yellow spots on the upper side of the corolla tube below the filaments are unique to the species of clades D and E, supporting their sister relationship.
Ancestral area and character state reconstructions
The results of ancestral area reconstruction using S-DIVA in RASP is shown in Fig. 6. The most recent common ancestor of Petrocosmea is in the border region of China, Thailand, India, and Myanmar, lying east and southeast of Himalaya-Tibetan Plateau. Petrocosmea has greatly diversified in southwestern China, especially in Hengduan Mountain-Yungui Plateau region, and further spread to central China (Fig. 6).
For ancestral character state reconstructions, twelve diagnostic characters were analyzed on the posterior set of trees derived from the combined molecular data analysis (Fig. 2). These were selected among all of the characters that were scored because they may represent important adaptations in the speciation of Petrocosmea. They are plant habit, ratio of the upper lip to lower lip, structure of the upper lip; character of corolla throat, dorsoventrally equal/unequal development of the ovary, length ratio of corolla tube to corolla lobes, inflation of the lower part of the corolla tube, position of the anther and filament relative to the ovary and style, type of anther dehiscence, exsertion of the style with curvature type of style tip; constriction at the top of the anther and straight/geniculation of filaments (Figs. 1, 4, 7-8, Additional file 1: Figure S5). We found that the plants of clade A retained a vestigial caulescent habit with ascendant leaves, which transitioned to a habit consisting of a short rhizome with rosette leaves spreading on the ground (Fig. 1). A ratio of upper to lower lip of 1:2 was inferred to have appeared independently two times in clades B and D. The upper lip is reflexed backward in clade B but extended forward in clade D (Figs. 1, 7). The upper to lower lip ratio is 1:4 in the main branch of clade C, but secondarily lengthened to equal length of the lower lip in clade E as well as the P. grandiflora/P. yanshanensis branch of clade C (Figs. 1, 5, 7). Corolla throat ribbing and whether the gynoecium develops equally or unequally dorsoventrally were correlated in all taxa and character state mapping indicates that a corolla throat that is ribbed on both upper and lower surfaces and a gynoecium that develops only slightly unequally dorsoventrally is the ancestral state for Petrocosmea (Fig. 8). Similarly four other characters were correlated; corolla tube length, corolla tube inflation on lower side, number of fertile stamens and type of dehiscence, and exsertion and orientation of the style. The ancestral states for these are a corolla tube that is equal to slightly longer than the lobes, is inflated on the lower surface, two fertile stamens with poricidal dehiscence, and an exserted style that is bent downward (Fig. 8). In clades D and E, the tube is shortened and not inflated and although there are also only two fertile stamens, their dehiscence is longitudinal and the exserted style is bent upward (Fig. 8).
A series of novel morphological traits are correlated with cladogenetic events in Petrocosmea. These morphological novelties are mainly reflected in the size and shape of the upper lip. In clade A, the two upper corolla lobes are slightly smaller than the three corolla lobes of the lower lip, generating a moderate floral zygomorphy as in Raphiocarpus. In clade B, the two upper corolla lobes are remarkably reduced relative to the three lobes of the lower lip. In clade C, the two much shortened upper corolla lobes are fused and extremely specialized. In clade D, even though the upper lobes are in general similar to those in clade B in size, they are extended forward with a flat face, contrasting with the two upper corolla lobes reflexed backward in clade B. The flowers in clade E are nearly actinomorphic, reflected in the equal length of the upper and lower lips, a deep sinus among the five corolla lobes and a much shortened corolla tube (Fig. 1). These morphological variants in the size and shape of the upper lip are consistent with a series of counterparts in other floral organs, such as character of corolla throat, length ratio of corolla tube to corolla lobes, inflation of the lower part of the corolla tube, position of the anther and filament relative to the ovary and style and type of anther dehiscence, exsertion of the style with curvature type of style tip, and dorsoventrally equal/unequal development of the ovary (Figs. 1, 4, 8).